* fix some warnings when running tests
* rever changes in CMakeLists.txt
* update example make use of newly promoted method
* update tests to make use of newly promoted methods
* wip: dcgan-example
* wip: dcgan-example
* update example to use leaky_relu and remove bias from net
* wip
* it works!
* add more comments
* add visualization code
* add example documentation
* rename example
* fix comment
* better comment format
* fix the noise generator seed
* add message to hit enter for image generation
* fix srand, too
* add std::vector overload to update_parameters
* improve training stability
* better naming of variables
make sure it is clear we update the generator with the discriminator's
gradient using fake samples and true labels
* fix comment: generator -> discriminator
* update leaky_relu docs to match the relu ones
* replace not with !
* add Davis' suggestions to make training more stable
* use tensor instead of resizable_tensor
* do not use dnn_trainer for discriminator
* Add instance segmentation example - first version of training code
* Add MMOD options; get rid of the cache approach, and instead load all MMOD rects upfront
* Improve console output
* Set filter count
* Minor tweaking
* Inference - first version, at least compiles!
* Ignore overlapped boxes
* Ignore even small instances
* Set overlaps_ignore
* Add TODO remarks
* Revert "Set overlaps_ignore"
This reverts commit 65adeff1f8.
* Set result size
* Set label image size
* Take ignore-color into account
* Fix the cropping rect's aspect ratio; also slightly expand the rect
* Draw the largest findings last
* Improve masking of the current instance
* Add some perturbation to the inputs
* Simplify ground-truth reading; fix random cropping
* Read even class labels
* Tweak default minibatch size
* Learn only one class
* Really train only instances of the selected class
* Remove outdated TODO remark
* Automatically skip images with no detections
* Print to console what was found
* Fix class index problem
* Fix indentation
* Allow to choose multiple classes
* Draw rect in the color of the corresponding class
* Write detector window classes to ostream; also group detection windows by class (when ostreaming)
* Train a separate instance segmentation network for each classlabel
* Use separate synchronization file for each seg net of each class
* Allow more overlap
* Fix sorting criterion
* Fix interpolating the predicted mask
* Improve bilinear interpolation: if output type is an integer, round instead of truncating
* Add helpful comments
* Ignore large aspect ratios; refactor the code; tweak some network parameters
* Simplify the segmentation network structure; make the object detection network more complex in turn
* Problem: CUDA errors not reported properly to console
Solution: stop and join data loader threads even in case of exceptions
* Minor parameters tweaking
* Loss may have increased, even if prob_loss_increasing_thresh > prob_loss_increasing_thresh_max_value
* Add previous_loss_values_dump_amount to previous_loss_values.size() when deciding if loss has been increasing
* Improve behaviour when loss actually increased after disk sync
* Revert some of the earlier change
* Disregard dumped loss values only when deciding if learning rate should be shrunk, but *not* when deciding if loss has been going up since last disk sync
* Revert "Revert some of the earlier change"
This reverts commit 6c852124ef.
* Keep enough previous loss values, until the disk sync
* Fix maintaining the dumped (now "effectively disregarded") loss values count
* Detect cats instead of aeroplanes
* Add helpful logging
* Clarify the intention and the code
* Review fixes
* Add operator== for the other pixel types as well; remove the inline
* If available, use constexpr if
* Revert "If available, use constexpr if"
This reverts commit 503d4dd335.
* Simplify code as per review comments
* Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh
* Clarify console output
* Revert "Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh"
This reverts commit 9191ebc776.
* To keep the changes to a bare minimum, revert the steps_since_last_learning_rate_shrink change after all (at least for now)
* Even empty out some of the previous test loss values
* Minor review fixes
* Can't use C++14 features here
* Do not use the struct name as a variable name
* Add concat_prev layer, and U-net example for semantic segmentation
* Allow to supply mini-batch size as command-line parameter
* Decrease default mini-batch size from 30 to 24
* Resize t1, if needed
* Use DenseNet-style blocks instead of residual learning
* Increase default mini-batch size to 50
* Increase default mini-batch size from 50 to 60
* Resize even during the backward step, if needed
* Use resize_bilinear_gradient for the backward step
* Fix function call ambiguity problem
* Clear destination before adding gradient
* Works OK-ish
* Add more U-tags
* Tweak default mini-batch size
* Define a simpler network when using Microsoft Visual C++ compiler; clean up the DenseNet stuff (leaving it for a later PR)
* Decrease default mini-batch size from 24 to 23
* Define separate dnn filename for MSVC++ and not
* Add documentation for the resize_to_prev layer; move the implementation so that it comes after mult_prev
* Fix previous typo
* Minor formatting changes
* Reverse the ordering of levels
* Increase the learning-rate stopping criterion back to 1e-4 (was 1e-8)
* Use more U-tags even on Windows
* Minor formatting
* Latest MSVC 2017 builds fast, so there's no need to limit the depth any longer
* Tweak default mini-batch size again
* Even though latest MSVC can now build the extra layers, it does not mean we should add them!
* Fix naming
* Problem: integer overflow when calculating sizes (may happen e.g. with very large images)
Solution: change some types from (unsigned) long to size_t
# Conflicts:
# dlib/dnn/tensor.h
* Fix the fact that std::numeric_limits<unsigned long>::max() isn't always the same number
* Revert serialization changes
* Review fix: use long long instead of size_t
* From long to long long all the way
* Change more types to (hopefully) make the compiler happy
* Change many more types to size_t
* Change even more types to size_t
* Minor type changes
the code, but it helps visual studio use less RAM when building the example,
and might make appveyor not crash. It's also a
slightly cleaner way to write the code anyway.
* Exposed jitter_image in Python and added an example
* Return Numpy array directly
* Require numpy during setup
* Added install of Numpy before builds
* Changed pip install for user only due to security issues.
* Removed malloc
* Made presence of Numpy during compile optional.
* Conflict
* Refactored get_face_chip/get_face_chips to use Numpy as well.
* Add example of semantic segmentation using the PASCAL VOC2012 dataset
* Add note about Debug Information Format when using MSVC
* Make the upsampling layers residual as well
* Fix declaration order
* Use a wider net
* trainer.set_iterations_without_progress_threshold(5000); // (was 20000)
* Add residual_up
* Process entire directories of images (just easier to use)
* Simplify network structure so that builds finish even on Visual Studio (faster, or at all)
* Remove the training example from CMakeLists, because it's too much for the 32-bit MSVC++ compiler to handle
* Remove the probably-now-unnecessary set_dnn_prefer_smallest_algorithms call
* Review fix: remove the batch normalization layer from right before the loss
* Review fix: point out that only the Visual C++ compiler has problems.
Also expand the instructions how to run MSBuild.exe to circumvent the problems.
* Review fix: use dlib::match_endings
* Review fix: use dlib::join_rows. Also add some comments, and instructions where to download the pre-trained net from.
* Review fix: make formatting comply with dlib style conventions.
* Review fix: output training parameters.
* Review fix: remove #ifndef __INTELLISENSE__
* Review fix: use std::string instead of char*
* Review fix: update interpolation_abstract.h to say that extract_image_chips can now take the interpolation method as a parameter
* Fix whitespace formatting
* Add more comments
* Fix finding image files for inference
* Resize inference test output to the size of the input; add clarifying remarks
* Resize net output even in calculate_accuracy
* After all crop the net output instead of resizing it by interpolation
* For clarity, add an empty line in the console output
dimensions in the same format as the mmod_options object (i.e. two lengths
measured in pixels). This should make defining random_cropping strategies that
are consistent with MMOD settings much more straightforward since you can just
take the mmod_options settings and give them to the random_cropper and it will
do the right thing.
Test on a given video like this cv::VideoCapture cap("Sample.avi") may be broken when the video frames are not enough before the main window is closed by the user.
min and max object height, it's now min and max object size. This way, if you
have objects that are short and wide (i.e. objects where the relevant dimension
is width rather than height) you will get sensible behavior out of the random
cropper.