* wip: layer normalization on cpu
* wip: add cuda implementation, nor working yet
* wip: try to fix cuda implementation
* swap grid_strid_range and grid_strid_range_y: does not work yet
* fix CUDA implementation
* implement cuda gradient
* add documentation, move layer_norm, update bn_visitor
* add tests
* use stddev instead of variance in test (they are both 1, anyway)
* add test for means and invstds on CPU and CUDA
* rename visitor to disable_duplicative_bias
* handle more cases in the visitor_disable_input_bias
* Add tests for visitor_disable_input_bias
Now the user doesn't have to supply a visitor capable of visiting all
layers, but instead just the ones they are interested in. Also added
visit_computational_layers() and visit_computational_layers_range()
since those capture a very common use case more concisely than
visit_layers(). That is, users generally want to mess with the
computational layers specifically as those are the stateful layers.
* add visitor to remove bias from bn_ inputs (#closes 2155)
* remove unused parameter and make documentation more clear
* remove bias from bn_ layers too and use better name
* let the batch norm keep their bias, use even better name
* be more consistent with impl naming
* remove default constructor
* do not use method to prevent some errors
* add disable bias method to pertinent layers
* update dcgan example
- grammar
- print number of network parameters to be able to check bias is not allocated
- at the end, give feedback to the user about what the discriminator thinks about each generated sample
* fix fc_ logic
* add documentation
* add bias_is_disabled methods and update to_xml
* print use_bias=false when bias is disabled
* fix some warnings when running tests
* rever changes in CMakeLists.txt
* update example make use of newly promoted method
* update tests to make use of newly promoted methods
* wip: dcgan-example
* wip: dcgan-example
* update example to use leaky_relu and remove bias from net
* wip
* it works!
* add more comments
* add visualization code
* add example documentation
* rename example
* fix comment
* better comment format
* fix the noise generator seed
* add message to hit enter for image generation
* fix srand, too
* add std::vector overload to update_parameters
* improve training stability
* better naming of variables
make sure it is clear we update the generator with the discriminator's
gradient using fake samples and true labels
* fix comment: generator -> discriminator
* update leaky_relu docs to match the relu ones
* replace not with !
* add Davis' suggestions to make training more stable
* use tensor instead of resizable_tensor
* do not use dnn_trainer for discriminator
* Add instance segmentation example - first version of training code
* Add MMOD options; get rid of the cache approach, and instead load all MMOD rects upfront
* Improve console output
* Set filter count
* Minor tweaking
* Inference - first version, at least compiles!
* Ignore overlapped boxes
* Ignore even small instances
* Set overlaps_ignore
* Add TODO remarks
* Revert "Set overlaps_ignore"
This reverts commit 65adeff1f8.
* Set result size
* Set label image size
* Take ignore-color into account
* Fix the cropping rect's aspect ratio; also slightly expand the rect
* Draw the largest findings last
* Improve masking of the current instance
* Add some perturbation to the inputs
* Simplify ground-truth reading; fix random cropping
* Read even class labels
* Tweak default minibatch size
* Learn only one class
* Really train only instances of the selected class
* Remove outdated TODO remark
* Automatically skip images with no detections
* Print to console what was found
* Fix class index problem
* Fix indentation
* Allow to choose multiple classes
* Draw rect in the color of the corresponding class
* Write detector window classes to ostream; also group detection windows by class (when ostreaming)
* Train a separate instance segmentation network for each classlabel
* Use separate synchronization file for each seg net of each class
* Allow more overlap
* Fix sorting criterion
* Fix interpolating the predicted mask
* Improve bilinear interpolation: if output type is an integer, round instead of truncating
* Add helpful comments
* Ignore large aspect ratios; refactor the code; tweak some network parameters
* Simplify the segmentation network structure; make the object detection network more complex in turn
* Problem: CUDA errors not reported properly to console
Solution: stop and join data loader threads even in case of exceptions
* Minor parameters tweaking
* Loss may have increased, even if prob_loss_increasing_thresh > prob_loss_increasing_thresh_max_value
* Add previous_loss_values_dump_amount to previous_loss_values.size() when deciding if loss has been increasing
* Improve behaviour when loss actually increased after disk sync
* Revert some of the earlier change
* Disregard dumped loss values only when deciding if learning rate should be shrunk, but *not* when deciding if loss has been going up since last disk sync
* Revert "Revert some of the earlier change"
This reverts commit 6c852124ef.
* Keep enough previous loss values, until the disk sync
* Fix maintaining the dumped (now "effectively disregarded") loss values count
* Detect cats instead of aeroplanes
* Add helpful logging
* Clarify the intention and the code
* Review fixes
* Add operator== for the other pixel types as well; remove the inline
* If available, use constexpr if
* Revert "If available, use constexpr if"
This reverts commit 503d4dd335.
* Simplify code as per review comments
* Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh
* Clarify console output
* Revert "Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh"
This reverts commit 9191ebc776.
* To keep the changes to a bare minimum, revert the steps_since_last_learning_rate_shrink change after all (at least for now)
* Even empty out some of the previous test loss values
* Minor review fixes
* Can't use C++14 features here
* Do not use the struct name as a variable name
* Add concat_prev layer, and U-net example for semantic segmentation
* Allow to supply mini-batch size as command-line parameter
* Decrease default mini-batch size from 30 to 24
* Resize t1, if needed
* Use DenseNet-style blocks instead of residual learning
* Increase default mini-batch size to 50
* Increase default mini-batch size from 50 to 60
* Resize even during the backward step, if needed
* Use resize_bilinear_gradient for the backward step
* Fix function call ambiguity problem
* Clear destination before adding gradient
* Works OK-ish
* Add more U-tags
* Tweak default mini-batch size
* Define a simpler network when using Microsoft Visual C++ compiler; clean up the DenseNet stuff (leaving it for a later PR)
* Decrease default mini-batch size from 24 to 23
* Define separate dnn filename for MSVC++ and not
* Add documentation for the resize_to_prev layer; move the implementation so that it comes after mult_prev
* Fix previous typo
* Minor formatting changes
* Reverse the ordering of levels
* Increase the learning-rate stopping criterion back to 1e-4 (was 1e-8)
* Use more U-tags even on Windows
* Minor formatting
* Latest MSVC 2017 builds fast, so there's no need to limit the depth any longer
* Tweak default mini-batch size again
* Even though latest MSVC can now build the extra layers, it does not mean we should add them!
* Fix naming
* Problem: integer overflow when calculating sizes (may happen e.g. with very large images)
Solution: change some types from (unsigned) long to size_t
# Conflicts:
# dlib/dnn/tensor.h
* Fix the fact that std::numeric_limits<unsigned long>::max() isn't always the same number
* Revert serialization changes
* Review fix: use long long instead of size_t
* From long to long long all the way
* Change more types to (hopefully) make the compiler happy
* Change many more types to size_t
* Change even more types to size_t
* Minor type changes
the code, but it helps visual studio use less RAM when building the example,
and might make appveyor not crash. It's also a
slightly cleaner way to write the code anyway.
* Exposed jitter_image in Python and added an example
* Return Numpy array directly
* Require numpy during setup
* Added install of Numpy before builds
* Changed pip install for user only due to security issues.
* Removed malloc
* Made presence of Numpy during compile optional.
* Conflict
* Refactored get_face_chip/get_face_chips to use Numpy as well.
* Add example of semantic segmentation using the PASCAL VOC2012 dataset
* Add note about Debug Information Format when using MSVC
* Make the upsampling layers residual as well
* Fix declaration order
* Use a wider net
* trainer.set_iterations_without_progress_threshold(5000); // (was 20000)
* Add residual_up
* Process entire directories of images (just easier to use)
* Simplify network structure so that builds finish even on Visual Studio (faster, or at all)
* Remove the training example from CMakeLists, because it's too much for the 32-bit MSVC++ compiler to handle
* Remove the probably-now-unnecessary set_dnn_prefer_smallest_algorithms call
* Review fix: remove the batch normalization layer from right before the loss
* Review fix: point out that only the Visual C++ compiler has problems.
Also expand the instructions how to run MSBuild.exe to circumvent the problems.
* Review fix: use dlib::match_endings
* Review fix: use dlib::join_rows. Also add some comments, and instructions where to download the pre-trained net from.
* Review fix: make formatting comply with dlib style conventions.
* Review fix: output training parameters.
* Review fix: remove #ifndef __INTELLISENSE__
* Review fix: use std::string instead of char*
* Review fix: update interpolation_abstract.h to say that extract_image_chips can now take the interpolation method as a parameter
* Fix whitespace formatting
* Add more comments
* Fix finding image files for inference
* Resize inference test output to the size of the input; add clarifying remarks
* Resize net output even in calculate_accuracy
* After all crop the net output instead of resizing it by interpolation
* For clarity, add an empty line in the console output
dimensions in the same format as the mmod_options object (i.e. two lengths
measured in pixels). This should make defining random_cropping strategies that
are consistent with MMOD settings much more straightforward since you can just
take the mmod_options settings and give them to the random_cropper and it will
do the right thing.