* typo
* - added compile time information to audio object. Not convinced this is needed actually. I'm perfectly happy just using the ffmpeg::frame object. I'm pretty sure I'm the only user who cares about audio.
- created resizing_args and resampling_args
* smaller videos for unit tests
* shorter videos for unit tests
* - decoder and demuxer: you now resize or resample at the time of read. therefore you don't set resizing or resampling parameters in constructor, but you pass them to read()
- added templated read() function
- simplified load_frame()
* inherit from resizing_args and resampling_args
* reorganised the tests to segragate decoding, demuxing, encoding and muxing as much as possible
* much more basic example
* demxing examples split
* examples
* fixing examples
* wip
* Fix load_frame()
* added frame - specific tests
* - makes sense to have a set_params() method rather than constructing a new object and moving. I mean, it works and it absolutely does the right thing, and in fact the same thing as calling set_params() now, but it can look a bit weird.
* notes on defaults and good pairings
* Update ffmpeg_demuxer.h
Watch out for `DLIB_ASSERT` statements. Maybe one of the unit tests should build with asserts enabled.
* Update ffmpeg_details.h
* Update ffmpeg_muxer.h
* WIP
* WIP
* - simplified details::resizer
- added frame::set_params()
- added frame::clear()
- forward packet directly into correct queue
* pick best codec if not specified
* added image data
* warn when we're choosing an appropriate codec
* test load_frame()
* - for some reason, you sometimes get warning messages about too many b-frames. Resetting pict_type suppresses this.
- you can move freshly decoded frames directly out.
* callback passed to push()
* I think it's prettier this way
* WIP
* full callback API for decoder
* updated tests
* updated example
* check the template parameter is callable and has 1 argument first before getting it's first argument
* Potential bug fix
* - write out the enable_if's explictly. It's fine. I think it's clear what's going on if someone cares
- guard push() with a boolean which asserts when recursion is detected
* pre-conditions on callbacks: no recursion
---------
Co-authored-by: pf <pf@me>
Co-authored-by: Your name <you@example.com>
* muxing
* Add HSV support (#2758)
* Add HSV support
* Add tests
* Update dlib/pixel.h
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
* Add HSV struct and make more things const
---------
Co-authored-by: Davis E. King <davis685@gmail.com>
* Fix imglab changing the current dir too soon (#2761)
* A bit of cleanup
---------
Co-authored-by: pf <pf@me>
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
Co-authored-by: Davis E. King <davis685@gmail.com>
Co-authored-by: Davis King <davis@dlib.net>
* docs
* callbacks for encoder
* shorter video
* shorter video
* added is_byte type trait
* leave muxer for next PR
* added overloads for set_layout() and get_layout() in details namespace
* unit test
* example
* build
* overloads for ffmpeg < 5
* Update examples/ffmpeg_video_encoding_ex.cpp
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* as per suggestion
* remove requires clause
* Update examples/ffmpeg_video_encoding_ex.cpp
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_abstract.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_muxer.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* use dlib::logger
* oops
* Update dlib/media/ffmpeg_muxer.h
Co-authored-by: Davis E. King <davis685@gmail.com>
* Update dlib/media/ffmpeg_demuxer.h
* Update dlib/media/ffmpeg_demuxer.h
* Update dlib/media/ffmpeg_abstract.h
---------
Co-authored-by: pf <pf@me>
Co-authored-by: Davis E. King <davis685@gmail.com>
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
* - use add_executable directly
- use target_compile_definitions()
- strip binaries in release mode
* Added a comment
---------
Co-authored-by: pf <pf@me>
Co-authored-by: Davis E. King <davis685@gmail.com>
* - added ffmpeg stuff to cmake
* - added observer_ptr
* ffmpeg utils
* WIP
* - added ffmpeg_decoder
* config file for test data
* another test file
* install ffmpeg
* added ffmpeg_demuxer
* install all ffmpeg libraries
* support older version of ffmpeg
* simplified loop
* - test converting to dlib object
- added docs
- support older ffmpeg
* added convert() overload
* added comment
* only register stuff when API not deprecated
* - fixed version issues
- fixed decoding
* added tests for ffmpeg_demuxer
* removed unused code
* test GIF
* added docs
* added audio test
* test for audio
* more tests
* review changes
* don't need observer_ptr
* made deps public. I could be wrong but just in case.
* - added some static asserts. Some areas of the code might do memcpy's on arrays of pixels. This requires the structures to be packed. Check this.
- added convert() functions
- changed default decoder options. By default, always decode to RGB and S16 audio
- added convenience constructor to demuxer
* - no longer need opencv
* oops. I let that slip
* - made a few functions public
- more precise requires clauses
* enhanced example
* - avoid FFMPEG_INITIALIZED being optimized away at link time
- added decoding example
* - avoid -Wunused-parameter error
* constexpr and noexcept correctness. This probably makes no difference to performance, BUT, it's what the core guidelines tell you to do. It does however demonstrate how complicated and unecessarily verbose C++ is becoming. Sigh, maybe one day i'll make the switch to something that doesn't make my eyes twitch.
* - simplified metadata structure
* hopefully more educational
* added another example
* ditto
* typo
* screen grab example
* whoops
* avoid -Wunused-parameter errors
* ditto
* - added methods to av_dict
- print the demuxer format options that were not used
- enhanced webcam_face_pose_ex.cpp so you can set webcam options
* if height and width are specified, attempt to set video_size in format_options. Otherwise set the bilinear resizer.
* updated docs
* once again, the ffmpeg APIs do a lot for you. It's a matter of knowing which APIs to call.
* made header-only
* - some Werror thing
* don't use type_safe_union
* - templated sample type
- reverted deep copy of AVFrame for frame copy constructor
* - added is_pixel_type and is_pixel_check
* unit tests for pixel traits
* enhanced is_image_type type trait and added is_image_check
* added unit tests for is_image_type
* added pix_traits, improved convert() functions
* bug fix
* get rid of -Werror=unused-variable error
* added a type alias
* that's the last of the manual memcpys gone. We'using ffmpeg API everywhere now for copying frames to buffers and back
* missing doc
* set framerate for webcam
* list input devices
* oops. I was trying to make ffmpeg 5 happy but i've given up on ffmpeg v5 compatibility in this PR. Future PR.
* enhanced the information provided by list_input_devices and list_output_devices
* removed vscode settings.json file
* - added a type trait for checking whether a type is complete. This is useful for writing type traits that check other types have type trait specializations. But also other useful things. For example, std::unique_ptr uses something similar to this.
* Davis was keen to simply check pixel_traits is specialised. That's equivalent to checking pixel_traits<> is complete for some type
* code review
* juse use the void_t in dlib/type_traits.h
* one liners
* just need is_image_check
* more tests for is_image_type
* i think this is correct
* removed printf
* better docs
* Keep opencv out of it
* keep old face pose example, then add new one which uses dlib's ffmpeg wrappers
* revert
* revert
* better docs
* better docs
---------
Co-authored-by: pf <pf@me>
* wip: loss goes down when training without a dnn_trainer
if I use a dnn_trainer, it segfaults (also with bigger batch sizes...)
* remove commented code
* fix gradient computation (hopefully)
* fix loss computation
* fix crash in input_rgb_image_pair::to_tensor
* fix alias tensor offset
* refactor loss and input layers and complete the example
* add more data augmentation
* add documentation
* add documentation
* small fix in the gradient computation and reuse terms
* fix warning in comment
* use tensor_tools instead of matrix to compute the gradients
* complete the example program
* add support for mult-gpu
* Update dlib/dnn/input_abstract.h
* Update dlib/dnn/input_abstract.h
* Update dlib/dnn/loss_abstract.h
* Update examples/dnn_self_supervised_learning_ex.cpp
* Update examples/dnn_self_supervised_learning_ex.cpp
* Update examples/dnn_self_supervised_learning_ex.cpp
* Update examples/dnn_self_supervised_learning_ex.cpp
* [TYPE_SAFE_UNION] upgrade (#2443)
* [TYPE_SAFE_UNION] upgrade
* MSVC doesn't like keyword not
* MSVC doesn't like keyword and
* added tests for emplate(), copy semantics, move semantics, swap, overloaded and apply_to_contents with non void return types
* - didn't need is_void anymore
- added result_of_t
- didn't really need ostream_helper or istream_helper
- split apply_to_contents into apply_to_contents (return void) and visit (return anything so long as visitor is publicly accessible)
* - updated abstract file
* - added get_type_t
- removed deserialize_helper dupplicate
- don't use std::decay_t, that's c++14
* - removed white spaces
- don't need a return-statement when calling apply_to_contents_impl()
- use unchecked_get() whenever possible to minimise explicit use of pointer casting. lets keep that to a minimum
* - added type_safe_union_size
- added type_safe_union_size_v if C++14 is available
- added tests for above
* - test type_safe_union_size_v
* testing nested unions with visitors.
* re-added comment
* added index() in abstract file
* - refactored reset() to clear()
- added comment about clear() in abstract file
- in deserialize(), only reset the object if necessary
* - removed unecessary comment about exceptions
- removed unecessary // -------------
- struct is_valid is not mentioned in abstract. Instead rather requiring T to be a valid type, it is ensured!
- get_type and get_type_t are private. Client code shouldn't need this.
- shuffled some functions around
- type_safe_union_size and type_safe_union_size_v are removed. not needed
- reset() -> clear()
- bug fix in deserialize() index counts from 1, not 0
- improved the abstract file
* refactored index() to get_current_type_id() as per suggestion
* maybe slightly improved docs
* - HURRAY, don't need std::result_of or std::invoke_result for visit() to work. Just privately define your own type trait, in this case called return_type and return_type_t. it works!
- apply_to_contents() now always calls visit()
* example with private visitor using friendship with non-void return types.
* Fix up contracts
It can't be a post condition that T is a valid type, since the choice of T is up to the caller, it's not something these functions decide. Making it a precondition.
* Update dlib/type_safe_union/type_safe_union_kernel_abstract.h
* Update dlib/type_safe_union/type_safe_union_kernel_abstract.h
* Update dlib/type_safe_union/type_safe_union_kernel_abstract.h
* - added more tests for copy constructors/assignments, move constructors/assignments, and converting constructors/assignments
- helper_copy -> helper_forward
- added validate_type<T> in a couple of places
* - helper_move only takes non-const lvalue references. So we are not using std::move with universal references !
- use enable_if<is_valid<T>> in favor of validate_type<T>()
* - use enable_if<is_valid<T>> in favor of validate_type<T>()
* - added is_valid_check<>. This wraps enable_if<is_valid<T>,bool> and makes use of SFINAE more robust
Co-authored-by: pfeatherstone <peter@me>
Co-authored-by: pf <pf@me>
Co-authored-by: Davis E. King <davis685@gmail.com>
* Just minor cleanup of docs and renamed some stuff, tweaked formatting.
* fix spelling error
* fix most vexing parse error
Co-authored-by: Davis E. King <davis@dlib.net>
Co-authored-by: pfeatherstone <45853521+pfeatherstone@users.noreply.github.com>
Co-authored-by: pfeatherstone <peter@me>
Co-authored-by: pf <pf@me>
Co-authored-by: Davis E. King <davis685@gmail.com>
* wip: dcgan-example
* wip: dcgan-example
* update example to use leaky_relu and remove bias from net
* wip
* it works!
* add more comments
* add visualization code
* add example documentation
* rename example
* fix comment
* better comment format
* fix the noise generator seed
* add message to hit enter for image generation
* fix srand, too
* add std::vector overload to update_parameters
* improve training stability
* better naming of variables
make sure it is clear we update the generator with the discriminator's
gradient using fake samples and true labels
* fix comment: generator -> discriminator
* update leaky_relu docs to match the relu ones
* replace not with !
* add Davis' suggestions to make training more stable
* use tensor instead of resizable_tensor
* do not use dnn_trainer for discriminator
* Add instance segmentation example - first version of training code
* Add MMOD options; get rid of the cache approach, and instead load all MMOD rects upfront
* Improve console output
* Set filter count
* Minor tweaking
* Inference - first version, at least compiles!
* Ignore overlapped boxes
* Ignore even small instances
* Set overlaps_ignore
* Add TODO remarks
* Revert "Set overlaps_ignore"
This reverts commit 65adeff1f8.
* Set result size
* Set label image size
* Take ignore-color into account
* Fix the cropping rect's aspect ratio; also slightly expand the rect
* Draw the largest findings last
* Improve masking of the current instance
* Add some perturbation to the inputs
* Simplify ground-truth reading; fix random cropping
* Read even class labels
* Tweak default minibatch size
* Learn only one class
* Really train only instances of the selected class
* Remove outdated TODO remark
* Automatically skip images with no detections
* Print to console what was found
* Fix class index problem
* Fix indentation
* Allow to choose multiple classes
* Draw rect in the color of the corresponding class
* Write detector window classes to ostream; also group detection windows by class (when ostreaming)
* Train a separate instance segmentation network for each classlabel
* Use separate synchronization file for each seg net of each class
* Allow more overlap
* Fix sorting criterion
* Fix interpolating the predicted mask
* Improve bilinear interpolation: if output type is an integer, round instead of truncating
* Add helpful comments
* Ignore large aspect ratios; refactor the code; tweak some network parameters
* Simplify the segmentation network structure; make the object detection network more complex in turn
* Problem: CUDA errors not reported properly to console
Solution: stop and join data loader threads even in case of exceptions
* Minor parameters tweaking
* Loss may have increased, even if prob_loss_increasing_thresh > prob_loss_increasing_thresh_max_value
* Add previous_loss_values_dump_amount to previous_loss_values.size() when deciding if loss has been increasing
* Improve behaviour when loss actually increased after disk sync
* Revert some of the earlier change
* Disregard dumped loss values only when deciding if learning rate should be shrunk, but *not* when deciding if loss has been going up since last disk sync
* Revert "Revert some of the earlier change"
This reverts commit 6c852124ef.
* Keep enough previous loss values, until the disk sync
* Fix maintaining the dumped (now "effectively disregarded") loss values count
* Detect cats instead of aeroplanes
* Add helpful logging
* Clarify the intention and the code
* Review fixes
* Add operator== for the other pixel types as well; remove the inline
* If available, use constexpr if
* Revert "If available, use constexpr if"
This reverts commit 503d4dd335.
* Simplify code as per review comments
* Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh
* Clarify console output
* Revert "Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh"
This reverts commit 9191ebc776.
* To keep the changes to a bare minimum, revert the steps_since_last_learning_rate_shrink change after all (at least for now)
* Even empty out some of the previous test loss values
* Minor review fixes
* Can't use C++14 features here
* Do not use the struct name as a variable name
* Add example of semantic segmentation using the PASCAL VOC2012 dataset
* Add note about Debug Information Format when using MSVC
* Make the upsampling layers residual as well
* Fix declaration order
* Use a wider net
* trainer.set_iterations_without_progress_threshold(5000); // (was 20000)
* Add residual_up
* Process entire directories of images (just easier to use)
* Simplify network structure so that builds finish even on Visual Studio (faster, or at all)
* Remove the training example from CMakeLists, because it's too much for the 32-bit MSVC++ compiler to handle
* Remove the probably-now-unnecessary set_dnn_prefer_smallest_algorithms call
* Review fix: remove the batch normalization layer from right before the loss
* Review fix: point out that only the Visual C++ compiler has problems.
Also expand the instructions how to run MSBuild.exe to circumvent the problems.
* Review fix: use dlib::match_endings
* Review fix: use dlib::join_rows. Also add some comments, and instructions where to download the pre-trained net from.
* Review fix: make formatting comply with dlib style conventions.
* Review fix: output training parameters.
* Review fix: remove #ifndef __INTELLISENSE__
* Review fix: use std::string instead of char*
* Review fix: update interpolation_abstract.h to say that extract_image_chips can now take the interpolation method as a parameter
* Fix whitespace formatting
* Add more comments
* Fix finding image files for inference
* Resize inference test output to the size of the input; add clarifying remarks
* Resize net output even in calculate_accuracy
* After all crop the net output instead of resizing it by interpolation
* For clarity, add an empty line in the console output