set.
The optimization is based on the observation that matrix matrix multiplication
with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a
transform, or scale matrix is only 4^2 operations. Which is a gain of a
*FACTOR*4* for these special cases.
The change implements these special cases, provides a unit test for these
implementation and converts uses of the expensiver dense matrix matrix
routine with the specialized versions.
Depending on the transform nodes in the scenegraph this change gives a
noticable improovement.
For example the osgforest code using the MatrixTransform is about 20% slower
than the same codepath using the PositionAttitudeTransform instead of the
MatrixTransform with this patch applied.
If I remember right, the sse type optimizations did *not* provide a factor 4
improovement. Also these changes are totally independent of any cpu or
instruction set architecture. So I would prefer to have this current kind of
change instead of some hand coded and cpu dependent assembly stuff. If we
need that hand tuned stuff, these can go on top of this changes which must
provide than hand optimized additional variants for the specialized versions
to give a even better result in the end.
An other change included here is a change to rotation matrix from quaterion
code. There is a sqrt call which couold be optimized away. Since we divide in
effect by sqrt(length)*sqrt(length) which is just length ...
"
they are written to disk, either inline or as an external file. Added support for
this in the .ive plugin. Default of WriteHint is NO_PREFERNCE, in which case it's
up to the reader/writer to decide.
"I made several modifications:
* The cause of my errors was that my OSG source directory path contains spaces. To fix this issue I wrapped all paths with quotes, as stated in doxygen documentation.
* I also received some warning messages about deprecated doxygen settings, which I fixed by updating the doxygen file, i.e. running \u2018doxygen \u2013u doxygen.cmake\u2018. By running this command deprecated doxygen options are removed, some option comments have changed and quite some options have been added (I kept their default settings unless mentioned).
* I was surprised to find that the doxygen OUTPUT_DIRECTORY was set to \u201c${OpenSceneGraph_SOURCE_DIR}/doc\u201d, which does not seem appropriate for out of source builds; I changed this to \u201c${OpenSceneGraph_BINARY_DIR}/doc\u201d. (On the other hand, maybe a cmake selectable option should be given to the user?)
* Fixed two warnings I received about unexpected end-of-list-markers in \u2018osg\AnimationPath and \u2018osgUtil\CullVisitor due to excess trailing points in comments.
* Fixed a warning in osgWidget\StyleInterface due to an #include directive (strangely) placed inside a namespace.
* Fixed a warning in osg\Camera due to the META_Object macro that confused doxygen. Adding a semi-colon fixed this.
* Removed auto_Mainpage from the INCLUDE option, because I am positive that this file does not belong there; It never generated useful documentation anyway.
* I added the OSG version number environment variable to the PROJECT_NUMBER option so that the version number is now shown on the main page of generated documentation (e.g. index.html).
* Changed option FULL_PATH_NAMES to YES, but made sure STRIP_FROM_PATH stripped the absolute path until the include dir. This fixed an issue that created mangled names for identical filenames in different directories. E.g. osg/Export and osgDB/Export are now correctly named.
* Changed option SHOW_DIRECTORIES to yes, which is a case of preference I guess.
"
New attribute DatabasePager::_expiryFrames sets number of frames a PagedLOD child is kept in memory. The attribute is set with DatabasePager::setExpiryFrames method or OSG_EXPIRY_FRAMES environmental variable.
New attribute PagedLOD::PerRangeData::_
frameNumber contains frame number of last cull traversal.
Children of PagedLOD are expired when time _AND_ number of frames since last cull traversal exceed OSG_EXPIRY_DELAY _AND_ OSG_EXPIRY_FRAMES respectively. By default OSG_EXPIRY_FRAMES = 1 which means that nodes from last cull/rendering
traversal will not be expired even if last cull time exceeds OSG_EXPIRY_DELAY. Setting OSG_EXPIRY_FRAMES = 0 revokes previous behaviour of PagedLOD.
Setting OSG_EXPIRY_FRAMES > 0 fixes problems of children reloading in lazy rendering applications. Required behaviour is achieved by manipulating OSG_EXPIRY_DELAY and OSG_EXPIRY_FRAMES together.
Two interface changes are made:
DatabasePager::updateSceneGraph(double currentFrameTime) is replaced by DatabasePager::updateSceneGraph(const osg::FrameStamp &frameStamp). The previous method is in #if 0 clause in the header file. Robert, decide if You want to include it.
PagedLOD::removeExpiredChildren(double expiryTime, NodeList &removedChildren) is deprecated (warning is printed), when subclassing use PagedLOD::removeExpiredChildren(double expiryTime, int expiryFrame, NodeList &removedChildren) instead. "
implementation of the atomic increment and decrement into a implementation
file.
This way inlining and compiler optimization can no longer happen for these
implementations, but it fixes compilation on win32 msvc targets. I expect
that this is still faster than with with mutexes.
Also the i386 gcc target gets atomic operations with this patch. By using an
implementation file we can guarantee that we have the right compiler flags
available."
"I have taken the liberty of updating a few files so that there is no longer any derivation from std::vector. I have done this by adding a new file osg/MixinVector and by updating only two others: osg/PrimitiveSet and osg/Array. You will notice that this actually removes what is acknowledged as a \u2018hack\u2019 in osg/PrimitiveSet.
With the original code I did manage to find memory leaks with some compiler options on VC 8 and 9, as well as Intel compiler. I determined the leak existence by instrumenting the destructor code, and by use of a garbage collector as a leak detector (in a similar manner to the Firefox project). Hence in contrast to what I said originally, it is exhibiting symptoms on at least some platforms.
Since I am trying to be a good OSG citizen I got out my editor and started hacking! I have built and tested on Linux (Ubuntu) with GCC 4.x and Windows VC 8 SP1. It appears that nothing is broken, and that I\u2019m using less memory J"
via StateSet::setNestedRenderBin(bool) whether the new RenderBin should be nested
with the existing RenderBin, or be nested with the enclosing RenderStage.
GL_GENERATE_MIPMAP_SGIS is very slow (over half a second for a 720*576
texture). However, glGenerateMipmapEXT() performs well (16ms for the
same texture), so I have modified the attached files to use
Texture::generateMipmap() if glGenerateMipmapEXT is supported, instead
of enabling & disabling GL_GENERATE_MIPMAP_SGIS."
Notes, from Robert Osfield, I've tested the out of the previous path using
GL_GENERATE_MIPMAP_SGIS and non power of two textures on NVidia 7800GT and
Nvidia linux drivers with the image size 720x576 and only get compile times
of 56ms, so the above half second speed looks to be a driver bug. With
Muchael's changes the cost goes done to less than 5ms, so it's certainly
an effective change, even given that Michael's poor expereiences with
GL_GENERATE_MIP_SGIS do look to be a driver bug.
- Solves issues of loading image data into the texture memory
- Print a warning if images are of different dimensions or have different internal formats (GL specification requires images to be the same)
Patch is tested and seems to work fine. It shouldn't break any other functionality. It should go into include/osg and src/osg
"
multi-threaded paging, where the Pager manages threads of reading local
and http files via seperate threads. This makes it possible to smoothly
browse large databases where parts of the data are locally cached while
others are on a remote server. Previously with this type of dataset
the pager would stall all paging while http requests were being served,
even when parts of the models are still loadable virtue of being in the
local cache.
Also as part of the refactoring the DatabaseRequest are now stored in the
ProxyNode/PagedLOD nodes to facilitate quite updating in the cull traversal,
with the new code avoiding mutex locks and searches. Previous on big
databases the overhead involved in make database requests could accumulate
to a point where it'd cause the cull traversal to break frame. The overhead
now is negligable.
Finally OSG_FILE_CACHE support has been moved from the curl plugin into
the DatabasePager. Eventually this functionality will be moved out into
osgDB for more general usage.
from: DEEP_COPY_STATESETS = 8,
to: DEEP_COPY_STATESETS = 1<<3,
showing clearly that this isn't the _value_ 8, but the _bit_ 8. this is an old pattern i see (and like to promulgate) to make code a bit more readable and maintainable.
"
From Robert Osfield, refactored the FrameBufferObejcts::_drawBuffers set up so that its done
within the setAttachment method to avoid potential threading/execution order issues.
Introduced code in BoundgingSphere, BoundingBox, ProxyNode and LOD to utilise the above settings.
Added Matrix::value_type, Plane::value_type, BoundingSphere::value_type and BoundingBox::value_type command line
options that report where the types of floats or doubles.
and a new scheme for computing the scaling when using autoscale that introduces smooth
transitions to the scaling of the subgraph so that it looks more natural.
Attached is a fixed version of OverlayNode.cpp. I fixed CustomPolytope::cut( osg::Plane ) method. Bug was apparent in such scenario:
Let P1 be some random frustum polytope
Let P2 be the polytope that was created from P1 bounding box (P2 contains P1 entirely)
Then ignoring precision errors: P1.cut( P2 ) == P2.cut( P1 ) == P1. But this condition was not always met. Cut failed when some of the polytope reference points happened to lie exactly on some intersecting planes in both P1 & P2 (plane distance was = 0).
I only use CustomPolytope for my shadowing stuff so I did not test how this affects rest of OverlayNode.cpp.
----2----
Also attached is a minor precision improvement for osg::Plane intersect method (double version).
----3----
I have also one observation regarding osg::Plane - There are two intersect vertices methods (float and double flavour):
inline int intersect(const std::vector<Vec3>& vertices) const
inline int intersect(const std::vector<Vec3d>& vertices) const
I guess osg::Plane won't compile when someone changes default vec3 typedef to vec3d. Shouldn't the first method be changed to use vec3f explicitly ? Ie:
inline int intersect(const std::vector<Vec3f>& vertices) const"
pbuffer functions or exactly ask for the extensions we need to call the
apropriate glx extension functions for and around pbuffers extensions.
The glx 1.3 version of this functios are prefered. If this is not pressent we
are looking for the glx extensions and check for them.
Prevously we just used some mix of the glx 1.3 functions or the extension
functions without making sure that this extension is present.
"
creating subclasses of osg::Array that referenced data
stored an application's internal data structures. I took
a stab at implementing that and ran into a couple of
downcasts in Geometry.cpp. Enclosed is my take at fixing
those along with a simple example of how to do this."
StateSet::removeAssociatedModes(const StateAttribute*)
and a
StateSet::removeAssociatedTextureModes(unsigned, const StateAttribute*)
call. These funktions are just missing for a complete api IMO."