set.
The optimization is based on the observation that matrix matrix multiplication
with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a
transform, or scale matrix is only 4^2 operations. Which is a gain of a
*FACTOR*4* for these special cases.
The change implements these special cases, provides a unit test for these
implementation and converts uses of the expensiver dense matrix matrix
routine with the specialized versions.
Depending on the transform nodes in the scenegraph this change gives a
noticable improovement.
For example the osgforest code using the MatrixTransform is about 20% slower
than the same codepath using the PositionAttitudeTransform instead of the
MatrixTransform with this patch applied.
If I remember right, the sse type optimizations did *not* provide a factor 4
improovement. Also these changes are totally independent of any cpu or
instruction set architecture. So I would prefer to have this current kind of
change instead of some hand coded and cpu dependent assembly stuff. If we
need that hand tuned stuff, these can go on top of this changes which must
provide than hand optimized additional variants for the specialized versions
to give a even better result in the end.
An other change included here is a change to rotation matrix from quaterion
code. There is a sqrt call which couold be optimized away. Since we divide in
effect by sqrt(length)*sqrt(length) which is just length ...
"
Attached is a fixed version of OverlayNode.cpp. I fixed CustomPolytope::cut( osg::Plane ) method. Bug was apparent in such scenario:
Let P1 be some random frustum polytope
Let P2 be the polytope that was created from P1 bounding box (P2 contains P1 entirely)
Then ignoring precision errors: P1.cut( P2 ) == P2.cut( P1 ) == P1. But this condition was not always met. Cut failed when some of the polytope reference points happened to lie exactly on some intersecting planes in both P1 & P2 (plane distance was = 0).
I only use CustomPolytope for my shadowing stuff so I did not test how this affects rest of OverlayNode.cpp.
----2----
Also attached is a minor precision improvement for osg::Plane intersect method (double version).
----3----
I have also one observation regarding osg::Plane - There are two intersect vertices methods (float and double flavour):
inline int intersect(const std::vector<Vec3>& vertices) const
inline int intersect(const std::vector<Vec3d>& vertices) const
I guess osg::Plane won't compile when someone changes default vec3 typedef to vec3d. Shouldn't the first method be changed to use vec3f explicitly ? Ie:
inline int intersect(const std::vector<Vec3f>& vertices) const"
to bugfixes in osg::Polytope.setToUnitFrustum and setToBoundingBox It was
sent at beginning of december. I read it when purging my Thrash emails and
found it there this because it was wrongly classified as SPAM.
What stroke me in this email was the fact that there was once an error in
Polytope class. Since I adopted CustomPolytope (osgSim OverlayNode.cpp) for
my minimal shadow area computations I checked my code for this error. And I
found it in CustomPolytope::setToUnitFrustum method.
CustomPolytope::setToBoundingBox seemed OK.
So I went back to the origin and fixed this error in OverlayNode.cpp as
well. I have not tested it in OverlayNode though (I don't know how) so
please look at this carefully. But it seems to work fine with my shadow
calculations."
of the OverlayNode.
I change the overlay subgraph dynamically and when I remove all the
subgraph nodes that is inside the current main camera FOV (others
outside still exist), the overlay texture does not update because of the
early return in the traversal. I then get a kind of ghost texture moving
around the terrain.
The attached file fixed the problem for me, but I'm not sure if it is
the best way to address the problem."