OpenSceneGraph/src/osg/PositionAttitudeTransform.cpp

63 lines
2.0 KiB
C++
Raw Normal View History

2006-07-18 23:21:48 +08:00
/* -*-c++-*- OpenSceneGraph - Copyright (C) 1998-2006 Robert Osfield
*
* This library is open source and may be redistributed and/or modified under
* the terms of the OpenSceneGraph Public License (OSGPL) version 0.0 or
* (at your option) any later version. The full license is in LICENSE file
* included with this distribution, and on the openscenegraph.org website.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* OpenSceneGraph Public License for more details.
*/
#include <osg/PositionAttitudeTransform>
using namespace osg;
PositionAttitudeTransform::PositionAttitudeTransform():
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
_scale(1.0,1.0,1.0)
{
}
bool PositionAttitudeTransform::computeLocalToWorldMatrix(Matrix& matrix,NodeVisitor*) const
{
if (_referenceFrame==RELATIVE_RF)
{
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
matrix.preMultTranslate(_position);
matrix.preMultRotate(_attitude);
matrix.preMultScale(_scale);
matrix.preMultTranslate(-_pivotPoint);
}
else // absolute
{
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
matrix.makeRotate(_attitude);
matrix.postMultTranslate(_position);
matrix.preMultScale(_scale);
matrix.preMultTranslate(-_pivotPoint);
}
return true;
}
bool PositionAttitudeTransform::computeWorldToLocalMatrix(Matrix& matrix,NodeVisitor*) const
{
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
if (_scale.x() == 0.0 || _scale.y() == 0.0 || _scale.z() == 0.0)
return false;
if (_referenceFrame==RELATIVE_RF)
{
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
matrix.postMultTranslate(-_position);
matrix.postMultRotate(_attitude.inverse());
matrix.postMultScale(Vec3d(1.0/_scale.x(), 1.0/_scale.y(), 1.0/_scale.z()));
matrix.postMultTranslate(_pivotPoint);
}
else // absolute
{
From Mathias Froehlich, "This is a generic optimization that does not depend on any cpu or instruction set. The optimization is based on the observation that matrix matrix multiplication with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a transform, or scale matrix is only 4^2 operations. Which is a gain of a *FACTOR*4* for these special cases. The change implements these special cases, provides a unit test for these implementation and converts uses of the expensiver dense matrix matrix routine with the specialized versions. Depending on the transform nodes in the scenegraph this change gives a noticable improovement. For example the osgforest code using the MatrixTransform is about 20% slower than the same codepath using the PositionAttitudeTransform instead of the MatrixTransform with this patch applied. If I remember right, the sse type optimizations did *not* provide a factor 4 improovement. Also these changes are totally independent of any cpu or instruction set architecture. So I would prefer to have this current kind of change instead of some hand coded and cpu dependent assembly stuff. If we need that hand tuned stuff, these can go on top of this changes which must provide than hand optimized additional variants for the specialized versions to give a even better result in the end. An other change included here is a change to rotation matrix from quaterion code. There is a sqrt call which couold be optimized away. Since we divide in effect by sqrt(length)*sqrt(length) which is just length ... "
2008-09-18 00:14:28 +08:00
matrix.makeRotate(_attitude.inverse());
matrix.preMultTranslate(-_position);
matrix.postMultScale(Vec3d(1.0/_scale.x(), 1.0/_scale.y(), 1.0/_scale.z()));
matrix.postMultTranslate(_pivotPoint);
}
return true;
}