set.
The optimization is based on the observation that matrix matrix multiplication
with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a
transform, or scale matrix is only 4^2 operations. Which is a gain of a
*FACTOR*4* for these special cases.
The change implements these special cases, provides a unit test for these
implementation and converts uses of the expensiver dense matrix matrix
routine with the specialized versions.
Depending on the transform nodes in the scenegraph this change gives a
noticable improovement.
For example the osgforest code using the MatrixTransform is about 20% slower
than the same codepath using the PositionAttitudeTransform instead of the
MatrixTransform with this patch applied.
If I remember right, the sse type optimizations did *not* provide a factor 4
improovement. Also these changes are totally independent of any cpu or
instruction set architecture. So I would prefer to have this current kind of
change instead of some hand coded and cpu dependent assembly stuff. If we
need that hand tuned stuff, these can go on top of this changes which must
provide than hand optimized additional variants for the specialized versions
to give a even better result in the end.
An other change included here is a change to rotation matrix from quaterion
code. There is a sqrt call which couold be optimized away. Since we divide in
effect by sqrt(length)*sqrt(length) which is just length ...
"
they are written to disk, either inline or as an external file. Added support for
this in the .ive plugin. Default of WriteHint is NO_PREFERNCE, in which case it's
up to the reader/writer to decide.
"I made several modifications:
* The cause of my errors was that my OSG source directory path contains spaces. To fix this issue I wrapped all paths with quotes, as stated in doxygen documentation.
* I also received some warning messages about deprecated doxygen settings, which I fixed by updating the doxygen file, i.e. running \u2018doxygen \u2013u doxygen.cmake\u2018. By running this command deprecated doxygen options are removed, some option comments have changed and quite some options have been added (I kept their default settings unless mentioned).
* I was surprised to find that the doxygen OUTPUT_DIRECTORY was set to \u201c${OpenSceneGraph_SOURCE_DIR}/doc\u201d, which does not seem appropriate for out of source builds; I changed this to \u201c${OpenSceneGraph_BINARY_DIR}/doc\u201d. (On the other hand, maybe a cmake selectable option should be given to the user?)
* Fixed two warnings I received about unexpected end-of-list-markers in \u2018osg\AnimationPath and \u2018osgUtil\CullVisitor due to excess trailing points in comments.
* Fixed a warning in osgWidget\StyleInterface due to an #include directive (strangely) placed inside a namespace.
* Fixed a warning in osg\Camera due to the META_Object macro that confused doxygen. Adding a semi-colon fixed this.
* Removed auto_Mainpage from the INCLUDE option, because I am positive that this file does not belong there; It never generated useful documentation anyway.
* I added the OSG version number environment variable to the PROJECT_NUMBER option so that the version number is now shown on the main page of generated documentation (e.g. index.html).
* Changed option FULL_PATH_NAMES to YES, but made sure STRIP_FROM_PATH stripped the absolute path until the include dir. This fixed an issue that created mangled names for identical filenames in different directories. E.g. osg/Export and osgDB/Export are now correctly named.
* Changed option SHOW_DIRECTORIES to yes, which is a case of preference I guess.
"
New attribute DatabasePager::_expiryFrames sets number of frames a PagedLOD child is kept in memory. The attribute is set with DatabasePager::setExpiryFrames method or OSG_EXPIRY_FRAMES environmental variable.
New attribute PagedLOD::PerRangeData::_
frameNumber contains frame number of last cull traversal.
Children of PagedLOD are expired when time _AND_ number of frames since last cull traversal exceed OSG_EXPIRY_DELAY _AND_ OSG_EXPIRY_FRAMES respectively. By default OSG_EXPIRY_FRAMES = 1 which means that nodes from last cull/rendering
traversal will not be expired even if last cull time exceeds OSG_EXPIRY_DELAY. Setting OSG_EXPIRY_FRAMES = 0 revokes previous behaviour of PagedLOD.
Setting OSG_EXPIRY_FRAMES > 0 fixes problems of children reloading in lazy rendering applications. Required behaviour is achieved by manipulating OSG_EXPIRY_DELAY and OSG_EXPIRY_FRAMES together.
Two interface changes are made:
DatabasePager::updateSceneGraph(double currentFrameTime) is replaced by DatabasePager::updateSceneGraph(const osg::FrameStamp &frameStamp). The previous method is in #if 0 clause in the header file. Robert, decide if You want to include it.
PagedLOD::removeExpiredChildren(double expiryTime, NodeList &removedChildren) is deprecated (warning is printed), when subclassing use PagedLOD::removeExpiredChildren(double expiryTime, int expiryFrame, NodeList &removedChildren) instead. "
implementation of the atomic increment and decrement into a implementation
file.
This way inlining and compiler optimization can no longer happen for these
implementations, but it fixes compilation on win32 msvc targets. I expect
that this is still faster than with with mutexes.
Also the i386 gcc target gets atomic operations with this patch. By using an
implementation file we can guarantee that we have the right compiler flags
available."
"I have taken the liberty of updating a few files so that there is no longer any derivation from std::vector. I have done this by adding a new file osg/MixinVector and by updating only two others: osg/PrimitiveSet and osg/Array. You will notice that this actually removes what is acknowledged as a \u2018hack\u2019 in osg/PrimitiveSet.
With the original code I did manage to find memory leaks with some compiler options on VC 8 and 9, as well as Intel compiler. I determined the leak existence by instrumenting the destructor code, and by use of a garbage collector as a leak detector (in a similar manner to the Firefox project). Hence in contrast to what I said originally, it is exhibiting symptoms on at least some platforms.
Since I am trying to be a good OSG citizen I got out my editor and started hacking! I have built and tested on Linux (Ubuntu) with GCC 4.x and Windows VC 8 SP1. It appears that nothing is broken, and that I\u2019m using less memory J"
via StateSet::setNestedRenderBin(bool) whether the new RenderBin should be nested
with the existing RenderBin, or be nested with the enclosing RenderStage.
GL_GENERATE_MIPMAP_SGIS is very slow (over half a second for a 720*576
texture). However, glGenerateMipmapEXT() performs well (16ms for the
same texture), so I have modified the attached files to use
Texture::generateMipmap() if glGenerateMipmapEXT is supported, instead
of enabling & disabling GL_GENERATE_MIPMAP_SGIS."
Notes, from Robert Osfield, I've tested the out of the previous path using
GL_GENERATE_MIPMAP_SGIS and non power of two textures on NVidia 7800GT and
Nvidia linux drivers with the image size 720x576 and only get compile times
of 56ms, so the above half second speed looks to be a driver bug. With
Muchael's changes the cost goes done to less than 5ms, so it's certainly
an effective change, even given that Michael's poor expereiences with
GL_GENERATE_MIP_SGIS do look to be a driver bug.
- Solves issues of loading image data into the texture memory
- Print a warning if images are of different dimensions or have different internal formats (GL specification requires images to be the same)
Patch is tested and seems to work fine. It shouldn't break any other functionality. It should go into include/osg and src/osg
"
multi-threaded paging, where the Pager manages threads of reading local
and http files via seperate threads. This makes it possible to smoothly
browse large databases where parts of the data are locally cached while
others are on a remote server. Previously with this type of dataset
the pager would stall all paging while http requests were being served,
even when parts of the models are still loadable virtue of being in the
local cache.
Also as part of the refactoring the DatabaseRequest are now stored in the
ProxyNode/PagedLOD nodes to facilitate quite updating in the cull traversal,
with the new code avoiding mutex locks and searches. Previous on big
databases the overhead involved in make database requests could accumulate
to a point where it'd cause the cull traversal to break frame. The overhead
now is negligable.
Finally OSG_FILE_CACHE support has been moved from the curl plugin into
the DatabasePager. Eventually this functionality will be moved out into
osgDB for more general usage.
from: DEEP_COPY_STATESETS = 8,
to: DEEP_COPY_STATESETS = 1<<3,
showing clearly that this isn't the _value_ 8, but the _bit_ 8. this is an old pattern i see (and like to promulgate) to make code a bit more readable and maintainable.
"
From Robert Osfield, refactored the FrameBufferObejcts::_drawBuffers set up so that its done
within the setAttachment method to avoid potential threading/execution order issues.
Introduced code in BoundgingSphere, BoundingBox, ProxyNode and LOD to utilise the above settings.
Added Matrix::value_type, Plane::value_type, BoundingSphere::value_type and BoundingBox::value_type command line
options that report where the types of floats or doubles.
and a new scheme for computing the scaling when using autoscale that introduces smooth
transitions to the scaling of the subgraph so that it looks more natural.
Attached is a fixed version of OverlayNode.cpp. I fixed CustomPolytope::cut( osg::Plane ) method. Bug was apparent in such scenario:
Let P1 be some random frustum polytope
Let P2 be the polytope that was created from P1 bounding box (P2 contains P1 entirely)
Then ignoring precision errors: P1.cut( P2 ) == P2.cut( P1 ) == P1. But this condition was not always met. Cut failed when some of the polytope reference points happened to lie exactly on some intersecting planes in both P1 & P2 (plane distance was = 0).
I only use CustomPolytope for my shadowing stuff so I did not test how this affects rest of OverlayNode.cpp.
----2----
Also attached is a minor precision improvement for osg::Plane intersect method (double version).
----3----
I have also one observation regarding osg::Plane - There are two intersect vertices methods (float and double flavour):
inline int intersect(const std::vector<Vec3>& vertices) const
inline int intersect(const std::vector<Vec3d>& vertices) const
I guess osg::Plane won't compile when someone changes default vec3 typedef to vec3d. Shouldn't the first method be changed to use vec3f explicitly ? Ie:
inline int intersect(const std::vector<Vec3f>& vertices) const"
pbuffer functions or exactly ask for the extensions we need to call the
apropriate glx extension functions for and around pbuffers extensions.
The glx 1.3 version of this functios are prefered. If this is not pressent we
are looking for the glx extensions and check for them.
Prevously we just used some mix of the glx 1.3 functions or the extension
functions without making sure that this extension is present.
"
creating subclasses of osg::Array that referenced data
stored an application's internal data structures. I took
a stab at implementing that and ran into a couple of
downcasts in Geometry.cpp. Enclosed is my take at fixing
those along with a simple example of how to do this."
StateSet::removeAssociatedModes(const StateAttribute*)
and a
StateSet::removeAssociatedTextureModes(unsigned, const StateAttribute*)
call. These funktions are just missing for a complete api IMO."
highlighted problems with Light, ClipPlane and Hint usage in osg::State's usage of cloneType
and reassignment of target/num in StateSet/these StateAttributes.
- GL2Extensions, Program and Program.cpp
Features:
- Support for fragment output binding. (e.g. You can now specify in the fragment shader varying out vec3 fragOut; fragOut = vec3(1,0,1); to write to the fragOut variable. In your program you call glBindFragDataLocation(program, 1, "fragOut") to bind the fragOut variable with the MRT 1 - GL_COLOR_ATTACHMENT1_EXT)
- new methods Program::add/removeBindFragDataLocation Program::getFragDataBindingList
"
- Implementation of integer textures as in EXT_texture_integer
- setBorderColor(Vec4) changed to setBorderColor(Vec4d) to pass double values
as border color. (Probably we have to provide an overloading function to
still support Vec4f ?)
- new method Texture::getInternalFormatType() added. Gives information if the
internal format normalized, float, signed integer or unsigned integer. Can
help people to write better code ;-)
"
Futher changes to this submission by Robert Osfield, changed the dirty mipmap
flag into a buffer_value<> vector to ensure safe handling of multiple contexts.
local function pointer to avoid compiler warnings related to case void*.
Moved various OSG classes across to using setGLExtensions instead of getGLExtensions,
and changed them to use typedef declarations in the headers rather than casts in
the .cpp.
Updated wrappers
"A new texture class Texture2DArray derived from
Texture extends the osg to support the new
EXT_texture_array extensions. Texture arrays provides
a feature for people interesting in GPGPU programming.
Faetures and changes:
- Full support for layered 2D textures.
- New uniform types were added (sampler2DArray)
- FrameBufferObject implementation were changed to
support attaching of 2D array textures to the
framebuffer
- StateSet was slightly changed to support texture
arrays. NOTE: array textures can not be used in fixed
function pipeline. Thus using the layered texture as a
statemode for a Drawable produce invalid enumerant
OpenGL errors.
- Image class was extended to support handling of
array textures
Tests:
I have used this class as a new feature of my
application. It works for me without problems (Note:
Texture arrays were introduced only for shading
languages and not for fixed function pipelines!!!).
RTT with Texture2DArray works, as I have tested them
as texture targets for a camera with 6 layers/faces
(i.e. replacement for cube maps). I am using the array
textures in shader programming. Array textures can be
attached to the FBO and used as input and as output."
stereo format to work. It's a good thing I tested these on a TV
before submitting them since I did indeed have a bug. One thing I
did not test was to see how this would work in windowed mode. Does
the interlaced stereo code have support for 'absolute' positions?
For example a given pixel on the screen is always shown in a given
eye no matter where the graphics context is placed?
"
to the view to be done during syncronous updateTraversal().
This feature can be used for doing things like merging subgraphs that have been loaded
in a background thread.
Created a new GraphicsThread subclass from OperationThread which allows the
GraphicsContext specific calls to be moved out of the base OperationThread class.
Updated the rest of the OSG to respect these changes.
Added and cleaned up DeleteHandler calls in osgViewer to help avoid crashes on exit.
Changed DatabasePager across to dynamically checcking osg::getCompileContext(..)
Updated wrappers.
is not the usual OpenGL BOTTOM_LEFT orientation, but with the origin TOP_LEFT. This
allows geometry setup code to flip the t tex coord to render the movie the correct way up.
I added _preDrawCallback member and neccessary access methods plus modified osgUtil RenderStage.cpp to invoke it before all drawInner calls are made. I tried to maintain symmetry with postDrawCallback but you know better where is a proper place for this call ;-)
"
"Since we desperately needed a means for picking Lines
and Points I implemented (hopefully!) proper geometrical tests
for the PolytopeIntersector.
First of all I implemented a new "GenericPrimiteFunctor"
which is basically an extended copy TriangleFunctor which also
handles Points, Lines and Quads through suitable overloads of
operator(). I would have liked to call it "PrimitiveFunctor"
but that name was already used...
I used a template method to remove redundancy in the
drawElements method overloads. If you know of platforms where
this will not work I can change it to the style used
in TriangleFunctor.
In PolytopeIntersector.cpp I implemented a
"PolytopePrimitiveIntersector" which provides the needed
overloads for Points, Lines, Triangles and Quads to
the GenericPrimitiveFunctor. This is then used in the
intersect method of PolytopeIntersector.
Implementation summary:
- Points: Check distance to all planes
- Lines: Check distance of both ends against each plane.
If both are outside -> line is out
If both are in -> continue checking
One is in, one is out -> compute intersection point (candidate)
Then check all candidates against all other polytope
planes. The remaining candidates are the proper
intersection points of the line with the polytope.
- Triangles: Perform Line-Checks for all edges of the
triangle as above. If there is an proper intersection
-> done.
In the case where there are more than 2 polytope
plane to check against we have to check for the case
where the triangle encloses the polytope.
In that case the intersection lines of the polytope
planes are computed and checked against the triangle.
- Quads: handled as two triangles.
This is implementation is certainly not the fastest.
There are certainly ways and strategies to improve it.
I also enabled the code for PolytopeIntersector
in osgkeyboardmouse and added keybindings to
switch the type of intersector ('p') and the picking
coordinate system ('c') on the fly. Since the
PolytopeIntersector does not have a canonical
ordering for its intersections (as opposed to
the LineSegementIntersector) I chaged the
implementation to toggle all hit geometries.
I tested the functionality with osgkeyboardmouse
and several models and it seems to work for
polygonal models. Special nodes such as billboards
do not work.
The next thing on my todo-list is to implement
a an improved Intersection-Structure for the
PolytopeIntersector. We need to know
which primitives where hit (and where).
"
selectively set the pixel format for windows that are inherited, following
some discussions on the mailing list last week.
This is implemented through a new traits flag
(setInheritedWindowPixelFormat) with a default state of false (to avoid
breaking existing applications). When set to true, the pixel format of the
inherited window will be set according to the traits specifications.
"
setActiveTextureUnit methods of osg::State so they return false if the
texture unit is outside the range of allowable units for the driver.
Currently, the functions would return true even if the units are
invalid. This would cause the osg::State to become out of sync with
the actual driver state, which can cause some bugs in certain cases.
The change I made would verify that the unit passed to
setClientActiveTextureUnit is below GL_MAX_TEXTURE_COORDS, and the
unit passed to setActiveTextureUnit is below
max(GL_MAX_TEXTURE_COORDS,GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS). I
modeled this behavior from the OpenGL docs for these commands which
can be found here:
http://www.opengl.org/sdk/docs/man/xhtml/glClientActiveTexture.xmlhttp://www.opengl.org/sdk/docs/man/xhtml/glActiveTexture.xml
"
"Attached are updates of src/osg/Sequence.spp and include/osg/Sequence.
I've taken _sbegin/_send/_ubegin/_uend and _step our of the include file
and made them local variables in whatever method might need them.
I got rid of the _recalculate method as it was only getting used in
one place.
I also found a cut/paste bug in setMode's START case."
Note from Robert Osfield, Also includes some guards against crashes that was occuring in this new
code when handling empty Sequences.
return the length of the stream.
Implemented the virtual methods in QuicktimeImageStream, (getLength,
getReferenceTime, setTimeMultiplier), to return valid value for each.
"
it to be assigned as a vertex attribute array to an osg::Geometry.
Removed the osgTerrain::ArrayLayer as its no longer required thanks to the above change
which makes the osgTerrain::HeightFieldLayer more flexible.
Updated wrappers
multithreaded-opengl-engine on os x or not. I set its default to false,
perhaps other os x users can test this setting with their data/apps, to
see if we can enable it by default.
I changed also the borderless-window-type, so expos?works correctly."
retain objects for several frames before deleting them. Also added RenderStageCache
into CullVistor.cpp that is used for handling RTT osg::Camera's that are being
used in double buffered SceneView usage.
related support into osgViewer::Viewer and osgViewer::StatsHandler.
Added lazy updating of text in StatsHandler HUD to minimize the impact of
slow text updating on observed frame rates.
Added setting of osg_SimulationTime and osg_DeltaSimulationTime to the uniforms set by SceneView
Added frame(double simulationTime) and advance(double simulationTime) parameters to
osgViewer::SimpleViewer, Vewer and CompositeViewer.
Updated various examples and Nodes to use SimulationTime where appropriate.
Added s/getStats() to osg::View and osg::Camera.
Added population of View::getStats() with frame stats in osgViewer/Viewer.
Added Basic StatsHandler to osgviewer example.
as osg::GraphicsOperation. Unpdated parts of OSG depending upon these.
Added a virtaul bool valid() method to osg::GraphicsContext to allow apps to
test whether a valid graphis context has been created or not.
added frame stamp updating and update traversal to osgViewer::Scene/Viewer.
Updated osgcamera example to use new Viewer API calls instead of using local
rendering calls.
Performance tests on big models did not indicate any performance penalty in using doubles over floats,
so the move to doubles should mainly impact precision improvements for whole earth databases.
Also made improvements to osgUtil::PlaneIntersector and osgSim::ElevationSlice classes
handle scenes with multiple views with elements that need coordinating on a per view basis.
Added beginings of new osgText::FadeText class (not functionality yet).
"I've made some changes to osg which I think make it easier to control
the render order of CameraNode's. Instead of using the built-in orders
(PRE_RENDER, POST_RENDER, NESTED_RENDER), you can specify an integer
order. Values less than zero are pre rendered in order. Values greater
than zero are post rendered in order. And a value of 0 is equivalent
to NESTED_RENDER.
The changes should be fully backward compatible. Also, I changed the
RenderStageList type from a vector to a list because I needed to be
able to insert values anywhere in the list.
The reason I made these changes was because I wanted to be able to set
the render order of a CameraNode at runtime without having to reorder
it in the scenegraph."
and later in the final submission message (relating to what has been finally been merged) :
"I've rethought my implementation and came up with something a little
better. The setRenderOrder will continue to take an enum, but will
have an optional orderNum parameter which can be both positive and
negative. I think this method is more intuitive and flexible."
"I was experiencing hard crashes of my application when using PBO's on
machines that don't support PBO's. I think osg incorrectly checks if
PBO's are supported.
I added a new method to the BufferObject::Extensions class which
returns if the "GL_ARB_pixel_buffer_object" string is supported. This
fixes the problem on my end. Machines without PBO support will
continue to work and machines with PBO support will still be able to
use it."
think it can be simplified quite a bit. The old code includes
<cmath> for pre-10.2 and anything using something other than g++ 4
and then uses std::isnan. For the most current version, it leaves
out cmath and uses isnan(). std::isnan and cmath work for the
current version, so I just made it include cmath if __APPLE__ is
defined and removed the ifdef between versions of OS X for isnan
related things.
This way the code is all the same, and it's not fragile to someone
including <cmath> prior to including osg/Math."
in incorrect texture assignment. Solution was to a compareTextureObjects() test to the Texture*::compare(..) method that
the osgUtil::Optimizer::StateSetVisitor uses to determine uniqueness.