algorithm consisting of two consequent phases :
- first phase is a GLSL shader performing object culling and LOD picking ( a culling shader ).
Every culled object is represented as GL_POINT in the input osg::Geometry.
The output of the culling shader is a set of object LODs that need to be rendered.
The output is stored in texture buffer objects. No pixel is drawn to the screen
because GL_RASTERIZER_DISCARD mode is used.
- second phase draws osg::Geometry containing merged LODs using glDrawArraysIndirect()
function. Information about quantity of instances to render, its positions and other
parameters is sourced from texture buffer objects filled in the first phase.
The example uses various OpenGL 4.2 features such as texture buffer objects,
atomic counters, image units and functions defined in GL_ARB_shader_image_load_store
extension to achieve its goal and thus will not work on graphic cards with older OpenGL
versions.
The example was tested on Linux and Windows with NVidia 570 and 580 cards.
The tests on AMD cards were not conducted ( due to lack of it ).
The tests were performed using OSG revision 14088.
The main advantages of this rendering method :
- instanced rendering capable of drawing thousands of different objects with
almost no CPU intervention ( cull and draw times are close to 0 ms ).
- input objects may be sourced from any OSG graph ( for example - information about
object points may be stored in a PagedLOD graph. This way we may cover the whole
countries with trees, buildings and other objects ).
Furthermore if we create osgDB plugins that generate data on the fly, we may
generate information for every grass blade for that country.
- every object may have its own parameters and thus may be distinct from other objects
of the same type.
- relatively low memory footprint ( single object information is stored in a few
vertex attributes ).
- no GPU->CPU roundtrip typical for such methods ( method uses atomic counters
and glDrawArraysIndirect() function instead of OpenGL queries. This way
information about quantity of rendered objects never goes back to CPU.
The typical GPU->CPU roundtrip cost is about 2 ms ).
- this example also shows how to render dynamic objects ( objects that may change
its position ) with moving parts ( like car wheels or airplane propellers ) .
The obvious extension to that dynamic method would be the animated crowd rendering.
- rendered objects may be easily replaced ( there is no need to process the whole
OSG graphs, because these graphs store only positional information ).
The main disadvantages of a method :
- the maximum quantity of objects to render must be known beforehand
( because texture buffer objects holding data between phases have constant size ).
- OSG statistics are flawed ( they don't know anymore how many objects are drawn ).
- osgUtil::Intersection does not work
Example application may be used to make some performance tests, so below you
will find some extended parameter description :
--skip-dynamic - skip rendering of dynamic objects if you only want to
observe static object statistics
--skip-static - the same for static objects
--dynamic-area-size - size of the area for dynamic rendering. Default = 1000 meters
( square 1000m x 1000m ). Along with density defines
how many dynamic objects is there in the example.
--static-area-size - the same for static objects. Default = 2000 meters
( square 2000m x 2000m ).
Example application defines some parameters (density, LOD ranges, object's triangle count).
You may manipulate its values using below described modifiers:
--density-modifier - density modifier in percent. Default = 100%.
Density ( along with LOD ranges ) defines maximum
quantity of rendered objects. registerType() function
accepts maximum density ( in objects per square kilometer )
as its parameter.
--lod-modifier - defines the LOD ranges. Default = 100%.
--triangle-modifier - defines the number of triangles in finally rendered objects.
Default = 100 %.
--instances-per-cell - for static rendering the application builds OSG graph using
InstanceCell class ( this class is a modified version of Cell class
from osgforest example - it builds simple quadtree from a list
of static instances ). This parameter defines maximum number
of instances in a single osg::Group in quadtree.
If, for example, you modify it to value=100, you will see
really big cull time in OSG statistics ( because resulting
tree generated by InstanceCell will be very deep ).
Default value = 4096 .
--export-objects - write object geometries and quadtree of instances to osgt files
for later analysis.
--use-multi-draw - use glMultiDrawArraysIndirect() instead of glDrawArraysIndirect() in a
draw shader. Thanks to this we may render all ( different ) objects
using only one draw call. Requires OpenGL version 4.3 and some more
work from me, because now it does not work ( probably I implemented
it wrong, or Windows NVidia driver has errors, because it hangs
the apllication at the moment ).
This application is inspired by Daniel Rákos work : "GPU based dynamic geometry LOD" that
may be found under this address : http://rastergrid.com/blog/2010/10/gpu-based-dynamic-geometry-lod/
There are however some differences :
- Daniel Rákos uses GL queries to count objects to render, while this example
uses atomic counters ( no GPU->CPU roundtrip )
- this example does not use transform feedback buffers to store intermediate data
( it uses texture buffer objects instead ).
- I use only the vertex shader to cull objects, whereas Daniel Rákos uses vertex shader
and geometry shader ( because only geometry shader can send more than one primitive
to transform feedback buffers ).
- objects in the example are drawn using glDrawArraysIndirect() function,
instead of glDrawElementsInstanced().
Finally there are some things to consider/discuss :
- the whole algorithm exploits nice OpenGL feature that any GL buffer
may be bound as any type of buffer ( in our example a buffer is once bound
as a texture buffer object, and later is bound as GL_DRAW_INDIRECT_BUFFER ).
osg::TextureBuffer class has one handy method to do that trick ( bindBufferAs() ),
and new primitive sets use osg::TextureBuffer as input.
For now I added new primitive sets to example ( DrawArraysIndirect and
MultiDrawArraysIndirect defined in examples/osggpucull/DrawIndirectPrimitiveSet.h ),
but if Robert will accept its current implementations ( I mean - primitive
sets that have osg::TextureBuffer in constructor ), I may add it to
osg/include/PrimitiveSet header.
- I used BufferTemplate class writen and published by Aurelien in submission forum
some time ago. For some reason this class never got into osg/include, but is
really needed during creation of UBOs, TBOs, and possibly SSBOs in the future.
I added std::vector specialization to that template class.
- I needed to create similar osg::Geometries with variable number of vertices
( to create different LODs in my example ). For this reason I've written
some code allowing me to create osg::Geometries from osg::Shape descendants.
This code may be found in ShapeToGeometry.* files. Examples of use are in
osggpucull.cpp . The question is : should this code stay in example, or should
it be moved to osgUtil ?
- this remark is important for NVidia cards on Linux and Windows : if
you have "Sync to VBlank" turned ON in nvidia-settings and you want to see
real GPU times in OSG statistics window, you must set the power management
settings to "Prefer maximum performance", because when "Adaptive mode" is used,
the graphic card's clock may be slowed down by the driver during program execution
( On Linux when OpenGL application starts in adaptive mode, clock should work
as fast as possible, but after one minute of program execution, the clock slows down ).
This happens when GPU time in OSG statistics window is shorter than 3 ms.
"
git-svn-id: http://svn.openscenegraph.org/osg/OpenSceneGraph/trunk@14531 16af8721-9629-0410-8352-f15c8da7e697
lists when drawing lots of batches and noticed that my program
generated a lot of unneeded glClientActiveTexture calls. Digging
deeper I found out it came from State::disableTexCoordPointer where
the function would call glClientActiveTexture but not
glDisableClientState because the geometry didn't have texture
coordinates for that channel. This is because in our scene there are
some geometries that have move than one uv channels making
State::_texCoordArrayList grow. Then the method
State::applyDisablingOfVertexAttributes() will call
disableTexCoordPointer multiple times.
I rearrange the method a little to combat this. Now the logic has the
same ordering as disableTexCoordPointersAboveAndIncluding which
already combats this."
git-svn-id: http://svn.openscenegraph.org/osg/OpenSceneGraph/trunk@14508 16af8721-9629-0410-8352-f15c8da7e697
To enable the sync of swap buffers set the env var OSG_SYNC_SWAP_BUFFERS to ON or 1, to switch off set to OFF or 0.
One can also use the --sync command line option for application that pass on command line options to the osg::DisplaySettings::instance().
git-svn-id: http://svn.openscenegraph.org/osg/OpenSceneGraph/trunk@14456 16af8721-9629-0410-8352-f15c8da7e697
The State::AppliedProgramObjectSet wasn't ever being used actively in the current rev of the OSG so populating and clearing was no longer neccessary, allowing the code to be removed completely.
git-svn-id: http://svn.openscenegraph.org/osg/OpenSceneGraph/trunk@14377 16af8721-9629-0410-8352-f15c8da7e697
Initially I described issue in message:
http://forum.openscenegraph.org/viewtopic.php?t=13820
It solves issue with compiling texture using ico from image with mipmaps
I added enviroment variable OSG_GL_TEXTURE_STORAGE_ENABLE to control usage of glTexStorage2d. Initially it is disabled.
It used only if image have mipmaps.
Another issue is converting from internalFormat + type to sized internal format. I created sizedInternalFormats[] struct where sized internal formats are ordered from worse->best.
also this struct have commented lines. Commented formats are listed in
http://www.opengl.org/wiki/GLAPI/glTexStorage2D
but looks like not using in osg."
Note from Robert Osfield. Changed the env var control to OSG_GL_TEXTURE_STORAGE and made it's value true by default when the feature is supported by the OpenGL driver. To disable to
use of glTexStorage2D use OSG_GL_TEXTURE_STORAGE="OFF" or "DISABLE"
git-svn-id: http://svn.openscenegraph.org/osg/OpenSceneGraph/trunk@14275 16af8721-9629-0410-8352-f15c8da7e697
If Drawable::getBoundingBox would compute an invalid bounding box (if it was for example empty) it would make a bounding sphere with a infinite radius which counts as a valid sphere in osg.
Attached is a small fix."
- Added apply(Drawable) and apply(Geometry) to NodeVisitor
- Added accept(NodeVisitor) method to Drawable/Geometry
- Added traverse(NodeVisitor) to Geode which calls accept(NodeVisitor) on all Drawables
- Updated CullVisitor to use new apply(Drawable) to handle drawables. The apply(Billboard) method still manually handles the drawables since it is depends on the billboard settings. I needed to disable the traverse within billboard to prevent duplicate traversal of drawables.
- Update other osgUtil node visitors (GLObjectsVisitor, IncrementalCompileOperation, ..) to use new apply(Drawable) method.
"
To select standard OpenGL 1/2 build with full backwards and forwards comtability use:
./configure
make
OR
./configure -DOPENGL_PROFILE=GL2
To select OpenGL 3 core profile build using GL3/gl3.h header:
./configure -DOPENGL_PROFILE=GL3
To select OpenGL Arb core profile build using GL/glcorearb.h header:
./configure -DOPENGL_PROFILE=GLCORE
To select OpenGL ES 1.1 profile use:
./configure -DOPENGL_PROFILE=GLES1
To select OpenGL ES 2 profile use:
./configure -DOPENGL_PROFILE=GLES2
Using OPENGL_PROFILE will select all the appropriate features required so no other settings in cmake will need to be adjusted.
The new configuration options are stored in the include/osg/OpenGL header that deprecates the old include/osg/GL header.
osg/GL2Extensions was incorrectly defining GL_RED_SNORM and GL_RG_SNORM as part of the definitions for OpenGL v3.1. However, a quick review of the 3.1 spec indicates that these are not part of the 3.1 standard.
My attached change moves these definitions out of the #ifndef GL_VERSION_3_1 conditional block, and defines them conditionally if not already defined. This allows the DDS plugin to build for GL3.
"
To the Lua plugin added support for assigned lua functions to C++ osg::Objects via the new osg::CallbackObject mechanism. To invoke the scripts function from C++ one must get the CallbackObject and call run on it.
Renamed ScriptCallback to ScriptNodeCallback to avoid possibly confusion between osg::CallbackObject and the ScriptNodeCallback.
One solution is naturally to create a new class that would inherit the osg::ComputeBoundVisitor, and use that. I don't like that idea as the ComputeBoundVisitor does actually have what I need - it is only hidden in a protected function.
I am therefor suggesting a slight generalization of the ComputeBoundVisitor with the attached patch, which is tested.
The patch has two parts:
we add applyBBox() so that one can use that in a customized traverse-function and add a bbox to the visitor. I considered calling this function expandByBBox(), but I though applyBBox was better.
The MatrixStack is made available to the outside world. That enables a traverse-function to do whatever it wishes.
I do actually only need one of the two, as I can implement what I wish either way, but adding getMatrixStack() will make more generic expansions possible.
"
From Robert Osfield, changed the name of the new applyBBox(..) method to applyBoundingBox(..) to keep it's naming more consistent with the rest of the OSG.
Added new WindowSystemInterface::setDisplaySettings() method to provide a mechanism for passing settings onto the WindowSystemInterface so it can then set up the system appropriately.
Added assignment of the DisplaySettings to the WindowSystemInterface in Viewer/ComppsiteViewer::realize().
To make easier "lazy apply" on the customer OpenGL shaders, the easiest way was to add an accessor to current OSG state's UniformMap.
I've also added accessors for modes and texture, since it could be usefull in the same way.
All methods are const, so I think there is no side-effects."
Provided are lua, python and V8 (for javascript) plugins that just open up enough of a link to the respective libs to run a script, there is no scene graph <-> script communication in current implementation.
But for glPrograms, in order to get all osg's uniform management system to work, I had to subclass osg::program::PerContextProgram.
Here is a modified version of this class, which add some "virtual" method to allow easy subclassing."
OpenSceneGraph/include\osg/BufferObject(701): warning C4138: '*/' found outside of comment (E:\osg\osgSvn\OpenSceneGraph\src\osg\Array.cpp)
adding a space before /* fixes the problem
void removeClient(osg::Object * /*client*/) { --_numClients; }
"
"The idea of this new OpenGL feature is :
- set RestartIndex = "n"
- draw elements strip
-> when the index is "n", the strip is "stopped" and restarted
It's very usefull for drawing tiles with a single strip and a "restart" at the end of each row.
The idea a an OSG StateAttribute is :
Usually we use to build geometry from code, because software modelers rarely support it (and 3d file formats doesn't support it) :
-RootNode <= "PrimitiveRestartIndex=0" // So now, we know that our restart index is 0 for all drawables under this node
|
- Drawable 1 : triangles => as usual
|
- Drawable 2 : triangles strip => as usual
|
- Drawable 3 : triangles strip + "GL_PRIMITIVE_RESTART" mode = ON => use the restart index
|
- Drawable 4 : triangles strip + "GL_PRIMITIVE_RESTART" mode = ON => use the restart index
|
- Drawable 5 : triangles strip => as usual
With a StateAttribute, it's easy for the developper to say "0 will be my restart index for all this object" and then activate the mode only on some nodes.
The main problem is if you set and restart index value which is not included in the vertex array (for exemple set restart index = 100 but you have only 50 vertex). There is no problem with OpenGL, but some OSG algorithms will try to access the vertex[100] and will segfault.
To solve this, I think there is two ways :
1/ add restart index in osg::PrimitiveSet and use this value in all algorithms. It's a lot of work, maybe dangerous, and it concern only a few situations : developpers who use this extension should be aware of advanced OpenGL (and OSG) data management
2/ use a StateAttribute, and choose a "correct" restart index. In my applications, I always use "0" as a restart index and duplicate the first vertex (vertex[0] = vertex[1]). So there is no difference for OpenGL and all OSG algorithms works properly.
"
"The attached file contains:
- a per-context read counter in GLBufferObject::BufferEntry
- a global client counter in BufferData
- the glue between Texture* and Image client counter
"
I think this is necessary on OpenGL 3.2+ since this is no more "default" locations in the OpenGL specs.
The default behaviour stay the same.
There is a few new methods on osg::State :
- resetVertexAttributeAlias : reset all vertex alias to osg's default ones
- set**Alias : set a vertex attribute alias configuration
- setAttributeBindingList : set the attribute binding list (allow to specify an empty list if you're using "layout" qualifier in glsl code to specify the bindings. This save some CPU operations)"
following in the qopengl.h header:
# include <QtGui/qopengles2ext.h>
# ifndef GL_DOUBLE
# define GL_DOUBLE GL_FLOAT
# endif
# ifndef GLdouble
typedef GLfloat GLdouble;
# endif
Unfortunately, when building for normal OpenGL (not GL/ES!) on Windows
with MSVC2012, GLdouble is not defined (it is not a macro but typedef)
and the code above produces a conflicting definition, making the
compile fail. I am attaching a bit hackish workaround for this problem
in osg/GL "
New methods osg::Geometry::containsDeprecatedData() and osg::Geometry::fixDeprecatedData() provide a means for converting geometries that still use the array indices and BIND_PER_PRIMITIVE across to complient
versions.
Cleaned up the rest of the OSG where use of array indices and BIND_PER_PRIMITIVE were accessed or used.
GeometryNew is only temporary and will be renamed to Geometry on the completion of refactoring work and feedback from community.
Ported osggeometry across to use GeometryNew.
TextureBuffer objects may use osg::Texture::bindToImageUnit(), so GLSL shaders are able to use not only texelFetch() function , but also functions defined in GL_ARB_shader_image_load_store extension : imageLoad(), imageStore(), imageAtomicAdd() etc."
second email: "After a while I found that osg::Texture::applyTexParameters() used with TextureBuffer may cause some OpenGL errors ( applying texture filters and wraps to TextureBuffer makes no sense ) so I fixed it."
The set method modify the buffer object of the BufferData while the get method returned the buffer object of the Image.
I've also removed the _bufferObject member of Image (not used anymore)."
--This line, Lionel Lagardeand those below, will be ignored--
M include/osg/Image
I fixed some bugs and did some more tests with both of the video-plugins. I integrated CoreVideo with osgPresentation, ImageStream has a new virtual method called createSuitableTexture which returns NULL for default implementations. Specialized implementations like the QTKit-plugin return a CoreVideo-texture. I refactored the code in SlideShowConstructor::createTexturedQuad to use a texture returned from ImageStream::createSuitableTexture.
I did not use osgDB::readObjectFile to get the texture-object, as a lot of image-related code in SlideShowConstructor had to be refactored to use a texture. My changes are minimal and should not break existing code.
There's one minor issue with CoreVideo in general: As the implementation is asynchronous, there might be no texture available, when first showing the video the first frame. I am a bit unsure how to tackle this problem, any input on this is appreciated.
Back to the AVFoundation-plugin: the current implementation does not support CoreVideo as the QTKit-plugin supports it. There's no way to get decoded frames from AVFoundation stored on the GPU, which is kind of sad. I added some support for CoreVideo to transfer decoded frames back to the GPU, but in my testings the performance was worse than using the normal approach using glTexSubImage. This is why I disabled CoreVideo for AVFoundation. You can still request a CoreVideoTexture via readObjectFile, though.
"
My previous patch for Atomic Counter Uniform provide new template implementation
of Matrix{2,3,4}x{2,3,4}{fd}. This new implementation use Column-Major Matrix.
Original code define matrix as Row-Major matrix like other Matrix in OSG, and
my matrix implementation break compatibility with previous code.
For example osg_normalMatrix define in osg::State report by Roland Hill.
Thanks to Paul Martz to spot me when the bug appear."
- add non square matrix
- add double
- add all uniform type available in OpenGL 4.2
- backward compatibility for Matrixd to set/get an float uniform matrix
- update of IVE / Wrapper ReadWriter
implementation of AtomicCounterBuffer based on BufferIndexBinding
add example that use AtomicCounterBuffer and show rendering order of fragments,
original idea from geeks3d.com."
in osg::Program::PerContextProgram :
typedef std::vector<UniformModifiedCountPair> LastAppliedUniformList;
should be
typedef std::map<unsigned int, UniformModifiedCountPair> LastAppliedUniformList;
Intel driver can use index uniform value > 200000.
With a std::vector, this index uniform value generate an out of memory error
Nothing in OpenGL or GLSL specification define index uniform value rules.
And all other implementation that deal with uniform index in osg::Program
use a std::map.
This fix could have a little performance impact but this is the cost
to pay to work with
all driver."
Without the change the application does not work properly. First I get the notification that an OpenGL error occured. After some more of this error messages I see broken textures on the screen. With the changes attached to this message my application works as intended."
Note from Robert Osfield, changed the Image::supportsTextureSubloading() to be const and to be implemented in the .cpp rather than inline.
I found that some of the items that had been paged in were being expired on the first frame that they were not visible (as the cache was full). This resulted in excessive paging every time the view was moved. With the following changes I could only allow children to be expired if they had not been used for e.g. 30 seconds or 60 frames."
parameter in osg::Image. To support this Image::setData(..) now has a new optional rowLength parameter which
defaults to 0, which provides the original behaviour, Image::setRowLength(int) and int Image::getRowLength() are also provided.
With the introduction of RowLength support in osg::Image it is now possible to create a sub image where
the t size of the image are smaller than the row length, useful for when you have a large image on the CPU
and which to use a small portion of it on the GPU. However, when these sub images are created the data
within the image is no longer contiguous so data access can no longer assume that all the data is in
one block. The new method Image::isDataContiguous() enables the user to check whether the data is contiguous,
and if not one can either access the data row by row using Image::data(column,row,image) accessor, or use the
new Image::DataIterator for stepping through each block on memory assocatied with the image.
To support the possibility of non contiguous osg::Image usage of image objects has had to be updated to
check DataContiguous and handle the case or use access via the DataIerator or by row by row. To achieve
this a relatively large number of files has had to be modified, in particular the texture classes and
image plugins that doing writing.
I don’t have access to an OSX or Linux dev machine to make the changes required to the quick time plugin. This plugin will just default to returning 0."
Motivation ;
When using PagedLODs, you don't always know the real size of loaded children,
If it occurs that they are out of predefined bounds, picking on the parts that are out of bound will fail
They also can be culled out too soon.
The problem often occurs with long object (roads).
I've modified LOD and ProxyNode to include this option."
and later email:
"Attached the UNION_OF_BOUNDING_SPHERE_AND_USER_DEFINED version
There are impacts on some serializers (dae, osgWrapper).
I haven't modified deprecated osg, since it's deprecated"
So with the 3.0 api change we propose the following change:
- put OSG_EXPORT on the QueryGeometry class so that we get access to the getNumPixels method.
- Create a function called getQueryGeometry that returns a casted _queryGeode->getDrawable(). Or a function called getQueryGeode that returns _queryGeode."
Added COMPUTE_NEAR_FAR_USING_PRIMITIVES option that provides the original functionality where only the near plane
is computed in a fine grained way, with the far plane being computed simply from bound volumes.
with a osg::DefaultUserDataContainer subclassed from this. The user object access methods have now all
been moved from osg::Object into the UserDataContainer class, except for the set/getUserData() methods
that are left in osg::Object for backwards compatibility, and the description list access methods have
been moved back into osg::Node.
main UserObject access methods are now all def
Refactored original UserData and Descriptions strings to be managed alongside the new user object suppport within
a single osg::Object::UserDataContainer.
CID 11666: Uninitialized pointer field (UNINIT_CTOR)
Non-static class member _glMultiTexCoord1dv is not initialized in this constructor nor in any functions that it calls.
Non-static class member _glVertexAttrib1dv is not initialized in this constructor nor in any functions that it calls.