Sweep and Prune or dBVT for this use case

Please don't post Bullet support questions here, use the above forums instead.
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Sweep and Prune or dBVT for this use case

Post by sphet »

Hey I was profiling our engine and noticed that, since the artists have added character collision bones, the broadphase cost has really climbed.

We're using a SAP (not quantized, have not got there yet) and I wonder if a dBVT would make more sense. Before I go and do it, I wanted to outline our situation.

We have one static object in the scene - a triangle mesh wrapped in a KD-Tree. All other objects are dynamic, with a number of them being kinematic - driven by animation.

Would a dbvt help here? I suspect the SAP is so expensive because all the player bones are overlapping each other causing swaps (even though they are not allowed to collide with each other), but not sure if the Static & Dynamic dBVT system would be effective in my case where almost everything is moving all the time.

With a dBvt, I realize that a new tree is rebuilt over multiple frames, but what I don't understand is this: On frame 0 the tree is built. Frame 1 a number of objects move, invalidating the tree so rebuilding is required. On Frame 2 they move again, invalidating any effort put into the rebuild. Frame 3, and on the same thing. How is this handled if the scene is always changing?

One other thought would be to put a new type of object into the world to package all of the player bones into one broadphase agent to limit the swaps, but that doesn't seem like the best use of my time, and I don't see this is any other engine.

Any input would be great, thanks.

S
RandyGaul
Posts: 43
Joined: Mon May 20, 2013 8:01 am
Location: Redmond, WA

Re: Sweep and Prune or dBVT for this use case

Post by RandyGaul »

The dynamic tree only reinserts leaves when they move enough, not when they move in general. Your concern is handled appropriately.

It is my guess that you will see significant performance jumps from switching from SAP to a dynamic tree.
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

The AABBs in the dynamic tree are inflated by some static amount and also extruded in the direction of the movement. A tree update is only triggered if the tight 'shape' AABB leaves the fat 'broadphase' AABB. I use a dynamic tree as well and it works with very large scenes (50k - 100k proxies) if tweaked correctly. Of course you don't want that large scenes, but it is nice when your game doesn't come to a full stop (like a SAP would for such scenes) in production when testing things.
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

Dirk Gregorius wrote:The AABBs in the dynamic tree are inflated by some static amount and also extruded in the direction of the movement. A tree update is only triggered if the tight 'shape' AABB leaves the fat 'broadphase' AABB. I use a dynamic tree as well and it works with very large scenes (50k - 100k proxies) if tweaked correctly. Of course you don't want that large scenes, but it is nice when your game doesn't come to a full stop (like a SAP would for such scenes) in production when testing things.
Dirk, thanks for the overview. Previously you mentioned that you adapted your dynamic tree from box2d. Are you using two trees, one for dynamic, one for static, or a combined tree? It looks like box2d doesn't build a second tree then swap it - it has only a single tree that it rebalances, whereas the bullet implementation using multiple trees - reconstructing a tight fight while the other deteriorates.

When implementing the RayCast in 3D, can you choose any perpendicular vector to the segment when doing the separating axis for segment tests? Box2D gets to exploit the fact that a cross product of a scalar with a vector is consistently perpendicular in 2D, but in 3D there could be any number of 'perpendicular' vectors.

The box2d seems like less of a headache, but I'm curious about your approach.


S
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

I pretty much use a 3D version of the Box2D tree. There is really nothing special about my implementation. With regards to the Bullet version I only have a high level overview and didn't investigate it much.

Tree traversal is another topic. You can clip the ray against the AABB instead of using a SAT. This can be implemented quite efficiently like this:

Code: Select all

RN_FORCEINLINE void rnClipRay( const rnRay& Ray, const rnBounds3& Bounds, float& MinT, float& MaxT )
	{
	rnVector3 T1 = ( Bounds.Min - Ray.Origin ) * Ray.DeltaInv;
	rnVector3 T2 = ( Bounds.Max - Ray.Origin ) * Ray.DeltaInv;

	for ( int Axis = 0; Axis < 3; ++Axis )
		{
		MinT = rnMax( MinT, rnMin( T1[ Axis ], T2[ Axis ] ) );
		MaxT = rnMin( MaxT, rnMax( T1[ Axis ], T2[ Axis ] ) );
		}
	}
You just need to be careful when building the inverse ray delta to avoid divide by zeros for rays parallel to any of the canonical axes (ex, ey, ez). I have a SAT version as well which I use for my static tree and both seem equally fast. You can make traversals faster by using a directed search (e.g. you decide which node you decent into based on the ray direction). I do these things in my static tree, but not in my dynamic tree yet.
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

Dirk,

Starting to get away from the topic but I am curious what you mean by 'static tree' and 'dynamic tree'? Do you have one tree with all the static objects in the scene, and another with the dynamics? I suppose that makes sense then you do Dynamic v Dynamic and Dynamic v Static. Or is your static tree a mid-phase wrapper around triangle mesh data? Given that we only have one static object (the triangle mesh) I was just going to put it in the same tree as everything else, but then I guess the nature of it's giant bound (the whole level) would upset the tree too much.

One adjustment I am trying to deal with is that Sweep and Prune have explicit Add/Remove edges. The bounding volume tree only has an Add edge. Is the trick to test the AABBs between pairs during collision dispatching (potentially a second time if it was just added) and then remove pairs that no longer overlap?

As always, thanks for your input.

S
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

I only have a dynamic tree for both static and dynamic bodies which I use as broadphase. As in Box2D I insert shapes (not bodies) into the tree. My static tree is a classical SAH - AABB tree for triangle meshes. Those get inserted as MeshShape into the dynamic tree as well. So when I dispatch pairs, I am dealing with shapes and no mid-phase is required.

Hope that makes sense!
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

Dirk,

Makes a lot of sense - when you say you put shapes, not bodies in the tree does that mean that there are no listshape/compound/composite/whathaveyou shapes - all shapes of a body are all inserted into the tree, without a narrow phase component for the compound shape?

Appreciate your insight.

S
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

The rigid body has a list of shapes - that's it. I never understood the need for list/compound/composite shapes. And I don't understand these complicated shape hierarchies. I support only spheres, capsules, hulls and meshes. No boxes, those are hulls too. And I can have has many shapes on a body as I want and create any compound shape nevessary. There is also no mid phase, which simplifies things a lot.

Maybe I am getting old, but I value simple approaches a lot these day. I am not shying away from more complicated solutions if necessary, but I need a good reason to add complexity. An engine like ours lives for so many years that maintainabiilityt is a primary design goal in my opion. Complex code becomes 'stiff' very easily and difficult to maintain from my experience.

Hope that makes sense!
-Dirk
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

There is one thing I should mention. New engines which use mechanisms like prefabs make it easy for artists to densily populate the world very quickly. You can easily run into situations where you find yourself with 50-100k shapes in as scene. A dynamic tree can handle these large scenes, but you will notice performance degration. To handle such scenarios I create a small grid and merge all shapes in a bucket into a mesh.
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

Dirk Gregorius wrote:The rigid body has a list of shapes - that's it. I never understood the need for list/compound/composite shapes. And I don't understand these complicated shape hierarchies.
...
Maybe I am getting old, but I value simple approaches a lot these day.
...
-Dirk
Dirk - I don't think it's age, I think it's common sense. I have worked at companies that prided themselves on line count in the code base but my current job works aggressively to push down line count whenever possible. I would guess that the concept of a listshape is needed because of historical lineage from ODE, Havok, etc. Our physics API is no different but I am desperate to get it running fast enough and reducing complexity seems to be helping. I have experience with ODE, Havok, Bullet so naturally I feel comfortable with those kinds of APIs - but when I stop to ask, "why bother with a listshape" I have to agree that it doesn't seem necessary (proving the broadphase can handle the extra objects, which SAP cannot).

If a shape is in the broadphase, it must store a local-to-body matrix as well as a current world matrix, is that right?

I can see two distinct reasons for having list shapes:

1. The local-to-parent offset of child shapes, and any AABB data can be stored in the local space of the parent, thereby minimizing expensive costs when a complicated list-shape is changing orientation. I would say that the Bounding Volume Tree doesn't solve this - for a table with four legs the 5 shapes would all need to be re-oriented into world space when the associated body changes orientation.

2. Child shapes/shape hierarchies can be shared - for a shape made up of multiple child shapes that are complex this might be meaningful, but duplicating a Sphere, Capsule or Hull is not a big deal - the hull can still share the vertices/plane equations internally.

#1 seems meaningful, certainly in my use case as I am fighting an exceptionally depressing battle against the hardware.

I've thrown together the dynamic tree and will hook it up to the broadphase tomorrow and we'll see what the performance is like. I will then do some analysis to see how much work it would be to put all the shapes in the broadphase, which is an interesting concept - it has the potential to reduce a number operations which really do add up.

Thanks again.

S
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

I don't use any shape transforms. The shapes are defined as:

1) Sphere: Center and Radius
3) Capsule: 2 Centers and Radius (This is nice since you don't need to deal with any up-directions)
3) Hulls: Vertices, Faces and Edge
4) Meshes: Vertices and Triangles

Is is basically all vertices defined under some parent transform. You could cache a world transform, I guess. Erin uses a fixture to attach a shape to a body which does the caching iirc, but I don't even do this. In all function calls I usually pass down the a parent transform and the associated shape. So far this really works out pretty clean and concise.

You usually want to keep tree updates low anyway. That is why the dynamic tree works with inflated (fat) AABBs. So far the tree updates haven't been any issue. Even in pathological test scenes where each shape is spawned at the same time and invalidate their fat broadphase AABB all in the same frame. I think your point is valid, but before it is not a practical problem I would not address it. It is probably as you said and it goes back to the inefficiencies of SAP handling insertions and removals.

Good luck! Let me know how it works out :)


Edit:
I can share hulls and meshes. rnHullShape and rnMeshShape reference rnHull and rnMesh which is owned by the user. For meshes this might make sense, but for hulls I don't even bother. They are so memory efficient the way I store them they never showed up. Surprisingly hulls are very expensive in PhysX though. Havok usually cares for memory more. My shapes are really cheap and have hardly any overhead since they don't cache any transforms.
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

Dirk Gregorius wrote:I don't use any shape transforms. The shapes are defined as:

1) Sphere: Center and Radius
3) Capsule: 2 Centers and Radius (This is nice since you don't need to deal with any up-directions)
3) Hulls: Vertices, Faces and Edge
4) Meshes: Vertices and Triangles
<snip>
Interesting - I pass down the transform in all cases as well - shapes are always defined in the local space of the parent - but sphere's are simply a radius, and capsules oriented along z(up). Perhaps the need for listshapes also comes from this kind of implementation because there is no way to offset a sphere in the local space of a rigid body without another shape in the hierarchy. So essentially you can create many spheres shapes with local space offsets and radii and then attach them all to the body - there is a lot of potential savings here compare to listshapes with a child-offset matrix which requires full concatenation.

Regarding sharing, the size of a sphere or capsule shape will be very small so duplicating that isn't the end of the world. This is the same as my system.

I will have to look over my implementations as I suspect that all of the collision agents easily handle non-centered points.

For hulls are you still using SAT or GJK? You've written a lot about SAT. I need to tackle that soon as of right now I only support sphere, capsule, finite/infinite plane and mesh. I need to add box soon.

Sorry for dragging this topic so way off course but it's good to talk out these issues and ideas.

S
Dirk Gregorius
Posts: 861
Joined: Sun Jul 03, 2005 4:06 pm
Location: Kirkland, WA

Re: Sweep and Prune or dBVT for this use case

Post by Dirk Gregorius »

For hulls I use SAT. You can find my GDC presentation and some implementation here:
https://box2d.googlecode.com/files/DGre ... DC2013.zip

It is pretty much what I use these days though there is some room for SIMD optimizations to get rid of some branches. The major idea to make the narrow phase (and also the broadphase) fast is to use temporal coherence. So once you have a separating axis you test it in the next frame first. Once you have two touching features you try to rebuild the contact manifold from the last two touching features based on some heuristic (e.g. whether the relative orientation has changed). The SAT is still O(n^2) as I presented it. You should see the pattern here for both broadphase and narrow phase. As long as you hit your caches you do nothing and you should only update a small part of your proxies and manifolds each frame. If everything is settled (before going to sleep) nothing should be updated at all!
sphet
Posts: 38
Joined: Mon May 06, 2013 6:14 pm

Re: Sweep and Prune or dBVT for this use case

Post by sphet »

So after implementing a dynamic AABB tree following the example by Erin, and hooking it up to the broadphase, I did a comparison between the Sweep and Prune and the Dynamic Tree specifically looking at character bones and their costs in the broadphase. I can't really print any numbers out because they don't have any significance to anyone but me, but the results were interesting.

My setup involves many characters with 12 bone rigs running against each other. The bones in one system cannot collide with themselves, only with bones of opposing players.

Having all bones in the world is much slower in SAP than Dynamic Tree (although I needed a large AABB expansion buffer on the tree nodes of approximately 1 meter to make this worthwhile).

Having all the bones in the world in the Dynamic Tree is still slower than having all the bones in a List Shape ( List shape takes 12 bones out of broadphase and replaces with 1 object). Additional cost is incurred because of some bounding box combining, and no mid-phase detection in the list shape versus list shape collider.

Having all the bones in a List Shape in Sweep and Prune is comparable to the Dynamic Tree. I think this is now due to the low number of entities in the broadphase.

I don't think my Dynamic Tree broadphase wrapper is nearly as efficient as the SAP was, given that my class hierarchy (overlapping pair management, dispatcher, etc) is based on SAP's explicit Add/Remove callback. Dynamic Tree leaves it up to the client broadphase to perform AABB test on the existing overlapped pairs to check for removal every frame, which can cause the AABB overlap test to be duplicated in some cases. Plus the plumbing for some of that logic just isn't there right now.

When bones move enough to trigger a leaf re-insert, the cost of combining AABBs to get the SAH costs show up in the profiler on my device, so I need to minimize this happening.

I believe that the Dynamic Tree would be faster than the SAP in all cases (and is easier to debug) plus the benefits of Ray Casting against it are well documented. I am sad that the BVT was not enough to allow me to keep the bones in the broadphase directly, without the list shape collider.

Thanks for your help.
Post Reply