Multithreaded Bullet and Triangle Meshes

KulSeran
Posts: 9
Joined: Thu Dec 10, 2009 10:32 pm

Multithreaded Bullet and Triangle Meshes

Post by KulSeran »

So, I've been attempting to get the multi-threaded version of bullet's collision dispatcher working. There are several issues I've come across while attempting to get that working.
We are using bullet 2.75, but in the process of patching up to 2.76.
We are also going to be patching in 2.75-beta2 for the PS3.

In all three version:
1) *ACK!* The equivalent functions for btManifoldResult::addContactPoint() aren't providing me with valid data in m_index0 and m_partId0. This is very important to our system, as we have per-triangle collision data we are using.

2) The ContactAddedCallback is not called. Understandably, this is an issue with not being able to run a callback from a thread/SPU. And we are only using it for filtering, so it seems like a simple workaround. But was there a plan to implement it?

Testing with just the 2.75 version:
1) The system seems to issue very few tasks. With the number of tasks set to 16. Bullet will seem to consistantly
"sendRequest" then immediately "waitForResponse" without launching additional jobs.

using this archatype (sorry, can't post the actual code...) for the btThreadSupportInterface

Code: Select all

virtual void sendRequest(uint32_t uiCommand, ppu_address_t uiArgument0, uint32_t uiArgument1)
{
     // issue request to thread pool
}
virtual	void waitForResponse(unsigned int *puiArgument0, unsigned int *puiArgument1)
{
    // wait for ALL jobs in thread pool
}
virtual	void startSPU()
{
// do nothing
}
virtual	void stopSPU()
{
   // wait for ALL jobs in thread pool
}
What would cause this? what types of things will actually see speed up from the multithreading? does the PS3 version suffer the same problem?

2) The system seems to take a significant amount of time while nothing is going on. Our controlled objects (16+ controllers) preform a significant number of sweep test in serial using the
btDynamicsWorld::addAction(); interface. This is based on the Kinematic controller code, and doesn't seem like it could run in parallel, but is ~67% of our physics update.
The other 33% is the regular collision update running on ~150objects, all of which are either Kinematic or deactivated. This comes out to ~8ms of physics update. Is there any flags that I may not have set that would speed this up?
KulSeran
Posts: 9
Joined: Thu Dec 10, 2009 10:32 pm

Re: Multithreaded Bullet and Triangle Meshes

Post by KulSeran »

I attached the git diff/patch I used to add the triangle support back into the multithreaded bullet lib. (ok.. it is in a code block because I couldn't get it to let me attach a file)

After experimenting with it all finally working the multithreaded dispatcher results in almost a 2x slowdown vs the serial dispatcher.
Maybe this is an indication that i setup the btThreadSupportInterface incorrectly? What should I be looking at to get the speed up
I'd expect from adding the threading?
You do not have the required permissions to view the files attached to this post.
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: Multithreaded Bullet and Triangle Meshes

Post by Erwin Coumans »

The number of tasks should be similar to the number of actual cores. So for a quadcore machine, use 4. The narrowphase should improve performance, although less than linear scaling over the number of cores. The narrowphase collision detection of 1000 convexes on a triangle mesh in Benchmark 5 runs more than twice as fast on one of my test machines.

You can check the performance difference, using the Bullet/Demos/Benchmarks. Enable benchmark 5, edit main.cpp and use #define benchmarkDemo benchmarkDemo5 at the top.

Use cmake-gui under Windows and enable the multi-threaded narrowphase:
bench.JPG
I'm not sure why you don't see any performance increase. How many objects are colliding against the mesh? Can you attach a .bullet file? (.zip and .bullet files can be attached)

Also, would it be possible to share the patch using svn diff (can't use git) and use the issue tracker?
Thanks a lot,
Erwin
You do not have the required permissions to view the files attached to this post.
KulSeran
Posts: 9
Joined: Thu Dec 10, 2009 10:32 pm

Re: Multithreaded Bullet and Triangle Meshes

Post by KulSeran »

Ok. That might be a while.
1) I don't have SVN installed or setup on any of my machines, but if i get a chance I'll try to get the patch in that diff format.
Strangely I thought diff outputs were all the same format, but I guess not.

2) I'll see if I can replicate the slowdown in a non-production scene. In our production level, we are running 100+ static objects, 1 static concave mesh,
16+ kinematic objects (using code adapted from the kinematic character controller), and ocasionally spawning 64+ dynamic shrapnel. There is a lot of custom filtering going on,
so only the kinematics can collide with most things. The dynamic shrapnel can't collide with other dynamic shrapnel. The mesh is 2000+ tris, and all the other
objects use the btCompoundShape with an assortment of spheres, and boxes making up the sub shapes (though, there is a subset that is compound with only one child)

3) The serialization code (and soft body code) for the moment are not in our engine, since we weren't currently planning on using them. There have also been a host of small items that don't compile quite right on the X360 platform, so gutting things we didn't need has kept us from having to edit too much bullet source code. If the process of including it is painless enough, I'll see about adding the serialization to export a test scene if I can make one.

The threaded dispatcher seems to occasionally dispatch multiple threads when there is dynamic shrapnel around, and sends occasional single jobs(ie, still serial) while in the kinematic controllers.
But overall, it seems to just add overhead, since the physics simulation still remains serial most the time. I suspect it might just be that there are very few dynamic objects to take advantage of the parallel code, considering the controllers all run in serial with the "action" interface. Though it does seem that the CPU usage or bullet's collision dispatcher (parallel or not), grows linear in the number of objects in the collision world. It doesn't seem to mater if the items are inactive or not, they take up the same CPU. There are spikes when we spawn shrapnel, but the base CPU usage for a scene with 0 dynamic objects ~100 static objects and 16 kinematic controllers seems very high (4-6ms serial, 5-8ms parallel).
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: Multithreaded Bullet and Triangle Meshes

Post by Erwin Coumans »

Ah, you might be hitting a few known issues

The btKinematicCharacterController doesn't work well and is very slow indeed.

btCompoundShape also doesn't perform well when using the btSpuGatheringCollisionDispatcher, see this issue.

If you are using Bullet 2.76, the serialization is built-in the library (in Bullet/src/LinearMath). The Extras/Serialization is just for loading...

Thanks,
Erwin