Running Bullet raytests in Parallel

Post Reply
Pursche01
Posts: 4
Joined: Thu Apr 06, 2017 6:21 pm

Running Bullet raytests in Parallel

Post by Pursche01 »

Hello! I am working on a game engine (surprise surprise?), one of our main features is a parallellized Entity Component System. We are investigating the possibility of running Bullet raytests in our ParallelUpdate function. I found conflicting information on the matter, so I decided to just give it a try. I pulled Bullet3 from Github as it was at the moment that I write this, and I get some really interesting results.

I have 5000 entities, all with one component with a ParallelUpdate function which will call this function:
https://pastebin.com/kDkWc84z
With _dynamics_world being a btDiscreteDynamicsWorld.

ParallelUpdate gets run in parallel through the use of a task scheduler (jobsystem) called EnkiTS
https://github.com/dougbinks/enkiTS/tree/C++11

Now for the interesting results.
The program is most likely to start and work perfectly fine with our ECS using 16 threads on a Ryzen R7 1700 (although we get the same result on an i5 4690k using 4 threads), but occasionally we get one of two crashes. The crashes happen immediately on start of the program, if the crash doesn't happen during the first update cycle it doesn't happen at all. A program that started correctly will keep on working, it will never randomly crash while "ingame".

Crash 1:
Exception thrown at 0x00007FFDB79859F4 (grv_x86_64d.dll) in dev_app_x86_64d.exe: 0xC0000005: Access violation reading location 0x0000000000000008.

This error happens in btVector3.h in the operator-(const btVector3& v1, const btVector3& v2) function. At line 14 in the following pastebin.
https://pastebin.com/2PyxZS6q
It appears that v1 is null.

This gets called from btDbvtBroadphase::rayTest in btDbvt.h. At line 27 in the following pastebin.
https://pastebin.com/Q4Hzg2zC

Looking at the context around line 27, it seems like node is NULL, "depth" is 0 and stack->m_data->childs has two members, 0 looking valid and 1 being a nullptr.
I'm thinking about just wrapping everything in "if (node != nullptr)", but since I barely have an idea what is happening here I figured I would ask around.
If anyone has a better idea I would love to hear it!

Crash 2:
HEAP[dev_app_x86_64d.exe]: Invalid address specified to RtlValidateHeap( 000001E324690000, 000001E333658200 )

This error happens in btAlignedAllocator.cpp, at line 3 in the following Pastebin.
https://pastebin.com/px6Hn7zr

This one is a crash in a simple free function, my guess is that it tries to free the same memory address twice. The void* looks valid.
This almost certainly happens because I'm running it in parallel, I tracked this down to btDbvtBroadphase::rayTest in btDbvt.h, line 21 in the following pastebin.
https://pastebin.com/Q4Hzg2zC

It looks like it is trying to resize the stack to double the normal size. I'm thinking that I can solve this by simply increasing the default size of the stack, I heard that I should be able to do it through the use of a btCollisionConfiguration passed to the btWorld when I create it, but I'm absolutely stumped by how I actually do this. Do I need to increase the persistentManifoldPool? CollisionAlgorithmPool? How exactly do I create another btCollisionConfiguration that isn't the default one, it looks like I can't set the members either in the constructor or with functions, do I need to create my own BtCollisionConfiguration class that inherits from the default one?



If you think that I am far away from the solution, please say so and I will try other methods. But since I have almost no experience with Bullet (I did use Box2D, PhysX and a bit of Havoc though) I figured I should ask! Do I perhaps need to try something other than DbvtBroadphase?

I am happy to provide any extra information you might need about these crashes, the code (with defaults that should replicate the crash) is public in our bitbucket and I'll add a link to it on request (not sure if it's against the forum rules...) AND I am available all weekend EU time in case someone wants to look at the problem directly through Teamviewer or something similar.

Thank you in advance, I will happily provide beer money through Paypal in exchange for help that leads to a solution. :)
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Running Bullet raytests in Parallel

Post by ktfh »

btDbvtBroadphase::rayTest isn't thread safe by default. I think you need to build bullet with BT_THREADSAFE defined, possibly some other requirements for setup. check out lunkhound's thread he implemented it all http://www.bulletphysics.org/Bullet/php ... =9&t=10232
Pursche01
Posts: 4
Joined: Thu Apr 06, 2017 6:21 pm

Re: Running Bullet raytests in Parallel

Post by Pursche01 »

ktfh wrote:btDbvtBroadphase::rayTest isn't thread safe by default. I think you need to build bullet with BT_THREADSAFE defined, possibly some other requirements for setup. check out lunkhound's thread he implemented it all http://www.bulletphysics.org/Bullet/php ... =9&t=10232
Thank you, I will look at the thread and do some experiments with it tomorrow. Feel free to PM me a Paypal connected email if you want a beer. Or you could find a charity or donation page of your choice. :D
Pursche01
Posts: 4
Joined: Thu Apr 06, 2017 6:21 pm

Re: Running Bullet raytests in Parallel

Post by Pursche01 »

ktfh wrote:btDbvtBroadphase::rayTest isn't thread safe by default. I think you need to build bullet with BT_THREADSAFE defined, possibly some other requirements for setup. check out lunkhound's thread he implemented it all http://www.bulletphysics.org/Bullet/php ... =9&t=10232
Sadly this didn't fix the problems, I built everything with BT_THREADSAFE defined to 1 (which I verified by checking which defines are actually being compiled) but I still get the exact same crashes. Lunkhound did say that btDbvtBroadphase::rayTest should be thread safe after having this define.
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Running Bullet raytests in Parallel

Post by ktfh »

Can you paste a full stack trace? It might help a bit. Both the crashes you described though seem to have been caused by the btDbvtNode stack being shared across threads. Compiling bullet lib with BT_THREADSAFE should have created an array of 64 stacks. Double check your compiling and linking to the newest version of bullet and try stepping through the code or compile with some printfs to see the if the stack in btDbvtBroadphase::rayTest is unique to each thread.
https://github.com/bulletphysics/bullet ... e.cpp#L236
Pursche01
Posts: 4
Joined: Thu Apr 06, 2017 6:21 pm

Re: Running Bullet raytests in Parallel

Post by Pursche01 »

ktfh wrote:Can you paste a full stack trace? It might help a bit. Both the crashes you described though seem to have been caused by the btDbvtNode stack being shared across threads. Compiling bullet lib with BT_THREADSAFE should have created an array of 64 stacks. Double check your compiling and linking to the newest version of bullet and try stepping through the code or compile with some printfs to see the if the stack in btDbvtBroadphase::rayTest is unique to each thread.
https://github.com/bulletphysics/bullet ... e.cpp#L236
Okay, I feel really stupid now. It turns out that in Premake5
defines {
"_DEBUG",
"BT_THREADSAFE = 1"
}

Is not the same thing as
defines {
"_DEBUG",
"BT_THREADSAFE=1"
}

The first one, the one I used first, seems to define BT_THREADSAFE without assigning it the value. Thus BT_THREADSAFE looks assigned in Visual Studio but it still fails the #if's.
I changed to the second one and it now seems to work perfectly, thank you very much! :D
Post Reply