Making bullet faster.

GilsonLaurent
Posts: 8
Joined: Wed Oct 31, 2007 2:59 pm

Making bullet faster.

Post by GilsonLaurent »

Hi,

i'm evaluating several engines for a bigger project. It's about simulating a large amount of little convex trimesh-objects colliding. I do not except real-time.

But bullet is way slower then any other engine. i think i missed something, triggered a debug mode or forgot to initialise some speedups. It compiles into release mode. This is the initialisation:

Code: Select all

    int maxProxies = 32766;
    btDefaultCollisionConfiguration* m_collisionConfiguration = new btDefaultCollisionConfiguration();
    
    btCollisionDispatcher* m_dispatcher = new   btCollisionDispatcher(m_collisionConfiguration);
    
    btVector3 worldAabbMin(-50, -50, -50);
    btVector3 worldAabbMax(50, 50, 50);
    btAxisSweep3* m_overlappingPairCache = new btAxisSweep3(worldAabbMin, worldAabbMax, maxProxies);
    
    btSequentialImpulseConstraintSolver* sol = new btSequentialImpulseConstraintSolver;
    
    world = new btDiscreteDynamicsWorld(m_dispatcher, m_overlappingPairCache, sol, m_collisionConfiguration);
    
    world->setGravity(btVector3(0, -9.81, 0));
and this generates a object:

Code: Select all

    float mass = 1.f;
    btTransform startTransform;
    startTransform.setIdentity();
    startTransform.setOrigin(btVector3(pos_x, pos_y, pos_z));
    
    btTriangleMesh* mesh = new btTriangleMesh();
    
    float* tri = m_type->getVertex(); //die Dreiecke
    int number_of_tris = m_type->getSize(); //Anzahl der Dreiecke
    for (int i =0;i<number_of_tris;i++){
        mesh->addTriangle(btVector3(tri[i*9], tri[(i*9)+1], tri[(i*9)+2]),
                btVector3(tri[(i*9)+3], tri[(i*9)+4], tri[(i*9)+5]),
                btVector3(tri[(i*9)+6], tri[(i*9)+7], tri[(i*9)+8]));
    }
    btCollisionShape* shape = new btConvexTriangleMeshShape(mesh);
    
    btVector3 localInertia(0, 0, 0);
    localInertia[0] = m_type->inertia[0][0];
    localInertia[1] = m_type->inertia[1][1];
    localInertia[2] = m_type->inertia[2][2];
    
    btDefaultMotionState* myMotionState = new btDefaultMotionState(startTransform);
    m_body = new btRigidBody(btRigidBody::btRigidBodyConstructionInfo(mass, myMotionState, shape, localInertia));
I'm on linux and version 2.67. It cannot be the mising pThreads. I see ~30s / step for bullet vs 0.1-0.2s / step for the other candidates.

Any ideas ?

Thanks
sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York

Re: Making bullet faster.

Post by sparkprime »

This may be a silly question but what do you consider one step to be? Are you letting bullet run steps internally or explicitly doing each step yourself.

If you ask for 1 second and bullet does 60 steps, then that may explain the large difference in time.
GilsonLaurent
Posts: 8
Joined: Wed Oct 31, 2007 2:59 pm

Re: Making bullet faster.

Post by GilsonLaurent »

one step is 0.01s run by

Code: Select all

world->stepSimulation((btScalar)0.01,0);
I hope that makes one step of 10ms ?
chunky
Posts: 145
Joined: Tue Oct 30, 2007 9:23 pm

Re: Making bullet faster.

Post by chunky »

maxSubSteps = 0 is not your friend.

Your stepSimulation would be better posted as stepSimulation((btScalar)0.01, 1, (btScalar)0.01);

I don't know if that fixes your specific case, but it's something to be aware of.

Hm. Question for someone else, though: How *would* you do single steps of the simulation as described above? It seems like you're likely to run into issues of float accuracy in the time accumulator leading [over a long enough period] to individual extra steps.

Gary (-;
sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York

Re: Making bullet faster.

Post by sparkprime »

Try a variety of stepsimulation stuff, and also get some profiling done.

On linux, I recommend "google performance tools" and/or "valgrind + callgrind", neither of which require invasive recompilation. edit: although debug symbols would obviously be required.
GilsonLaurent
Posts: 8
Joined: Wed Oct 31, 2007 2:59 pm

Re: Making bullet faster.

Post by GilsonLaurent »

The different stepSimulation calls make little to no difference. callgrind shows most time is pasting in getLocalSupportingVertex. That makes sense, my models are very complex convex hulls (switched from convex triangle meshs to convex hulls. It helped a bit) (1300 vertexes per hull. Don't ask, that number is not going to change).

Is it faster to split the convex hulls into smaller subparts and use compound shapes? Does the compound shape prevent getLocalSupportingVertex-calls if the subshape is out of reach?

Thanks
sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York

Re: Making bullet faster.

Post by sparkprime »

do you mean localGetSupportingVertex ?
sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York

Re: Making bullet faster.

Post by sparkprime »

Also I forgot to mention kcachegrind is a nice tool for visualising the callgrind output, if you haven't already found it.

Callgrind's "time" is not real cpu time but just the number of instructions executed I think, so the results may be skewed a bit. Google performance tools is a bit nicer in this respect, as it's run on the real CPU rather than a simulation. I usually use both but I prefer callgrind mainly because of kcachegrind :)