Would it make sense to add non-allocating versions of Vec2 and Mat22 methods (i.e. versions that take a destination parameter)?
Yes, that might be a good way to avoid having to explicitly inline everything, it's at least worth a shot. I believe I added a few local versions of Vec2 operators, but going further might be very worthwhile, as currently I think the Mat22 and XForm ops make up a decent proportion of allocations.
Would it be acceptable to add private members to classes to be used as temporaries inside methods? If so, should these temporaries be shared between methods when possible to save space?
Absolutely. In a few cases I did this with static class variables as BorisTheBrave mentioned, but this can be a bit dangerous, esp. if anyone ever decides to try and do anything multithreaded (i.e. run more than one world at once or something like that), which is conceivable. But I wouldn't worry too much about that for now, nobody's really doing anything like that, and it's considered unsupported - IOW, I'd be perfectly happy to merge changes that relied on private statics, and I'd probably do them myself if I had the time.
In an ideal world, I'd love to get the Pyramid demo running with fewer than 10k Vec2 allocations per frame; that's going to be tough to achieve, but it would address the main speed concerns and allow JBox2d to run a lot better on constrained devices where GC is disproportionately expensive.
FWIW, the changes along these lines that I already made (over the last several versions) led to roughly a 100% speedup, so they are
extremely valuable, and a fairly good use of optimization energy, esp. since in Java cache-focused optimizations are far less effective than in C++ (I did some tests on this, and the desktop JVM is actually amazingly smart about rearranging things in memory to optimize the hit rate, even if your algorithm doesn't make it easy - if you ever see an instance where Java code runs faster than the "same" C++ code, this is usually what's being exploited).