There are a bunch of array allocations at every P3 call in the small-kernels version of P3. Make these buffers.