Release v0.8.1 · gorgonia/tensor

v0.8.1 sees the built in transpose function use a different algorithm to perform inplace transpose.

Prior to this version the transpose uses a cycle-chasing algorithm. This turns out to have poor cache locality. So the solution is to replace that with one that allocates a new temporary array. The transpose operation is then simpy an iterative copying to the new array. The data is then copied from the temp array back to the original array.

v0.8.1 also sees an improvement contributed by @stuartcarnie on the FlatIterator structure. Here's the benchmark results.

benchmark                     old ns/op     new ns/op     delta
BenchmarkComplicatedGet-8     228778        199737        -12.69%

benchmark                     old allocs     new allocs     delta
BenchmarkComplicatedGet-8     2              2              +0.00%

benchmark                     old bytes     new bytes     delta
BenchmarkComplicatedGet-8     112           112           +0.00%

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

v0.8.1

Uh oh!