Skip to content

Unoptimized BLAKE2b #60

@veorq

Description

@veorq

Issue reported in the context of Kudelski Security's audit

The implementation does not leverage vectorized instructions. For example, on platforms supporting AVX2, a reference, portable implemnentations is about 40% slower than an AVX2 implementation, as reported on a Cannonlake microarchitecture benchmark from SUPERCOP.

An AVX2 implementation of BLAKE2b can be found in the SUPERCOP archive as well as in Libsodium.
An AVX512-optimized version of BLAKE2s (not BLAKE2b) is used in Wireguard.
Similar techniques may be used to optimize BLAKE2b for the AVX512 instruction set.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions