You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Paddle-mobile has reconstructed and enhanced efficiency of the matrix operation library sgemm and sgemv, which gives rise to performance boost of 10%~100% on most models.
19 new operators are provided in this version such as while, sequence_expand, sequence_pool, sequence_softmax, gru_unit, beam_search, and beam_search_decode. Apart from that, there has also been a large amount of optimization, and the support attention-based end-to-end Model prediction.
arm v8 of winograd implementation: higher inference performance on v8 hardware on IOS; winograd support for operator fusion to ensure higher efficiency after operator fusion.
Direct convolution for kernel with a 3x3 sliding window, which will be more efficient than winograd and gemm on the condition that the number of channels is small.
Reconstructed and optimized depthwise convolution with the kernel size 3x3: in contrast to previous versions, it supports arbitrary padding, and attains better performance and returns more reliable calculation results.
Depthwise convolution with the kernel size 5x5 on armv8: the NAS model prediction speeds up by more than 30%.
Complete the efficiency optimization of the deconvolution conv2d_transpose.
Consolidated with memory reuse strategy based on graph optimization. When the strategy is applied, most models can reduce memory usage by nearly 50%. It is automatically turned on for the ARM CPU (not compatible with FPGA and GPU).
ARM GPU
Paddle-mobile completes the convolution optimization for the kernel with size 1x1, and MobileNet v1 has an average inference performance improvement of 35% on Qualcomm Adreno GPUs.