-
Notifications
You must be signed in to change notification settings - Fork 7
Converter
The Converter class provides static methods for image format conversions, memory allocation, and configuration settings. It supports RGB, BGR, RGBA, BGRA to YUV420P conversions, YUV420P to RGB conversions, YUV NV12 to RGB conversions, YUV NV12 to YUV I420 conversions, and image downscaling.
The Converter class is designed to provide efficient image conversion and memory management functionalities. It leverages native implementations for performance-critical operations and offers various configurations to optimize conversions.
- RGB/BGR/RGBA/BGRA to YUV420P conversion.
- YUV420P to RGB conversion.
- YUV NV12 to RGB conversion.
- YUV NV12 to YUV I420 conversion.
- Image downscaling.
- Aligned native memory allocation and freeing.
- Global configuration settings.
- SIMD optimizations (SSE, NEON, AVX2).
| Method | Return Type | Description |
|---|---|---|
SetConfig(ConverterConfig config) |
void |
Sets global configuration for the converter. |
GetCurrentConfig() |
ConverterConfig |
Gets the current configuration. |
SetOption(ConverterOption option, int value) |
void |
Sets a configuration option. |
AllocAllignedNative(int size) |
IntPtr |
Allocates aligned native memory. |
FreeAllignedNative(IntPtr p) |
void |
Frees native memory allocated by AllocAllignedNative. |
Rgb2Yuv(RgbImage from, YuvImage yuv) |
void |
Converts RGB, BGR, RGBA, BGRA to YUV420P. |
Yuv2Rgb(YuvImage yuv, RgbImage image) |
void |
Converts YUV420P image to RGB format. |
Yuv2Rgb(YUVImagePointer yuv, RgbImage image) |
void |
Converts YUV420P image to RGB format using YUVImagePointer. |
Yuv2Rgb(YUVNV12ImagePointer yuv, RgbImage image) |
void |
Converts YUV NV12 planar image to RGB format. |
YuvNV12toYV12(YUVNV12ImagePointer nv12, YuvImage yv12) |
void |
Converts YUV NV12 planar image to YUV I420. |
Downscale(RgbImage from, RgbImage to, int multiplier) |
void |
Downscales an image by a given factor. |
| Name | Description |
|---|---|
| NumThreads | Number of chunks that image is divided and sent to threadpool. Defaults to 1 on arm systems |
| EnableSSE | Allows use of SSE SIMD implementations of Converter operations. Does nothing on ARM. |
| EnableNeon | Allows use of NEON SIMD implementations of Converter operations. Does nothing on x86 systems. |
| EnableAvx2 | Allows use of AVX2 SIMD implementations of Converter operations. Does nothing on ARM. |
| EnableAvx512 | Not supported yet. |
| EnableCustomThreadPool | Enables use of Custom Threadpool. On windows you can optionally use the Windows pool provided on ppl.h. Depending hardware performance may vary. Does nothing on other platforms. |
| EnableDebugPrints | EnablesDebugPrints |
| ForceNaiveConversion | For test purposes only, when no SIMD enabled, uses Fixed point approximation naive converter. |
ConverterConfig config = Converter.GetCurrentConfig();
config.NumThreads = 4;
Converter.SetConfig(config);Converter.SetOption(ConverterOption.EnableAvx2, 1);
Converter.SetOption(ConverterOption.NumThreads, 8);RgbImage rgbImage = new RgbImage(ImageFormat.Rgb, 800, 600);
YuvImage yuvImage = new YuvImage(800, 600);
Converter.Rgb2Yuv(rgbImage, yuvImage);YuvImage yuvImage = new YuvImage(800, 600);
RgbImage rgbImage = new RgbImage(ImageFormat.Rgb, 800, 600);
Converter.Yuv2Rgb(yuvImage, rgbImage);RgbImage fromImage = new RgbImage(ImageFormat.Rgb, 1600, 1200);
RgbImage toImage = new RgbImage(ImageFormat.Rgb, 800, 600);
Converter.Downscale(fromImage, toImage, 2);IntPtr nativePtr = Converter.AllocAllignedNative(1024 * 1024); // Allocate 1MB, alligns to 64 bytes
Converter.FreeAllignedNative(nativePtr);-
Parallelization:
- The color format conversion process (RGB to YUV and vice versa) supports optional parallelization via thread configuration.
- Using a single thread minimizes CPU cycle consumption (reduces context switching, cache thrashing) and maximizes efficiency, but may result in slower conversion times.
- Conversion performance is highly dependent on factors such as image size, system memory speed, L3 cache size, core IPC (instructions per clock), cache performance, and other system-specific characteristics. Therefore, performance may vary depending o processor architecture.
- Especially in ARM systems NEON implementations are very efficient, your bottleneck will be memory speed and cache. In my tests, parallelization on ARM systems gave little to no benefit.
- On windows there is an option switch to use either windows thread pool(
ppl.h) or custom thread pool. Custom thread pool performs significantly better on AMD system I have tested over windows and offer same performance on Intel system. Non windows platforms will always use custom pool. - Setting NumThreads to 1 or 0 will disable the threadpool.
-
RGB to YUV Conversion SIMD Support:
- SIMD (Single Instruction, Multiple Data) implementations are significantly faster than Naïve or compiler auto vectorized implementations.
- SIMD support can be configured for RGB to YUV conversions.
- By default, the highest supported instruction set (e.g., AVX2, SSE) is automatically selected at runtime. i.e. If AVX2 is available, the SSE version will not be executed.
- Neon instruction sets are utilized on ARM architectures, and are inactive on x86 systems and vice versa.
H264Sharp conversion operations are up to 2.9x faster than OpenCV implementations.
1080p 5000 Iterations of RGB -> YUV and YUV -> RGB, CustomThreadPool
AMD Ryzen 7 3700X Desktop CPU
| #Threads | OpenCV (ms) | H264Sharp (ms) |
|---|---|---|
| 1 | 11919 | 4899 |
| 2 | 6205 | 2479 |
| 4 | 3807 | 1303 |
| 8 | 2543 | 822 |
| 16 | 2462 | 824 |
Intel i7 10600U Laptop CPU
1080p 5000 Iterations of RGB -> YUV and YUV -> RGB, CustomThreadPool
| #Threads | OpenCV (ms) | H264Sharp (ms) |
|---|---|---|
| 1 | 11719 | 6010 |
| 2 | 6600 | 3210 |
| 4 | 4304 | 2803 |
| 8 | 3560 | 1839 |
1080p, 1000 iterations.
Pixel 6 Pro, Google Tensor SOC
| #Threads | Yuv2Rgb (ms) | Rgb2Yuv (ms) |
|---|---|---|
| 1 | 523 | 634 |
| 2 | 402 | 635 |
| 4 | 429 | 638 |
| 8 | 466 | 653 |