-
Notifications
You must be signed in to change notification settings - Fork 73
Implement YUV to RGB #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement YUV to RGB #291
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great if you can compare the resutls to an existing library in rust like https://github.com/awxkee/yuvutils-rs
/// assert_eq!(yuv.size().width, 4); | ||
/// assert_eq!(yuv.size().height, 5); | ||
/// ``` | ||
pub fn yuv_from_rgb(src: &Image<f32, 3>, dst: &mut Image<f32, 3>) -> Result<(), ImageError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrew-shc feel free to add comments. Especially on the input types / conversion to match other open discussion in this topic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please point out the discussion link so that I may be able to add the comments myself instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, here #296
I can compare the result to that crate, but it only uses Questions:
|
let's target first f32. We can attempt u8 in another PR |
Hey @edgarriba! Can you please take a look at the comparison with yuv_utils in the last commit. Please tell me if there is anything missing. |
)?; | ||
let mut yuv = Image::<f32, 3>::from_size_val(rgb_image.size(), 0.0)?; | ||
super::yuv_from_rgb(&rgb_image, &mut yuv)?; | ||
super::parallel::par_iter_rows(&yuv.clone(), &mut yuv, |src_p, dst_p| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because of the output range I was asking about earlier:
Questions:
Should I make the two APIs similar or just transform them manually and compare them?
(This is related to the first conversation in this PR.)
Should the YUV image be in the range:
Normalized:
- Y:
[0, 1]
- U:
[-0.436, +0.436]
- V:
[-0.615, +0.615]
Image pixel values :
- Y:
[0, 255]
- U:
[0, 255]
- V:
[0, 255]
These ranges are the same as in kornia and since the yuv_utils crate uses u8, I have to transform the range back to [0,255] to compare them.
yuvutils supports 10/12 sometimes 14 and 16 bit depth if you want to do high resolution tests. docs
I don’t know where did you took those values from, but they are definitely wrong. UV ranges always match, and they always are [-0.5;0.5]. If signal will be transmitted out of your processing engine the standard mandates to add +0.5 bias to chroma components (UV) to have signal in positive range. Thus yes, it makes YUV image in storage ( prepared for transmission ) in [0;1] or [0;255] in ‘u8’ representation. Your matrices is unclear to me. You're testing against Bt.601 Full Range, and then y're using matrix similar to Bt.601 Limited Range. What matrix and range did you actually try to use? |
I took these values from kornia. Part of the Docstring of rgb_to_yuv says:
Disclaimer: I am not an expert in YUV conversions in any stretch of the imagination. I just wanted the two APIs of
To be honest, I didn't know which range and matrix correspond to the implementation in kornia that I was testing against. I used this combination of range and matrix when I found that the tests passed (but with low precision). Could you please provide a source/docs that explain matrix parameter? |
Sure, YUV is pretty well [documented by ITU-R.
Whoever did this, he was wrong. You may not reproduce those coefficients in yuvutils because coefficients aren't correct. You could try to reproduce their coefficients: fn kr_kb_for_color_primaries(primaries: [[f32; 2]; 4]) -> (f32, f32) {
let r_x = primaries[0][0];
let r_y = primaries[0][1];
let g_x = primaries[1][0];
let g_y = primaries[1][1];
let b_x = primaries[2][0];
let b_y = primaries[2][1];
let w_x = primaries[3][0];
let w_y = primaries[3][1];
let r_z = 1.0 - (r_x + r_y);
let g_z = 1.0 - (g_x + g_y);
let b_z = 1.0 - (b_x + b_y);
let w_z = 1.0 - (w_x + w_y);
let kr = (r_y
* (w_x * (g_y * b_z - b_y * g_z) + w_y * (b_x * g_z - g_x * b_z) + w_z * (g_x * b_y - b_x * g_y)))
/ (w_y * (r_x * (g_y * b_z - b_y * g_z) + g_x * (b_y * r_z - r_y * b_z) + b_x * (r_y * g_z - g_y * r_z)));
let kb = (b_y
* (w_x * (r_y * g_z - g_y * r_z) + w_y * (g_x * r_z - r_x * g_z) + w_z * (r_x * g_y - g_x * r_y)))
/ (w_y * (r_x * (g_y * b_z - b_y * g_z) + g_x * (b_y * r_z - r_y * b_z) + b_x * (r_y * g_z - g_y * r_z)));
(kr, kb)
} From theirs link https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-199802-S!!PDF-E.pdf Red/Green/Blue and White point for M/PAL: let (kr, kb) = kr_kb_for_color_primaries([
[0.67, 0.33],
[0.21, 0.71],
[0.14, 0.08],
[0.31, 0.316]
]);
kr: 0.29896665 kb 0.11461213 Then from pub fn get_forward_transform(
range_rgba: u32,
range_y: u32,
range_uv: u32,
kr: f32,
kb: f32,
) -> CbCrForwardTransform<f32> {
let kg = 1.0f32 - kr - kb;
let yr = kr * range_y as f32 / range_rgba as f32;
let yg = kg * range_y as f32 / range_rgba as f32;
let yb = kb * range_y as f32 / range_rgba as f32;
let cb_r = -0.5f32 * kr / (1f32 - kb) * range_uv as f32 / range_rgba as f32;
let cb_g = -0.5f32 * kg / (1f32 - kb) * range_uv as f32 / range_rgba as f32;
let cb_b = 0.5f32 * range_uv as f32 / range_rgba as f32;
let cr_r = 0.5f32 * range_uv as f32 / range_rgba as f32;
let cr_g = -0.5f32 * kg / (1f32 - kr) * range_uv as f32 / range_rgba as f32;
let cr_b = -0.5f32 * kb / (1f32 - kr) * range_uv as f32 / range_rgba as f32;
CbCrForwardTransform {
yr,
yg,
yb,
cb_r,
cb_g,
cb_b,
cr_r,
cr_g,
cr_b,
}
} Which results in get_forward_transform(255, 255, 255, kr, kb)
And for limited range:
8-bit scale values doesn't matter here because it's relative values, they may contribute max 0.5% error. Where is your final matrix should be:
TLDR I'm sure original implementation is not correct- UV range may not be out of [-0.5; 0.5] so anything that produces values significantly out of range is immediately not trustworthy. You may follow original implementation for compatibility but testing against yuvutils have no meaning because it may not produce such values. |
@awxkee thanks for looking into this, some context here kornia/kornia#934 |
Issue you mentioned almost have no relation to coefficients. There is discussion about chroma subsampling, as far as I could see. |
@omar-abdelgawad can you try to apply then simplified matrix that @omar-abdelgawad provided. Curious to see how it compares also with OpenCV and other standard libraries? |
OpenCV use by.601 limited range matrix which is quite common. Nowadays, Bt.709 limited range is more common, because for any size larger than SD bt.709 should be used, but bt. 601 still default matrix on many systems. TBH, I don’t know why M/PAL was chosen previously. It is quite uncommon matrix in the pc world. It’s purely tv matrix, you unlikely find it used somewhere except very specialized software. |
Most software will use by.601 for SD content, by.709 for HD content and bt.2020 for wide gamut or uhd content. This is common workflow, you’ll find it as default almost on any camera ( except professional cameras ) and device. It pure analytical purposes is assumed bt.601 or bt.709 limited range is common choice. Might be sometimes full range, if you’re not expecting content directly from camera. Cameras are almost always in limited range except you’re configured otherwise. For pure analytics limited range might be worse, because it quantize an image. |
The primary goal is real time camera based processing, so I’d choose this path first. Probably something to validate first is from the camera pipes we have to take the raw I420 yuv format and convert to rgb8 as a common representation for most of the vision algorithms covered in the library |
You need this pipeline then. You may ignore it and use OpenCV with BT.601 as always—it is still acceptable. Colors will be slightly washed out and have a "greenish effect", but this might be okay depending on the context. |
P.S I420 means Y'UV 4:2:0 exactly in Bt.601, just to make it clear. But this unlikely that all cameras will always send you I420. But as I already wrote, this might be acceptable. |
@edgarriba I would be happy to keep working on this PR but I am now a bit busy refining my proposal for "Implement GPU image transforms" (and I also have midterms 😢) so I would like to continue working on it after submitting my proposal. |
@edgarriba Just to be clear on what I should do, I believe we agreed that I have fix the wrong range of values to be in [0.5;0.5] for UV values.
I will first try the coefficients provided
I will also test their accuracy against yuv_444 Bt.601 with Full range and finally report their matching accuracy and inverse relation here to see if we that is the wanted behaviour. |
I420 is in BT.601 Limited Range with Y'UV 4:2:0 layout. You could check common YUV FourCC on libyuv page https://chromium.googlesource.com/libyuv/libyuv/+show/master/docs/formats.md. |
Got it so I should test against |
to add a bit more context, this is an implementation i've been effectively using in another project where we stream nv12 images which is probably the next feature I wanted to support in the pub fn yuv420sp_to_rgb8(image: &CompressedImage) -> Image<u8, 3> {
let rows = image.rows();
let cols = image.cols();
let mut rgb8_data = vec![0; rows * cols * 3];
let y_plane = image.as_slice()[..rows*cols];
let uv_plane = image.as_slice()[rows*cols];
for y in 0..rows {
let index = y * cols;
for x in 0..cols {
let y_index = index + x;
let uv_index = (y / 2) * (cols / 2) * 2 + (x / 2) * 2;
let y = y_plane[y_index] as f32;
let u = uv_plane[uv_index] as f32 - 128.0;
let v = uv_plane[uv_index + 1] as f32 - 128.0;
let r = (y + 1.402 * v).clamp(0.0, 255.0) as u8;
let g = (y - 0.34414 * u - 0.71414 * v).clamp(0.0, 255.0) as u8;
let b = (y + 1.772 * u).clamp(0.0, 255.0) as u8;
let rgb_index = y_index * 3;
rgb8_data[rgb_index] = r;
rgb8_data[rgb_index + 1] = g;
rgb8_data[rgb_index + 2] = b;
}
} |
This is Bt.601 full range matrix, not limited. I'm sure that 99% cameras stream NV12 in limited range, but I don't know the context. Since, I'm here I just prefer to clarify how it works. You could re-check matrix with Bt.601 limited in OpenCV, and see its different, and bias is missing in your code. If it actually should be NV12 Bt.601 limited. |
I wrote once for image-rs simple and reasonable effective implementation supporting almost arbitrary inversion matrix. |
got it -- actually you found a bug in my other project :) Agree, let's do then |
I'll spend some time to enable support for i420 in the gstreamer pipeline to validate this /cc @AS1100K |
|
📢 Describe your changes
yuv_from_rgb()
andrgb_from_yuv()
inkornia-imgproc::color::yuv.rs
and exported them in themod.rs
with the same interface ashsv_from_rgb
inhsv.rs
.1e-1
precision between the two functions.🧪 How to test this Pull Request
cargo test --package kornia-imgproc color::yuv
cargo test --package kornia-imgproc color::yuv --doc
✅Checklist before requesting a review