Skip to content

Conversation

omar-abdelgawad
Copy link

@omar-abdelgawad omar-abdelgawad commented Mar 19, 2025

📢 Describe your changes

  • 📌 changes Summary
    • Closes Implement YUV to RGB #100
    • Added yuv_from_rgb() and rgb_from_yuv() in kornia-imgproc::color::yuv.rs and exported them in the mod.rs with the same interface as hsv_from_rgb in hsv.rs.
    • Added 2 unit tests that match output from original kornia and 1 test that affirms inverse relation to a 1e-1 precision between the two functions.

🧪 How to test this Pull Request

cargo test --package kornia-imgproc color::yuv
cargo test --package kornia-imgproc color::yuv --doc

✅Checklist before requesting a review

  • I have performed a self-review of my code
  • If it is a core feature, I have added thorough tests.

Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great if you can compare the resutls to an existing library in rust like https://github.com/awxkee/yuvutils-rs

/// assert_eq!(yuv.size().width, 4);
/// assert_eq!(yuv.size().height, 5);
/// ```
pub fn yuv_from_rgb(src: &Image<f32, 3>, dst: &mut Image<f32, 3>) -> Result<(), ImageError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrew-shc feel free to add comments. Especially on the input types / conversion to match other open discussion in this topic

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please point out the discussion link so that I may be able to add the comments myself instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, here #296

@omar-abdelgawad
Copy link
Author

would be great if you can compare the resutls to an existing library in rust like https://github.com/awxkee/yuvutils-rs

I can compare the result to that crate, but it only uses u8 representation instead of f32 representation as far as I know.

Questions:

  1. Should I make the two APIs similar or just transform them manually and compare them?
    (This is related to the first conversation in this PR.)
    Should the YUV image be in the range:

    • Normalized:

      • Y: [0, 1]
      • U: [-0.436, +0.436]
      • V: [-0.615, +0.615]
    • Image pixel values :

      • Y: [0, 255]
      • U: [0, 255]
      • V: [0, 255]
  2. Should the image dtype be f32 or u8 for the second representation?

@edgarriba
Copy link
Member

let's target first f32. We can attempt u8 in another PR

@omar-abdelgawad
Copy link
Author

Hey @edgarriba! Can you please take a look at the comparison with yuv_utils in the last commit. Please tell me if there is anything missing.

)?;
let mut yuv = Image::<f32, 3>::from_size_val(rgb_image.size(), 0.0)?;
super::yuv_from_rgb(&rgb_image, &mut yuv)?;
super::parallel::par_iter_rows(&yuv.clone(), &mut yuv, |src_p, dst_p| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of the output range I was asking about earlier:

Questions:

  1. Should I make the two APIs similar or just transform them manually and compare them?
    (This is related to the first conversation in this PR.)
    Should the YUV image be in the range:

    • Normalized:

      • Y: [0, 1]
      • U: [-0.436, +0.436]
      • V: [-0.615, +0.615]
    • Image pixel values :

      • Y: [0, 255]
      • U: [0, 255]
      • V: [0, 255]

These ranges are the same as in kornia and since the yuv_utils crate uses u8, I have to transform the range back to [0,255] to compare them.

@awxkee
Copy link

awxkee commented Mar 25, 2025

I can compare the result to that crate, but it only uses u8 representation instead of f32 representation as far as I know.

yuvutils supports 10/12 sometimes 14 and 16 bit depth if you want to do high resolution tests. docs

Should I make the two APIs similar or just transform them manually and compare them?
(This is related to the first conversation in this PR.)
Should the YUV image be in the range:
Normalized:
Y: [0, 1]
U: [-0.436, +0.436]
V: [-0.615, +0.615]

I don’t know where did you took those values from, but they are definitely wrong. UV ranges always match, and they always are [-0.5;0.5]. If signal will be transmitted out of your processing engine the standard mandates to add +0.5 bias to chroma components (UV) to have signal in positive range. Thus yes, it makes YUV image in storage ( prepared for transmission ) in [0;1] or [0;255] in ‘u8’ representation.

Your matrices is unclear to me. You're testing against Bt.601 Full Range, and then y're using matrix similar to Bt.601 Limited Range. What matrix and range did you actually try to use?

@omar-abdelgawad
Copy link
Author

omar-abdelgawad commented Mar 25, 2025

I don’t know where did you took those values from, but they are definitely wrong. UV ranges always match, and they always are [-0.5;0.5]. If signal will be transmitted out of your processing engine the standard mandates to add +0.5 bias to chroma components (UV) to have signal in positive range. Thus yes, it makes YUV image in storage ( prepared for transmission ) in [0;1] or [0;255] in ‘u8’ representation.

I took these values from kornia. Part of the Docstring of rgb_to_yuv says:

    The image data is assumed to be in the range of :math:`(0, 1)`. The range of the output is of
    :math:`(0, 1)` to luma and the ranges of U and V are :math:`(-0.436, 0.436)` and :math:`(-0.615, 0.615)`,
    respectively.

    The YUV model adopted here follows M/PAL values (see
    `BT.470-5 <https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-199802-S!!PDF-E.pdf>`_, Table 2,
    items 2.5 and 2.6).

Disclaimer: I am not an expert in YUV conversions in any stretch of the imagination. I just wanted the two APIs of kornia and kornia-rs to be similar and perform the same calculations. Please tell me if I am doing anything wrong in my calculations.

Your matrices is unclear to me. You're testing against Bt.601 Full Range, and then y're using matrix similar to Bt.601 Limited Range. What matrix and range did you actually try to use?

To be honest, I didn't know which range and matrix correspond to the implementation in kornia that I was testing against. I used this combination of range and matrix when I found that the tests passed (but with low precision). Could you please provide a source/docs that explain matrix parameter?

@awxkee
Copy link

awxkee commented Mar 26, 2025

To be honest, I didn't know which range and matrix correspond to the implementation in kornia that I was testing against. I used this combination of range and matrix when I found that the tests passed (but with low precision). Could you please provide a source/docs that explain matrix parameter?

Sure, YUV is pretty well [documented by ITU-R.
See 45,46,47.

I don’t know where did you took those values from, but they are definitely wrong. UV ranges always match, and they always are [-0.5;0.5]. If signal will be transmitted out of your processing engine the standard mandates to add +0.5 bias to chroma components (UV) to have signal in positive range. Thus yes, it makes YUV image in storage ( prepared for transmission ) in [0;1] or [0;255] in ‘u8’ representation.

I took these values from kornia. Part of the Docstring of rgb_to_yuv says:

    The image data is assumed to be in the range of :math:`(0, 1)`. The range of the output is of
    :math:`(0, 1)` to luma and the ranges of U and V are :math:`(-0.436, 0.436)` and :math:`(-0.615, 0.615)`,
    respectively.

    The YUV model adopted here follows M/PAL values (see
    `BT.470-5 <https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-199802-S!!PDF-E.pdf>`_, Table 2,
    items 2.5 and 2.6).

Whoever did this, he was wrong. You may not reproduce those coefficients in yuvutils because coefficients aren't correct.

You could try to reproduce their coefficients:
See ITU-R (39) (40)

fn kr_kb_for_color_primaries(primaries: [[f32; 2]; 4]) -> (f32, f32) {
    let r_x = primaries[0][0];
    let r_y = primaries[0][1];
    let g_x = primaries[1][0];
    let g_y = primaries[1][1];
    let b_x = primaries[2][0];
    let b_y = primaries[2][1];
    let w_x = primaries[3][0];
    let w_y = primaries[3][1];
    let r_z = 1.0 - (r_x + r_y);
    let g_z = 1.0 - (g_x + g_y);
    let b_z = 1.0 - (b_x + b_y);
    let w_z = 1.0 - (w_x + w_y);
    let kr = (r_y
        * (w_x * (g_y * b_z - b_y * g_z) + w_y * (b_x * g_z - g_x * b_z) + w_z * (g_x * b_y - b_x * g_y)))
        / (w_y * (r_x * (g_y * b_z - b_y * g_z) + g_x * (b_y * r_z - r_y * b_z) + b_x * (r_y * g_z - g_y * r_z)));
    let kb = (b_y
        * (w_x * (r_y * g_z - g_y * r_z) + w_y * (g_x * r_z - r_x * g_z) + w_z * (r_x * g_y - g_x * r_y)))
        / (w_y * (r_x * (g_y * b_z - b_y * g_z) + g_x * (b_y * r_z - r_y * b_z) + b_x * (r_y * g_z - g_y * r_z)));
    (kr, kb)
}

From theirs link https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-199802-S!!PDF-E.pdf

Red/Green/Blue and White point for M/PAL:

   let (kr, kb) = kr_kb_for_color_primaries([
        [0.67, 0.33],
        [0.21, 0.71],
        [0.14, 0.08],
        [0.31, 0.316]
    ]);
   kr: 0.29896665 kb 0.11461213

Then from kr,kb you could compute your YUV matrix

pub fn get_forward_transform(
    range_rgba: u32,
    range_y: u32,
    range_uv: u32,
    kr: f32,
    kb: f32,
) -> CbCrForwardTransform<f32> {
    let kg = 1.0f32 - kr - kb;

    let yr = kr * range_y as f32 / range_rgba as f32;
    let yg = kg * range_y as f32 / range_rgba as f32;
    let yb = kb * range_y as f32 / range_rgba as f32;

    let cb_r = -0.5f32 * kr / (1f32 - kb) * range_uv as f32 / range_rgba as f32;
    let cb_g = -0.5f32 * kg / (1f32 - kb) * range_uv as f32 / range_rgba as f32;
    let cb_b = 0.5f32 * range_uv as f32 / range_rgba as f32;

    let cr_r = 0.5f32 * range_uv as f32 / range_rgba as f32;
    let cr_g = -0.5f32 * kg / (1f32 - kr) * range_uv as f32 / range_rgba as f32;
    let cr_b = -0.5f32 * kb / (1f32 - kr) * range_uv as f32 / range_rgba as f32;
    CbCrForwardTransform {
        yr,
        yg,
        yb,
        cb_r,
        cb_g,
        cb_b,
        cr_r,
        cr_g,
        cr_b,
    }
}

Which results in

get_forward_transform(255, 255, 255, kr, kb)
CbCrForwardTransform { yr: 0.29896665, yg: 0.58642125, yb: 0.11461213, cb_r: -0.1688337, cb_g: -0.3311663, cb_b: 0.5, cr_r: 0.5, cr_g: -0.41825488, cr_b: -0.08174513 }

And for limited range:

get_forward_transform(255, 219, 224, kr, kb);
CbCrForwardTransform { yr: 0.25675958, yg: 0.50363237, yb: 0.0984316, cb_r: -0.14830881, cb_g: -0.29090688, cb_b: 0.4392157, cr_r: 0.4392157, cr_g: -0.36740822, cr_b: -0.07180749 }

8-bit scale values doesn't matter here because it's relative values, they may contribute max 0.5% error.

Where is your final matrix should be:

Y = r * yr + g * yg + b * yb;
U = r * cb_r + g * cb_g + b * cb_g;
V = r * cr_r + g * cr_g + b * cr_b;

TLDR I'm sure original implementation is not correct- UV range may not be out of [-0.5; 0.5] so anything that produces values significantly out of range is immediately not trustworthy. You may follow original implementation for compatibility but testing against yuvutils have no meaning because it may not produce such values.

@edgarriba
Copy link
Member

edgarriba commented Mar 26, 2025

@awxkee thanks for looking into this, some context here kornia/kornia#934
pinging @oskarflordal @cceyda who has been helping with other color stuff in the pytorch Kornia library

@awxkee
Copy link

awxkee commented Mar 27, 2025

Issue you mentioned almost have no relation to coefficients. There is discussion about chroma subsampling, as far as I could see.
Here is one of authors mentioned that coeffs are incorrect and thats it.
The PRs attached to the issue definitely only address chroma subsampling, which is a different matter.

@edgarriba
Copy link
Member

@omar-abdelgawad can you try to apply then simplified matrix that @omar-abdelgawad provided. Curious to see how it compares also with OpenCV and other standard libraries?

@awxkee
Copy link

awxkee commented Mar 30, 2025

OpenCV use by.601 limited range matrix which is quite common. Nowadays, Bt.709 limited range is more common, because for any size larger than SD bt.709 should be used, but bt. 601 still default matrix on many systems.
https://github.com/opencv/opencv/blob/42a132088cfc2060e9ae816bbcf7ebfcab4f1de8/modules/imgproc/src/color_yuv.simd.hpp#L1096

TBH, I don’t know why M/PAL was chosen previously. It is quite uncommon matrix in the pc world. It’s purely tv matrix, you unlikely find it used somewhere except very specialized software.

@awxkee
Copy link

awxkee commented Mar 30, 2025

Most software will use by.601 for SD content, by.709 for HD content and bt.2020 for wide gamut or uhd content. This is common workflow, you’ll find it as default almost on any camera ( except professional cameras ) and device.

It pure analytical purposes is assumed bt.601 or bt.709 limited range is common choice. Might be sometimes full range, if you’re not expecting content directly from camera. Cameras are almost always in limited range except you’re configured otherwise. For pure analytics limited range might be worse, because it quantize an image.

@edgarriba
Copy link
Member

The primary goal is real time camera based processing, so I’d choose this path first. Probably something to validate first is from the camera pipes we have to take the raw I420 yuv format and convert to rgb8 as a common representation for most of the vision algorithms covered in the library

@awxkee
Copy link

awxkee commented Mar 30, 2025

Most software will use by.601 for SD content, by.709 for HD content and bt.2020 for wide gamut or uhd content

You need this pipeline then. You may ignore it and use OpenCV with BT.601 as always—it is still acceptable. Colors will be slightly washed out and have a "greenish effect", but this might be okay depending on the context.
This pipeline will match most cameras, except for some cheap ones that record HD content using BT.601.

@awxkee
Copy link

awxkee commented Mar 30, 2025

P.S I420 means Y'UV 4:2:0 exactly in Bt.601, just to make it clear. But this unlikely that all cameras will always send you I420. But as I already wrote, this might be acceptable.

@omar-abdelgawad
Copy link
Author

@edgarriba I would be happy to keep working on this PR but I am now a bit busy refining my proposal for "Implement GPU image transforms" (and I also have midterms 😢) so I would like to continue working on it after submitting my proposal.

@omar-abdelgawad
Copy link
Author

omar-abdelgawad commented Apr 9, 2025

@edgarriba Just to be clear on what I should do, I believe we agreed that I have fix the wrong range of values to be in [0.5;0.5] for UV values.

@omar-abdelgawad can you try to apply then simplified matrix that @omar-abdelgawad provided. Curious to see how it compares also with OpenCV and other standard libraries?

I will first try the coefficients provided

Which results in

get_forward_transform(255, 255, 255, kr, kb)
CbCrForwardTransform { yr: 0.29896665, yg: 0.58642125, yb: 0.11461213, cb_r: -0.1688337, cb_g: -0.3311663, cb_b: 0.5, cr_r: 0.5, cr_g: -0.41825488, cr_b: -0.08174513 }

I will also test their accuracy against yuv_444 Bt.601 with Full range and finally report their matching accuracy and inverse relation here to see if we that is the wanted behaviour.
does thatsound ok @edgarriba ?

@awxkee
Copy link

awxkee commented Apr 9, 2025

I420 is in BT.601 Limited Range with Y'UV 4:2:0 layout. You could check common YUV FourCC on libyuv page https://chromium.googlesource.com/libyuv/libyuv/+show/master/docs/formats.md.
From a practical standpoint, you won't find cameras recording in BT.601 Full Range—there's no real benefit to recording 640×480 in full range.

@omar-abdelgawad
Copy link
Author

Got it so I should test against rgb_to_yuv_420 with Bt.601 Limited Range 👍.

@edgarriba
Copy link
Member

edgarriba commented Apr 9, 2025

to add a bit more context, this is an implementation i've been effectively using in another project where we stream nv12 images which is probably the next feature I wanted to support in the StreamCapture component

pub fn yuv420sp_to_rgb8(image: &CompressedImage) -> Image<u8, 3> {
    let rows = image.rows();
    let cols = image.cols();

    let mut rgb8_data = vec![0; rows * cols * 3];
    let y_plane = image.as_slice()[..rows*cols];
    let uv_plane = image.as_slice()[rows*cols];

    for y in 0..rows {
        let index = y * cols;
        for x in 0..cols {
            let y_index = index + x;
            let uv_index = (y / 2) * (cols / 2) * 2 + (x / 2) * 2;

            let y = y_plane[y_index] as f32;
            let u = uv_plane[uv_index] as f32 - 128.0;
            let v = uv_plane[uv_index + 1] as f32 - 128.0;

            let r = (y + 1.402 * v).clamp(0.0, 255.0) as u8;
            let g = (y - 0.34414 * u - 0.71414 * v).clamp(0.0, 255.0) as u8;
            let b = (y + 1.772 * u).clamp(0.0, 255.0) as u8;

            let rgb_index = y_index * 3;
            rgb8_data[rgb_index] = r;
            rgb8_data[rgb_index + 1] = g;
            rgb8_data[rgb_index + 2] = b;
        }
    }

@awxkee
Copy link

awxkee commented Apr 9, 2025

This is Bt.601 full range matrix, not limited.
This is an effect you'll see using invalid YUV range.
gcanat/video_reader-rs#45

I'm sure that 99% cameras stream NV12 in limited range, but I don't know the context. Since, I'm here I just prefer to clarify how it works.

You could re-check matrix with Bt.601 limited in OpenCV, and see its different, and bias is missing in your code. If it actually should be NV12 Bt.601 limited.
https://github.com/opencv/opencv/blob/42a132088cfc2060e9ae816bbcf7ebfcab4f1de8/modules/imgproc/src/color_yuv.simd.hpp#L1035

@awxkee
Copy link

awxkee commented Apr 9, 2025

I wrote once for image-rs simple and reasonable effective implementation supporting almost arbitrary inversion matrix.
https://github.com/image-rs/image/blob/f337e27aadaae8b86484429bc6020fef8a019c95/src/codecs/avif/yuv.rs#L625
I think it might be easier for @omar-abdelgawad just drop in few minutes fixed point there and reuse :)
Or, if RGB8 is target it might be reused at all.

@edgarriba
Copy link
Member

got it -- actually you found a bug in my other project :)

Agree, let's do then u8 with limited. For simplicity and consistency with the rest of the library I would like that we keep separated functionality under different signatures eg yuv420_limited_to_rgb8. Generalization and high level apis will come later in a separated (we are still early in terms of global design)

@edgarriba
Copy link
Member

I'll spend some time to enable support for i420 in the gstreamer pipeline to validate this /cc @AS1100K

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement YUV to RGB
3 participants