Skip to content

[FEATURE REQUEST] Add DCT #702

Open
Open
@zeroby0

Description

@zeroby0

Add DCT-2 (Type-II Discrete Cosine Transform) functionality to ulab, similar to SciPy's implementation. DCT is widely used in image and audio processing applications.

Wikipedia: https://en.wikipedia.org/wiki/Discrete_cosine_transform
SciPy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.dct.html

Technical Details

  • Implementation can be based on FFT using one of four established methods (detailed in this DSP StackExchange post)
  • I've benchmarked all four methods on a Raspberry Pi Pico using the 2025-01-01 micropython-builder build
  • Benchmark results and the code are at the end of this post

Benefits

  • Essential transform for signal processing applications
  • Minimal code footprint when implemented using existing FFT
  • Potential for performance improvement with C implementation (as opposed to users writing their own DCT)
  • Aligns with ulab's goal of providing NumPy/SciPy-like functionality

Benchmark

  • DCT_2N_M is the mirrored version
  • DCT_2N_P is the padded version.
  • Exprange is the time taken for np.exp(...) (refer to code below) , which can be memoized.
N DCT_4N DCT_2N_M DCT_2N_P DCT_N FFT EXPRANGE
4 1.455 1.976 2.026 2.076 0.462 0.992
8 2.136 2.541 2.568 2.476 0.678 1.199
16 3.618 3.691 3.741 3.293 1.096 1.654
32 6.819 6.125 6.244 4.923 1.735 2.532
64 13.798 11.278 11.448 8.36 3.23 4.312
128 28.994 22.136 22.56 15.504 6.481 7.843
256 61.7 45.2 47.0 30.6 13.2 14.6
512 133.5 96.0 98.5 62.8 28.6 29.5
1024 284.9 201.0 205.3 132.6 62.1 59.1
2048 611.8 416.7 425.3 268.6 134.5 117.2

I would recommend DCT_N (DCT_FFTN) in code with the L = 2 * np.exp(...) memoized for values of N for the last few calls. FFT compute and L = 2 * np.exp(...) cache memory occupied are the least for it.

If there is interest and drive to implement this feature, I will help find FFT implementations for other forms of DCT. I would implement this and create a PR, but wanted to check the waters first, is there interest in the community for this?

Code

from ulab import numpy as np
from time import ticks_ms, ticks_diff


def timer(func):
    def wrapper(*args, **kwargs):
        start = ticks_ms()
        for i in range(10):
            result = func(*args, **kwargs)
        end = ticks_ms()
        print(f"[] {func.__name__} took {ticks_diff(end, start)/10} milliseconds")
        return result
    return wrapper


@timer
def FFT(x):
    return np.fft.fft(x)


@timer
def DCT_FFT4N(x):
    N = x.shape[-1]

    u = np.zeros(4 * N)
    u[1:2*N:2] = x
    u[2*N+1::2] = x[::-1]

    U = np.fft.fft(u)[:N]
    return U.real


@timer
def DCT_FFT2NM(x):
    N = x.shape[-1]

    y = np.zeros(2*N)
    y[:N] = x
    y[N:] = x[::-1]

    Y = np.fft.fft(y)[:N]

    L = np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (Y.real * L.real) - (Y.imag * L.imag)


@timer
def DCT_FFT2NP(x):
    N = x.shape[-1]

    y = np.zeros(2*N)
    y[:N] = x

    Y = np.fft.fft(y)[:N]

    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (Y.real * L.real) - (Y.imag * L.imag)


@timer
def DCT_FFTN(x):
    N = x.shape[-1]

    v = np.zeros(x.shape, dtype=x.dtype)
    v[:(N-1)//2+1] = x[::2]

    if N % 2: # odd length
        v[(N-1)//2+1:] = x[-2::-2]
    else: # even length
        v[(N-1)//2+1:] = x[::-2]

    V = np.fft.fft(v)

    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (V.real * L.real) - (V.imag * L.imag)


@timer
def EXPRANGE(N):
    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))
    return L


for i in range(2, 12):

    P = 2**i

    x = np.arange(P)

    print(f"\nP = {P}")
    y = FFT(x)
    y = DCT_FFT4N(x)
    y = DCT_FFT2NM(x)
    y = DCT_FFT2NP(x)
    y = DCT_FFTN(x)

    z = EXPRANGE(P)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions