[FEATURE REQUEST] Add DCT

Add DCT-2 (Type-II Discrete Cosine Transform) functionality to ulab, similar to SciPy's implementation. DCT is widely used in image and audio processing applications.

Wikipedia: https://en.wikipedia.org/wiki/Discrete_cosine_transform
SciPy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.dct.html


## Technical Details
* Implementation can be based on FFT using one of four established methods (detailed in [this DSP StackExchange post](https://dsp.stackexchange.com/a/10606))
* I've benchmarked all four methods on a Raspberry Pi Pico using the 2025-01-01 micropython-builder build
* Benchmark results and the code are at the end of this post


## Benefits
- Essential transform for signal processing applications
- Minimal code footprint when implemented using existing FFT
- Potential for performance improvement with C implementation (as opposed to users writing their own DCT)
- Aligns with ulab's goal of providing NumPy/SciPy-like functionality

## Benchmark

* `DCT_2N_M` is the mirrored version
* `DCT_2N_P` is the padded version.
* Exprange is the time taken for `np.exp(...)` (refer to code below) , which can be memoized.

| N    | DCT_4N    | DCT_2N_M | DCT_2N_P | DCT_N      | FFT   | EXPRANGE |
|------|-----------|----------|----------|------------|-------|----------|
| 4    | **1.455** | 1.976    | 2.026    | 2.076      | 0.462 | 0.992    |
| 8    | **2.136** | 2.541    | 2.568    | 2.476      | 0.678 | 1.199    |
| 16   | 3.618     | 3.691    | 3.741    | **3.293**  | 1.096 | 1.654    |
| 32   | 6.819     | 6.125    | 6.244    | **4.923**  | 1.735 | 2.532    |
| 64   | 13.798    | 11.278   | 11.448   | **8.36**   | 3.23  | 4.312    |
| 128  | 28.994    | 22.136   | 22.56    | **15.504** | 6.481 | 7.843    |
| 256  | 61.7      | 45.2     | 47.0     | **30.6**   | 13.2  | 14.6     |
| 512  | 133.5     | 96.0     | 98.5     | **62.8**   | 28.6  | 29.5     |
| 1024 | 284.9     | 201.0    | 205.3    | **132.6**  | 62.1  | 59.1     |
| 2048 | 611.8     | 416.7    | 425.3    | **268.6**  | 134.5 | 117.2    |


I would recommend DCT_N (`DCT_FFTN`) in code with the `L = 2 * np.exp(...)`  memoized for values of N for the last few calls. FFT compute and `L = 2 * np.exp(...)` cache memory occupied are the least for it.

If there is interest and drive to implement this feature, I will help find FFT implementations for other forms of DCT. I would implement this and create a PR, but wanted to check the waters first, is there interest in the community for this?


## Code
```python
from ulab import numpy as np
from time import ticks_ms, ticks_diff


def timer(func):
    def wrapper(*args, **kwargs):
        start = ticks_ms()
        for i in range(10):
            result = func(*args, **kwargs)
        end = ticks_ms()
        print(f"[] {func.__name__} took {ticks_diff(end, start)/10} milliseconds")
        return result
    return wrapper


@timer
def FFT(x):
    return np.fft.fft(x)


@timer
def DCT_FFT4N(x):
    N = x.shape[-1]

    u = np.zeros(4 * N)
    u[1:2*N:2] = x
    u[2*N+1::2] = x[::-1]

    U = np.fft.fft(u)[:N]
    return U.real


@timer
def DCT_FFT2NM(x):
    N = x.shape[-1]

    y = np.zeros(2*N)
    y[:N] = x
    y[N:] = x[::-1]

    Y = np.fft.fft(y)[:N]

    L = np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (Y.real * L.real) - (Y.imag * L.imag)


@timer
def DCT_FFT2NP(x):
    N = x.shape[-1]

    y = np.zeros(2*N)
    y[:N] = x

    Y = np.fft.fft(y)[:N]

    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (Y.real * L.real) - (Y.imag * L.imag)


@timer
def DCT_FFTN(x):
    N = x.shape[-1]

    v = np.zeros(x.shape, dtype=x.dtype)
    v[:(N-1)//2+1] = x[::2]

    if N % 2: # odd length
        v[(N-1)//2+1:] = x[-2::-2]
    else: # even length
        v[(N-1)//2+1:] = x[::-2]

    V = np.fft.fft(v)

    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))

    return (V.real * L.real) - (V.imag * L.imag)


@timer
def EXPRANGE(N):
    L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))
    return L


for i in range(2, 12):

    P = 2**i

    x = np.arange(P)

    print(f"\nP = {P}")
    y = FFT(x)
    y = DCT_FFT4N(x)
    y = DCT_FFT2NM(x)
    y = DCT_FFT2NP(x)
    y = DCT_FFTN(x)

    z = EXPRANGE(P)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE REQUEST] Add DCT #702

Technical Details

Benefits

Benchmark

Code

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

N	DCT_4N	DCT_2N_M	DCT_2N_P	DCT_N	FFT	EXPRANGE
4	1.455	1.976	2.026	2.076	0.462	0.992
8	2.136	2.541	2.568	2.476	0.678	1.199
16	3.618	3.691	3.741	3.293	1.096	1.654
32	6.819	6.125	6.244	4.923	1.735	2.532
64	13.798	11.278	11.448	8.36	3.23	4.312
128	28.994	22.136	22.56	15.504	6.481	7.843
256	61.7	45.2	47.0	30.6	13.2	14.6
512	133.5	96.0	98.5	62.8	28.6	29.5
1024	284.9	201.0	205.3	132.6	62.1	59.1
2048	611.8	416.7	425.3	268.6	134.5	117.2

[FEATURE REQUEST] Add DCT #702

Description

Technical Details

Benefits

Benchmark

Code

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions