Description
Add DCT-2 (Type-II Discrete Cosine Transform) functionality to ulab, similar to SciPy's implementation. DCT is widely used in image and audio processing applications.
Wikipedia: https://en.wikipedia.org/wiki/Discrete_cosine_transform
SciPy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.dct.html
Technical Details
- Implementation can be based on FFT using one of four established methods (detailed in this DSP StackExchange post)
- I've benchmarked all four methods on a Raspberry Pi Pico using the 2025-01-01 micropython-builder build
- Benchmark results and the code are at the end of this post
Benefits
- Essential transform for signal processing applications
- Minimal code footprint when implemented using existing FFT
- Potential for performance improvement with C implementation (as opposed to users writing their own DCT)
- Aligns with ulab's goal of providing NumPy/SciPy-like functionality
Benchmark
DCT_2N_M
is the mirrored versionDCT_2N_P
is the padded version.- Exprange is the time taken for
np.exp(...)
(refer to code below) , which can be memoized.
N | DCT_4N | DCT_2N_M | DCT_2N_P | DCT_N | FFT | EXPRANGE |
---|---|---|---|---|---|---|
4 | 1.455 | 1.976 | 2.026 | 2.076 | 0.462 | 0.992 |
8 | 2.136 | 2.541 | 2.568 | 2.476 | 0.678 | 1.199 |
16 | 3.618 | 3.691 | 3.741 | 3.293 | 1.096 | 1.654 |
32 | 6.819 | 6.125 | 6.244 | 4.923 | 1.735 | 2.532 |
64 | 13.798 | 11.278 | 11.448 | 8.36 | 3.23 | 4.312 |
128 | 28.994 | 22.136 | 22.56 | 15.504 | 6.481 | 7.843 |
256 | 61.7 | 45.2 | 47.0 | 30.6 | 13.2 | 14.6 |
512 | 133.5 | 96.0 | 98.5 | 62.8 | 28.6 | 29.5 |
1024 | 284.9 | 201.0 | 205.3 | 132.6 | 62.1 | 59.1 |
2048 | 611.8 | 416.7 | 425.3 | 268.6 | 134.5 | 117.2 |
I would recommend DCT_N (DCT_FFTN
) in code with the L = 2 * np.exp(...)
memoized for values of N for the last few calls. FFT compute and L = 2 * np.exp(...)
cache memory occupied are the least for it.
If there is interest and drive to implement this feature, I will help find FFT implementations for other forms of DCT. I would implement this and create a PR, but wanted to check the waters first, is there interest in the community for this?
Code
from ulab import numpy as np
from time import ticks_ms, ticks_diff
def timer(func):
def wrapper(*args, **kwargs):
start = ticks_ms()
for i in range(10):
result = func(*args, **kwargs)
end = ticks_ms()
print(f"[] {func.__name__} took {ticks_diff(end, start)/10} milliseconds")
return result
return wrapper
@timer
def FFT(x):
return np.fft.fft(x)
@timer
def DCT_FFT4N(x):
N = x.shape[-1]
u = np.zeros(4 * N)
u[1:2*N:2] = x
u[2*N+1::2] = x[::-1]
U = np.fft.fft(u)[:N]
return U.real
@timer
def DCT_FFT2NM(x):
N = x.shape[-1]
y = np.zeros(2*N)
y[:N] = x
y[N:] = x[::-1]
Y = np.fft.fft(y)[:N]
L = np.exp(-1j*np.pi*np.arange(N)/(2*N))
return (Y.real * L.real) - (Y.imag * L.imag)
@timer
def DCT_FFT2NP(x):
N = x.shape[-1]
y = np.zeros(2*N)
y[:N] = x
Y = np.fft.fft(y)[:N]
L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))
return (Y.real * L.real) - (Y.imag * L.imag)
@timer
def DCT_FFTN(x):
N = x.shape[-1]
v = np.zeros(x.shape, dtype=x.dtype)
v[:(N-1)//2+1] = x[::2]
if N % 2: # odd length
v[(N-1)//2+1:] = x[-2::-2]
else: # even length
v[(N-1)//2+1:] = x[::-2]
V = np.fft.fft(v)
L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))
return (V.real * L.real) - (V.imag * L.imag)
@timer
def EXPRANGE(N):
L = 2 * np.exp(-1j*np.pi*np.arange(N)/(2*N))
return L
for i in range(2, 12):
P = 2**i
x = np.arange(P)
print(f"\nP = {P}")
y = FFT(x)
y = DCT_FFT4N(x)
y = DCT_FFT2NM(x)
y = DCT_FFT2NP(x)
y = DCT_FFTN(x)
z = EXPRANGE(P)