Fourier Transforms
Fourier Transforms can be used to transform (change one set of data to another set of data) between time space and frequency space. The basic idea is that every signal, no matter how complex, can be represented using pure sinusoids at different frequencies. There are 4 different categories of the fourier transform, but for DSP we’re mostly concerned with the Discrete Fourier Transform, which is the fourier transform that concerns a sampled (discrete), periodic signal. Each of these categories can be split out further into real and complex versions, which take a different input.
A Fourier Transform is relatively complex for non-mathematicians (like me) so before continuing into the exact working i’d recommend reading about Complex Numbers as well as Polar- & Rectangular Notation.
Discrete Fourier Transform (DFT)
The discrete fourier transform is a function that returns a series of complex numbers that represents the frequency content of it’s input signal.
The amount of complex numbers returned (we call these bins) depends on the length of the input signal. This is what we call the resolution of the signal. The resolution is generall half of the signal length, because you need a signal to make at least one cycle in the window to properly detect it’s frequency. The equations for the Discrete Fourier Transform are as follows:
$$ ReX[k] = \sum_{i = 0}^{N-1}x[i]\cos(2\pi k i/N) \newline ImX[k] = -\sum_{i = 0}^{N - 1}x[i]\sin(2\pi k i/N) $$
Where $N$ is the size of the input signal with $k$ running from 0 to $N/2$ and $i$ running from 0 to $N - 1$.
Synthesis using the inverse DFT
The inverse FFT can be used to reconstruct the time domain signal from the frequency domain signal. A bit more work goes into making this work though, due to scaling factor issues.
This is the equation for the inverse DFT:
$$ x[i] = \sum_{k = 0}^{N/2} Re\bar{X}[k]\cos(2\pi ki/N) + \sum_{k = 0}^{N/2} Im\bar{X}[k]\sin(2\pi ki/N) $$
This can construct a discrete signal $x$ with $i$ running over time from 0 to $N - 1$. The amplitudes we need for reconstructing the signal are slightly different than the frequency domain. The equations to compensate for that are as follows:
$$ Re\bar{X}[k] = \frac{ReX[k]}{N/2} \newline Im\bar{X}[k] = -\frac{ImX[k]}{N/2} $$
except for two special values of k:
$$ Re\bar{X}[0] = \frac{ReX[0]}{N} \newline Re\bar{X}[N/2] = \frac{ReX[N/2]}{N/2} $$
Parseval’s Relation
Because the time and frequency representations represent the same signal, their energy must be the same. This is known as Parseval’s Relation and can be expressed for every Fourier transform.
For the DFT this is expressed as follows:
$$ \sum_{i=0}^{N-1}x[i]^2 = \frac{2}{N}\sum_{k=0}^{N/2}MagX[k]^2 $$
Complex DFT
The Complex DFT is a “more sophisticated” version of the DFT, that takes in 2 $N$ point time domain signals and returns 2 $N$ point domain signals (both in the form of complex numbers, hence the name). To convert a real DFT input to an input for the complex DFT what we do is we keep the DFT input that we already have for the real DFT and set all the samples in the imaginary part to 0.
The negative frequencies in a complex DFT are included in the second part of the DFT. This means that if you have a window of length $N$, the positive frequencies are contained in $0$ - $N/2$ and the negative frequencies in $N/2 + 1$ - $N - 1$. Like with the Real DFT the negative frequencies are symmetric, meaning point $N/2 + 1$ is the same as $N/2 - 1$. The same thing goes for the complex part, except that the sign is flipped. This is not very efficient to do on computers (certainly not good enough for real time analysis), which is where the next version of the Discrete Fourier Transform comes in.
Discrete Time Fourier Transform (DTFT)
The Discrete Time Fourier Transform is a form of fourier transform that takes a discrete, non-periodic input signal and returns a Complex number. To quote the DSP guide:
This is the DTFT, the Fourier transform that relates an aperiodic, discrete signal, with a periodic, continuous frequency spectrum.
The analysis equotion for the DTFT is as follows:
$$ ReX(\omega) = \sum_{n = -\infty}^{+\infty}x[n]\cos(\omega n) \newline ImX(\omega) = -\sum_{n=-\infty}^{+\infty}x[n]\sin(\omega n) $$
With the synthesis equation being
$$ x[n] = \frac{1}{\pi}\int_{0}^{\pi}ReX(\omega)\cos(\omega n) - ImX(\omega)\sin(\omega n)d\omega $$
In these formulas the following variables are used:
- $k$, which is an index that runs from 0 to $N/2$
- $f$ which is the fraction of the sampling rate, running from 0 to 0.5.
- $\omega$ which can be (and commonly is) used instead of $f$. This is the fraction of the sample rate (like $f$ is) but expressed as the natural frequency.
$\omega$ tends to be used because it’s shorter to write (you don’t have to do the whole $2\pi f$ thing).
Another thing to note is that many people use $\Omega$ to represent the frequency in the DTFT (instead of $\omega$), these function the same though.
The Inverse DTFT doesn’t need the same compensation as the Inverse DFT does. But has a similar compensation using $1/\pi$. Which you should use to compensate either in the analysis or synthesis function (note: some people split the term and put $1/\sqrt{\pi}$ in front of both functions). Because the inverse DTFT is a continuous function, there is no need to treat the start/end of the window differently though.
Because the DTFT contains infinite sums and integrals, it cannot be used in digital audio. It’s main use (according to the DSP guide)is in theoretical problems as an alternative to the DFT.
Fast Fourier Transform
Starting of the section with a quote from The Scientist and Engineer’s Guide to Digital Signal Processing (emphasis mine):
The FFT is a complicated algorithm, and its details are usually left to those that specialize in such things. This section describes the general operation of the FFT, but skirts a key issue: the use of complex numbers. If you have abackground in complex mathematics, you can read between the lines to understand the true nature of the algorithm. Don’t worry if the details elude you; few scientists and engineers that use the FFT could write the program from scratch.
The FFT works by splitting up a signal of a length $N$ (which has to be a power of 2) down recursively until it can processes $N$ signals of one point. This happens in an interlaced fashion, splitting between samples with an odd and an even index.
The same result of this can be found by bit reversal sorting the input signal.
The step after is calculating the frequency spectra of the single point signals, which, because it’s a single point, will be itself.