# FFT short time fourier transform (STFT) question

Discussion in 'Electronic Design' started by Jamie, Aug 11, 2011.

1. ### JamieGuest

Hi,

I have begun to use an FFT in software, and I started thinking about how
to turn the intensity over frequency FFT chart into a frequency over
time chart, ie, by making a strip chart, time/frequency domain "Gabor
spectrum STFT" with each XY point of the chart having an intensity based
on the power at that frequency. So I was thinking to take that a step
further (and push the limits of a modern CPU), instead of generating the
strip chart data with FFT blocks of 1000 samples, what would happen if
there was an FFT generated for each incoming sample, ie 1000 FFT's per
second for a 1kHz sample rate, with each new FFT centered on the next
incoming sample. If this data was then stripcharted (1000 new columns
per second added to the chart) for a sine wave input of 250Hz, what
would the frequency over time greyscale image look like? For a normal
Gabor spectrum STFT chart (ie. 1 FFT per 1000 samples) with a 250Hz sine
wave input and 1kHz sampling frequency, the chart should just show a
steady intensity peak from left to right half way up the chart (500Hz
Nyquist frequency would be the top of the chart). Also this might be a
good way to encode/decode data for communication, in a 2-dimensional
representation that is converted via FFT's to and from voltage/time for
each sample, as long as the hardware can do one FFT per sample.

Gabor spectrum (STFT short time fourier transform chart:
http://sonicawe.com/content/tutorials/img/tut5/image06.png

cheers,
Jamie

2. ### JamieGuest

I think the real benefit of 1FFT per sample is it gives instant impulse
response compared to 1FFT per 1000 samples. ie. if the 250Hz sine wave
is changed to 225Hz, each new per sample FFT will have a moving
intensity line and the slope of this line will be proportional to how
fast the 250Hz was changed to 225Hz.

cheers,
Jamie

3. ### Martin BrownGuest

The main effect of this if as seems likely from the question that you
don't know what you are doing is that you will see variable ripple
effects in the sidebands caused by the edge discontinuities at the start
and end of the sampling window. You would be better off doing larger
overlaps if you really need time domain resolution beyond what
sequential blocking will give you. The term for it in their application
is "Overlap". BTW such diagrams are often easier to interpret in false
colour rather than monochrome since you can distinguish several orders
of magnitude difference in signal strength more easily.
DAQARTA does this stuff out of the box and is pretty good at it too.
Why reinvent the wheel?

Regards,
Martin Brown

4. ### JamieGuest

I'm just using a rectangular FFT windowing function so far.

You would be better off doing larger
I think there must be a formula to show that there is increased
frequency resolution available based on the amount of FFT overlap. ie.
like ADC voltage oversampling gives more effective voltage resolution
bits, overlapped FFT's should give more frequency bins, up to the max
overlap of the FFT window size - 1 sample. (1 FFT per sample).
Theoretically to add the most extra frequency bins at the highest time
resolution, the FFT window size should be as large as possible (n), and
the FFT overlap should be the FFT window size - 1 sample.

I take a wild guess that for non-overlap FFT with 2^x bins, adding the
full n-1 overlap can theoretically increase the effective number of
frequency bins to 2^(x+p) where p is some small integer related to the
FFT window size n. For an FFT window size n = 2^10, p might be 5 or 10
at most. That would require some form of post processing to generate
the missing extra frequency bins from all that FFT data though, similar
to decimation of ADC sample rate. Also 1 FFT per sample will generate
data at 1024x the rate of the sample frequency for a FFT window size of
1024, and use 1024 times more CPU than normal FFT code, but its possible
to get more effective frequency bins I am pretty sure with that technique.

cheers,
Jamie

5. ### JamieGuest

DAQARTA has the limitation that there is a tradeoff between frequency
and temporal resolution in the spectrogram output.

I found this: www.comm.utoronto.ca/~dimitris/ece431/slidingdft.pdf

sliding DFT, it is designed to do 1FFT per incoming sample, and once the
first FFT is done, each successive FFT only takes "two real adds and one
complex multiply" independent of the FFT window size! (assuming the last
incremental FFT is available)

So now I just need to find an FFT decimation filter to crunch the 1 FFT
per sample data down, then can increase frequency and temporal
resolution on a spectrogram at the same time

cheers,
Jamie

6. ### Martin BrownGuest

There is *always* a tradeoff between frequency resolution and time. They
are conjugate variables in the Fourier transform, confine one and the
other spreads out - this is intrinsic to the mathematics.

If you want better frequency resolution and have decent signal to noise
then you can use maximum entropy which gives the optimum tradeoff of
resolution against signal to noise but with some risk of unfamiliar
artefacts for practitioners not skilled in Fourier interpretation.

An introduction is in Numerical Recipes (whilst I don't think much of
their algorithm or code it does sort of work)

http://homepage.univie.ac.at/mario.barbatti/papers/NRF/bookfpdf/f13-7.pdf

Better algorithms do exist. You can't use them realtime unless you have
an exceptionally fast processor, but offline it will give you the most
precise answer for the frequencies of significant components in a time
series with an accuracy that depends on their intensity.

Strong signals have a much better defined frequency than weak ones.
Overfit the system noise and they will spuriously split into two.
You do realise this only calculates a single known frequency component
don't you?

To do a whole spectrum this way would be O(N^2) not fast at all. Or
require massive hardware parallelism (as is done for sigint devices).

Regards,
Martin Brown

7. ### JamieGuest

Hi,

Thanks, I missed that line: "the kth value of the DFT". I am starting
to think the best method for what I would like to do is to use parallel
bandpass filters, with slight overlap. Setting a threshold trigger for
delta-power over time on the output of each of these filters can give
accurate time resolution of frequency events that an FFT can't do.

cheers,
Jamie

8. ### Jon KirwanGuest

http://www.maxim-ic.com/app-notes/index.mvp/id/1762

Jon

9. ### Jon KirwanGuest

That's fast, but I was doing 1024 pt, complex-in, complex-out
floating point FFTs on an integer-only ADSP-2111 and memory
says it took slightly over 3ms -- back in 1992 and before
FFTW was around. I'd have to dig up the details to be
absolutely sure, but I coded it and that's what I recall now
as the rate (we needed less than 4.5ms.) I'm not so
impressed that in 2006 on fancy and big FPGAs the rate would
be 1.2ms (and not knowing if it handled complex in/out
values.) I learned out of Brigham's Fast Fourier Transform
and Its Applications. A book I definitely can recommend,
even if it is old. Probably some better stuff around now
with more graphics to help hammer in the points. But I've
not been exposed to recent writings.

Jon

10. ### DonMackGuest

"Jamie" wrote in message
Hi,

I have begun to use an FFT in software, and I started thinking about how
to turn the intensity over frequency FFT chart into a frequency over
time chart, ie, by making a strip chart, time/frequency domain "Gabor
spectrum STFT" with each XY point of the chart having an intensity based
on the power at that frequency. So I was thinking to take that a step
further (and push the limits of a modern CPU), instead of generating the
strip chart data with FFT blocks of 1000 samples, what would happen if
there was an FFT generated for each incoming sample, ie 1000 FFT's per
second for a 1kHz sample rate, with each new FFT centered on the next
incoming sample. If this data was then stripcharted (1000 new columns
per second added to the chart) for a sine wave input of 250Hz, what
would the frequency over time greyscale image look like? For a normal
Gabor spectrum STFT chart (ie. 1 FFT per 1000 samples) with a 250Hz sine
wave input and 1kHz sampling frequency, the chart should just show a
steady intensity peak from left to right half way up the chart (500Hz
Nyquist frequency would be the top of the chart). Also this might be a
good way to encode/decode data for communication, in a 2-dimensional
representation that is converted via FFT's to and from voltage/time for
each sample, as long as the hardware can do one FFT per sample.

Gabor spectrum (STFT short time fourier transform chart:
http://sonicawe.com/content/tutorials/img/tut5/image06.png

cheers,
Jamie

--------------------
It won't do any good. There is an innate inverse relationship between
frequency and time(f = 1/t). The Fourier transform's basis functions are not
localized and cannot describe local behavior well.

Take a square wave. The Fourier transform of this has infinite bandwidth.
Yet the function is one of the most basic functions that exist. The Fourier
transform has extreme difficulties describing functions that have sharp
transitions because sharp transitions occur over short periods of time and
short periods of time have high frequencies(the f = 1/t thing).

We generally like to think of a square wave or any other period function
with period p as having a frequency of 1/p. The Fourier transform can
recover this fundamental frequency but really it can't. Remember we have to
apply a low pass filter to it(mathematically we can take the limit though).

So, if something simple as a square wave has infinite bandwidth no matter
what kinda fourier transform you use you won't be able to describe it
accurately with finite bandwidth. Your sliding windowed Fourier transform
can't do any better than a multiresolution windowed Fourier transform.

There are several transforms that take a time based signal and transform it
in a frequency and time. The wavelet transform is one such transform.

But no matter how you slice and dice it, frequency and time are inversely
correlated. You can attempt to use a basis that is more local than the
fourier basis and you'll end up with something like wavelets. You can go
further and attempt to use the signal itself to determine the basis and get
something like empirical mode decomposition.

Ultimately you'll always end up with the same problem of trying to narrow
down the frequency and time and it just can't be done.

For example, what is the frequency of some signal s(t) at time t = 0? It's a
nonsensical question. Ultimately what you have to do to answer such a
question is not use the point at t = 0 but an interval around that point.
This allows you to "zoom out" a bit and see what the signal is doing and
hence determine it's frequency. For lower frequencies you have to zoom out
more because they have a larger period(p = 1/f and when f is small p is
large).

The real issue at hand is we define frequency as the rate of something
occurring periodically. Mathematically we actually compare it to sinusoids.
That is, sinusoids is the model for frequency. But a sinusoid has no
instantaneous frequency. Therefore anything that is composed of sinusoids
will not have instantaneous frequencies. Wavelets have more of a localized
frequency but now frequency becomes more vague(in fact wavelets do not have
frequency and we are comparing apples to oranges now).

There is a way to define instantaneous frequency as a mathematical
generalization:

http://en.wikipedia.org/wiki/Instantaneous_phase

The problem is that it seems to be useless to interpret as a real frequency
or we just haven't found out how.

i.e., with the instantaneous frequency we can actually answer the above
question about the frequency of s(t) at time 0 but don't expect for it to
make a lot of sense.

The main point is that you may not get much benefit but you are limited
mathematically so no matter your algorithms are they can't overcome the
frequency/time limitations described above. It is not inherent in the
mathematics itself but in the concept of frequency.

11. ### Jon KirwanGuest

No need. I'm well aware of the technique. Used it many
times. Can describe it, even.

Anyway, now I'm mostly curious if you are saying that you
were doing _two_ 1024 pt real arrays in 1.2ms or if you were
actually doing them in 2.4ms in a complex-in complex-out 1024
pt FFT that you packed and then disassembled. Completely
different performance meanings to me, depending.

Jon

12. ### Martin BrownGuest

I saw one a long time ago ~1983-4. On "loan" to a big dish radio
observatory and it came with two US guards for the duration.
At the time it was the fastest of its kind in the world.
I think he will struggle. FFTW would be the first thing to try to see if
the ballpark performance figures are even remotely close. Then stripping
it down to a hardwired fixed length transform and choosing a CPU with
appropriate cache lines to optimise a 1024 real transform. It is
anybodies guess what the best factorisation will be these days.

Intels IPPS benchmarks about 20% faster than FFTW3 at these lengths.

Temporal resolution could be improved using sub blocks to preserve
intermediate results and then twiddled together to avoid rework. The out
of place transforms are usually faster so this should not hurt.

Regards,
Martin Brown

14. ### Jon KirwanGuest

Now I understand much better and am more impressed, overall.
Thanks very much for the detailed recollections. I really

Jon

15. ### JamieGuest

Hi,

A single FFT done over 1024 samples, will give a linear frequency bin
output on the X-axis. However theoretically the frequency bin
dispersion can be logarithmic, so that there are more frequency bins
closer together at higher frequency for the same timestep. A single FFT
can't produce this data though as it averages over the whole sample
period. That is where using multiple overlapped FFT's is a possible
advantage, as a spectrogram chart calculated using overlapped FFT's
should be able to show a logarithm Y axis with the frequency bins
getting closer together up the Y-axis, giving more time and frequency
resolution for that timestep than lower frequency bins in the same timestep.

cheers,
Jamie

16. ### Martin BrownGuest

ITYM the opposite of what you said. There is an efficient constant Q
fast transform that gives something vaguely useful for audio and studio
band equaliser type displays. But it has worse resolution with
increasing frequency in qualitatively the same was as the human ear.

There are techniques to get very high resolution frequency measurements
from limited time sampled data provided that the signal to noise ratio
is sufficiently high. Heuristically time between zero crossings on a
simple waveform will give you a feel for why this is possible. To do it
for an unknown signal requires some complex mathermatics and it isn't
really amenable to realtime displays taking typically a couple of orders
of magnitude more processing time. But if you are stuck with the
measurements you can actually make and need the answer it can be useful.

Regards,
Martin Brown

17. ### JamieGuest

Hi,

Thanks for the clarification, I find it a fairly confusing topic!
I was looking at wavelets a bit which seem like they would be quite useful:
http://www.compsensor.com/images/wavelet.gif

cheers,
Jamie

18. ### josephkkGuest

No, it does not increase the frequency bins in any way. It does add back
some temporal information so that you can see how the various frequency
lines behave versus time.

?-)