Here's how I fixed it in code.
MonoOkay. Don't judge me, but the first thing I did is to convert the incoming stereo audio down to mono. Most music tends to be well spatialised, while the voice track comes through the center channel. By converting to mono I estimate a ~3dB drop in perceived volume during musical interludes, or, conversely, I can raise the total volume by 3dB without waking up the boys.
BassThe next thing to do is to cut the low frequencies. The rumbles and the explosions. The ones which reverberate throughout the house and wake up the kids and the neighbors too.
It's relatively easy to find info on constructing a digital high-pass filter on the internet, provided you can cut through the jargon. Generally there tends to be an incoming signal, denoted as xn. It's the series of samples coming from your video decoder. And then there's an outgoing signal, yn. It's the series of samples that you send to the sound card.
Then there's some function that links them together. Here's an example of a really simple high-pass filter you can find on the internet :
yn = 0.2 . xn + 0.8 . yn-1
It's a simple matter to turn this into code:
|A simple high-pass filter, in C++.|
To work out the frequency response, we can use the Z-transform to find the Transfer Function, which maps our filter from the discrete, time domain, into the continuous, frequency domain. In our case:
H(z) = Y(z)/X(z) = 0.2 / (1 + 0.8 . z-1)The cool thing is we can use theory to treat z like a complex number, even though in practice, all the xn and yn will be floating point numbers. For example, to compute the frequency response at 1000Hz, with a sample rate of 48kHz, we take:
i = √-1
FrequencyResponse(1000Hz) = | H(2π1000/48000 i) |
= | 0.2/(1 + 0.8 / (2π1000i/48000) ) |
If we draw this on a graph, with frequency (Hz) on the X-axis, and magnitude (dB) on the Y-axis, we can get a "Bode Plot" of our filter:
|A Bode Plot for a simple filter.|
OptimizeBack in the old days, we would have had to endlessly try different filter combinations and painstakingly compute Bode plots to find combinations of filters that might fit our requirements.
But now that we all have super-computers under our desks, we've got much more powerful tools to solve an old problem in a new way.
In pseudo-code, it looks like this :
That's right, lets just use an offline optimizer to compute the optimal co-efficients for us! You could use numpy, or octave for this.def GetDesiredFrequencyResponse(frequency):// Your desired EQ function goes here
def EvaluateFilter(filter):errorSum = 0for frequency in 20 .. 22000:freq1 = GetDesiredFrequencyResponse(frequency)freq2 = filter.CalcFrequencyResponse(frequency)errorTerm = freq1 - freq2errorSum += errorTerm * errorTermreturn errorSum
bestFilter = Minimize(FilterFactory, EvaluateFilter)
... And Generate the Code
|'f32' is a 32-bit float|
Here's a function which writes out the C++ directly. We can just copy and paste directly into ProcessAudio(), hit recompile, and hear the results immediately.
Notice the coding style - it's rife with buffer overflows, and makes huge unjustified assumptions about the inputs. You certainly couldn't use this code in a production environment, but as the scaffolding to bootstrap a single-use filter, the iteration speed trumps all other considerations.
Here's a sample output :
static float y1 = 0.0f, y2 = 0.0f;
float sum = 0.211989*x
float y0 = (sum -9.433180*y1 +31.500277*y2) / 0.000017;
y2 = y1;
y1 = y0;
So this is part 1 of 2, in the next post I'll go into the details of the actual filters I'm using on my television, as well as some sample video so you can hear before and after. I'll try and include tips for how to ensure convergence, and some gotchas that the Bode plot won't tell you about. And of course, if there's anything you'd like to know more, why not suggest it in the comments below?