Week 11: Shazam

Chris Tralie

Overview

As we've been discussing, one of the key stems in the Shazam algorithm, as discussed in Wang 2003, is to pick time/frequency bins that are local maxes within a certain window. Fill in the code below to do this. Click here to download some tunes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
%load_ext autoreload
%autoreload 2
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
from scipy.ndimage import maximum_filter
from spectrogramtools import *
import scipy.io as sio
 
sr, x = sio.wavfile.read("ExampleQueries/baddayClean.wav")
x = np.array(x, dtype=float)
win_length = 2048
hop_length = 512
max_freq = 128
S = STFT(x, win_length, hop_length)
S = np.abs(S)
orig_shape = S.shape[0]
S = S[0:max_freq, :]
 
time_win = 8
freq_win = 3
 
SM = np.zeros_like(S)
## TODO: Fill this in.  Put a 1 at SM[i, j] if
## S[i, j] is greater than its neighbors in the box
## [i-freq_win, i+freq_win] x [j-time_win, j+time_win]
 
X, Y = np.meshgrid(np.arange(SM.shape[1]), np.arange(SM.shape[0]))
X = X[SM == 1]
Y = Y[SM == 1]
 
plt.figure(figsize=(8, 6))
plt.subplot(211)
plt.imshow(S, aspect='auto', cmap='magma_r')
plt.scatter(X, Y)
plt.gca().invert_yaxis()
plt.subplot(212)
plt.imshow(SM, aspect='auto')
plt.gca().invert_yaxis()
plt.tight_layout()

If it works, you should see something like this

Once you get it to work, I'm going to blow your mind with a sonification of what we have