Week 11: Shazam
Chris Tralie
Overview
As we've been discussing, one of the key stems in the Shazam algorithm, as discussed in Wang 2003, is to pick time/frequency bins that are local maxes within a certain window. Fill in the code below to do this. Click here to download some tunes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | % load_ext autoreload % autoreload 2 % matplotlib notebook import numpy as np import matplotlib.pyplot as plt import IPython.display as ipd from scipy.ndimage import maximum_filter from spectrogramtools import * import scipy.io as sio sr, x = sio.wavfile.read( "ExampleQueries/baddayClean.wav" ) x = np.array(x, dtype = float ) win_length = 2048 hop_length = 512 max_freq = 128 S = STFT(x, win_length, hop_length) S = np. abs (S) orig_shape = S.shape[ 0 ] S = S[ 0 :max_freq, :] time_win = 8 freq_win = 3 SM = np.zeros_like(S) ## TODO: Fill this in. Put a 1 at SM[i, j] if ## S[i, j] is greater than its neighbors in the box ## [i-freq_win, i+freq_win] x [j-time_win, j+time_win] X, Y = np.meshgrid(np.arange(SM.shape[ 1 ]), np.arange(SM.shape[ 0 ])) X = X[SM = = 1 ] Y = Y[SM = = 1 ] plt.figure(figsize = ( 8 , 6 )) plt.subplot( 211 ) plt.imshow(S, aspect = 'auto' , cmap = 'magma_r' ) plt.scatter(X, Y) plt.gca().invert_yaxis() plt.subplot( 212 ) plt.imshow(SM, aspect = 'auto' ) plt.gca().invert_yaxis() plt.tight_layout() |
If it works, you should see something like this
Once you get it to work, I'm going to blow your mind with a sonification of what we have