Infinite Beyonce

One of the things we have to deal with in note-based version identification is that two versions may be transposed in pitch; that is, they are in different scales, and all of the notes have been shifted up or down by a constant number of halfsteps. Since we're using chroma features to summarize the notes, we should explore the effect that this has on the chroma features.

The example below is not a different version per se, but it's a perfect example of transposition. In the song "Love on Top," Beyonce transposes the chorus 5 times, moving up by a single halfstep each time. We see that the chromas look very similar, except they have been shifted up by one row

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
import IPython.display as ipd
import warnings
warnings.filterwarnings("ignore")
In [2]:
c = 1
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[2]:
In [3]:
c = 2
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[3]:
In [4]:
c = 3
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[4]:
In [5]:
c = 4
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[5]:

This is the last actual shift that Beyonce does

In [6]:
c = 5
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[6]:

Below I synthesized a few more shifts to show more clearly how this leads to a circular shift of the chroma features, since all octaves of a note are collapsed into a single equivalence class for that note

In [7]:
c = 6
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[7]:
In [8]:
c = 7
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[8]:
In [9]:
c = 8
plt.figure(figsize=(12, 4))
y, sr = librosa.load("{}.mp3".format(c))
C = librosa.feature.chroma_cqt(y=y, sr=sr)
librosa.display.specshow(C, y_axis='chroma', x_axis='time')
plt.title('{}'.format(c))
plt.colorbar()
ipd.Audio(y, rate=sr)
Out[9]:
In [ ]: