Aligning tab to audio¶
Having completed the tab parsing step, we have extracted the chord labels and their corresponding line and word numbers from the tab file. However, tab files retain no timing information, so we need an additional step to align the chord labels to the audio file. There already exist four different algorithms by cite{mcvicar2011using} that incorporate tab information into a HMM-based system for audio chord estimation. The most promising of these four algorithms is Jump Alignment.
Jump Alignment is based on a Hidden Markov Model (HMM). A HMM models the joint probability distribution P(X, y | Theta) over the feature vectors X and the chord labels y, where Theta are the parameters of the model.
Preprocessing: feature extraction¶
First, the audio file needs to be preprocessed. For this purpose, we use the python package librosa. First, we convert the audio file to mono. Then, we use the HPSS function to separate the harmonic and percussive elements of the audio. Then, we extract chroma from the harmonic part, using constant-Q transform with a sampling rate of 22050 and a hop length of 256 samples. Now we have chroma features for each sample, but we expect that the great majority of chord changes occurs on a beat. Therefore, we beat-synchronize the features: we run a beat-extraction function on the percussive part of the audio and average the chroma features between the consecutive beat positions. The chord annotations need to be beat-synchronized as well. We do this by taking the most prevalent chord label between beats. Each mean feature vector with the corresponding beat-synchronized chord label is regarded as one frame. Now we have the feature vectors X and chord labels y for each song, which we feed to our HMM.
-
decibel.audio_tab_aligner.feature_extractor.
beat_align_ground_truth_labels
(ground_truth_labels_path: str, beat_times: numpy.ndarray) → List[str][source]¶ Beat-synchronize the reference chord annotations, by assigning the chord with the longest duration within that beat
- Parameters
ground_truth_labels_path – Path to the ground truth file
beat_times – Array of beats, measured in seconds
- Returns
List of chords within each beat
-
decibel.audio_tab_aligner.feature_extractor.
export_audio_features_for_song
(song: decibel.music_objects.song.Song) → None[source]¶ Export the audio features of this song to a file.
For this purpose, we use the python package librosa. First, we convert the audio file to mono. Then, we use the HPSS function to separate the harmonic and percussive elements of the audio. Then, we extract chroma from the harmonic part, using constant-Q transform with a sampling rate of 22050 and a hop length of 256 samples. Now we have chroma features for each sample, but we expect that the great majority of chord changes occurs on a beat. Therefore, we beat-synchronize the features: we run a beat-extraction function on the percussive part of the audio and average the chroma features between the consecutive beat positions. The chord annotations need to be beat-synchronized as well. We do this by taking the most prevalent chord label between beats. Each mean feature vector with the corresponding beat-synchronized chord label is regarded as one frame.
- Parameters
song – Song for which we export the audio features
Jump Alignment¶
Jump Alignment is an extension to the HMM, which utilizes the chords that are parsed from tabs. Following cite{mcvicar2011using}, we refer to these chords parsed from tab files as Untimed Chord Sequences (UCSs). Compared to the original HMM, in the Jump Alignment algorithm the state space and transition probabilities are altered in such a way that it can align the UCSs to audio, while allowing for jumps to the start of other lines.
-
decibel.audio_tab_aligner.jump_alignment.
jump_alignment
(chords_from_tab_file_path: str, audio_features_path: str, lab_write_path: str, hmm_parameters: decibel.audio_tab_aligner.hmm_parameters.HMMParameters, p_f: float = 0.05, p_b: float = 0.05) -> (<class 'float'>, <class 'int'>)[source]¶ Calculate the optimal alignment between tab file and audio
- Parameters
chords_from_tab_file_path – Path to chords from tab file
audio_features_path – Path to audio features
lab_write_path – Path to the file to write the chord labels to
hmm_parameters – HMMParameters obtained in the training phase
p_f – Forward probability
p_b – Backward probability
- Returns
best likelihood and best transposition
-
decibel.audio_tab_aligner.jump_alignment.
test_single_song
(song: decibel.music_objects.song.Song, hmm_parameters: decibel.audio_tab_aligner.hmm_parameters.HMMParameters) → None[source]¶ Estimate chords for each tab matched to the song and export them to a lab file.
- Parameters
song – Song for which we estimate tab-based chords
hmm_parameters – Parameters of the trained HMM
-
decibel.audio_tab_aligner.jump_alignment.
train
(chord_vocabulary: decibel.music_objects.chord_vocabulary.ChordVocabulary, train_songs: Dict[int, decibel.music_objects.song.Song]) → decibel.audio_tab_aligner.hmm_parameters.HMMParameters[source]¶ Train the HMM parameters on training_set for the given chords_list vocabulary
- Parameters
chord_vocabulary – List of chords in our vocabulary
train_songs – Set of songs for training
- Returns
HMM Parameters