Aligning tab to audio

Having completed the tab parsing step, we have extracted the chord labels and their corresponding line and word numbers from the tab file. However, tab files retain no timing information, so we need an additional step to align the chord labels to the audio file. There already exist four different algorithms by cite{mcvicar2011using} that incorporate tab information into a HMM-based system for audio chord estimation. The most promising of these four algorithms is Jump Alignment.

Jump Alignment is based on a Hidden Markov Model (HMM). A HMM models the joint probability distribution P(X, y | Theta) over the feature vectors X and the chord labels y, where Theta are the parameters of the model.

Preprocessing: feature extraction

First, the audio file needs to be preprocessed. For this purpose, we use the python package librosa. First, we convert the audio file to mono. Then, we use the HPSS function to separate the harmonic and percussive elements of the audio. Then, we extract chroma from the harmonic part, using constant-Q transform with a sampling rate of 22050 and a hop length of 256 samples. Now we have chroma features for each sample, but we expect that the great majority of chord changes occurs on a beat. Therefore, we beat-synchronize the features: we run a beat-extraction function on the percussive part of the audio and average the chroma features between the consecutive beat positions. The chord annotations need to be beat-synchronized as well. We do this by taking the most prevalent chord label between beats. Each mean feature vector with the corresponding beat-synchronized chord label is regarded as one frame. Now we have the feature vectors X and chord labels y for each song, which we feed to our HMM.

decibel.audio_tab_aligner.feature_extractor.beat_align_ground_truth_labels(ground_truth_labels_path: str, beat_times: numpy.ndarray) → List[str][source]

Beat-synchronize the reference chord annotations, by assigning the chord with the longest duration within that beat

Parameters
  • ground_truth_labels_path – Path to the ground truth file

  • beat_times – Array of beats, measured in seconds

Returns

List of chords within each beat

decibel.audio_tab_aligner.feature_extractor.export_audio_features_for_song(song: decibel.music_objects.song.Song)None[source]

Export the audio features of this song to a file.

For this purpose, we use the python package librosa. First, we convert the audio file to mono. Then, we use the HPSS function to separate the harmonic and percussive elements of the audio. Then, we extract chroma from the harmonic part, using constant-Q transform with a sampling rate of 22050 and a hop length of 256 samples. Now we have chroma features for each sample, but we expect that the great majority of chord changes occurs on a beat. Therefore, we beat-synchronize the features: we run a beat-extraction function on the percussive part of the audio and average the chroma features between the consecutive beat positions. The chord annotations need to be beat-synchronized as well. We do this by taking the most prevalent chord label between beats. Each mean feature vector with the corresponding beat-synchronized chord label is regarded as one frame.

Parameters

song – Song for which we export the audio features

decibel.audio_tab_aligner.feature_extractor.get_audio_features(audio_path: str, sampling_rate: int, hop_length: int) → Tuple[numpy.ndarray, numpy.ndarray][source]
decibel.audio_tab_aligner.feature_extractor.get_feature_ground_truth_matrix(full_audio_path: str, ground_truth_labs_path: str) → numpy.matrix[source]

Jump Alignment

Jump Alignment is an extension to the HMM, which utilizes the chords that are parsed from tabs. Following cite{mcvicar2011using}, we refer to these chords parsed from tab files as Untimed Chord Sequences (UCSs). Compared to the original HMM, in the Jump Alignment algorithm the state space and transition probabilities are altered in such a way that it can align the UCSs to audio, while allowing for jumps to the start of other lines.

decibel.audio_tab_aligner.jump_alignment.jump_alignment(chords_from_tab_file_path: str, audio_features_path: str, lab_write_path: str, hmm_parameters: decibel.audio_tab_aligner.hmm_parameters.HMMParameters, p_f: float = 0.05, p_b: float = 0.05) -> (<class 'float'>, <class 'int'>)[source]

Calculate the optimal alignment between tab file and audio

Parameters
  • chords_from_tab_file_path – Path to chords from tab file

  • audio_features_path – Path to audio features

  • lab_write_path – Path to the file to write the chord labels to

  • hmm_parameters – HMMParameters obtained in the training phase

  • p_f – Forward probability

  • p_b – Backward probability

Returns

best likelihood and best transposition

decibel.audio_tab_aligner.jump_alignment.test_single_song(song: decibel.music_objects.song.Song, hmm_parameters: decibel.audio_tab_aligner.hmm_parameters.HMMParameters)None[source]

Estimate chords for each tab matched to the song and export them to a lab file.

Parameters
  • song – Song for which we estimate tab-based chords

  • hmm_parameters – Parameters of the trained HMM

decibel.audio_tab_aligner.jump_alignment.train(chord_vocabulary: decibel.music_objects.chord_vocabulary.ChordVocabulary, train_songs: Dict[int, decibel.music_objects.song.Song]) → decibel.audio_tab_aligner.hmm_parameters.HMMParameters[source]

Train the HMM parameters on training_set for the given chords_list vocabulary

Parameters
  • chord_vocabulary – List of chords in our vocabulary

  • train_songs – Set of songs for training

Returns

HMM Parameters