Pitch
The pitch of musical sounds is a perceptual attribute that allows their classification on a scale of 'high – low.' In general, the perception of pitch is determined by the fundamental frequency of the sound wave. Thus, a sound with a high fundamental frequency typically has a higher pitch than a sound with a lower fundamental frequency. In practice, pitch determines the difference between tones and is a fundamental element of melody and harmony.
Spectral Centroid
Spectral centroid is defined as the weighted average frequency of the spectrum, also known as the spectral center of gravity. It has been identified as one of the features with strong perceptual significance and is associated with the perception of acoustic brightness in the literature. Low values usually indicate a "dark" sound, while higher values suggest a "bright" sound. However, it also increases in the presence of noise and tends to fluctuate significantly during transient sound phenomena. If not normalized by dividing by the fundamental frequency (for harmonic sounds) as in this case, it is also strongly correlated with pitch.
Inharmonicity
Inharmonicity is a measure of the deviation of the harmonic series from the position of the integer multiples of the fundamental frequency where they should be in the case of a perfectly harmonic signal. There is evidence that singers' voices can be distinguished based on the criterion of inharmonicity.
Formants
Formants are local maxima in the frequency spectrum, which, in the study of voice, reflect the resonance frequencies of the human vocal tract. Typically, the positions of the first four maxima, F1, F2, F3, and F4, are calculated. Differences between these allow us to recognize different vowels and distinguish between different speakers or singers. For singers, formants are important because they determine the quality and timbre of their voices.
Vibrato
Vibrato is a periodic variation in pitch that creates the sensation of a pulsating sound and is one of the key expressive tools used by singers and performers of string or wind instruments. Particularly for singers, vibrato is considered one of the elements that provide a performer with their unique identity. Vibrato itself is characterized by its rate (the frequency of the variation), its extent (the range of variation around the average value), and its regularity (the extent to which the variation pattern can be approximated by a simple sine wave). Its rate is measured in Hz, its extent in cents (hundredths of a semitone), and its regularity is reflected by a dimensionless quantity derived from the ratio between the amplitudes of the two strongest frequencies contained in the variation pattern. This ratio ranges from 0 for completely irregular vibrato to 1 for vibrato with a simple sine wave pattern with a single variation frequency.
Chromagram
The chromagram in computational music analysis shows the energy distribution of the 12 pitch classes (C, C#, D, D#, E, F, F#, G, G#, A, A#, B) over time, regardless of the octave. It is particularly useful for analyzing harmony, chord progressions, key estimation, and melodic structures, because it focuses on pitch class information rather than absolute frequency.
The analysis algorithms that were used in this repository are based on: M. Müller and S. Ewert. Chroma Toolbox: Matlab Implementations for Extracting Variants of Chroma-Based Audio Features. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, Florida, USA, October 24-28, 2011
Tempo Curve
A tempogram is a visual representation of how the tempo changes over time and it helps estimate the beats per minute (BPM) and rhythmic patterns in a piece. By analyzing the intensity of beats at different speeds, a tempogram can highlight the most dominant tempi in a track.
A single tempo curve can then be derived from a tempogram. Similar to the tempogram, a tempo curve is a representation of how the tempo changes over time but, instead of showing all possible tempos, it highlights the most likely tempo at each moment, creating a flowing line that follows the rhythm of the music. This helps track abrupt or gradual tempo changes, such as accelerations or slowdowns, and makes it useful for analyzing musical performances, thematic segmentation, beat tracking, and genre (or dance) classification.
The analysis algorithms used in this repository are based on: P. Grosche and M. Müller. Extracting predominant local pulse information from music recordings. IEEE Transactions on Audio, Speech, and Language Processing, 19(6):1688–1701, 2011