AI has achieved significant advances in recent years, revolutionizing industries such as healthcare, finance, transportation, entertainment, and music.
I attended the EuroPython conference's online stream this year and was inspired by the work of Mateusz Modrzejewski in his talk on Music Information Retrieval with Python.
My wife has a passion for playing and listening to the piano. Some songs are classical pieces, and their scores can be found online. Others are covers of more contemporary songs, like Beyoncé - Halo (Karaoke Piano) Lower Key from Sing2Piano. Since they recreate the scores, they sell them in their own shop; however, this particular song is missing.
Thus, I was pondering whether it is possible to recreate the music sheet for the song using open-source AI tools. Please note that this would be undertaken solely for educational purposes.
Getting the audio
To process the song, you will need a local version of it. There are several tools available that can be used to fetch public videos. I prefer to use pytube as it provides an easy CLI for quick tasks and can be integrated into code for more complex workflows.
Install Pytube:
pip install pytube
Check Available Audio Streams:
pytube -l https://www.youtube.com/watch?v=ID | grep audio
Output
<Stream: itag="139" mime_type="audio/mp4" abr="48kbps" acodec="mp4a.40.5" progressive="False" type="audio">
<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2" progressive="False" type="audio">
<Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus" progressive="False" type="audio">
Download an Audio Stream:
pytube --itag=251 https://www.youtube.com/watch?v=ID
Generate a MIDI file
From the Q/A session of the talk I mentioned earlier, I learned about basic pitch, a lightweight yet powerful audio-to-MIDI converter with pitch bend detection, released by Spotify.
It cannot process the .webm
file from the previous step, so it must be converted to one of the following formats: .mp3
, .ogg
, .wav
, .flac
, or .m4a
. This conversion can be done easily using ffmpeg.
ffmpeg -i input.webm input.wav
After that, you can run basic-pitch
to generate the MIDI file:
basic-pitch output/ input.wav
Here are some additional options you can use:
--sonify-midi
to additionally save a.wav
audio rendering of the MIDI file.--save-model-outputs
to additionally save raw model outputs as an NPZ file.--save-note-events
to additionally save the predicted note events as a CSV file.
The output directory will now contain an input_basic_pitch.mid
file.
Creating the Music Sheet
The last step is completed using MuseScore4 by simply opening the .mid
file and creating a PDF export. You will now have a PDF file that represents the music sheet for your chosen song.
Conclusion
AI tools such as basic-pitch can provide an automated method for converting audio to MIDI. However, there may still be limitations in accuracy and in capturing all the nuances of the original performance. Nevertheless, for personal projects or learning experiences, they can serve as valuable tools in recreating music sheets from audio recordings.