Converting Instrumental Songs to Sheet Music Using Open Source AI

Posted by Harald Nezbeda on Sun 01 October 2023

AI has achieved significant advances in recent years, revolutionizing industries such as healthcare, finance, transportation, entertainment, and music.

I attended the EuroPython conference's online stream this year and was inspired by the work of Mateusz Modrzejewski in his talk on Music Information Retrieval with Python.

My wife has a passion for playing and listening to the piano. Some songs are classical pieces, and their scores can be found online. Others are covers of more contemporary songs, like Beyoncé - Halo (Karaoke Piano) Lower Key from Sing2Piano. Since they recreate the scores, they sell them in their own shop; however, this particular song is missing.

Thus, I was pondering whether it is possible to recreate the music sheet for the song using open-source AI tools. Please note that this would be undertaken solely for educational purposes.

Getting the audio

To process the song, you will need a local version of it. There are several tools available that can be used to fetch public videos. I prefer to use pytube as it provides an easy CLI for quick tasks and can be integrated into code for more complex workflows.

Install Pytube:

pip install pytube

Check Available Audio Streams:

pytube -l https://www.youtube.com/watch?v=ID | grep audio

Output

<Stream: itag="139" mime_type="audio/mp4" abr="48kbps" acodec="mp4a.40.5" progressive="False" type="audio">
<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2" progressive="False" type="audio">
<Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus" progressive="False" type="audio">

Download an Audio Stream:

pytube --itag=251 https://www.youtube.com/watch?v=ID

Generate a MIDI file

From the Q/A session of the talk I mentioned earlier, I learned about basic pitch, a lightweight yet powerful audio-to-MIDI converter with pitch bend detection, released by Spotify.

It cannot process the .webm file from the previous step, so it must be converted to one of the following formats: .mp3, .ogg, .wav, .flac, or .m4a. This conversion can be done easily using ffmpeg.

ffmpeg -i input.webm input.wav

After that, you can run basic-pitch to generate the MIDI file:

basic-pitch output/ input.wav

Here are some additional options you can use:

  • --sonify-midi to additionally save a .wav audio rendering of the MIDI file.
  • --save-model-outputs to additionally save raw model outputs as an NPZ file.
  • --save-note-events to additionally save the predicted note events as a CSV file.

The output directory will now contain an input_basic_pitch.mid file.

Creating the Music Sheet

The last step is completed using MuseScore4 by simply opening the .mid file and creating a PDF export. You will now have a PDF file that represents the music sheet for your chosen song.

Music sheet

Conclusion

AI tools such as basic-pitch can provide an automated method for converting audio to MIDI. However, there may still be limitations in accuracy and in capturing all the nuances of the original performance. Nevertheless, for personal projects or learning experiences, they can serve as valuable tools in recreating music sheets from audio recordings.