"The first step is to figure out what the actual notes of the spoken words are. This usually entails slowing the video digitally and going over it and over it until I work out the 'melody' of the voice. Then, I add harmony. I try to get my videos to have a chord progression that you might expect to hear in the real world.
Without getting too in depth, any one note can be applied to virtually any chord. The note might just function differently depending on how you’re using it in a harmonic context. This gives me a lot of freedom in how I apply the melody to a harmonic progression and makes it so I can usually spit out something in the end that almost sounds like a complete song.
The hardest part is executing it all once I’ve gotten it worked out, since people tend to speak much faster than one would typically sing."