Give AI Music A Hand

September 07, 2024

Give AI Music A Hand

Not to applaud it, but to put our hands on it--to interact and reshape it--to collaborate with it as a human-in-the-loop.

As AI music evolves, I'm still looking for ways to use it either as a tool, or to use as subjects for appropriation. Some sites are clearly better at generation, but I'm not interested in that as much as seeing where my own ideas go--not only in the early provisional demos, but how they evolve over time. I'm fine with using AI in that process--especially with filling in lyrics.

Here's an A-B comparison of something generated by AI using a few random lyrics:

Function before form
Start with a basic sound
Think that they were art
Looking at them close and far
Roman or Egyptian art
Brightly-colored African art
Going back to nature
Going back to the roots

My version:

Function Before Form (8-19-2024) by meta4s

The AI-generated version:

Generated Lines:

Function before form
Basic sound in the air
Art in the making
Close and far we stare
Simple and rough
Beat rising high
Unpolished love
Touching the sky
Short lines we sing
Melody rings loud
Catchy and clear
We stand proud

As a soft jazz ballad with brushes and acoustic bass, you could noodle around on a guitar against it as a jam track. What's remarkable is that the generated music sounds more like "form before function" (emotion before compositional form) so it's ironic in that regard and is an odd interpretation. If someone was actually going to improvise a vocal, this wouldn't be the approach with these particular lyric lines. While the production sounds good, there's no Producer asking for a different take.

The chords are diatonic in F major and the melody mostly pentatonic, but there is a particularly jarring use of a Db major triad in what it calls the "chorus". The algorithm probably is designed to "borrow" chords from the parallel key, F minor, but it does it in a way that a human wouldn't, and is rather a borrowing from Bb minor. Such abrupt modulations would only be used a certain inflection points in the words, perhaps with the "Egyptian Art" line--and even then a bad cliche. But overall, the lines it generated in the gaps were quite good and I'd go with them. The tempo is rubato, which is appropriate for this style, but if using AI-generated tracks for use against other tracks they'd have to be quantized, and to me, not worth the trouble.

I prefer my version because the valence matches the words and I used a stuttering rhythm contrasted with a simple 4/4 backbeat (a la John Bonham's on Kashmir) and is more "functional" than a smooth jazz treatment. Also, the edginess seems more appropriate to mood of the world today. Mine is obviously not slickly produced--so AI wins there, but again, smooth jazz is the wrong approach, and the modulation is odd. But most importantly, mine isn't driven by algorithms other than my own.

I can see doing an album of songs with some AI generated material, but not as-is out of the box. For example, how would an actual jazz group perform the song? Could it be transcribed into a lead sheet with chords and melody, and what happens to it in the process?

***

Here's another AI-generation based on only 3 lines chosen at random from 8/29 entries from my diary: one about the 2004 Republican convention on 8/29/2004 and Hurricane Katrina on 8/29/2005. It's a song about nothing and something at the same time, with just three lines.

Looks like 1968 again
Category 5
Climate change
History of automation
Everywhere I look
Everything I saw
Was something to be made
It all belonged to me
A factory
With its broken panes
Lines on a road map
Paper fragments
In the street
Anything goes

I actually like what was generated, and I played a bass part against it, as well as an orchestral arrangement in order to "put my hands on it". Making something sound just like the "record" is like a photorealist painting, but can also be a form of deep-fakery with insincere intentions, such as painting a photo that fools the viewer that it is a depiction of reality, when, in fact, things were actually altered. No one would bother checking it. It begs the question of whether truth and trust should be involved in art-making. Personally I would give full disclosure of what I did--counter to the way we seem to be going about exhibiting AI-generated work without saying that it is.

The vocals here are very loose and rubato, with a very human feel--more sprechgesang. But if a human was actually singing this in a band , the phrasing would be much different. It's interesting how it displaces the melody at random and puts in 2/4 bars.

What is remarkable is that the tempos lock and you can throw in orchestral arrangements against it. This is a way to interact with the music. I transcribed the melodic and rhythmic figures and mapped it over the main track.

Category 5 (Orchestral) by meta4s

Most AI music generation is terrible, like this: https://aimusic.fm/music/66d079eb87b45a39e9e3030c

***

The other day I consulted one of my favorite books on the philosophy of photography, Roland Barthe's Camera Lucida, A Reflection On Photography. This passage relates to what we're doing with AI Art, and I am replacing "photography" with "AI art".

"I call photographic [AI art] referent not the optionally real thing to which an image or a sign refers to but the necessarily real thing which has been placed before the lens, without which there would be no photograph {AI art]. Painting [AI art] can fake reality without having seen it. Discourse combines signs which have referents, but these referents can be and are most often “chimeras”. Contrary to these imitations, in photography [AI art] I can never deny that the thing has been there. There is a superimposition here, of reality and of the past. and since this constraint exists only for photography [AI art], we must consider it, by reduction, as the very essence, the noeme of photography {AI art]. What I intentionalize in a photograph…is neither art nor communication, it is reference, which is the founding order of photography {AI art]."

AI-generation of art and music are "referents" in the same way cameras are an apparatus for creating referents, faking reality. Looking at a picture you took yourself reminds you of the moment, but when moments are generated where we don't know who experienced them, they should be seen as inherently fake--even if they weren't faked. I actually liked the Category 5 song, but whose voice was it? It was a human I'm sure, but why do we accept the anonymity so easily? When we see a photo we know someone took it. When we see am AI-generated photo there are many people that could have taken it, so we're seeing a collage of experiences. That's what AI music is: a collage of experience.

Cory Doctorow on "enshitification": "OpenAI is the poster child of the AI world. Multiple 60-year-olds at my gym have told me I should use ChatGPT to write articles. I cannot think of another technology company that has created such universal hype and fear as OpenAI. Indeed, it’s arguable that our current cultural, economic, and financial obsession with AI, a technology that has existed for decades, is thanks to OpenAI. In fact, OpenAI’s meteoric rise has pushed big tech to invest hundreds of billions of dollars into AI in an attempt to catch up or ride on their coattails. So, the fact that a recent investigation found that OpenAI could be bankrupt by the end of the year is utterly terrifying, as it will hurt all of us."

Search This Blog

Musings on Music (mostly)

Give AI Music A Hand

Comments

Popular Posts

Word Worlds

AI and Lyric-Writing