Do LLMs Have Soul?
(Dictated with some edits)
Artificial intelligence surprisingly has more soul than it used to have. 10 years ago when the first songs were released and created with AI, they had a strange warbled sound to them which I thought was kind of interesting. Now many AI generations are very realistic. But there's this idea that AI isn't capable of making us feel emotions the way humans can because it isn't actually playing acoustic instruments or singing. I love playing instruments, but now I have somewhat of an affinity for music generated by AI--which I would argue can make us feel emotions
The other day I was running generations on some of my new lyrics, and the iterations were actually quite good and actually started to move me emotionally. The problem with always manually playing music is that while we want to do things that seem emotional, it doesn't always turn out that way based on our musical abilities. If you're not playing guitar well, or you're not playing piano well, your intentions to have soul aren't going to be very effective. It's a fine balance between music that is played well and produced well that causes this kind of valence.
"Valence" is a new word that we haven't used so much in the past, but now it's a standard metadata field in digital music. Feeling is really in the moment: When we listen to a song for the first time it may move us in a different way than when we listen to it in the future, it all depends on the context and what's happening in your life, even your emotional valence at the time that you're listening to it. We're always at the mercy of selective attention, where we look at one element in a piece of music, and don't notice other things. But one day we start noticing them, and we connect to it in a different way. Perhaps you haven't listened to the lyrics because you've always been listening to the music, so when you make that discovery, it changes your emotions and your opinion about it. But the problem with AI music is still attribution: if the guitar part is very soulful as if it were played by a real musician, who is the real musician? And what were the emotions of the musicians when they were recording it? Very often you're playing things over and over again, and there's no emotion in it at all, or seems less genuine and forced. You just want to be done with it; It becomes tedious. Any emotions that were there to begin with are long gone once you've gotten to the 10th take of something. If your skills are good enough that you can play everything on a first take, which jazz musicians can do, I think it's something that we should aspire to–and perhaps a reason to restart jazz education. It's in the improvisations, coupled with musical ability that we can directly access soul. We understand soul as being spiritual, but it really depends on our abilities to fully express it on a first take. Perhaps there should be an option in AI music generation where you would select things that were in perfectly played, And then do an AB comparison of them to see which ones were more moving to you, and had more so-called soul, or je ne sai quoi.
If AI doesn't have consciousness, then it can't feel the same things that we're feeling. so the AI that is writing this music can't be asked what their feelings were about it. The feelings that are in the music are just coming from whatever emotions that were in the individual vocalists and players. But even then, as I said, that emotion might have actually been frustration and fatigue after working in the studio for many hours. Feelings can largely be an illusion. Artificial intelligence is mimicking emotions and we buy into that illusion. In terms of the vocal performances, if an actual vocalist tried to replicate what they did, it would sound much different, and probably not sound as good. But people might decide that if they had a choice between something that was perfectly emotive, and something that was partially emotive, or partially true, they might choose that one. (See Harari interview)
There are also questions of ethics that remain beyond just using samples of vocalists and instrumentalists. What we might have to think about is whether it is authentic to ourselves, that is, could we produce something similar? Is it a matter of poaching? For example, we have to go to the rainforest for natural materials for tires because we don’t have latex-producing plants in the US. Similarly, we have to use AI as a way to get resources from other musicians in order to create our music. Perhaps we should always think locally, but after globalism, is there really any way we can live completely locally? And using the corollary of generating music, can we just use our “local” skills? I would say yes because I can play music manually, but generating fully produced music is kind of a convenience. And there's always the possibility of performing that music.
If I may quote my own work, I wrote an essay in the early aughts that referenced anthropological studies done in the 1960s in the African bush where they would show the natives photographs and asked them what they saw. "Many of these people had never seen a photograph before in their lives. One woman, when shown a photo of her own son, failed to recognize him. However, when they were shown the same images on cloth or stone, they were able to relate to them. This may indicate that certain cultures may have invisible constructs in the brain that prevent subjective interpretation of the outside world." There's an ingrained trust in traditions, and perhaps AI is inimical to trust. Even if something moves us emotionally it is rejected as being fake and lacking "soul", but that's similar to the feelings we might have doing 50 takes of a guitar part. To the listener, it sounds spontaneous but it wasn't for the player.
***
In some sense, everything I've ever done could be considered to have soul because it was done manually, with all kinds of inexpensive consumer-level gear, portable cassette machines, and so on. A lot of the music that I've written is interesting to me in musical terms, but could never compete on a production level. Even if you were to invest the money in studios and session musicians, the music would sound better, but could never compete with mainstream music. Now everything can compete on an even playing field, at the expense of everything sounding the same because it's all coming from the same data sets.
To really make AI music powerful, it has to meet the needs of musicians, not listeners. The way I work is mostly in notation, jotting down riffs and rhythms as I think of them. But how is it that I use that in an AI context? My melodic and rhythmic ideas are from my soul, but how is it that I can utilize them? Even though my musical seeds might turn into something else, at least it's from me.
From Todd Rundgren's The Individualist: "The second or third night in Kathmandu I am woke in my hostel room with a song in my head that I felt desperate to remember. I had no guitar with me and of course no ability to think in conventional music notation, so I spent about a half an hour visualizing a piano keyboard and how my hand should look to articulate the chords in my head. For the rest of my trip I replayed that exercise until many weeks later when I actually got in front of a piano and made my hands repeat what I had imagined."
This is how soul works in the songwriting process. Ideas are going to come to you, and there has to be some manual way of recording them, either on some form of paper, or whatever you can get your hands on, or creating some kind of a mnemonic. All of this has nothing to do with electronics or computers. But who's to say that the idea that you had wasn't had by some other soul? And what will happen to the idea once you start to work with it? It's a challenge to extrapolate an idea so that the original soul of it remains intact over weeks or months.
Sometimes after I generate a piece of music, I start to reverse-engineer it, or play parts against it to see what the essence of it is. Perhaps that's how we locate the “soul” of something, even in manually played music that has been recorded. [Incidentally,, what you might want to do if you generate music with AI, and you want to learn music, is simply play along with it, as we did back in the day when we played along with radio and the records. If you've been generating music and have a large catalog, put it all in a playlist and play along with the playlist. The problem though is that the intrinsic quality of the music is lacking, compared to what we were playing along with in the 1970s and 1980s, when there was still an influence from blues, jazz, and classical]. Playing it live is a different story sometimes. Perhaps what we're after is the wabi-sabi effect, which is what people react to in a live performance. Some people like to hear bands play the songs as they are on the record, and some audiences don't care and prefer things to be played in the moment–as they would in a jazz context. Personally, that's what I like. I wouldn't want to play songs as they are on the record because it's sort of artificial. The whole process of recording is artifice to begin with. But it's what we react to emotionally. The guitar solo on Stairway to Heaven can be emotionally moving, and most of the solos Page played live are terrible. But we still liked seeing them live in a different kind of emotional context, typically under the influence of something.
The other day I played some of my AI generations for a friend, and he really liked them. But who's to say if he listens to it tomorrow he'll have the same feelings (valence) about them? This is why music has to be played at least three times before you can really absorb what's there. Initially what hits us is what we expect in the moment, which can be different tomorrow. Tomorrow we may have new opinions about AI and be cynical about it, and so we bring that to the table when we listen to things, or when we look at art. It may start to sow contempt and consequently, the music gets drowned out by that contempt. Minimalism in art has always had that problem where the contemptuous feeling was that anybody could do it, and what's this world coming to? So we're still having the same conversation about that with AI. There are always people who aren't going to like it, just like they don't like modern or contemporary art. There's nothing that's going to change their mind. Some people still don't like computers, so AI goes nowhere with them. We might think that recognition of soul or doing something with soul is something that comes from within us, but that's not necessarily true because it's a reaction to inputs, even from yourself if you’re improvising. There's been many times when I just improvised and recorded what I played, and there were interesting parts in it, where I could take bits of it and use it, but most of it is unusable. There might have been some emotion in it, but you still have to compose it, and in the composition process, emotions kind of get pushed to the background because we're using the left hemisphere to do the writing, and then hopefully when the piece is performed, the people who listen to it will feel the emotions that the conductor evokes. Could there be an AI conductor if all the instrument sounds are generated in real time? There would probably be some interesting moments in there and we’d be moved by them regardless.
The genre of music is also crucial to how I personally feel about it. if I'm excited about an idea and it's an electronic dance music piece, which I certainly can't own, I'll still like the result, and like listening to it. It's music that I wouldn’t turn off.
I've always liked fun and frivolous pop music. Yesterday I watched a short of Walk Like an Egyptian by the Bangles. Why would a person my age be moved by that? But I was. I’m moved by a lot of the pop music that I generate. As a musician, I never really move on to things that I liked 20 or 30 years ago. There's really no reason to because you're saying to yourself, “ I'm not going to be moved by this music anymore because I'm too old for it.” I think AI music allows us to revisit the things that we like, and we can feel that we have actually written them, even though everything was generated. Personally I always use my own lyrics, and if I manually compose the song on a guitar using my lyrics, which I certainly can do, it would be different, but it wouldn't be complete; it wouldn't be produced. What AI music gives us is a finished product that we can react to, which I argue is better than the one that could be played on an out-of-tune guitar. However, I still see the value in that as a musician. But for someone who doesn't play instruments, playing on out-of-tune guitar wouldn't be very inspiring. There are AI music contests now, where people with no musical training can become stars. The songs are actually moving people emotionally, and are not too concerned about it being played by a real human. So it's incumbent on humans to create an equivalent experience that is just as good or maybe better than an AI-generated track, or take the AI-generated music and rewrite it in a way that can be played by musicians and can be equally compelling.
Sometimes we know whether something is emotionally moving through the perceived reaction of someone we know. I had played one of my AI Generations of lyrics in Esperanto, and it was done just with an acoustic guitar and a voice. He said that it was something that his 80 year-old mother would like–someone who never listens to music. So we have a collective sense of what is moving. It wouldn't matter to her whether it was AI-generated. For someone who has no involvement with technology whatsoever, she probably would be amazed that a computer could do something like that. That wouldn't have been the case even 2 years ago, when AI-generated music was not as good and did sound very artificial.
What typically happens with any disruptive technology is that it breaks things into little pieces that have to be reassembled at some point. The current litigation between the major music conglomerates and AI music startups is an example of what would happen if they were to reach some kind of a deal or settlement, and then get into the trenches to determine how the settlement terms could be implemented. The biggest problem is, again, attribution–attempting to figure out whose voice it is, or who's playing what. It will probably be very expensive because it requires forensic analysis, and likewise, will be a business opportunity for some. In any event, there will be a new music business model, and will take many years to resolve. In the meantime, AI music will continue to evolve as more people use it and find more uses for it. The interesting uses to me are the ones that are on the edges of it, which not many people are using.
***
What AI music will do is make it more polished and packaged, which I think is what people ultimately want. Take for example, organic produce that might be compromised in some way, it doesn't look good, and has various blemishes, compared to genetically modified produce or packaged produce that looks better, which is what people ultimately choose. This is the way we look at music as well now with polished productions generated with AI. In terms of it having soul, if you were to grow your own food, it would also be blemished, but you would eat it because it's something that you grew yourself. That's actual soul food, growing it and preparing it yourself--using a corollary of making music manually with whatever tools you have available. Artisanal-quality foods have been an effective marketing strategy for a long time because they sell the idea of being disabused of the work involved in growing and preparing your own food, but have the same kind of vibe. Conceivably, you could use the same strategy with AI-generated art, where you'd have actual artists doing the generation, claiming that the souls of the artist is in the work that they do. It could get to the point where people won't want to do the generations themselves, even though it takes 5 minutes, because they've been sold on the idea that the artist has a soul, something unique and special that they're willing to pay for.
Comments