Autocompletion

Some language training sets contain hundreds of billions of words, but only a tiny percentage of that is used in everyday speech.

I often rail about things being full of cliches and formulas, but they make sense in the absence of being able to write things down to remember them (or even wanting to remember them). Cliches are shortcuts.

You would think that by simply letting AI generate permutations would result in lots of cliches if the training set was comprised of mostly cliches and formulas because the underlying syntax is set up to "auto-complete" them.

From The cliché writes back:

"These models use AI to learn what words tend to follow a given set of words, and, along the way, pick up information about meaning and syntax. They estimate likelihoods (the probability of a word appearing after a prior passage of words) for every word in their vocabulary, which is why they’re sometimes also called probabilistic language models. Given the prompt ‘The soup was …’ the language model knows that ‘… delicious’, ‘… brought out’ or ‘… the best thing on the menu’ are all more likely to ensue than ‘… moody’ or ‘… a blouse’.

Musical syntax is similarly predictive. I've noticed that even in my own writing, rhythms fall within a predictable range when lyrics are used, as the words already suggest rhythms, and I keep seeing the same rhythms emerging. Melodies in pop songs typically hover around the chord tones of triads and sometimes only the range of a third or fifth.

In the future we might be able to write our own algorithms (I won't say "AIs") to generate content, but why would we want to? We can set up ranges, for example, on how dissonant something can be, but it's just as easy to set that up yourself by simply playing the music. Music already has the capacity for dissonance on-demand, so there's no reason to set up algorithms to produce it--at least in music in the flow of life. We can write algorithms wired to certain real-time contexts, like weather conditions, or other if/then associations. But what matters is what audio files are in the data set. If everything is from the classical period, then that's what it sounds like. If your jukebox is full of Mozart, then any algorithm is going to sound like Mozart, and do we want that? Do we want all classical cliches? 

But cliche is important, as it sets the boundaries on how far we want to go in terms of commonalities. This is what we do unwittingly perhaps--to keep our work within acceptable cultural boundaries. We want to avoid being too esoteric or exoteric. If we work from the supposition that we just don't care about originality and are going for just bullet points or takeaways because we have to get through the firehose of inputs, then AI would work for some people. We're in echo chambers anyway. We don't read more than headlines, or perhaps the first paragraph, or skim for the phrases that confirm beliefs and positions. An interesting AI is one that just scans through the places you usually go and delivers them to your doorstep. But as some of us remember from the print days, stuff would just pile up. I loved my subscription to Art Forum but I never read them or just looked at the art. However it's generated, it's still accumulating.

Cliches exist because we let things be good enough for the limited time we engage with art. Commercial art in public spaces is an example of this: It's simply something to fill a space. Likewise, algorithms fill spaces without putting much thought or effort into it. It's a form of auto-complete.

Comments

Popular Posts