As a creative writer, I’ve been fascinated by LLM’s since they first came out. I’ve always wondered about how my writing works – when I’m “in the zone,” words and phrases seem to flow out of my brain into my fingers and onto the screen. Occasionally, sure, I stop to actual think about a particular metaphor or word choice, but most of the time the writing just flows. Often I’m shocked and amazed and I wonder how I came up with a particular turn of phrase.
It has struck me long ago that those words and phrases are not really mine – they are a comogulation of everything I have read in my life. Not necessarily the exact phrase, but the style, the pattern, the tone might be from other words, or combined from hundreds of similar things I’ve read.
When LLMs came out I quickly deduced they were working on the same principal. Of course, they have perfect recall and can sometimes regurgitate the exact text they were trained on, which my brain can’t really do. But the concept is similar.
So I’ve never been too upset by the idea that LLMs are “stealing” existing works by being trained upon them – that’s exactly what humans have done for thousands of years. Look at every writer who writes about the authors that influenced them and you can see hints of those previous works in their work. That’s how creativity works. It’s changed, modified, improved, morphed, and combined to make something new, usually without being conscious of the process.
(I’m not convinced AI does anything truly creative, since it doesn’t know what it is doing – there is zero intention – but it is mimicking the human process of creating, for sure.)
This “brain training” is one of the reason I read a lot and try to read different genres of fiction and types of non-fiction. The more I read, the better writer I become.
(If I have a worry about AI, it’s that it is running out of training material. It’s already being trained on its own output, and as more “writing” in the world is AI-generated, the various models will consume that for training. This will water down the content the way photocopies of photocopies are further and further removed from the original. Like a game of telephone, the end result may be corrupted and completely distorted, taking away whatever humanity was in the original.)