“Words behave like pixels and sentences like pictures”: An interview with Mario Klingemann
Mario Klingemann is considered a pioneer in the field of neural networks, computer learning and artificial intelligence art. He spoke to Goethe-Institut about using AI creatively and the role of technology in a modern interpretation of Kulturtechnik.
Mario Klingemann describes himself as a sceptic with a curious mind. The Munich-based artist is not new to the technology game: he taught himself programming in the early 1980s and has wanted to train machines to display near-autonomous creative behaviour ever since.
He has been an Artist in Residence at Google Arts and Culture and has worked with prestigious institutions including The British Library, Cardiff University and New York Public Library. Klingemann spoke to Goethe-Institut ahead of the Future U exhibition at RMIT Gallery in Melbourne, where he is due to present his work Mistaken Identity.
You have stated that you are confident that machine artists will be able to create “more interesting work” than humans in the near future. Does this also apply to aesthetics and beauty?
At least in my world model it does. If you drew a Venn diagram of things that are interesting to humans, aesthetics and beauty are surely some of the bigger circles there. But not everything that we consider beautiful is also interesting, beauty on its own can become quite boring – which is what makes "interestingness" so interesting. If beauty was all that it takes to get our attention, to fascinate us or to evoke our emotions, then creativity would be solved – just add a flower or a pretty face and your work is done.
Creating something that the majority of people will find aesthetically pleasing is technically not difficult. You can make a permutation from a set of typical aesthetic subjects, apply a few well-known composition rules to it and you will have instant beauty – quite likely boring and predictable kitsch. Creating something that is beautiful and interesting at the same time is much harder and due to the fleeting nature of interestingness also something that can never be solved once and for all. It has to be attained every time anew. "Mistaken Identity" consists of three rendered videos, based on the artist's neural glitch technique | © Mario Klingemann Here in Australia you will show the work “Mistaken Identity”. For this installation you incorporated random elements and glitches, so essentially you are playing with mistakes. Are you sometimes genuinely surprised by the outcomes you generate with the help of neural networks?
Surprise is the essence of interestingness. We experience it if our expectations are not met – either in a positive or a negative way. When we encounter any kind of information or situation, we start making predictions based on what we know: what is likely to happen next? What else is in that picture? What will we read in the next sentence? And if whatever comes next is not what we expected, we are surprised and interested, because this promises us to expand and improve our model of the world so in the future we are better prepared.
In particular in the early phase of working with neural networks these types of surprises were plenty since these models were like a ship that took you to a previously unexplored territory. Imagine how it must have felt for the first settlers who came to Australia and who encountered the platypus or the kangaroo for the first time – that's how I feel in my artistic experiments. But the longer I travel in these latent spaces the more I get accustomed to their nature and just like you probably won't get too excited anymore by spotting another marsupial, getting genuine surprises out of these models now takes effort and time. But it still happens.
AI art is a relatively young genre. Currently we are still in its pioneering phase but sooner or later applications that are easy to use will filter down to the masses. Are you excited or worried about the prospect of AI art going mainstream?
It's the way these things go, but I am happy that I had at least a few years of solitude when this way of creation was not the norm. It's again the crux with interestingness here – if something is easy to do so everybody can do it, it becomes harder to make something with it that has not already been done by someone or seen by everyone. So, as civilisation starts creeping up into these territories in the form of one-click AI art tools, it forces me to look for areas out there that I still consider "wilderness" and to learn more about what it is that we humans find truly interesting and captivating. Right now, words and storytelling are some of those areas that look promising to me.
For your recent work “Appropriate Response” you chose to focus on language and the power of words as your main medium. How was this different to working with images?
The truly fascinating part about the way neural networks work is that underneath everything is numbers. It does not matter if you are dealing with images, sound or words – once you have a way to convert them to numbers, they are all on the same playing field. So "meaning" becomes a location in a multidimensional space and you can measure, manipulate and translate it. In that sense, words behave like pixels and sentences like pictures – or to use another image – letters are like clay which can then be molded into sculptures using similar or even the same techniques that I use to make visuals.
One of the big differences is that words are less forgiving than images when it comes to assembling them – images are way more redundant and indefinite and it usually does not matter if a few pixels are "in the wrong spot", whereas a single misplaced letter can already change the entire meaning of a sentence. Which is not necessarily a bad thing since it allows for a lot of surprises. However, the ratio of "neural rubbish" that you get with text is higher and it takes more work to separate the interesting from the mediocre.
One of the systems you have used is GPT-2, which was created in the Elon Musk-founded research lab OpenAI. The lab itself states that there is a real danger of extremist groups misusing the system to generate synthetic propaganda. Are we playing with fire when these powerful tools are released into the community?
With these types of systems it is a little bit like with the coronavirus: yes, there might be a temporary danger due to the nature of how we spread and consume information nowadays online and what we then perceive as reality, mostly through social media and Google searches. If suddenly an information virus comes along that is able to reproduce at an enormous speed and which can trick the information replicators into believing that it is a valuable message then this might overwhelm those systems and ultimately society. Unfortunately, we tend to believe what we see, in particular if it is written in black on white and our social immune system is not prepared yet for neurally optimised attacks that can abuse that.
But I believe that with more exposure to these threats we will develop "herd immunity" and be able to refine our information receptors so we can again distinguish nourishing information from empty phrases.
As an artist, you are interested in handing over more and more control to machines – as a citizen, are you concerned about the level of control big data and algorithms have on your daily life?
Oh absolutely. Not a day goes by when I am not shocked by the naivety or malignancy of certain politicians that try to convince us to give away another piece of our privacy under whatever pretext. Working with machine learning I know how little information can be required to home in on a certain target or to separate data points into different classes and how you can use gradient descent to optimise whatever your goal is, in order to maximize your gain. So in my daily life I am very suspicious of every attempt that forces me to reveal personal data and I try to keep control over what I am willing to share and what not.
Mario Klingemann | © Onkaos
Your projects tend to be very technical and require not only coding but also engineering skills. Do you think our more traditional “Kulturtechniken” like drawing and writing are under threat as machines are trained to learn more and more of these skills?
I am not worried about that. Yes, you could say that technology is the natural enemy of tradition, but history also teaches us that it acts more as a transformer or catalyst than a destroyer. Photography did not kill painting, TV did not kill books. What technology does change is the ratio of how many people can make a living from a particular Kulturtechnik. At the same time, it opens the door for others to participate in the system and creates new opportunities for those who are willing to adapt. Which means, for our times, that besides drawing and writing, learning how to code should be seen as one of the most valuable skills to acquire to be prepared for a future in which machines are taking an active role in all our lives.
The reason I am not worried is that one of the important aspects of what makes a work interesting to us is the fact that it was made by a human hand or a human mind. The story behind the work is often just as important as the work itself. And this is something that machines will have a very hard time replicating.
Mistaken Identity consists of three rendered two-hour long videos. Based on the artist’s “neural glitch” technique, the fully synthesised videos show visualisations undertaken by generative adversarial neural networks (GANs). Mistaken Identity will be on display as part of the Future U exhibition at RMIT Gallery in 2021.
Learn more about Mario Klingemann's views on the future of creative AI here.
He has been an Artist in Residence at Google Arts and Culture and has worked with prestigious institutions including The British Library, Cardiff University and New York Public Library. Klingemann spoke to Goethe-Institut ahead of the Future U exhibition at RMIT Gallery in Melbourne, where he is due to present his work Mistaken Identity.
You have stated that you are confident that machine artists will be able to create “more interesting work” than humans in the near future. Does this also apply to aesthetics and beauty?
At least in my world model it does. If you drew a Venn diagram of things that are interesting to humans, aesthetics and beauty are surely some of the bigger circles there. But not everything that we consider beautiful is also interesting, beauty on its own can become quite boring – which is what makes "interestingness" so interesting. If beauty was all that it takes to get our attention, to fascinate us or to evoke our emotions, then creativity would be solved – just add a flower or a pretty face and your work is done.
Creating something that the majority of people will find aesthetically pleasing is technically not difficult. You can make a permutation from a set of typical aesthetic subjects, apply a few well-known composition rules to it and you will have instant beauty – quite likely boring and predictable kitsch. Creating something that is beautiful and interesting at the same time is much harder and due to the fleeting nature of interestingness also something that can never be solved once and for all. It has to be attained every time anew. "Mistaken Identity" consists of three rendered videos, based on the artist's neural glitch technique | © Mario Klingemann Here in Australia you will show the work “Mistaken Identity”. For this installation you incorporated random elements and glitches, so essentially you are playing with mistakes. Are you sometimes genuinely surprised by the outcomes you generate with the help of neural networks?
Surprise is the essence of interestingness. We experience it if our expectations are not met – either in a positive or a negative way. When we encounter any kind of information or situation, we start making predictions based on what we know: what is likely to happen next? What else is in that picture? What will we read in the next sentence? And if whatever comes next is not what we expected, we are surprised and interested, because this promises us to expand and improve our model of the world so in the future we are better prepared.
In particular in the early phase of working with neural networks these types of surprises were plenty since these models were like a ship that took you to a previously unexplored territory. Imagine how it must have felt for the first settlers who came to Australia and who encountered the platypus or the kangaroo for the first time – that's how I feel in my artistic experiments. But the longer I travel in these latent spaces the more I get accustomed to their nature and just like you probably won't get too excited anymore by spotting another marsupial, getting genuine surprises out of these models now takes effort and time. But it still happens.
AI art is a relatively young genre. Currently we are still in its pioneering phase but sooner or later applications that are easy to use will filter down to the masses. Are you excited or worried about the prospect of AI art going mainstream?
It's the way these things go, but I am happy that I had at least a few years of solitude when this way of creation was not the norm. It's again the crux with interestingness here – if something is easy to do so everybody can do it, it becomes harder to make something with it that has not already been done by someone or seen by everyone. So, as civilisation starts creeping up into these territories in the form of one-click AI art tools, it forces me to look for areas out there that I still consider "wilderness" and to learn more about what it is that we humans find truly interesting and captivating. Right now, words and storytelling are some of those areas that look promising to me.
For your recent work “Appropriate Response” you chose to focus on language and the power of words as your main medium. How was this different to working with images?
The truly fascinating part about the way neural networks work is that underneath everything is numbers. It does not matter if you are dealing with images, sound or words – once you have a way to convert them to numbers, they are all on the same playing field. So "meaning" becomes a location in a multidimensional space and you can measure, manipulate and translate it. In that sense, words behave like pixels and sentences like pictures – or to use another image – letters are like clay which can then be molded into sculptures using similar or even the same techniques that I use to make visuals.
One of the big differences is that words are less forgiving than images when it comes to assembling them – images are way more redundant and indefinite and it usually does not matter if a few pixels are "in the wrong spot", whereas a single misplaced letter can already change the entire meaning of a sentence. Which is not necessarily a bad thing since it allows for a lot of surprises. However, the ratio of "neural rubbish" that you get with text is higher and it takes more work to separate the interesting from the mediocre.
One of the systems you have used is GPT-2, which was created in the Elon Musk-founded research lab OpenAI. The lab itself states that there is a real danger of extremist groups misusing the system to generate synthetic propaganda. Are we playing with fire when these powerful tools are released into the community?
With these types of systems it is a little bit like with the coronavirus: yes, there might be a temporary danger due to the nature of how we spread and consume information nowadays online and what we then perceive as reality, mostly through social media and Google searches. If suddenly an information virus comes along that is able to reproduce at an enormous speed and which can trick the information replicators into believing that it is a valuable message then this might overwhelm those systems and ultimately society. Unfortunately, we tend to believe what we see, in particular if it is written in black on white and our social immune system is not prepared yet for neurally optimised attacks that can abuse that.
But I believe that with more exposure to these threats we will develop "herd immunity" and be able to refine our information receptors so we can again distinguish nourishing information from empty phrases.
As an artist, you are interested in handing over more and more control to machines – as a citizen, are you concerned about the level of control big data and algorithms have on your daily life?
Oh absolutely. Not a day goes by when I am not shocked by the naivety or malignancy of certain politicians that try to convince us to give away another piece of our privacy under whatever pretext. Working with machine learning I know how little information can be required to home in on a certain target or to separate data points into different classes and how you can use gradient descent to optimise whatever your goal is, in order to maximize your gain. So in my daily life I am very suspicious of every attempt that forces me to reveal personal data and I try to keep control over what I am willing to share and what not.
Mario Klingemann | © Onkaos
Your projects tend to be very technical and require not only coding but also engineering skills. Do you think our more traditional “Kulturtechniken” like drawing and writing are under threat as machines are trained to learn more and more of these skills?
I am not worried about that. Yes, you could say that technology is the natural enemy of tradition, but history also teaches us that it acts more as a transformer or catalyst than a destroyer. Photography did not kill painting, TV did not kill books. What technology does change is the ratio of how many people can make a living from a particular Kulturtechnik. At the same time, it opens the door for others to participate in the system and creates new opportunities for those who are willing to adapt. Which means, for our times, that besides drawing and writing, learning how to code should be seen as one of the most valuable skills to acquire to be prepared for a future in which machines are taking an active role in all our lives.
The reason I am not worried is that one of the important aspects of what makes a work interesting to us is the fact that it was made by a human hand or a human mind. The story behind the work is often just as important as the work itself. And this is something that machines will have a very hard time replicating.
Mistaken Identity consists of three rendered two-hour long videos. Based on the artist’s “neural glitch” technique, the fully synthesised videos show visualisations undertaken by generative adversarial neural networks (GANs). Mistaken Identity will be on display as part of the Future U exhibition at RMIT Gallery in 2021.
Learn more about Mario Klingemann's views on the future of creative AI here.