“I don’t see it as a dystopian future”: An interview with Stephan Baumann on music
Stephan Baumann leads a double life. The artificial intelligence expert from Germany’s DFKI centre is also a keen musician. He spoke to the Goethe-Institut about AI’s future role in music and how musicians are already adapting.
Stephan Baumann could be the coolest music professor you’ll ever meet. The academic-cum-musician works for the German Research Centre for Artificial Intelligence (DFKI) by day, and releases electronic music under the name of MODISCH in his spare time.
Back in the early 2000s, Baumann co-founded SONICSON, a start-up specialised in music recommendation, an industry that has grown immensely over the last two decades. Since then he has lectured on the technology of AI-driven music recommendation at the German Popakademie and now works in Berlin and Kaiserslautern for the DFKI.
His dual identity as an academic and active musician gives him unique insight into the future of music. In conversation with the Goethe-Institut, Baumann talks about current AI-driven music trends, the influence of big tech and the new landscape for musicians.
Spotify has been accused of paying fake bands to create music for its playlists (which it denies), while Shazam now has a record label that uses its user-data to sign new bands. Are we currently being played by the big tech companies or is this new reliance on user data actually empowering music lovers?
Stephan Baumann | © Marie Gouil It’s hard to stay on track and not to get lost in this kind of manipulation, I agree with you. It is a little bit risky because we even have the possibility now, not only to have content recommended to us which has been produced for others, but we can also have content recommended to us which is made only for us and exactly customized to our tastes. We have seen AI engines being trained on certain musical styles. The music industry will, one day, deliver to me exactly what I want, for instance, the perfect 2020 house tune with disco influence from the 70s and a nice vocal by the former singer of Moloko. This potential is already there.
Is this a futuristic view or is this happening right now?
When it comes to music I’m already experiencing the future. This is because of my profession, not because I can see into the future. We are dealing with cutting-edge, state of the art technology right now in our job. For example, look at the project Jukebox. I was really blown away by what they can generate automatically. No human interaction and it generates a complete song in audio with vocals and some fantasy lyrics. There are still some sonic artifacts because of the heavy computation. I think they need 10 hours to produce 1 minute or so, but it is really awesome.
A sonic artifact is a little noisy disruption. To explain it in a simple way, the scientists go down, down, down, to tiny time windows, where they get all the pieces they need to create a certain syllable or note and then it gets rebuilt over and over again until it sounds like a Beatles singer is singing you a fantasy lyric. But at some points it’s not quite perfect. I would say this technology will be optimised in 10 years and it will be used to create the perfect song for you by machine and not by a musician.
But the point is, why? The machine doesn’t know the meaning of life or death. For us as musicians, we do music to compensate our feelings or just to have a good time, but the machine doesn’t care. So, the point is, why? Why should it make sense to have machines doing the composition for us, I mean, that is a crazy idea. But the only use could be that a professional composer could stop composing boring music for the mainstream. They can do sophisticated avant-garde material instead, to have fun. That is why I still see these things not really as a dystopian future but as a very, very advanced musical tool. A drum machine, a sampler, AI-generated music - it’s okay. I’ll still have fun with my music machines, creating my own songs. Robot band Compressorhead can play a decent song, but are they really feeling it? | © dpa picture alliance / Alamy Stock Photo Could this technology be a threat to musicians’ livelihoods in the future?
I would say no. If you talk to musicians in the avant-garde scene or the young ones, they are already using AI-generated material and they like it, they put their voice to it. Holly Herndon is one of the poster girls – she’s super nerdy and an independent star. She has a PhD in computer science. She created an entire universe to pumping out albums and doing tours by blending AI-generated sound and vocals with her own music. To me it is just another facet of your work.
If you look at Bandcamp, a few years ago nobody would have foreseen the rise of this website. We were all talking about Spotify and whether manually created playlists will outperform the recommendation algorithms, and then Bandcamp arrived. For young people it feels like folk music in the 60s again. Being closer to the artist and the movement. I find it quite interesting that these counter movements come up. Each time we think, “Oh god, we are lost” and then some counter movement comes up. Maybe this is human nature? Somehow we still keep refusing the technological singularity.
Do you think that the widespread use of AI-driven music recommendation systems by consumers is leading to more bland, homogenous modern music being created?
Unfortunately, I agree. If I look at Spotify’s top 10 or listen to what my kids are listening to, I hear lots of futuristic hip-hop autotuned lyrics, power-pop stuff. I don’t know what to call it and, yes, I don't like it. But what I don't like is that I cannot distinguish between the songs, I cannot even say if it is Beyoncé or I don't know who. It all sounds the same to me and I think the filter bubbles of recommendation systems contribute to this.
But I see some niche counter movements here too, people who want to do things differently. That's why I would not say that this will be a kind of a fading curve that we have to slide down and then we will have just one song for the entire population of earth! But I still wonder about the genuine act of creativity. In the AI world we are working on genuine forms of creativity, but it is very hard. This is even hard for an established musical artist, because they are often asked about their influences and questioned about their authenticity. Where is the point, where some crystal pops up, and there is an idea of great art which nobody has seen before?
If we can customize music, AI-generated music, which is exactly matched to what you have been socialized to, that’s something I guess. But where does this new factor come in? The creation of something new? I can’t see it so far in AI generated music or in the algorithms. Customized music may be possible with AI, says Baumann, but that doesn't mean it's creative | Photo credit: sgcdesignco / Unsplash Recently, I’ve been looking into boardgame playing software by Google’s DeepMind. These systems are beating the world champions but they also train each other without human help. Previously, these programmes had knowledge about perfect human moves. Now, they are playing against each other and they perform moves which a human would not do, but which lead to success. This is a bit scary. Garry Kasparov has been analyzing this new AlphaZero software and saying that this is a kind of synthetic creativity: powerful moves that a human would not do. I wonder if we could transfer this algorithmic technology to music, then maybe some new musical movements could be created?
But there is one thing we should remember: With these kinds of boardgames, it is easy to describe the end goals. You tell the machine, “You are perfect if you reach this state – you have won or you have lost.” In music and art this is open, though. It is much harder describe the final goal state. You can’t say, “The perfect song is this, try to create one on your own.” This is not possible.
You have talked in other interviews about AI technology that measures a music listener’s mood and other responses and react accordingly. Are we there yet?
We already have the hardware to measure this. Here at the DFKI I am going through a project to measure people’s reactions by heartrate, skin temperature, galvanic skin response. We know these factors and there is some stable hardware now, so you can measure what is going on. But the interpretation is still very, very hard.
We can try to apply the machine learning and AI algorithmic processes to it but it is still tricky to say exactly what is going on. If your heartrate races and your skin temperature changes when a certain musical chord is played, it can still mean different things to different people. So, you need really excellent data by running in-depth interviews with the people that you measured, asking them about what was really going on. Then you have to be able to rely on them to report precisely and objectively about their subjective mood and meanings. I would say this is really the holy grail in my field.
Learn more about Stephan Baumann's views on the future of creative AI here.
Back in the early 2000s, Baumann co-founded SONICSON, a start-up specialised in music recommendation, an industry that has grown immensely over the last two decades. Since then he has lectured on the technology of AI-driven music recommendation at the German Popakademie and now works in Berlin and Kaiserslautern for the DFKI.
His dual identity as an academic and active musician gives him unique insight into the future of music. In conversation with the Goethe-Institut, Baumann talks about current AI-driven music trends, the influence of big tech and the new landscape for musicians.
Spotify has been accused of paying fake bands to create music for its playlists (which it denies), while Shazam now has a record label that uses its user-data to sign new bands. Are we currently being played by the big tech companies or is this new reliance on user data actually empowering music lovers?
Stephan Baumann | © Marie Gouil It’s hard to stay on track and not to get lost in this kind of manipulation, I agree with you. It is a little bit risky because we even have the possibility now, not only to have content recommended to us which has been produced for others, but we can also have content recommended to us which is made only for us and exactly customized to our tastes. We have seen AI engines being trained on certain musical styles. The music industry will, one day, deliver to me exactly what I want, for instance, the perfect 2020 house tune with disco influence from the 70s and a nice vocal by the former singer of Moloko. This potential is already there.
Is this a futuristic view or is this happening right now?
When it comes to music I’m already experiencing the future. This is because of my profession, not because I can see into the future. We are dealing with cutting-edge, state of the art technology right now in our job. For example, look at the project Jukebox. I was really blown away by what they can generate automatically. No human interaction and it generates a complete song in audio with vocals and some fantasy lyrics. There are still some sonic artifacts because of the heavy computation. I think they need 10 hours to produce 1 minute or so, but it is really awesome.
A sonic artifact is a little noisy disruption. To explain it in a simple way, the scientists go down, down, down, to tiny time windows, where they get all the pieces they need to create a certain syllable or note and then it gets rebuilt over and over again until it sounds like a Beatles singer is singing you a fantasy lyric. But at some points it’s not quite perfect. I would say this technology will be optimised in 10 years and it will be used to create the perfect song for you by machine and not by a musician.
But the point is, why? The machine doesn’t know the meaning of life or death. For us as musicians, we do music to compensate our feelings or just to have a good time, but the machine doesn’t care. So, the point is, why? Why should it make sense to have machines doing the composition for us, I mean, that is a crazy idea. But the only use could be that a professional composer could stop composing boring music for the mainstream. They can do sophisticated avant-garde material instead, to have fun. That is why I still see these things not really as a dystopian future but as a very, very advanced musical tool. A drum machine, a sampler, AI-generated music - it’s okay. I’ll still have fun with my music machines, creating my own songs. Robot band Compressorhead can play a decent song, but are they really feeling it? | © dpa picture alliance / Alamy Stock Photo Could this technology be a threat to musicians’ livelihoods in the future?
I would say no. If you talk to musicians in the avant-garde scene or the young ones, they are already using AI-generated material and they like it, they put their voice to it. Holly Herndon is one of the poster girls – she’s super nerdy and an independent star. She has a PhD in computer science. She created an entire universe to pumping out albums and doing tours by blending AI-generated sound and vocals with her own music. To me it is just another facet of your work.
If you look at Bandcamp, a few years ago nobody would have foreseen the rise of this website. We were all talking about Spotify and whether manually created playlists will outperform the recommendation algorithms, and then Bandcamp arrived. For young people it feels like folk music in the 60s again. Being closer to the artist and the movement. I find it quite interesting that these counter movements come up. Each time we think, “Oh god, we are lost” and then some counter movement comes up. Maybe this is human nature? Somehow we still keep refusing the technological singularity.
Do you think that the widespread use of AI-driven music recommendation systems by consumers is leading to more bland, homogenous modern music being created?
Unfortunately, I agree. If I look at Spotify’s top 10 or listen to what my kids are listening to, I hear lots of futuristic hip-hop autotuned lyrics, power-pop stuff. I don’t know what to call it and, yes, I don't like it. But what I don't like is that I cannot distinguish between the songs, I cannot even say if it is Beyoncé or I don't know who. It all sounds the same to me and I think the filter bubbles of recommendation systems contribute to this.
But I see some niche counter movements here too, people who want to do things differently. That's why I would not say that this will be a kind of a fading curve that we have to slide down and then we will have just one song for the entire population of earth! But I still wonder about the genuine act of creativity. In the AI world we are working on genuine forms of creativity, but it is very hard. This is even hard for an established musical artist, because they are often asked about their influences and questioned about their authenticity. Where is the point, where some crystal pops up, and there is an idea of great art which nobody has seen before?
If we can customize music, AI-generated music, which is exactly matched to what you have been socialized to, that’s something I guess. But where does this new factor come in? The creation of something new? I can’t see it so far in AI generated music or in the algorithms. Customized music may be possible with AI, says Baumann, but that doesn't mean it's creative | Photo credit: sgcdesignco / Unsplash Recently, I’ve been looking into boardgame playing software by Google’s DeepMind. These systems are beating the world champions but they also train each other without human help. Previously, these programmes had knowledge about perfect human moves. Now, they are playing against each other and they perform moves which a human would not do, but which lead to success. This is a bit scary. Garry Kasparov has been analyzing this new AlphaZero software and saying that this is a kind of synthetic creativity: powerful moves that a human would not do. I wonder if we could transfer this algorithmic technology to music, then maybe some new musical movements could be created?
But there is one thing we should remember: With these kinds of boardgames, it is easy to describe the end goals. You tell the machine, “You are perfect if you reach this state – you have won or you have lost.” In music and art this is open, though. It is much harder describe the final goal state. You can’t say, “The perfect song is this, try to create one on your own.” This is not possible.
You have talked in other interviews about AI technology that measures a music listener’s mood and other responses and react accordingly. Are we there yet?
We already have the hardware to measure this. Here at the DFKI I am going through a project to measure people’s reactions by heartrate, skin temperature, galvanic skin response. We know these factors and there is some stable hardware now, so you can measure what is going on. But the interpretation is still very, very hard.
We can try to apply the machine learning and AI algorithmic processes to it but it is still tricky to say exactly what is going on. If your heartrate races and your skin temperature changes when a certain musical chord is played, it can still mean different things to different people. So, you need really excellent data by running in-depth interviews with the people that you measured, asking them about what was really going on. Then you have to be able to rely on them to report precisely and objectively about their subjective mood and meanings. I would say this is really the holy grail in my field.
Learn more about Stephan Baumann's views on the future of creative AI here.