German artist Boris Eldagsen has sent shockwaves through the world of photography by winning one of the prestigious annual Sony World Photography Awards with his work “The Electrician.” It’s part of his Pseudomnesia series and an image he created exclusively using artificial intelligence. He applied to the category “creative” to test if the organization had done its homework regarding AI-generated images. It turned out they didn’t. Eldagsen managed to dupe the organization as they weren’t aware of the photo’s origins. Eldagsen describes the events in detail on his website. Following the rejection of the award, SWPA issued a statement you can read here.
Eldagsen sparked an important discussion to accelerate the process of the Prize organizers to create separate competitions for AI-generated images.
In our interview, he shares his thoughts on approaching AI within the photo industry and gives tips for leveraging the new technology.
What feedback have you received since you rejected the award and caused controversy within the photo industry?
Boris Eldagsen: I have yet to see all the comments posted on social media, but I read every single message that reaches me, and 90 percent of them are positive. That surprised me because I have been communicating this topic in various presentations within the creative industry in Germany for over half a year now. There was also some criticism from professional photographers, and my engagement with AI was not always well received by the community.
AI generates many existential fears among professional photographers. With what mindset could they best approach it?
Boris Eldagsen: Pandora’s box has been opened, and I recommend approaching the topic with curiosity, trying to individually find out how to integrate AI into one’s workflow to perform tasks more quickly and cost-effectively. The speed at which the technology is developing has been continuously accelerating since August 2022, like a Big Bang that speeds up in all directions. Since January of this year, I have been dealing with nothing else and noticed how challenging it is to keep up with the pace.
How could photographers get started with AI?
Boris Eldagsen: There are many different ways to generate images with AI. One is with language, through text prompts, and another is using pre-existing photos that are then assembled, remixed, and altered. And there is also an interface where you work with both language and a visual prompt. That’s the basis. There are many possibilities when working with text prompts, consisting of up to eleven layers.
They include details about the references, such as the medium you want to imitate, and technical parameters, such as aperture and lens. But there are also visual aspects; for example foreground, middle ground, background, and perspective. What are the lighting conditions and colors? What references are there to other artists? That is usually called style, but I see these references in a larger context, as they can be anchored to certain art historical epochs. Putting all these components together in a prompt is a craft that allows you to create something new.
What was the learning curve controlling all these parameters with words?
Boris Eldagsen: I just started experimenting. I have always relied on my imagination to create images in my photographic practice. Often, the challenge for me was to capture my ideas in matter. Now I work exclusively with immaterial knowledge, depending on my imagination and the visual and technical expertise I have acquired. It’s a real liberation. That’s why I see AI more as a knowledge amplifier that allows me to bring my experience as a photographer of over 30 years into the technology.
And when was there ever a technological revolution in which the older generation could benefit the most? That’s also what I convey in my courses to other photographers. All participants bring their unique expertise and can use it to their advantage when working with AI. So you’re not simply replaced by a soulless machine, but AI helps you get even more out of what you already bring regarding expertise and experience.
How can photographers best utilize the potential of this new technology?
Boris Eldagsen: Above all, through experimentation and creative use of language. Because no matter what you input, the AI always delivers a result. Platforms often suggest boring things like “American football player painted by Van Gogh” or “Pizza painted by Van Gogh.” To make it more creative, I could instead input “Van Gogh photographed by a pizza” and see what comes out. I can also nest sentences together with subject and object in a no longer logically solvable way.
That has something Dadaistic to it. I type in something absurd and see what comes out.
Boris Eldagsen: And if I use a platform with many manual parameters that I can adjust, then a previously generated aesthetic is repeatable. You can create a formula for a new aesthetic appearance, and this is a creative achievement.
What platforms are you currently experimenting with, and which ones do you recommend to photographers?
Boris Eldagsen: I recommend starting with DALL·E 2, then Midjourney, Stable Diffusion, and finally, open source platforms. While DALL·E 2 is still at the level of August 2022, Midjourney has released five new versions in the meantime, which work very photo-realistically and are more complex to use. It runs on a Discord server, widely used in the gaming industry, which initially scares off some people. But in the future, there will also be a browser-based version. And then, there are numerous open-source providers. Last week there were 3,100, all with a corresponding aesthetic and orientation. And there are over 200 parameters that can be preset.
That sounds overwhelming…
Stable Diffusion has new features every week and has become much easier to use. You don’t need a fast computer or a perfect graphics card. You just let it run on Google Colab, which means you use the computing power of the platform. At the moment, I also have Adobe Firefly in mind, which claims to be the first platform to compile training materials ethically, i.e., from their stock photo community and copyright-free materials. They are excellent in photorealism, but it still has a stock photo feeling. I think, in the future, you will switch back and forth between different platforms. You combine multiple platforms based on their strengths. And everything I generated in the past year, I could feed back in as additional working material.
With the “The Electrician” from the series “Pseudomnesia” Boris Eldagsen won in the category creative at the Sony World Photography Awards.
You touched on the copyright debate and mentioned that some platforms use ethically curated material. Why was it possible for AI platforms to commit copyright infringement in the first place without any consequences?
Boris Eldagsen: The current legal situation allows non-profit organizations to compile and use material on the internet for scientific purposes, as long as it is not behind a firewall. That means that both Open AI and Stable Difusion have established a non-profit subsidiary as a company, enabling them to exploit this legal loophole. The training material is thus compiled and used in the subsidiary. The result of this research work is something from which the parent companies have made money. It is still legally covered but not ethically fair for those whose materials are used. There are currently numerous legal proceedings in this regard.
In another interview, you mentioned that the biggest problem in the copyright debate is how users of the platforms handle third-party material.
Boris Eldagsen: I think there will soon be a regulation about the use of training materials. But the platforms also offer numerous options to generate images by blending existing material from other photographers and artists, especially Midjourney thrives on this option. I upload two photos, and the AI combines them into a new one in seconds. The Pope in the Balenciaga puffer jacket is such an example. I only need a picture of the Pope and the jacket, and Midjourney creates a deceptively realistic montage from them. And these functions are very hard, if not impossible, to control.
Users can also photograph a visual work with their smartphone camera, upload it to an AI platform, and distort it until the original is no longer recognizable. That makes it very hard to prevent copyright infringements. And even if there is a regulation that no longer allows specific works to train the AI, a very individual style can still be described by Chat GPT in words and then used as a text prompt in an AI image tool to emulate the style. Realistically, preventing others from using one’s image material will be impossible. Therefore, photographers must discover how they can still benefit from the technology.
For the next generation of photographers, an existential question arises: Is studying photography even worthwhile? Many young people interested in this career path are still determining whether they can launch a successful career with a photography degree.
Boris Eldagsen: I have also had this discussion with a private university in Germany, and the head asked himself: Do we now have to rename ourselves as a university if we start incorporating AI into our curriculum? Would it be better to distance ourselves from the term photography? I have given numerous lectures and workshops on AI in recent weeks at various universities. I can only hope that educational institutions will react early and integrate it into their curriculum.
But doesn’t that contradict the notion that photography and AI generated images should be two separate art forms?
Boris Eldagsen: The relationship is complex. First and foremost, it’s essential that terms like “AI photography” should not be used. I can also create visuals that look like drawings or paintings with image generators. Therefore, we need new terminology, and the most interesting suggestion from the community so far has been “promptography.” We then distinguish between promptography and photography. The commonality lies in the photographic language, which has become independent and leads a life of its own.
Whether it makes sense to combine the two areas under one roof must be addressed in an open discussion. And I would like the respective players in the photography world to do this work and draw up a position paper just as we did in Germany with the individual associations and the Photo Council. I also rejected the Sony World Photography Awards because they were unwilling to work out this position for themselves and then communicate it clearly. They always avoided clear statements on the subject and left me hanging, even though they had agreed to address the topic of AI and photography in an interview with me on their blog, after I repeatedly insisted, but I never heard back from them. AI-generated images and photographs should not compete against each other in competitions. That does not mean there cannot be joint exhibitions under one roof in the future but as two separate artistic styles.
It’s also interesting that you have already participated in such competitions before with photographs. A picture generated by an AI has received so much attention and struck a chord with the jury. What makes the image so strong?
Boris Eldagsen: The picture is powerful and meets all the requirements a work of art should have for me. The question of what the artist wants to express with it is not the most important thing. For me, a picture is an impulse to take a personal journey inward and to deal with one’s thoughts, memories, and feelings. The better question would therefore be what the artwork evokes in me and why? It’s a process of self-reflection that arises from looking at the picture. I have received very moving responses in reaction to the image. An Italian man told me it reminded him of his mother, who had developed dementia. He saw his mother in the woman in the picture. It’s great that it creates a connection from person to person and can mean something individual to everyone.
In a previous interview, you expressed concern about photojournalism for apparent reasons. How do you see its future?
Boris Eldagsen: Photojournalism is being attacked from several sides. One is through the fake images that anyone can produce. With the blend option of Midjourney it’s incredibly simple. How do we differentiate these images from the real ones in the future? One possibility suggested by the photojournalists of “Freelens” is to label them with A for authentic, M for manipulated, and G for generated.
However, implementing this system in practice requires immense resources. I recently spoke to a picture editor at the Spiegel publishing group who said that verifications are becoming increasingly expensive. Small publishers, therefore, need help managing this. The question is how a democratic state can set up a structure that enables the press to carry out these verifications without interfering with press freedom.
We also need to talk about how photojournalists already have a hard time. I recently heard from a photographer that for a SPIEGEL cover featuring the German Chancellor, they would receive only 600 euros. To do this job, many have to take on additional commercial jobs, which will continue to decline through the possibilities of AI, in addition to increased competition. Photojournalism is, therefore, under attack from several sides. And as much as the technology excites me as an artist, it also worries me as a citizen.”
This interview has been edited for length and clarity.