RADAR Member Spotlight .001 | Squid
Stable diffusion, centaur creativity, and personal resilience with @Squid
Welcome to a new series from RADAR, highlighting some of the brilliant folks from inside our community. In this series, we’ll go deep on topics our members have been fascinated by, learning from their experiences and experiments as we dive down the rabbit hole.
Generative AI is, of course, the topic du jour – and the conversation doesn’t seem to be slowing down. Ever since the release of DALL-E 2 seemed to create a landslide of AI siblings – each more impressive than the last – the world (and particularly, our corner of Twitter) seems like it can’t get enough of this space.
Today we’re talking with Squid, one of RADAR’s resident AI enthusiasts and practitioners. An early adopter, Squid has been a big part of the conversations around AI and art in the discord – from conversations around established players adopting the tech, to in-depth technical analyses on how AI models like Midjourney came to happen.
Read on for his take on the landscape, how he’s been experimenting with these new technologies, and the long road ahead for these tools in human creativity stacks.
Squid is a Cultural Strategist by day, and moonlights as an AI enthusiast/advocate. You can find some of his experiments with Stable Diffusion and Disco Diffusion here.
RADAR: Let’s start off with the basics. How did you get into generative AI?
@Squid: The driver was my curiosity to try an emerging technology and scratch a creative itch that I hadn’t scratched since I left cocktail bartending and dancing to enter the world of cultural insight and strategy. [Editor’s note – Squid has a very cool and multifaceted background that includes breaking, contemporary dance and a master’s in Dance Research].
It started mid-late last year, when I jumped into the world of NFTs and found my way onto the Tezos ecosystem. I stumbled upon 3 artists who use AI in their workflows: Jenni Pasanen, who uses Artbreeder to creature textures for her digital paintings; Ivona Tau, who trains custom GAN models on her photography; and Ilya Shkipin who blends AI outputs with his digital paintings. I started to research GANs which eventually brought me to methods like diffusion, CLIP, and the synthetic media space at large.
During this exploration, I heard whispers of a then-new model called MidJourney, but I missed the early access round. I found Disco Diffusion as I was looking for an alternative. I came across Ethan’s guide and spent a lot of time in the Disco Diffusion Discord, learning and experimenting. I eventually settled on an exploration of adorned fauna and flora.
Since then, I've experimented with Dall-E 2, Stable Diffusion and MidJourney. More recently, I’ve been using fine-tuned Stable Diffusion models trained on my Disco Diffusion outputs using Dreambooth. I’m excited to explore style-transfer using outputs from various generative models to create an aesthetic that is distinctly “mine”.
What did your journey in finding the best tools and resources look like? What were some hurdles along the way you could help other people grasp through your own experience?
Ethan’s guide was a blessing to get my head around Disco Diffusion settings and basic prompting, the rest was from asking questions and experimenting with others in the Disco Diffusion Discord and connecting with others on Instagram.
Guides from people like KaliYuga and others were really strong assets in learning how to prompt more effectively. But the space moves fast, and the wealth of knowledge available now far surpasses what we had access to in the middle of last year.
I started before Stable Diffusion was released and UI’s such as AUTOMATIC1111 and InvokeAI were not around yet. The open source nature of Stable has really changed the game and made things more accessible (though there’s still a lot of work to be done).
I’d encourage folks experimenting to learn about art, aesthetic movements, ways to describe things and how to write descriptions of scenes to get more of what you want. Getting familiar with nuances of each tool and advance prompting (e.g. weighting, negative prompting, aspect ratios, re-using seeds, init images) is key, too.
The craft of prompting has been such an interesting development to watch in this space, how people are building and learning in public with these tools. What’s your approach to bringing your vision to life with the tech?
On a basic level, the syntax I tend to follow for my prompts is:
[medium] [subject] [artist(s)] [details] [image repository support].
I think with the newer models referencing artists isn’t always needed but worth trying. I’d also recommend people to try unique combinations of artists, as the hybrid aesthetics can sometimes be far more interesting
On that note, let’s go a bit deeper into the tools – the use cases might be similar, but it seems like people all have their own preferences and experiences with them. For you, what separates DALL-E 2 from MidJourney, or Stable Diffusion?
The main tools that I’ve used, from easiest to hardest are:
DALL-E 2, brand safe, properly the easiest tool for people in my industry, but aesthetically it is limited.
MidJourney has a particular flavor / aesthetic, though the integration of Stable Diffusion has changed things a bit.
Stable Diffusion, need to know a bit more about prompt crafting but results can be amazing. Even better if you fine-tune a model with Dreambooth.
Disco Diffusion & Mathrock Diffusion, hardest to learn as you also have more settings and model combinations but probably my favorite visuals-wise as it leverages the AI-vision aesthetic (vs the other models which tend to favour more realism).
There’s been a lot of conversation on the ethics of generative AI, particularly around art and artists. What are your thoughts on the debates around IP, ownership and guardrails when training these models?
These conversations are incredibly complex. From what I understand, a lot of the debate centers around copyright vs fair-use. I have to preface, I’m not a legal professional in any capacity but I do have some thoughts.
Consent is really important here, I think one thing that has caused tension with the art community is the lack of consent. The way in which these datasets are collected and curated need to have measures that allow an artist to opt-in or opt-out. While this may not be solved in the near future, I think it will help address some of the concerns of artists.
I’ve also noticed that when people first pick up these tools they enter a novelty phase and try to bash their favorite celebs, brands or media properties together, or replicate existing aesthetics. So I think asking yourself what your intention is is key here. If it's to make memes or imagery for personal use, I think that’s fine – the tension is if and when outputs become commercialised.
In the advertising industry, we’ve been told not to reference living artists in our prompts if we choose to use generated assets. Other techniques are using multiple artists to create hybrid aesthetics, or prompting with art genres/movements instead of naming a specific person. While these are not perfect solutions, I do think they represent more considered approaches to using these tools.
There’s certainly a litigious future ahead for these tools – finding the right way to maintain intellectual property is going to be crucial. What else do you see as the biggest challenge for Generative AI as a commercial tool and a catalyst for human creativity?
The current media discourse sees generative AI through a lens of fear. I think in some ways Terminator and 2001: A Space Odyssey are probably to blame (like how Jaws created a mass fear of shark attacks).
I think one of the big challenges is to find a way to dislodge the fear lens and help people see the opportunity. This technology can be used really well as part of a larger workflow. Be it sowing seeds of inspiration, generating moodboards rapidly without having to google images, or bringing storyboards to life quickly.
I tend to view these as augmenting human creativity rather than replacing. I do think that in industries that rely on photobashing for concepting, or other digital assets this will speed things up and allow people to focus on their ideas. But of course, this narrative is coming from a commercial lens and the concerns voiced by traditional artists are crucial to listen to.
If the past few months are any indicator, this is going to be a breakneck year in AI – and we’re stoked to have you in the RADAR community as we explore its possibilities. Could you tell us a little bit about your background, and how you got to us at RADAR?
It was a combination of work, my general curiosity and personal exploration in the web3 space as I started to move from solely focusing on NFTs and DeFi.
I’ve been a big proponent of collective knowledge and meaning-making since my dance school days and post-graduate studies in dance research, so elements of DAOs had always piqued my interest, but it wasn’t until I participated in the Futurethon and read the AFIS report that I felt like I’d found the community I was looking for.
I also wanted to find more like-minded people as New Zealand is pretty small and I’ve struggled to find people who are interested in the same things as me, both on a personal and professional level. It’s an honor to be here and having so many amazing humans in the community as well as industry leaders has been an amazing experience so far!
We’ve focused this conversation on AI, but we’d love to hear more about your other fascinations. What are a few rabbit holes you’d recommend everyone explore?
A big one is self-inflicted adversity & building mental resilience through endurance sport. This one started when I saw two documentaries about The Barkley Marathons - The Barkley Marathons: The Race That Eats Its Young and the infamous Where Dreams Go To Die.
I was captivated by the zany premise of the race, its director and the types of people who were drawn to the event. I was fascinated with trying to understand why people race these sorts of events or create their own adventures that push their limits, and dove into ultra running culture as well as the wider world of endurance sport. People undertake these endeavors for all sorts of reasons: To find discomfort when you live a life full of comfort and ease, to find personal growth through adversity, and more.
It’s made me curious about starting my own journey into this world of endurance sport, and has also challenged my preconceived notions around aging, in some ways reframing it as a positive thing. Aging isn't all that bad in the context of endurance sport – and often being older can be an advantage.
A couple more rabbit holes to note: I can’t for the life of me remember why or how I stumbled on them, but MRE reviews have been a staple of my content viewing for several years. Highly recommend the creator Steve1989MREinfo, who some have called the Bob Ross of MRE reviews. Also a big fan of Corridor Crew’s react series – looking at the good, the bad, and the ugly of visual storytelling in popular media through the lens of the VFX and CG explosion in popular media.
A big thank you for Squid for this chat. Suffice to say, we’re obsessed with his work and thinking around this space – an enthusiasm for learning while doing and jumping straight into complex rabbit holes is very, well, RADAR. Readers, you can find Squid on Twitter here and follow his AI experiments on Instagram here.
Our Member Spotlight series is curated and written by @Kairon, contributor at RADAR.