Project #3 - James L. Elia

Project #3: Aesthetic Selection in AI Image Generation

GitHub Repo

Practical Objective: Create an interactive workshop for the 2025 Envisioning AI at Yale Symposium applying evolutionary principles to AI image generation.

Learning Objective: Build fluency with high-performance computing and deployment of pre-built generative AI models.

Generative AI as an Evolutionary System

To bridge a conceptual gap between machine learning and evolution, I designed this pipeline to apply selection to the latent space of a diffusion model. The core loop of the exhibit:

txt2img: I feed in three prompts and generate three images each (nine total), displayed on a large television

For example: "a beach at sunset", "a river in a mountain valley", "a futuristic cityscape at night"

Voting: Symposium attendees use a keyboard to vote for their favorite image in each category
img2img: Once an image reaches a vote threshold, it is fed into an img2img diffusion to create three new variants for that prompt

As the loop continues, we theoretically generate increasingly pleasing images (at least, according to the crowd). We can map the parameters directly to evolutionary principles:

Genotype and Inheritance: The seed image acts as the genetic code. We use img2img generation to pass phenotypic traits to the next generation.
Mutation Rate: The denoising strength functions as the mutation rate. Too low and the image is visually identical; too high and the image loses its lineage.
Selection Pressure: Symposium attendees act as the environment. Using a voting interface, they applied selective pressure, determining which phenotypes survived to reproduce.

HPC Architecture and Headless Interaction

Running Stable Diffusion XL on several images in real-time requires VRAM-rich compute that exceeds standard local capabilities. I deployed this workflow on Yale’s McCleary High-Performance Computing cluster, which presented unique engineering challenges regarding resource allocation and interactive visualization in a headless environment.

While I built the backend, YCRC's Sam Friedman helped me set up X11 forwarding so attendees had an interactive display:

Backend (Compute): A Python script utilizing accelerate and xformers integrates heavy tensor operations and memory-efficient attention on A100 GPU nodes with the core logic described above.
Frontend (Visualization): A lightweight Matplotlib viewer script utilizes X11 forwarding to project the generated gallery from the headless cluster directly to the local displays via SSH, allowing for real-time audience feedback loops.

The biggest engineering hurdle was dependency management. Getting the "shifting sands" of modern AI libraries to play nicely with the rigid environment of an HPC cluster was a crash course in dependency hell.

Three sets of three AI generated images showing evolutionary progression — **The Evolutionary Loop:** Three iterations of three prompts compete. Images selected by the audience become the input for the next generation, creating a directed evolutionary path through the model's latent space. Winners outlined with green.