A team of Japanese researchers led by Yu Takagi and Shinji Nishimoto from Osaka University’s Graduate School of Frontier Biosciences used Stable Diffusion AI to help generate images based on people’s brain activity using MRI scans as input.
The researchers relied on a latent diffusion model (LDM) known as Stable Diffusion to decode fMRI data from human brains and reconstruct visual experiences using a diffusion model (DM). According to the researchers, a few studies have produced high-resolution image reconstructions, but only after training and fine-tuning generative models. This resulted in limitations because complex model training is difficult and there aren’t many samples to work with in neuroscience. No other researchers had attempted to use diffusion models for visual reconstruction prior to this new study.
The researchers showed subjects a series of images and performed fMRI (functional magnetic resonance imaging) scans of their brains while they focused on the image. The final image is the result of several parts, including an existing fMRI image output and Semantic Decoder. The model was then processed, and noise was added to it via the diffusion process. Finally, the researchers used text representations decoded from fMRI signals in the higher visual cortex as input to create a final constructed image.
This simple framework uses Stable Diffusion to reconstruct high-resolution images from functional Magnetic Resonance Imaging (fMRI) signals, eliminating the need for training or fine-tuning of complex deep generative models. Stable Diffusion’s text-to-image conversion process incorporates the semantic information expressed by the conditional text while retaining the appearance of the original image.
The sources for this piece include an article in Vice.