Decoding visual brain representations from electroencephalography through Knowledge Distillation and latent diffusion models
🔥INFO
Blog: 2025/07/23 by IgniSavium
- Title: Decoding visual brain representations from electroencephalography through Knowledge Distillation and latent diffusion models
- Authors: Matteo Ferrante, Nicola Toschi (University of Rome Tor Vergata)
- Published: September 2023
- Comment: arxiv
- URL: https://arxiv.org/abs/2309.07149
🥜TLDR: Only train the Conv Encoder which maps EEG-signal spectrogram (derived from STFT) to ImageNet classification scores with CLIP knowledge distillation], and then use a category-related text template to guide Stable Diffusion generation.
Motivation
This research aims to enhance EEG-based brain decoding by developing an individualized, real-time image classification and reconstruction pipeline, addressing the limitations of prior studies that relied on multisubject models and struggled with low-fidelity visual reconstructions.
Model
Architecture
- train a EEG-based classifier using CLIP distillation:
- Use text prompt such as "an image of a \<predicted class>" to guide SD generation with random white noise \(z_T\).
Evaluation
Performance
Compared with other typical CLASSIFIER architectures.
🤔Last two rows in the table below show the benefit of knowledge distillation, but the acc. gain is VERY LIKELY exaggerated.
🤔Using typical LSTM to model EEG time series directly can also have not bad performance.
This bar chart below seems more plausible.
🤔Reflections
Class-wise encoding is obviously insufficient for detailed semantic reconstruction.
Other works have mapped the EEG-signal to conditional text features and visual latent features in the SD.