When EEG Stops Thinking in Squares: Why Linear-Time Models Are Quietly Winning

Opening — Why this matters now

There is a quiet bottleneck in AI that rarely makes headlines: time complexity. While large language models dominate attention, a parallel world—biosignals like EEG—is struggling with something more mundane but more fatal: scale.

EEG data is long, messy, and structurally inconsistent. Transformer-based models, elegant as they are, scale with $O(n^2)$ complexity. That’s tolerable for text. It’s disastrous for continuous brain signals.

The paper LuMamba proposes a subtle but important shift: stop forcing EEG into Transformer-shaped thinking. Instead, redesign the pipeline around linear-time sequence models and topology invariance. The result is not just faster—it changes what “generalization” means in biosignal AI.

Background — Context and prior art

EEG foundation models have followed a predictable trajectory:

Approach	Core Idea	Problem
Transformers (e.g., EEGFormer, LaBraM)	Masked modeling, attention across channels/time	Quadratic complexity, memory limits
Contrastive SSL (e.g., BENDR)	Learn representations via similarity	Sensitive to dataset structure
Topology-aware models (e.g., LUNA)	Map electrode layouts into latent space	Still relies on heavy attention
State-space models (e.g., FEMBA)	Linear-time sequence modeling	Lacks topology invariance

The core tension is clear:

Transformers: expressive but computationally expensive
SSMs (Mamba): efficient but structurally naive to electrode variation

And EEG has an additional twist: the input space itself is unstable. Different hospitals, devices, and studies use different electrode configurations. Models trained on one layout degrade on another—sometimes by 2–6%. fileciteturn0file0

So the real problem is not just modeling time—it’s modeling structure that keeps changing underneath you.

Analysis — What the paper actually does

LuMamba is less a new model and more a fusion architecture. It stitches together three previously separate ideas into a coherent system:

1) Topology Invariance (LUNA-style)

Instead of treating EEG channels as fixed inputs, LuMamba projects them into a shared latent query space using cross-attention.

Implication: the model no longer “cares” whether input has 16 or 26 electrodes—it learns a canonical representation of brain signals, not hardware layouts.

2) Linear-Time Temporal Modeling (Mamba)

The temporal backbone replaces Transformers with bidirectional Mamba (state-space models).

Key property:

$$ \text{Complexity: } O(n) \quad \text{vs Transformer } O(n^2) $$

This is not just a speedup. It changes feasibility:

Longer sequences become tractable
Real-time or embedded deployment becomes plausible
Memory ceilings stop dictating model design

3) A Different View on Representation Learning (LeJEPA)

This is where things get interesting.

Most EEG models rely on masked reconstruction. LuMamba adds LeJEPA, which:

Aligns local and global views of signals
Regularizes embeddings toward an isotropic Gaussian

In plain terms:

Reconstruction → structured representations
LeJEPA → smooth, transferable representations

The paper’s real contribution is not proposing LeJEPA—but showing how it behaves in biosignal space, where structure and noise are tightly entangled.

Findings — Results with visualization

1) Objective Trade-off: Structure vs Generalization

Pre-training Strategy	Strength	Weakness
Reconstruction-only	Clear clusters, strong in-distribution performance	Poor cross-dataset generalization
LeJEPA-only	Smooth embeddings, better robustness	Weak clustering, less task-specific signal
Combined (LuMamba)	Balanced performance	Slight loss in visual separability

From Table I (page 4), the combined objective achieves the best overall results, especially on unseen electrode setups. fileciteturn0file0

2) Real Performance (Selected)

Task	Metric	Result
TUAB (abnormal detection)	Balanced Accuracy	80.99%
Alzheimer’s detection (APAVA)	AUPR	0.97
Parkinson’s (TDBrain)	AUPR	~0.96

The Alzheimer’s result is particularly notable: +20% improvement vs reconstruction-only. fileciteturn0file0

3) Efficiency Gains (The Quiet Killer Feature)

From Figure 2 (page 5):

Model	Relative FLOPS
LUNA	26× higher
LaBraM	377× higher
EEGFormer	3718× higher

And more importantly:

Supports 12× longer sequences before memory failure

This is not optimization. This is category shift.

Implications — What this actually means

1) EEG is moving toward “foundation model reality”

Previously, EEG models were dataset-specific tools. LuMamba suggests something closer to:

Pre-train once on massive unlabeled EEG
Fine-tune across tasks and hospitals

That’s the foundation model playbook, finally applied properly.

2) Efficiency is becoming a first-class design constraint

Most AI discussions still treat efficiency as engineering detail. This paper disagrees—quietly but firmly.

In domains like healthcare:

Data is long
Devices are constrained
Latency matters

A 300× FLOPS reduction is not a benchmark win. It’s the difference between:

“Research demo”
and “deployable system”

3) Representation geometry is now a strategic choice

The LeJEPA vs reconstruction trade-off reveals something deeper:

Good representations are not just about accuracy—they are about how transferable your mistakes are.

Reconstruction → memorizes structure
LeJEPA → tolerates variation

The combination is effectively a bias–variance trade-off in latent space.

4) Topology invariance hints at a broader pattern

EEG is just one example of non-stationary input structure.

This idea generalizes to:

Multi-sensor IoT systems
Cross-market financial signals
Multi-source enterprise data pipelines

In all cases, the input schema changes—but the underlying signal doesn’t.

LuMamba’s approach—learn a stable latent interface—is likely to reappear elsewhere.

Conclusion — Quiet revolutions are still revolutions

LuMamba does not introduce a flashy new paradigm. It does something more dangerous: it removes constraints that everyone else quietly accepted.

Sequence length is no longer the bottleneck
Electrode configuration is no longer a liability
Representation learning is no longer one-dimensional

And once those constraints disappear, the entire design space shifts.

In other words, EEG modeling just stopped thinking in squares.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — Context and prior art#

Analysis — What the paper actually does#

1) Topology Invariance (LUNA-style)#

2) Linear-Time Temporal Modeling (Mamba)#

3) A Different View on Representation Learning (LeJEPA)#

Findings — Results with visualization#

1) Objective Trade-off: Structure vs Generalization#

2) Real Performance (Selected)#

3) Efficiency Gains (The Quiet Killer Feature)#

Implications — What this actually means#

1) EEG is moving toward “foundation model reality”#

2) Efficiency is becoming a first-class design constraint#

3) Representation geometry is now a strategic choice#

4) Topology invariance hints at a broader pattern#

Conclusion — Quiet revolutions are still revolutions#