Cover image

No More Low-Rank Detours: GPart and the Geometry of Fine-Tuning

Adapters are supposed to make fine-tuning simple. A team takes a large pretrained model, freezes most of it, trains a small adapter for customer support, another for invoice extraction, another for compliance review, and so on. The pitch is attractive: less storage, less training cost, faster iteration, fewer excuses from the infrastructure team. Naturally, the adapter becomes the small and tidy object everyone wants to manage. ...

May 26, 2026 · 15 min · Zelina
Cover image

Pooling Resources: UniPool and the MoE Budget Nobody Wanted to Audit

Opening — Why this matters now AI infrastructure has entered its spreadsheet era. Not the glamorous spreadsheet, where revenue projections grow diagonally upward and nobody asks where the assumptions came from. The other spreadsheet: the one where compute cost, memory footprint, inference latency, training instability, and model quality all insist on appearing in the same row. ...

May 9, 2026 · 16 min · Zelina
Cover image

Graph Expectations: Why Context Compression Needs Structure, Not Just Similarity

Opening — Why this matters now The AI industry has developed a charmingly expensive habit: when models struggle with long documents, we buy them larger windows and pretend the problem has been solved. It has not. Long-context LLMs are useful, but longer context is not the same as better context. A model can accept a very large input and still miss the crucial paragraph buried in the middle, over-attend to duplicated evidence, or lose the argumentative spine of a document. The result is familiar to anyone building AI tools for legal review, finance research, policy analysis, procurement, consulting, compliance, or enterprise knowledge work: the model has “read” everything, yet somehow understands the wrong thing. Very modern. Very expensive. ...

May 1, 2026 · 12 min · Zelina
Cover image

Merge Without Mayhem: How Orthogonal Deltas Could Revolutionize Model Composition

In the era of foundation models, one challenge looms increasingly large: how to safely, scalably, and reversibly compose AI systems from multiple task-specific fine-tunings. Traditional solutions — from naïve weight averaging to adapter stacking — often create interference, forgetfulness, and compliance nightmares. But a recent paper introduces a promising new direction: Modular Delta Merging with Orthogonal Constraints (MDM-OC). Rather than combining entire model weights, MDM-OC treats each task-specific fine-tuned model as a delta from a shared base. Think of these deltas as compact, focused perturbations that encode only what changed to solve a given task. The twist? Before merging, each delta is orthogonalized — projected into a subspace that doesn’t overlap with others. This creates a modular, mathematically principled structure for interference-free integration. ...

August 2, 2025 · 3 min · Zelina