Foundation Models

Context Is Not Free, So Stop Feeding the Whole Table

TL;DR for operators Many tabular foundation models behave like very competent consultants with a mildly expensive habit: they want the entire labelled training set placed in front of them at inference time. That works neatly on small datasets. It becomes rather less charming when the table grows to tens or hundreds of thousands of rows and the model’s attention cost starts behaving like it has discovered compound interest. ...

LoRA’s Rank Excuse Has a Gradient Problem

TL;DR for operators LoRA is usually sold as a rank-and-cost compromise: train a small low-rank adapter instead of updating the whole model, accept some performance gap, and enjoy the budget meeting. The paper behind SDS-LoRA argues that this explanation is incomplete. The gap is not only because the adapter is low-rank. It is also because standard LoRA can distort the training signal that flows into that adapter.1 ...

Less Label, More Light: What a 3D Microscopy Foundation Model Actually Buys

Microscopy has a labor problem. Not the photogenic kind where a scientist leans into a glowing instrument and discovers the secret architecture of life before lunch. The duller problem is that modern light sheet fluorescence microscopy can produce rich three-dimensional volumes faster than expert teams can label them. Segmentation requires voxel-level masks. Stain classification requires domain knowledge. Restoration needs paired degraded and high-quality images, which nature, unhelpfully, does not always provide in tidy folders. ...

One Pass to Forecast Them All: Toto 2.0 and the Scaling Recipe for Time-Series AI

Forecasting is where machine learning often learns humility. A language model can sound clever while being wrong. A forecasting model has fewer hiding places. Revenue arrives or it does not. CPU saturation happens or it does not. Demand spikes, latency drifts, inventories rot, turbines fail, and the spreadsheet smiles politely before punishing everyone involved. This is why time-series foundation models have been treated with a particular kind of suspicion: useful, interesting, sometimes impressive, but not yet comfortably scalable in the way large language models became scalable. ...

Heart of Scale: Why Bigger ECG Models Don’t Always Beat Better Biases

Heart of Scale: Why Bigger ECG Models Don’t Always Beat Better Biases A hospital does not buy an ECG model because it enjoys leaderboard furniture. It buys one because somebody wants a cheap, reliable signal from a noisy waveform: rhythm abnormality, structural heart disease, ICU risk, mortality risk, maybe a demographic or physiological clue that was not explicitly labeled during pre-training. ...

Context Is Not a Costume: Why Strong Agents Still Fail on Contact

The agent looks ready. Then reality answers back. The current AI-agent story is conveniently simple. Take a powerful foundation model, wrap it in tools, give it a workflow, add a polite system prompt, and call the result “ready for deployment.” Reality, as usual, has poor manners. Two recent arXiv papers examine very different agent settings. One studies whether multimodal AI agents can align their behavior with the cognitive age of child users. The other studies whether behavior foundation models for imitation learning can remain robust when the physical dynamics of an environment shift after training. They do not share a benchmark, a model class, or even the same deployment domain. That is precisely why they are useful together. ...

The Heart of the Model: ECG Foundation Models Need the Right Backbone Before More Data

Cost is not always about size. That is an inconvenient sentence for anyone trying to sell a larger medical foundation model by waving parameter counts like a hospital procurement trophy. In ECG modeling, the expensive question is not simply whether one can pretrain on more recordings. The harder question is whether the model architecture and pretraining task actually match the structure of the signal. ...

Pooling Resources: UniPool and the MoE Budget Nobody Wanted to Audit

Opening — Why this matters now AI infrastructure has entered its spreadsheet era. Not the glamorous spreadsheet, where revenue projections grow diagonally upward and nobody asks where the assumptions came from. The other spreadsheet: the one where compute cost, memory footprint, inference latency, training instability, and model quality all insist on appearing in the same row. ...

Place Your Experts, Not Your Bets

Opening — Why this matters now The fashionable version of AI strategy still sounds suspiciously like a gym membership pitch: bigger model, more parameters, more GPUs, more everything. The operational version is less glamorous and much more important: where does the computation happen, which parts of the model are actually used, how predictable is demand, and whether the system can turn those facts into lower latency, lower cost, or better decisions. ...

Free AI Inference Providers

A daily dashboard for monitoring free AI inference providers, with curated vendor boards and a machine-refreshable OpenRouter free-model roster.