Foundation Models

Mutation Impossible? How Multimodal Agents Are Rewriting Glioma Diagnostics

Report First, Diagnosis Second A medical report usually arrives after the diagnostic work is done. It explains, records, justifies, and sometimes politely hides how messy the evidence really was. This paper asks a more interesting question: what if the report itself becomes a predictive object? In Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma, Hafsa Akebli and colleagues build a Multimodal Oncology Agent, or MOA, for predicting IDH1 mutation status in low-grade glioma using TCGA-LGG data, whole-slide histology, structured clinical variables, genomic context, and external biomedical knowledge sources.1 The immediate headline is easy enough: the full multimodal setup reaches the best reported performance, with an F1-score of 0.912. ...

Maps, Models, and Mobility: GPT Goes for a Walk

The delivery route is not a sentence A delivery van does not move like a sentence. It stops. It waits. It turns left because a road exists, not because grammar allows it. Its next point depends on geography, time of day, congestion, driver behavior, business constraints, and occasionally the small civic miracle of a loading bay being available. A language model sees the world as tokens arranged in sequence. A trajectory model sees movement as a sequence too, but the symbols are less polite: latitude, longitude, timestamp, region, point of interest, dwell time, elapsed time, and missing segments. ...

Map Before You Train: Data Cartography to Defuse LLM Memorization

TL;DR for operators Training data does not become risky only after a model has memorised it. It often leaves signals while training is still happening. That is the useful idea behind Generative Data Cartography, or GenDataCarto: track how each pretraining sample behaves during early training, then use that behaviour to decide which data should be kept, up-sampled, down-weighted, or removed.1 The method uses two signals. The first is early loss, which approximates how difficult a sample is. The second is the frequency of “forget events”, where a sample appears learned and later becomes poorly fitted again. In the paper’s framing, frequent forget events are not just training noise. They are a warning that a sample may be unusually influential, repeatedly re-entering the model’s attention like a guest who refuses to leave the meeting. ...

MoE Money, MoE Problems? FinCast Bets Big on Foundation Models for Markets

TL;DR for operators FinCast is a finance-specific time-series foundation model that tries to do for market forecasting what large pretrained models did for language: absorb enough diverse data that new tasks require less bespoke engineering.1 The paper reports strong evidence on forecasting accuracy. In a zero-shot benchmark of 3,632 financial time series and more than 4.38 million scalar time points, FinCast beats general-purpose time-series foundation models on average, with roughly 20% lower MSE and 10% lower MAE. In supervised stock benchmarks, even the zero-shot version beats the listed supervised baselines; lightweight fine-tuning improves the gap further. ...

The Invisible Hand in the Machine: Rethinking AI Through a Collectivist Lens

TL;DR for operators Users do not experience an AI product as a theorem. They experience it as a bargain. They give data, attention, labour, trust, prompts, feedback, documents, creative work, behavioural traces, and sometimes money. In return, they expect useful output, lower friction, safer decisions, visibility, compensation, privacy, or at least not being quietly turned into unpaid infrastructure. The bargain may be explicit. More often, because apparently we enjoy building planetary-scale systems on implied consent and vibes, it is not. ...

Brains with Gradients: Why Energy-Based Transformers Might Be the Future of Thinking Machines

TL;DR for operators Energy-Based Transformers are not another prompt trick, reasoning wrapper, or RL-flavoured attempt to make a chatbot show more homework. They change the model’s job. Instead of directly predicting the next token, frame, or image patch in one forward pass, an EBT learns a scalar energy function that scores whether a candidate prediction is compatible with its context. Lower energy means “this fits better.” Inference then becomes optimisation: start with a rough or random candidate, compute the gradient of the energy with respect to that candidate, and iteratively move toward a lower-energy prediction. ...

The Grammar and the Glow: Making Sense of Time-Series AI

TL;DR for operators Time-series AI is getting better at recognising patterns across domains: energy demand, ECG signals, traffic sensors, weather readings, equipment logs, and other data streams that behave nothing like nice, polite spreadsheets. Two recent arXiv papers point to a useful combined thesis. The first argues that time-series foundation models work because they learn a kind of “language of time”: recurring temporal patches become motif tokens; motif frequencies follow long-tail patterns; motif sequences show grammar-like constraints.1 The second tackles the adoption problem: even if a model is accurate, people still need to know why it raised a diagnosis, forecast, alarm, or recommendation. It proposes a hybrid ResNet–Transformer system that fuses local Grad-CAM heatmaps with global attention, then turns salient regions into natural-language explanations.2 ...

Body of Proof: Why Embodied AI Needs More Than One Mind

TL;DR for operators A robot that works alone is already expensive, brittle, and rude to your maintenance budget. A group of robots that must work together adds a different class of difficulty: timing, communication, role allocation, shared perception, physical interference, changing team composition, and the occasional human wandering into the scene with a clipboard. ...

Evolving Beyond Bottlenecks: How Agentic Workflows Revolutionize Optimization

TL;DR for operators Optimization work usually looks technical from the outside: equations, solvers, constraints, tolerances, and someone quietly muttering about convergence. Inside the business, the real bottleneck is often less glamorous. Someone has to decide what the problem actually is, how to formulate it, which algorithm to try, which hyperparameters to tune, and whether the resulting answer is useful or merely mathematically decorative. ...

Weights and Measures: OpenAI's Innovator’s Dilemma

TL;DR for operators OpenAI’s planned return to open-weight language models is not a charming rediscovery of its founding name. It is a market correction. The useful way to read the move is not “OpenAI becomes open source.” That is too neat, and therefore probably wrong. The more practical reading is this: OpenAI has a premium API and subscription business, but the AI market is increasingly learning to route around premium access when “good enough, controllable, and local” beats “best, metered, and remote.” ...