Cover image

Greedy Enough to Win: When Loss Starts Driving the Learning Rate

Training runs rarely fail with cinematic drama. They do not burst into flames. They simply become expensive, slow, and faintly embarrassing. A fine-tuning job starts with promise, the loss descends, then progress flattens. Another run behaves well for 200 steps, then becomes jumpy after a data shard changes. A third run is rescued by lowering the learning rate, except nobody knows whether the rescue came too early, too late, or by accident. Eventually, the team does what teams do: try cosine decay again, because at least cosine looks mathematically respectable while doing whatever it was going to do anyway. ...

December 17, 2025 · 16 min · Zelina
Cover image

Fault, Interrupted: How RIFT Reinvents Reliability for the LLM Hardware Era

A chip does not need to fail everywhere to fail badly A modern AI accelerator is not fragile in the poetic sense. It is not a porcelain teacup trembling on the edge of a desk. It is much more annoying than that. It can run billions of parameters at high throughput, survive ordinary engineering noise, and still contain a few small fault locations where one carefully placed disturbance can turn a capable model into expensive decorative silicon. The problem is not that every bit matters equally. The problem is that a few bits may matter absurdly more than the rest. ...

December 11, 2025 · 17 min · Zelina
Cover image

Therapy, Transcribed: How LLMs Turn Conversation Into Clinical Insight

A therapist finishes a session. The call ends, the room becomes quiet, and the notes begin. There is the obvious record: what the client said, what the therapist asked, what homework was discussed. Then there is the harder record: what pattern kept returning? Was the client describing low motivation, fear of failure, family obligation, avoidance, self-criticism, or some collision among all of them? And if several patterns appeared, which one might be upstream of the others? ...

December 8, 2025 · 16 min · Zelina
Cover image

Timeline Triage: How LLMs Learn to Read Between Clinical Lines

Hospital notes are not databases that forgot to wear a spreadsheet costume. They are fragments of care: treatment names, planned cycles, delayed doses, discontinued regimens, relative dates, typos, abbreviations, and the occasional phrase that looks obvious until two clinicians disagree about what it actually means. For oncology, that mess matters. A chemotherapy timeline is not just a historical summary; it is the skeleton of a patient’s treatment journey. Get the timeline wrong, and downstream systems may misunderstand what was given, when it started, when it ended, and whether a patient fits a registry, audit, research cohort, or trial-matching rule. ...

December 7, 2025 · 16 min · Zelina
Cover image

Heuristics, Meet Your Agents: How Role-Based LLMs Rewire Optimization

Trucks do not care whether your routing algorithm is elegant. They care whether the vehicle arrives, whether the route violates capacity, whether the dispatch plan survives a late order, and whether the whole thing can be recomputed before someone in operations starts calling the system “that AI toy.” Optimization has always lived in this unglamorous place: close enough to mathematics to look pure, close enough to reality to be messy. ...

December 4, 2025 · 17 min · Zelina
Cover image

Roots of Understanding: When Transformers Try to Learn the Language of Numbers

Numbers look simple until you ask a model to continue them. That is the quiet trap in Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees.1 The paper does not ask whether a transformer can chat about prime numbers, recite factorization facts, or hallucinate Euclid with confidence. It asks a cleaner question: if we translate the natural numbers into a symbolic language whose grammar is generated by prime factorization, can a GPT-2-style transformer learn that grammar from sequence data alone? ...

December 2, 2025 · 15 min · Zelina
Cover image

Forecasting the Forecasters: How Hierarchical LLM Meteorologists Rewrite Weather Reasoning

Weather reports look simple only after someone has already done the hard part. A forecast table can tell you that temperature drops, rain appears, wind direction shifts, humidity stays high, and visibility changes. That is data. A useful report tells you whether this is a mild autumn transition, a tropical shower pattern, a frontal passage, a flood warning, or merely Tuesday being dramatic again. ...

December 1, 2025 · 16 min · Zelina
Cover image

Persona Non Grata: When LLMs Forget They're AI

Persona Non Grata: When LLMs Forget They’re AI A chatbot wearing a lab coat is still a chatbot. That sentence sounds obvious until a system prompt quietly says, “You are a renowned neurosurgeon with 25 years of experience,” and the model responds by inventing medical school, residency, fellowships, board certification, patient cases, and lifelong professional development. Not because anyone explicitly asked it to lie. Not because it lacks the ability to say “I am an AI.” Under neutral conditions, the models in this study almost always do say that. ...

November 27, 2025 · 13 min · Zelina
Cover image

Tile by Tile: Why LLMs Still Can't Plan Their Way Out of a 3×3 Box

A board game should not embarrass a frontier model. That is the uncomfortable charm of the 8-puzzle. It has no hidden information, no vague user intent, no messy database schema, no ambiguous policy exception, and no client saying “just make it pop.” It is a 3×3 grid with eight tiles and one blank space. Slide adjacent tiles into the blank. Reach the goal state. Done. ...

November 27, 2025 · 15 min · Zelina
Cover image

Pills, Protocols, and Parameters: When LLMs Sit the Pharmacist Exam

Exam rooms are wonderfully unsentimental. They do not care whether a model has a charming interface, a dramatic launch story, or a fan base that treats benchmark tables like sports scores. They ask a question, demand an answer, and mark it right or wrong. That makes professional licensing exams tempting AI benchmarks. A pharmacist licensure exam, in particular, looks like a clean test of whether a large language model can handle the kind of knowledge society actually cares about: drugs, laws, prescriptions, clinical judgment, and the delicate art of not confidently recommending something dangerous. Minor detail. ...

November 26, 2025 · 15 min · Zelina