Cover image

Motivation Is Something Your Models Need: When Curiosity Becomes a Training Strategy

Training budgets are where elegant architecture slogans go to be audited. The usual response to a model that needs better accuracy is painfully familiar: make it larger, train it longer, feed it more data, and then pretend the GPU bill is a philosophical problem. The paper Motivation Is Something You Need takes a more interesting route. It asks whether a model needs to be large all the time, or whether extra capacity can be activated only when training signals suggest the model is “getting somewhere.”1 ...

February 25, 2026 · 16 min · Zelina
Cover image

When ERP Meets Attention: Teaching Transformers to Pack, Schedule, and Save Real Money

Furnace loading is not the glamorous side of artificial intelligence. No one gives a keynote about choosing which pile of titanium scrap should enter an induction furnace. Which is precisely why it is useful. The paper Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry applies a Multi-Type Transformer, or MTT, to two classic combinatorial optimization problems: the Knapsack Problem (KP) and the Job-Shop Scheduling Problem (JSP). It then pushes the method into a real manufacturing allocation case: selecting raw materials for a ferro-titanium furnace batch.1 ...

January 31, 2026 · 14 min · Zelina
Cover image

When SGD Remembers: The Hidden Memory Inside Training Dynamics

Reset Is the Most Honest Experiment Resetting an optimizer sounds boring. It is the kind of engineering operation that hides inside training scripts, not the kind of thing that gets people excited at conference coffee breaks. But in this paper, reset becomes a scalpel. The authors ask a deceptively simple question: when a neural network receives the same next training intervention, does that intervention behave the same way regardless of what just happened before?1 In a tidy Markovian story, the answer should be yes, at least once the relevant state is specified. In practical training, the answer is more inconvenient. Momentum buffers, batch overlap, augmentation choices, and short update histories can all make yesterday’s path leak into today’s update. ...

January 26, 2026 · 15 min · Zelina
Cover image

Learning to Discover at Test Time: When Search Learns Back

A leaderboard usually treats an AI model like a very fast intern: give it a problem, let it try many times, keep the best answer, and politely ignore the graveyard of failed attempts. That is useful. It is also a little strange. A human engineer does not merely try 25,600 variations of a GPU kernel while keeping the same brain. After the first few failures, she learns which bottlenecks matter. After a lucky partial success, she changes how she thinks about the problem. After enough attempts, the search process is no longer just sampling. It has become learning. ...

January 24, 2026 · 18 min · Zelina
Cover image

Learning the Fast Lane: When MILP Solvers Start Remembering Where the Answer Is

Queue. That is the least glamorous word in enterprise optimization, which is probably why it matters. A mixed-integer linear programming solver does not usually fail because it lacks mathematical dignity. It fails because the search tree becomes too large, the clock keeps running, and some poor planning system is still deciding which facility to open, which order to allocate, which truck route to approve, or which resource schedule to release before Monday morning starts behaving like Monday morning. ...

January 23, 2026 · 17 min · Zelina
Cover image

Greedy, but Not Blind: Teaching Optimization to Listen

Budget meetings have a familiar rhythm. Someone brings the spreadsheet. Someone brings the map. Someone else brings the sentence that ruins the spreadsheet: “This district looks inefficient on paper, but the roads are worse than the data says.” Classical optimization knows what to do with numbers. It does not naturally know what to do with that sentence. In public health planning, infrastructure rollout, retail site selection, and ESG investment, those sentences are often where the real institutional knowledge lives. Unfortunately, once the sentence enters the room, the algorithm usually leaves through the back door. Or worse, the organization pretends the sentence has been “encoded” into a weight, because apparently all human judgment becomes rigorous once it is multiplied by 0.37. ...

January 19, 2026 · 14 min · Zelina
Cover image

Probe, Then Commit: Why Solver Tuning Finally Grew Up

Probe, Then Commit: Why Solver Tuning Finally Grew Up Planning is where business software goes to meet reality. A factory needs a schedule. A logistics team needs routes. A utility company needs network decisions. A hospital needs staff allocation. The model is elegant, the constraints are clear, and then the solver quietly asks the question nobody put in the PowerPoint: ...

January 19, 2026 · 13 min · Zelina
Cover image

Redundancy Overload Is Optional: Finding the FDs That Actually Matter

Tables have a talent for pretending to be tidy. A customer table may have fifty columns. A transaction table may have a hundred. A log table may contain derived fields, timestamps, status codes, copied identifiers, normalized labels, and a few columns that nobody remembers creating but everybody is afraid to delete. Then a data profiling tool arrives, dutifully discovers functional dependencies, and returns several hundred thousand “valid” relationships. ...

January 18, 2026 · 19 min · Zelina
Cover image

Lean LLMs, Heavy Lifting: When Workflows Beat Bigger Models

Seats are not just seats. For an airline, a seat can be sold as a cheap restricted fare, a flexible economy fare, or not sold at all. A passenger who cannot buy one fare may upgrade, switch flights, or disappear into a competitor’s booking funnel. Multiply that across routes, departure times, fare classes, demand segments, aircraft capacity, and network balance rules, and the innocent phrase “optimize ticket sales” becomes a fairly effective trap for language models. ...

January 15, 2026 · 12 min · Zelina
Cover image

Speculate Smarter, Not Harder: Hierarchical Decoding Without Regret

Speed is the polite word. Cost is the less polite one. Every production LLM system eventually meets the same boring villain: the target model must generate tokens one after another, and each forward pass is expensive. Speculative decoding was supposed to soften that problem. Let a cheaper draft model run ahead, ask the expensive model to verify the draft, and accept several tokens per target-model call when the draft is good enough. Simple. Elegant. Almost suspiciously useful. ...

January 12, 2026 · 16 min · Zelina