Deep-Learning

The Viscosity Budget: Why Softmax Is Not Just a Knob

TL;DR for operators A new paper by Jose Marie Antonio Miñoza, Erika Fille T. Legara, and Christopher P. Monterola argues that a log-sum-exp neural layer is not merely analogous to a viscous Hamilton-Jacobi equation. Under the paper’s parameterisation, it is exactly the Hopf-Cole solution of one, evaluated at the input point.1 The operational point is not “neural networks are physics now”, although someone will certainly try to put that on a slide. The point is cleaner: one parameter, $\varepsilon$, simultaneously controls softmax temperature, PDE viscosity, and entropy-regularised convex optimisation. That makes smoothness, expressiveness, robustness, attribution sharpness, and scaling behaviour mathematically coupled. ...

Beyond Accuracy: When Forecasts Meet Cash Flow

Inventory is the moment when a forecast stops being a spreadsheet exercise and starts costing money. A demand model can look elegant in validation. It can shave RMSE by a few decimals, win a leaderboard, and make the data science team briefly feel like civilization has advanced. Then the warehouse over-orders slow-moving stock, the store misses fast-moving items, and the finance team discovers that “better accuracy” is not the same thing as better cash flow. ...

When One Heatmap Isn’t Enough: Layered XAI for Brain Tumour Detection

Diagnosis has a simple business problem hiding inside a clinical one: nobody wants a black box that is confident for the wrong reason. That is especially true in medical imaging. A brain MRI classifier that says “tumour” or “non-tumour” is not automatically useful because it crosses a respectable accuracy threshold. The difficult question comes next: did the model look at the clinically relevant region, or did it discover some convenient artefact in the image pipeline? A single heatmap may answer that question. It may also merely look persuasive, which is not quite the same thing. Medicine, regrettably, is one of those domains where aesthetic confidence is still not a validation method. ...

Learning the Fast Lane: When MILP Solvers Start Remembering Where the Answer Is

Queue. That is the least glamorous word in enterprise optimization, which is probably why it matters. A mixed-integer linear programming solver does not usually fail because it lacks mathematical dignity. It fails because the search tree becomes too large, the clock keeps running, and some poor planning system is still deciding which facility to open, which order to allocate, which truck route to approve, or which resource schedule to release before Monday morning starts behaving like Monday morning. ...

Clustering Without Amnesia: Why Abstraction Keeps Fighting Representation

A customer database looks harmless until someone asks for “natural segments.” Then the ritual begins. Export the data. Pick a clustering algorithm. Reduce the dimensions. Make a pretty 2D plot. Give each blob a name. “Premium convenience buyers.” “Budget explorers.” “Dormant loyalists.” Everyone nods, because blobs are comforting. Business strategy has survived on worse. ...

When Prophet Meets Perceptron: Chasing Alpha with NP‑DNN

Accuracy is a dangerous word in finance. It sounds clean. It fits nicely into a slide. It makes a model feel disciplined, measurable, and almost adult. A stock-prediction system with accuracy above 90% sounds like something a hedge fund would guard behind three NDAs and a biometric door. That is exactly why we should slow down. ...

Greedy Enough to Win: When Loss Starts Driving the Learning Rate

Training runs rarely fail with cinematic drama. They do not burst into flames. They simply become expensive, slow, and faintly embarrassing. A fine-tuning job starts with promise, the loss descends, then progress flattens. Another run behaves well for 200 steps, then becomes jumpy after a data shard changes. A third run is rescued by lowering the learning rate, except nobody knows whether the rescue came too early, too late, or by accident. Eventually, the team does what teams do: try cosine decay again, because at least cosine looks mathematically respectable while doing whatever it was going to do anyway. ...

From Trendlines to Transformers: DeepSupp Redefines Support Level Detection

TL;DR for operators Support levels are usually treated as chart objects: a line, a zone, a Fibonacci retracement, a moving average, perhaps a hand-drawn artefact with suspicious confidence. DeepSupp reframes them as latent market states: patterns in how price, volume, VWAP, and related features move together over time.1 The paper’s useful contribution is the pipeline, not the marketing-friendly phrase “AI technical analysis.” DeepSupp builds rolling Spearman correlation matrices from price-volume features, sends those matrices through a multi-head attention autoencoder, compresses them into latent embeddings, and then uses DBSCAN clustering to map dense market states back into median price levels. In plainer language: it tries to find support zones by learning how market relationships evolve, rather than by assuming that yesterday’s visual line still deserves respect. ...