Solver-Optimization

TL;DR for operators Financial AI is usually sold as a machine that predicts markets. This paper is about something more modest and, frankly, more useful: making the maths underneath portfolio optimisation and option pricing run faster. The authors propose a reinforcement learning controller that adjusts the block size of a preconditioner inside Flexible GMRES, an iterative solver used for large sparse or awkward linear systems. The agent is trained with PPO. Its state is the current residual vector, its action is a choice of block size, and its reward pushes the residual norm downward. In plain English: the model watches how badly the solver is still missing the answer, then changes the way the solver reorganises the problem. ...