Speed Bumps and Swells: Rethinking Optimal Trading with Stochastic Volatility

TL;DR for operators

Execution desks already know that volatility matters. The useful question is less poetic: which volatility, on what time scale, and what should the trading algorithm actually do about it?

The paper by Patrick Chan, Ronnie Sircar, and Iosif Zimbidis extends the Gârleanu-Pedersen optimal trading framework from constant volatility to predictable returns, temporary transaction costs, persistent price impact, and multiscale stochastic volatility.¹ That combination matters because it puts the model closer to the daily problem of a trading desk: alpha is changing, risk is changing, and the desk’s own trades are also moving the price. Delightful. The market is not merely adversarial; it is participatory.

The main operational insight is that stochastic volatility is not just a new value of $\sigma$ dropped into an old formula. Fast and slow volatility affect the trading rule differently. A fast mean-reverting volatility factor mostly averages out at first order, so the leading useful correction arrives only at second order. In the Heston/CIR-style example, when current volatility is above its long-run mean, the corrected strategy reduces target exposure while increasing the urgency of tracking that smaller target. Translation: move more decisively, but toward a safer inventory.

The slow volatility factor behaves differently. Its first-order correction is correlation-sensitive: the relationship between return signals and volatility shocks changes the aim portfolio. In the paper’s slow-factor example, the trading speed itself remains unchanged at the first correction level, but the target portfolio shifts. So the slow factor is less a speed bump and more a tide: it moves the destination.

The numerical evidence is promising but should be read properly. The paper reports a 53.27 basis point average PnL improvement in the fast-factor Monte Carlo example when $Y_0=0.4$, with lower PnL variance under the corrected strategy. For the slow-factor example, mean PnL gains are positive across tested initial volatility values $Z_0 \in {0.1,\ldots,0.6}$, ranging from 34.2532 bps down to 0.6804 bps. These are simulation results under the paper’s model, not a universal law of execution alpha. Markets remain tragically unwilling to obey asymptotic expansions on request.

The business takeaway is practical: desks can use volatility-timescale-aware rules to adjust trading speed and target inventory when alpha, transaction costs, and price impact interact. The boundary is equally important: this is a single-asset model with fully observable Ornstein-Uhlenbeck alpha, linear persistent impact, constant liquidity parameters, and formal asymptotics whose rigorous error bounds for this exact control problem are left for future work.

The desk problem is not volatility; it is timing

A portfolio manager receives a signal. The signal says buy. The risk system says volatility is elevated. The execution engine says trading aggressively will cost money now and may push the price against the desk later. The portfolio manager says something short, probably unsuitable for publication.

Classic optimal trading models already handle part of this mess. Gârleanu and Pedersen gave a tractable way to balance expected returns, risk, temporary trading costs, and persistent price impact. Their result can be expressed in a simple and useful form:

$$ u_t^\ast = r(\mathrm{aim}_t - q_t) $$

The trader adjusts current inventory $q_t$ toward an aim portfolio at some tracking speed $r$. This framing is why the model became influential: it turns a dynamic optimisation problem into a rule that a human can at least explain before lunch.

But the constant-volatility assumption leaves a gap. In real trading, volatility is not only “high” or “low”. Some changes are fast and mean-reverting: intraday turbulence, short-lived bursts, liquidity noise wearing a volatility hat. Other changes are slow: regime shifts, macro uncertainty, gradual risk repricing. Treating both as the same scalar risk knob is convenient. It is also how models quietly become furniture.

Chan, Sircar, and Zimbidis keep the target-and-speed intuition but let volatility move on two clocks. Their contribution is not “volatility matters”. That would be a headline from 1987 wearing a new suit. Their contribution is showing how volatility’s time scale changes the structure of the optimal trading rule while preserving enough tractability to compute corrections.

The baseline rule has three moving parts

The model begins with a single risky asset, a trading rate $u_t$, and a position $q_t$ satisfying:

$$ dq_t = u_t,dt. $$

Temporary transaction cost is quadratic in trading speed, written as $\frac{K}{2}u_t^2$. This captures the immediate cost of trading faster. Persistent price impact is modelled separately through a price-impact state $l_t$:

$$ dl_t = (\lambda u_t - \beta l_t)dt. $$

Here $\lambda$ measures how strongly trades move the price-impact state, while $\beta$ governs how quickly that impact decays. This is the useful distinction: temporary cost hurts at execution; persistent impact lingers afterwards. Any trading model that treats those as the same inconvenience is doing accounting, not execution.

The asset has a predictable return component $x_t$, modelled as an Ornstein-Uhlenbeck process. The trader earns from holding inventory against this signal, pays for risk exposure through a term proportional to $\gamma\sigma^2 q_t^2$, and pays trading costs. The resulting Hamilton-Jacobi-Bellman equation becomes nonlinear once persistent impact and stochastic volatility are included.

Under constant volatility, the value function can be represented with a quadratic ansatz in position, impact, and signal. The optimal rule then returns to the interpretable target-tracking form:

$$ u_t^\ast = r^c(\sigma^2)(\mathrm{aim}^c_t(\sigma^2)-q_t). $$

That is the anchor. The paper’s stochastic-volatility contribution should be read as a set of corrections to this anchor, not as a demolition of it.

Stochastic volatility enters through two clocks, not one dial

The paper models volatility as $\sigma(Y_t,Z_t)$, with $Y_t$ a fast factor and $Z_t$ a slow factor. The fast factor is scaled by a small $\varepsilon$; the slow factor by a small $\delta$. The mathematical machinery is asymptotic expansion: singular perturbation for the fast factor, regular perturbation for the slow factor, and then a combined multiscale approximation.

This is where a normal summary would become a swamp of HJB notation. The mechanism is cleaner:

Volatility regime	Mathematical treatment	Operational effect	Reader trap
Constant volatility	Quadratic value function and algebraic system	Track an aim portfolio at a computed speed	Assume this remains valid after simply replacing $\sigma$
Fast stochastic volatility	Singular perturbation in $\varepsilon$	First correction vanishes; second-order term changes speed and aim	Expect fast noise to matter immediately
Slow stochastic volatility	Regular perturbation in $\sqrt{\delta}$	First-order correction shifts the aim portfolio through correlation-sensitive terms	Ignore return-volatility correlation
Multiscale volatility	Combined $\varepsilon$ and $\sqrt{\delta}$ corrections	Both aim and tracking speed can be affected	Treat “stochastic volatility” as one effect

The important surprise is the fast factor. Because it mean-reverts quickly, the zeroth-order problem uses an averaged volatility. The first correction vanishes. To see the leading fast-volatility effect, the authors have to compute the second-order term. This is not a decorative technicality. It changes how one should interpret fast volatility: short-lived turbulence does not simply become a first-order instruction to panic-trade. It mostly washes through the average, then reappears as a more subtle correction to the trading rule.

The slow factor does not wash out in the same way. It moves gradually enough that the trader must care about where the volatility regime is going. That is why correlation between return and volatility shocks enters more visibly in the slow case.

The fast factor says: move faster toward a smaller exposure

For the fast-factor example, the paper uses a Heston-style setup with $\sigma(y)=\sqrt{y}$ and a CIR process for the volatility factor. In this case, the corrected trading rule can be interpreted as modifying the constant-volatility target-tracking strategy by a term proportional to current deviation from long-run volatility.

The paper’s qualitative interpretation is direct. When current volatility $y$ is above its long-run average $\mu$, the corrected strategy increases the adjusted trading speed but reduces the target portfolio. When volatility is below its long-run average, the effect reverses.

This sounds contradictory only if “trade faster” and “take more risk” are treated as the same instruction. They are not. A desk can trade faster while reducing exposure. In fact, that is often precisely what risk-aware execution should do: move decisively away from an inventory level that no longer makes sense.

That is the core mechanism:

The fast factor changes the urgency of adjustment.
It also changes the target exposure.
The economically relevant instruction is the combination, not either component alone.

This is a more useful idea than “volatility up, trade less”. Sometimes volatility up means trade faster, because the current position is now too large. Risk control is not passivity. Shocking, apparently.

The slow factor says: the target depends on correlation

The slow-factor case is different. The paper derives a first-order correction in $\sqrt{\delta}$ using a regular perturbation. In the slow-only setting, the first-order correction does not change the trading speed in the same way the fast factor does. Instead, it shifts the aim portfolio.

This is where return-volatility correlation matters. If the volatility factor and the return predictor are correlated, then a higher return estimate may arrive with higher risk. The paper’s derivation shows that the sign of this correlation affects the direction of the correction. Its numerical illustration compares the corrected aim portfolio against the constant-volatility aim under positive and negative correlation assumptions.

The business version is simple enough to survive a risk committee meeting:

If volatility is fast, ask how quickly current conditions should change the speed and size of inventory adjustment.
If volatility is slow, ask whether the return signal is being paid for with a volatility regime shift.
If both are present, do not pretend one scalar volatility estimate can answer both questions.

The slow factor is especially relevant for medium-horizon portfolio adjustment, where a signal may remain attractive but the risk regime is drifting underneath it. In that setting, the question is not merely “is alpha positive?” It is “is this alpha positive because the asset is compensating me for a regime I do not actually want to warehouse?”

The simulations are useful, but they do different jobs

The paper’s numerical section is not one monolithic proof. Different figures play different evidentiary roles, and confusing them would be a tidy way to misunderstand the contribution.

Paper element	Likely purpose	What it supports	What it does not prove
Figures 1–2: normalised coefficient and derivative errors under small price-impact approximation	Implementation validation	First-order small-price-impact approximation is numerically accurate enough for the simulations, with errors generally of order $O(\theta^2)$ and decreasing as volatility increases	Live-market execution performance
Figure 3: fast-factor speed and aim across transaction cost $K$	Mechanism illustration	Fast volatility above its mean increases adjusted speed and lowers target exposure	PnL dominance across assets or liquidity regimes
Figure 4: fast-factor Monte Carlo PnL	Main evidence for fast correction	Corrected strategy improves average PnL by 53.27 bps at $Y_0=0.4$ and lowers variance from 0.6515 to 0.6286	Robustness to empirical calibration error
Figure 5: slow-factor aim portfolio under correlation choices	Mechanism and sensitivity test	Slow volatility correction shifts the target portfolio, with correlation affecting direction	That the same shift is always profitable
Figure 6 and Table 2: slow-factor PnL distributions and summary statistics	Main evidence for slow correction	Mean PnL gains are positive across tested $Z_0$ values	Pathwise dominance or universal gain magnitude

The small price-impact approximation deserves its own note. In the full constant-volatility model with persistent impact, the coefficient system is nonlinear and has no general closed-form solution. Solving it at every time step and sample path would be computationally heavy. The authors introduce a small parameter $\theta$ into the price-impact dynamics and expand the coefficients recursively.

This is not the glamour result, but it is what makes the later simulations feasible. The approximation is an implementation bridge: without it, the elegant correction terms would be less useful in a system that has to run more than once before everyone goes home.

For the fast-factor Monte Carlo experiment, the reported 53.27 bps mean improvement is economically meaningful in the context of execution rules. The lower variance also matters: the corrected strategy is not merely shifting the average upward while wildly increasing dispersion in that example.

The slow-factor simulation is more nuanced. Table 2 reports positive mean PnL gains across all tested initial volatility levels:

Initial slow volatility factor $Z_0$	Mean gain (bps)	Std (bps)	95% confidence interval
0.1	34.2532	235.7638	[29.6322, 38.8741]
0.2	14.7266	79.2333	[13.1736, 16.2796]
0.3	8.3127	39.4801	[7.5389, 9.0865]
0.4	5.0165	24.3310	[4.5396, 5.4934]
0.5	2.3825	15.0484	[2.0875, 2.6774]
0.6	0.6804	10.7493	[0.4698, 0.8911]

The pattern is informative: the mean gain declines as $Z_0$ rises. The standard deviations are also large relative to the means, especially at low $Z_0$. So the result supports positive expected gains under the model, not guaranteed outperformance path by path. Anyone selling the latter should be invited to trade their own balance sheet first.

The paper’s direct result is tractability, not magic alpha

The direct academic contribution is precise. The authors take a model that becomes hard because stochastic volatility makes the HJB equation nonlinear, then recover tractable approximations by expanding around the constant-volatility solution.

There are three layers:

Constant-volatility extension with persistent impact. The paper analyses the algebraic system that arises when both instantaneous and persistent transaction costs are present, including a small-price-impact approximation.
Single-factor stochastic volatility corrections. It separately derives fast and slow volatility corrections, showing that fast volatility requires a second-order term while slow volatility contributes at first order.
Combined multiscale strategy. It shows how the fast and slow corrections can be combined so that both the aim and tracking speed are affected.

The business inference is not that this model should be lifted into production untouched. The inference is that an execution system should separate volatility by time scale before deciding whether to adjust speed, target, or both.

That separation is useful because it maps naturally to desk workflows. Fast volatility resembles execution-state monitoring: sudden turbulence, changing short-term risk, unstable local conditions. Slow volatility resembles allocation-state monitoring: regime drift, signal-risk co-movement, changing background risk. A single “vol forecast” field in an order management system does not capture this distinction. It merely provides a number with the confidence of a dashboard widget.

What an execution desk can actually take from this

The paper is most relevant to desks that already use model-based trading rules and want interpretable corrections rather than black-box policy search. It suggests a practical architecture:

Estimate or filter alpha signals and volatility factors.
Separate volatility into fast and slow components.
Use the fast component to adjust urgency and exposure reduction.
Use the slow component to adjust the aim portfolio through correlation-aware terms.
Keep price-impact assumptions explicit, especially the distinction between temporary trading cost and persistent impact.

This is not anti-machine-learning. It is anti-mysticism. Machine learning can estimate latent states, forecast volatility, or calibrate impact parameters. The trading rule itself can still remain economically interpretable. That matters in environments where model risk, execution review, and capital allocation all require explanations longer than “the network said so”.

A useful production interpretation would be:

Operational decision	Model signal	Potential desk action
How urgently should we adjust inventory?	Fast volatility deviation from long-run average	Increase or decrease tracking speed depending on whether current inventory is too risky
Where should the target inventory sit?	Fast correction and slow correlation-sensitive correction	Lower or raise the aim portfolio rather than blindly following raw alpha
Can we compute the rule repeatedly?	Small-price-impact approximation accuracy	Avoid solving the full nonlinear system at every simulation step
Is the effect worth modelling?	Monte Carlo PnL gain and variance change	Prioritise calibration experiments before production integration

The immediate ROI pathway is not “earn 53.27 bps everywhere”. Please do not put that in a board deck unless you enjoy follow-up questions. The more defensible pathway is cheaper diagnosis: the framework tells a desk whether a volatility change should alter speed, target, or both. That is valuable because many execution failures come from applying the right signal at the wrong urgency.

Boundaries that matter before production

The model is deliberately controlled. That is not a weakness; it is how mathematical finance keeps the lights on. But the boundaries matter.

First, the model is single-asset. Portfolio desks care about cross-impact, correlated signals, sector exposure, and inventory constraints across books. Extending the mechanism to multiple assets is not just adding indices until the notation gives up. Cross-impact can change the meaning of target and speed.

Second, the return predictor is fully observable and follows an Ornstein-Uhlenbeck process. Real alpha signals are noisy, partially observed, and frequently revised by teams who insist this version is “much cleaner”. The authors themselves identify partial information and expert opinions as a future extension.

Third, persistent price impact is linear with constant parameters. The paper acknowledges stochastic liquidity and nonlinear price impact as natural next steps. This matters because liquidity is not constant intraday; it often varies around the open, close, news, rebalances, and whatever else the market chooses to weaponise.

Fourth, the asymptotic results are formal in this exact control setting. The paper notes that rigorous justification exists in related option pricing and Merton problem contexts, but a similar theoretical justification for this problem is left for future work. That does not make the results unusable. It means production use should be preceded by calibration, stress testing, and sensitivity analysis rather than ceremonial admiration.

Fifth, the numerical evidence is simulated under the model. The PnL improvements show that the corrections can matter economically when the model assumptions hold. They do not show realised performance on historical order-book data, across liquidity regimes, or under adversarial market conditions.

These boundaries do not erase the contribution. They define where it can be used without pretending.

The useful correction is conceptual before it is computational

The common misconception is that stochastic volatility merely means replacing constant $\sigma$ with a better estimate. The paper shows why that is too shallow.

A fast volatility factor does not enter like a slow one. A first-order fast correction vanishes, so the important effect requires going to second order. A slow factor enters through correlation-sensitive terms that move the aim portfolio. In the combined model, both aim and tracking speed can change. Volatility is not a better dashboard number; it is a different control structure.

That is the article’s main mechanism. If the desk’s volatility process has multiple clocks, the execution rule should too.

The paper gives a tractable path toward that rule. It keeps the economic grammar of target tracking while adding the missing distinction between fast turbulence and slow regime drift. For operators, that is the useful part: not a promise that every correction prints money, but a sharper way to ask what the algorithm should actually change when volatility moves.

Sometimes the right answer to higher volatility is not “trade less”. Sometimes it is “get to a smaller position faster”. That is a small sentence with expensive implications.

Cognaptus: Automate the Present, Incubate the Future.

Patrick Chan, Ronnie Sircar, and Iosif Zimbidis, “Optimal Trading under Instantaneous and Persistent Price Impact, Predictable Returns and Multiscale Stochastic Volatility,” arXiv:2507.17162, 2025. https://arxiv.org/abs/2507.17162 ↩︎

TL;DR for operators#

The desk problem is not volatility; it is timing#

The baseline rule has three moving parts#

Stochastic volatility enters through two clocks, not one dial#

The fast factor says: move faster toward a smaller exposure#

The slow factor says: the target depends on correlation#

The simulations are useful, but they do different jobs#

The paper’s direct result is tractability, not magic alpha#

What an execution desk can actually take from this#

Boundaries that matter before production#

The useful correction is conceptual before it is computational#