Cover image

Mirror, Signal, Trade: How Self‑Reflective Agent Teams Outperform in Backtests

The Takeaway A new paper proposes TradingGroup, a five‑agent, self‑reflective trading team with a dynamic risk module and an automated data‑synthesis pipeline. In backtests on five US stocks, the framework beats rule‑based, ML, RL, and prior LLM agents. The differentiator isn’t a fancier model; it’s the workflow design: agents learn from their own trajectories, and the system continuously distills those trajectories into fine‑tuning data. What’s actually new here? Most “LLM trader” projects look similar: sentiment, fundamentals, a forecaster, and a decider. TradingGroup’s edge comes from three design choices: ...

August 26, 2025 · 5 min · Zelina
Cover image

Stackelbergs & Stakeholders: Turning Bits into Boardroom Moves

TL;DR: BusiAgent proposes a client‑centric, multi‑agent LLM framework that formalizes roles (CEO/CFO/CTO/MM/PM) with an extended Continuous‑Time MDP, coordinates them via entropy‑guided brainstorming (peer‑level) and multi‑level Stackelberg games (vertical), and squeezes extra performance from contextual Thompson sampling for prompt optimization—wrapped in a QA stack that fuses STM/LTM memories with a knowledge base. It’s a serious attempt to connect granular analytics to boardroom decisions. The big win is organizational alignment; the big risks are evaluation rigor, token economics, and ops reliability at scale. ...

August 24, 2025 · 5 min · Zelina
Cover image

IRB, API, and a PI: When Agents Run the Lab

Virtuous Machines: Towards Artificial General Science reports something deceptively simple: an agentic AI designed three psychology studies, recruited and ran 288 human participants online, built the analysis code, and generated full manuscripts—end‑to‑end. Average system runtime per study: ~17 hours (compute time, excluding data collection). The paper frames this as a step toward “artificial general science.” The more immediate story for business leaders: a new production function for knowledge work—one that shifts the bottleneck from human hours to orchestration quality, governance, and data rights. ...

August 20, 2025 · 5 min · Zelina