Cover image

Green Lights, Smarter Cities: How Multi‑Agent Reinforcement Learning Is Rewiring Urban Traffic

Traffic lights are not stupid. They are obedient. That is the problem. A fixed-time signal does exactly what it was told to do: hold this green for this long, clear the junction, move to the next phase, repeat. It does not care that one lane is empty, another is spilling backward, and a third has just received a platoon of vehicles from the previous intersection. It is not being malicious. It is merely following a plan designed for a world that stopped changing five minutes ago. ...

March 14, 2026 · 17 min · Zelina
Cover image

Traffic, but Make It Agentic: When Simulators Learn to Think

Traffic. A planner wants to test whether a new signal policy will reduce congestion near a hospital. A logistics operator wants to know whether a revised delivery schedule will overload a district during the evening peak. A city team wants to compare two neighborhoods, two time windows, and two control strategies before anyone touches asphalt, paint, or public patience. ...

December 25, 2025 · 18 min · Zelina
Cover image

Preference Chains of Command: Making LLM Agents Pick Like People

TL;DR for operators Cities rarely wait for perfect data. A new district still needs a transit plan, a campus still needs a shuttle model, and a developer still wants to know whether people will walk, drive, or quietly defeat the entire urban-design deck by ordering a car. The paper behind this article introduces Preference Chain, a method that uses a small sample of behavioural mobility data to guide an LLM agent’s transport choices.1 The important bit is not that it “adds Graph RAG” to an LLM. That phrase now covers everything from serious retrieval systems to someone throwing a Neo4j logo onto a slide. The real mechanism is narrower and more useful: Preference Chain turns sparse human travel records into structured priors over likely choices, then lets the LLM adjust those priors for context. ...

August 25, 2025 · 21 min · Zelina