Prompt Engineering

Chains of Causality, Not Just Thought

TL;DR for operators Causal Influence Prompting, or CIP, is a safety method for LLM agents that asks the model to build and consult a causal influence diagram before acting. Instead of telling the agent, “be safe,” it asks the agent to represent the task as a graph: what facts matter, what choices are available, what outcomes are useful, and what outcomes are harmful. This is a better shape for the problem, because agents do not merely answer questions. They click buttons, run code, forward messages, use tools, and occasionally behave as if “sure, why not?” were a compliance framework. ...

Mind Over Modules: How Smart Agents Learn What to See—and What to Be

TL;DR for operators Agentic AI is not only a model-selection problem. It is an environment-design problem. Two recent papers make that point from opposite ends of the stack. One studies LLM agents in a controlled repeated routing game and shows that the way history, rewards, and peer actions are represented can significantly change behaviour.1 The other proposes SwarmAgentic, a framework that automatically generates and optimises agent roles, execution policies, and collaboration structures using a language-based version of particle swarm optimisation.2 ...

Reflections in the Mirror Maze: Why LLM Reasoning Isn't Quite There Yet

TL;DR for operators Adding “reasoning” to an LLM agent is not the same as making it reason better. Wong et al. test four open-source models across dynamic SmartPlay tasks using a baseline prompt, reflection, reflection plus an Oracle that mutates heuristics, and reflection plus a Planner that simulates short future trajectories.1 The clean result is not “planning wins” or “bigger models win.” The result is more annoying, therefore more useful: the same scaffold can be a booster, a distraction, or a failure amplifier. ...

Guess How Much? Why Smart Devs Brag About Cheap AI Models

TL;DR for operators Cheap models are not a moral victory. They are useful when the surrounding system knows what to ask, how to check the answer, and when to escalate. The practical lesson from FrugalGPT and later model-routing research is that AI cost optimisation is less about picking one “best value” model and more about designing an inference pipeline that spends intelligence only where intelligence is needed.1 ...

Blind Trust, Fragile Brains: Why LoRA and Prompts Need a Confidence-Aware Backbone

TL;DR for operators LoRA and prompts are attractive because they make model adaptation feel almost too easy: add a few examples, attach a small adapter, nudge the model into a domain, and call it customised. The uncomfortable part is that adaptation changes not only what a model says, but how confidently it says it. A compliance assistant that becomes slightly more domain-specific but far more overconfident has not been improved. It has been promoted beyond its competence, a classic corporate move. ...