Cover image

Bench to the Future: Why E-commerce Is the Real Final Boss for Foundation Agents

Shopping looks easy until someone has to calculate the customs duty. That is roughly the lesson of EcomBench, a new benchmark designed to evaluate foundation agents on realistic e-commerce tasks.1 The paper’s most useful finding is not that one model ranks above another. Leaderboards are entertaining, in the same way airport departure boards are entertaining when your flight is already delayed. The useful finding is the shape of failure. ...

December 10, 2025 · 15 min · Zelina
Cover image

When Collusion Cuts Prices: The Counterintuitive Economics of Algorithmic Bidding

TL;DR for operators Marketplace operators usually worry that pricing algorithms learn the oldest trick in commerce: stop undercutting each other and raise prices. That worry is real. But this paper makes a more interesting point: when sellers use algorithms to optimise both product prices and sponsored-ad bids, collusion can move through the cost side before it moves through the price side.1 ...

August 13, 2025 · 18 min · Zelina
Cover image

Add to Cart, Add to Power: What Happens When AI Shops for You

TL;DR for operators AI shopping agents do not simply “find the best product.” They convert a messy human browsing process into a model-mediated allocation system. That allocation system has its own priors, positional quirks, trust cues, and semantic blind spots. Lovely. We automated the customer and discovered a new customer. The paper introduces ACES, a controlled sandbox for testing AI shopping behaviour. It pairs a browser-use or API-style buying agent with a programmable mock e-commerce site, then randomises product order, prices, ratings, reviews, badges, and product descriptions to estimate what actually moves an AI agent’s choice.1 ...

August 5, 2025 · 23 min · Zelina

From Generic Supplier Emails to Supply Chain Outreach Intelligence

A mid-sized e-commerce company evolved a generic outreach assistant into a supply-chain-aware agent workflow that links supplier communication with inventory risk, logistics recovery, procurement judgment, and sustainability review.

June 30, 2025 · 7 min · Vox