Do They Mean It? Testing Whether AI Actually ‘Reasons’ Behind the Wheel
A car follows a cyclist on a narrow road. The double solid yellow line says: do not cross. The empty oncoming lane says: perhaps you can. The cyclist may feel uncomfortable being followed. The passenger may be late. The vehicle behind may be getting impatient. The automated vehicle must choose. A normal benchmark would ask whether the final maneuver is safe, legal, smooth, or close to a human reference trajectory. Useful, yes. Complete, no. ...