When Graphs Stop Guessing: Teaching Models to Rewrite Their Own Meaning
GES shows how graph models can improve not by becoming larger, but by using LLMs to rewrite node descriptions around task-relevant structural evidence.
GES shows how graph models can improve not by becoming larger, but by using LLMs to rewrite node descriptions around task-relevant structural evidence.
A semi-supervised safety-classification paper shows why unlabeled AI interaction data becomes useful only when the training process preserves harmful intent, not just surface wording.
A practical reading of MemSinks and what it teaches AI builders about memorization, generalization, and why forgetting must be designed before deployment.
A mechanism-first reading of how programmatic policies let LLM agents condition on each other’s source code, and why the business value is inspectable coordination rather than magic cooperation.
A mechanism-first reading of SFTKey-Tag, a two-stage fine-tuning method that separates answer correctness from reasoning-format training.
FinAgent shows how agentic AI can turn grocery planning into a price-aware loop across household budgets, nutrition targets, health constraints, and food substitutions.
A practical reading of when LLM persona panels can replace field experiments for method benchmarking—and when they merely create cheaper noise.
A case-first reading of a paper showing why LLM safety fails when models respond to surface wording while missing the user's likely intent.
A mechanism-first reading of RoboSafe, a runtime safety guardrail that turns embodied-agent safety from vague refusals into executable checks over context and time.
A mechanism-first reading of TrafficSimAgent, showing why agentic traffic simulation is less about chatting with SUMO and more about turning simulation workflows into controllable, memory-aware optimization systems.