Stack Overflow for Ethics: Governing AI with Feedback, Not Faith
A control-theoretic reading of the Social Responsibility Stack, and why responsible AI needs monitors, thresholds, rollback paths, and governance authority—not just principles.
A control-theoretic reading of the Social Responsibility Stack, and why responsible AI needs monitors, thresholds, rollback paths, and governance authority—not just principles.
A mechanism-first reading of TOGGLE, a framework that turns LLM compression into a constrained engineering problem using temporal logic, Bayesian optimization, and explicit behavioral thresholds.
A case-first reading of PCML, a method for turning black-box agent behavior into interpretable probabilistic capability maps.
A mechanism-first reading of Artism, a dual-engine AI framework that turns generative art into a self-critical loop rather than another novelty machine.
A decision-theoretic guide to deciding when imperfectly aligned AI systems are still worth delegating to.
A comparison-based reading of SDE, a benchmark that tests whether frontier LLMs can move from science quiz performance to iterative scientific discovery.
Nemotron-Math shows that better mathematical reasoning supervision is not just more data, but a carefully engineered mix of reasoning depth, tool use, source diversity, filtering, and long-context training economics.
A mechanism-first reading of Predictive Concept Decoders and why activation-based audit layers may matter more than model self-explanations.
A close reading of Stepwise Think-Critique, a single-model approach that interleaves reasoning and self-critique to make mathematical reasoning more inspectable without pretending self-audit is already trust.
A practical reading of CAGE, an attribution-graph method that audits not only which prompt evidence influenced an LLM answer, but how intermediate generations carried that influence forward.