Unsafe at Any Bit: Patching the Safety Gaps in Quantized LLMs

When deploying large language models (LLMs) on mobile devices, edge servers, or any resource-constrained environment, quantization is the go-to trick. It slashes memory and compute costs by reducing model precision from 16-bit or 32-bit floating points to 8-bit or even 4-bit integers. But there’s a problem: this efficiency comes at a cost. Quantization can quietly erode the safety guarantees of well-aligned models, making them vulnerable to adversarial prompts and jailbreak attacks. ...

June 26, 2025 · 3 min · Zelina

Bias Busters: Teaching Language Agents to Think Like Scientists

In the latest paper “Language Agents Mirror Human Causal Reasoning Biases” (Chen et al., 2025), researchers uncovered a persistent issue affecting even the most advanced language model (LM) agents: a disjunctive bias—a tendency to prefer “OR”-type causal explanations over equally valid or even stronger “AND”-type ones. Surprisingly, this mirrors adult human reasoning patterns and undermines the agents’ ability to draw correct conclusions in scientific-style causal discovery tasks. ...

May 15, 2025 · 3 min

Feeling Without Feeling: How Emotive Machines Learn to Care (Functionally)

When we think of emotions, we often imagine something deeply human—joy, fear, frustration, and love, entangled with memory and meaning. But what if machines could feel too—at least functionally? A recent speculative research report by Hermann Borotschnig titled “Emotions in Artificial Intelligence”1 dives into this very question, offering a thought-provoking framework for how synthetic emotions might operate, and where their ethical boundaries lie. Emotions as Heuristic Shortcuts At its core, the paper proposes that emotions—rather than being mystical experiences—can be understood as heuristic regulators. In biology, emotions evolved not for introspective poetry but for speedy and effective action. Emotions are shortcuts, helping organisms react to threats, rewards, or uncertainties without deep calculation. ...

May 7, 2025 · 4 min