AI risk reviews still tend to begin with comforting questions. Who is the responsible developer? What policy applies? What did the model output? Was the user allowed to ask that? Did the compliance team approve the deployment checklist?
Useful questions, certainly. Also slightly late.
Two recent arXiv papers point to a less convenient lesson: some AI risks are not merely produced by bad prompts, careless users, malicious deployment, or weak legal controls. They are produced by architecture. One paper shows this at the model-training layer, where Batch Normalization can amplify memorization of atypical samples and increase privacy leakage.1 The other shows it at the ecosystem layer, where decentralized AI can dissolve the very addressee that conventional governance assumes, forcing governance to move from policy instructions to protocol-level constraints.2
That combination is more interesting than either paper alone. The first paper says: look inside the machinery that makes the model learn. The second says: look outside the model, at the infrastructure that determines whether governance has any leverage after deployment. Together, they suggest a business-relevant rule: AI risk management should become an architecture review, not just an output audit or legal checklist.
The mildly annoying implication is that “we have a policy” is not the same as “we have control.” Policies are useful when there is a reachable party, an observable system, and a causal path from instruction to behavior. When risk is baked into training dynamics or locked into decentralized infrastructure, policy becomes less like a steering wheel and more like a polite letter to the engine.
The shared problem: risk below the visible interface
The two papers operate at very different levels. The Batch Normalization paper is a machine-learning study: datasets, architectures, noisy labels, out-of-distribution samples, gradient norms, membership inference, and a theoretical explanation for why BN can accelerate outlier memorization. The decentralized AI paper is a governance and ethics argument: open-weight proliferation, permissionless compute, agent harnesses, identity, ownership, accountability gaps, and protocol design.
The connection is not topic similarity. The connection is control surface.
| Layer | What looks like the risk | What the papers ask us to inspect instead |
|---|---|---|
| Model training | The model memorized sensitive data | Which training mechanisms make rare samples unusually influential? |
| Privacy audit | The attack succeeded | Which architectural choices increased membership inference exposure? |
| Deployment governance | Someone failed to comply | Is there even an addressable actor who can still change the system? |
| Decentralized infrastructure | The agent behaved badly | Does the protocol make unsafe action impossible, or merely punishable later? |
This is the article’s spine: risk moves downward into the substrate. At the micro level, it appears in learning dynamics. At the macro level, it appears in the governance architecture of the AI ecosystem. The practical question is no longer only “what did the AI do?” It is “what kind of system made that behavior likely, observable, reversible, or irreversible?”
Step 1: Batch Normalization is not just an optimization convenience
Batch Normalization is usually introduced as a friendly engineering technique. It stabilizes training, speeds convergence, and appears in many deep learning architectures. In the usual product conversation, it is filed under “implementation detail,” which is corporate language for “please do not make me think about this unless something breaks.”
The BN paper makes that filing system look too relaxed.
The authors investigate whether BN affects memorization of atypical or outlier samples. This matters because memorization is not evenly distributed across a dataset. Rare, noisy, mislabeled, or low-probability samples are often the ones most likely to be memorized. In privacy-sensitive domains, those are exactly the examples that may represent distinctive medical conditions, unusual financial behavior, minority demographic patterns, or other records whose exposure is not just embarrassing but consequential.
The paper tests BN through three complementary lenses:
- Forced memorization. The authors use corrupted labels and out-of-distribution data to test whether models fit examples that cannot be explained by normal generalization.
- Per-sample influence. They examine gradient norms to see whether atypical samples exert larger influence during training.
- Membership inference attacks. They test whether models with BN make it easier to infer whether a specific data point was in the training set.
Across these settings, the reported pattern is consistent: models with BN memorize atypical samples more strongly than comparable models without BN, and this amplification translates into greater vulnerability under membership inference. The point is not that BN is “bad” in a cartoonish sense. The point is more precise: a mechanism that improves optimization can also increase the distinguishability of rare training examples.
That is a nasty kind of trade-off because it hides behind success. The model trains well. Accuracy looks fine. Generalization may even improve. Meanwhile, the long tail may become more exposed.
In the paper’s theoretical account, the key quantity is the ratio between BN’s learned scale parameter and the activation standard deviation. In simplified terms, the margin-growth amplification for tail samples scales with something like:
$$ \left(\frac{\gamma}{\sigma}\right)^2 $$
where $\gamma$ is the learned BN scale and $\sigma$ is the activation standard deviation. Channels with low variance or large learned scale can become strong amplifiers. A tail sample that sits far from the activation mean is not merely normalized; it can become more forceful in the training update. The paper describes a self-reinforcing loop: outlier influence can push the scale dynamics in a direction that further accelerates memorization, until the sample is fitted and the loop fades.
For a business reader, the important lesson is not the exact algebra. It is the governance implication: privacy risk can be created by ordinary training machinery. Nobody has to steal the dataset. Nobody has to jailbreak the model. Nobody has to write a villainous prompt in a hoodie. The architecture itself can make sensitive long-tail examples more visible to later attacks.
The paper also proposes mitigation through regularizing the relevant BN ratio, showing that memorization can be reduced while preserving much of the clean-data performance. It also notes that normalization-free architectures may offer better privacy-utility trade-offs in sensitive settings. This is not a universal prescription to remove BN everywhere. It is a demand for architecture-aware privacy evaluation.
In other words: do not ask only whether the model performs well. Ask what kind of examples it had to remember in order to perform well.
Step 2: The privacy audit must move inside the learning process
A conventional privacy review often asks whether the dataset was collected lawfully, whether consent was obtained, whether access is restricted, whether identifiers were removed, and whether the deployed model passes an output-level leakage test. Those controls matter. But the BN paper shows why they are incomplete.
If rare samples receive amplified influence during training, then average-case metrics can be misleading. A model may look acceptable on aggregate while creating disproportionate exposure for the very records that most need protection. This is the familiar long-tail problem, but here it is not just a product analytics issue. It is a privacy issue.
A better model-risk review would include questions such as:
| Review question | Why it matters |
|---|---|
| Which training components amplify rare or atypical samples? | Optimization gains may carry privacy costs. |
| Are per-sample influence metrics inspected, not only aggregate loss? | Outlier exposure may be invisible in averages. |
| Are membership inference tests run separately on long-tail slices? | The vulnerable group may not be the typical user. |
| Has the team compared BN, regularized BN, and normalization-free alternatives? | Architecture choices should be part of privacy due diligence. |
| Is the privacy-utility trade-off documented for sensitive datasets? | “The model is accurate” is not a privacy argument. |
This is where the first paper becomes larger than BN itself. BN is the example, not the whole category. The broader category is risk-bearing infrastructure inside the model: layers, normalization schemes, training objectives, data mixtures, gradient dynamics, fine-tuning procedures, and evaluation protocols that determine what the model internalizes.
A serious AI risk review should therefore include an architectural memo. Not a decorative diagram. Not a slide with rounded rectangles and arrows pointing heroically to “AI.” A real memo: what mechanisms shape memorization, what failure modes they create, what tests were run, and what trade-offs were accepted.
Step 3: Decentralized AI moves the problem from model internals to governance internals
The second paper scales the same architectural logic outward. It asks whether decentralized AI is governable under existing assumptions.
The authors’ answer is not the lazy version: “blockchain makes everything impossible.” Their argument is more structured. Existing AI governance frameworks usually assume an identifiable entity—a developer, deployer, provider, operator, controller, or organization—that can be addressed by rules and compelled to comply. This is governance through what the paper calls normative address: someone receives the instruction, understands it, and can alter the system accordingly.
Decentralized AI weakens that assumption across a six-layer stack:
| Decentralization layer | Governance problem |
|---|---|
| Model weights | Open or redistributed models can proliferate beyond the creator’s recall. |
| Training | Distributed or unauthorized fine-tuning can move outside centralized oversight. |
| Compute | Inference and training can shift from regulable cloud facilities to edge devices or permissionless compute markets. |
| Harness | Prompts, tools, memory, guardrails, and execution logic can be forked and recombined quickly. |
| Identity | Agents may operate through pseudonymous or self-sovereign identities. |
| Ownership | Control can move from a single principal to DAOs, smart contracts, or no practical human override. |
The paper’s key distinction is between two governance failures. The accountability gap appears when no addressable principal can be identified. The incapacitation gap appears when someone can be identified but cannot alter or terminate the running system. That second gap is especially important. It means liability can remain while control has evaporated. Blaming a human who cannot change the system may satisfy a courtroom instinct, but it does not stop the agent.
This is where the paper becomes useful for businesses, especially those experimenting with agentic workflows, open models, crypto rails, decentralized compute, or autonomous commercial agents. The risk question is not only “who owns this?” It is also:
Can that owner still change it, pause it, observe it, or kill it?
If the answer is no, then governance has already moved from management policy to architecture. The system is not waiting for your committee meeting. It is running.
Step 4: Protocol becomes governance when policy loses purchase
The decentralized AI paper argues that when normative address fails, governance has to shift toward architectural constraint. Instead of telling an actor what to do, the protocol determines what actions are possible in the first place.
That is the difference between a speed-limit sign and a median barrier. The sign addresses the driver. The barrier structures the road. One assumes a responsive subject. The other works even when the subject is reckless, confused, anonymous, or not a subject in the ordinary sense.
For decentralized AI, the paper argues that protocol-level mechanisms may serve this architectural role: cryptographic verification, identity registries, attestation requirements, resource limits, reputation gates, and transaction rules that make non-compliant behavior fail at execution rather than become punishable after the fact.
This is an important shift. It does not mean policy disappears. It means policy must be translated into the substrate before the substrate becomes ungovernable. The paper calls this a movement from regulative governance to constitutive governance: not punishing invalid actions after they occur, but defining what counts as a valid action inside the system.
That sounds clean. It is not.
The paper is careful about the political cost. Protocol governance creates a legitimacy problem. If architecture determines what agents can do, then the designers of that architecture become unusually powerful. Protocol maintainers, standards bodies, TEE vendors, L1 communities, certification authorities, and wealthy governance-token holders may become the new control points. Decentralization does not magically remove power; it often changes where power hides.
The paper therefore identifies ethical conditions for protocol governance: legitimacy, contestability, transparency, and non-domination. These are not philosophical ornaments. They are practical requirements. A protocol that can constrain autonomous agents without democratic contestability may solve one governance problem by creating another: unaccountable technical sovereignty, now with better branding.
Step 5: The combined lesson is architectural due diligence
The papers are not making the same argument. That is why the logic-chain structure matters.
The BN paper shows that privacy risk can emerge inside the learning mechanism. The decentralized AI paper shows that governance may need to be embedded into the deployment substrate before the system becomes unreachable. Together, they support a broader framework:
| Due diligence layer | Main question | Business failure if ignored |
|---|---|---|
| Learning architecture | What mechanisms amplify memorization, influence, or leakage? | Sensitive long-tail records become exposed despite acceptable aggregate performance. |
| Evaluation architecture | Are tests targeted at the vulnerable slices, not only averages? | Privacy risk hides under clean dashboard metrics. |
| Deployment architecture | Who can observe, update, pause, or terminate the system? | Accountability remains on paper while control disappears in practice. |
| Protocol architecture | Are constraints enforceable at execution time? | The system can violate policy faster than the organization can respond. |
| Governance architecture | Who designs the constraints, and how can they be contested? | Technical control becomes unaccountable power. |
This is the business translation: architecture is both a source of risk and the remaining site of control.
That dual role is easy to miss. In the BN paper, architecture creates a hidden privacy risk. In the DeAI paper, architecture becomes the only practical governance channel when conventional address fails. The same word—architecture—does two jobs. It describes the machinery that can produce harm and the machinery through which harm can be constrained.
That is not a contradiction. It is the point.
What the papers show versus what businesses should do with it
A useful article should not smuggle interpretation into the papers and pretend the authors said everything. So let us separate the claims.
| Category | What the papers show | Business interpretation |
|---|---|---|
| BN and privacy | BN can amplify memorization of atypical samples and increase membership inference exposure in the studied settings. | Treat normalization and training dynamics as privacy-relevant design choices, especially for sensitive long-tail data. |
| Memorization | Rare or noisy samples may exert disproportionate influence and become more distinguishable. | Run privacy tests on vulnerable slices, not only the average user or average record. |
| DeAI governance | Decentralization across model, training, compute, harness, identity, and ownership can create accountability and incapacitation gaps. | Before deploying agentic or decentralized systems, verify that someone can still observe, update, constrain, and stop them. |
| Protocol governance | Architectural constraints can govern when normative address fails, but they raise legitimacy and domination risks. | Protocol design is not only engineering; it is institutional design with legal, ethical, and commercial consequences. |
For a business team, this suggests a practical operating rule: every high-risk AI project should have two review tracks.
The first is a model-internal risk review. It asks how the training process handles rare examples, whether architectural choices intensify memorization, whether membership inference has been tested under realistic threat models, and whether mitigation has been evaluated against utility.
The second is a deployment-control review. It asks whether the organization can monitor, update, pause, roll back, or terminate the system after release; whether the agent’s tool access is mediated by enforceable gates; whether the identity and payment rails are auditable; and whether the governance design can be contested by affected stakeholders.
Most companies do some version of the first badly and the second accidentally. That is understandable. It is also how architecture becomes policy by surprise.
The misconception to avoid: architecture is not a technical footnote
The likely reader misconception is that AI privacy and governance are mainly about policy language, user permissions, compliance forms, post-hoc audits, and output filtering. These tools are necessary, but they are downstream tools. They work only when the system remains reachable through human or institutional control.
The two papers suggest a sharper model:
- If sensitive data exposure is intensified by training dynamics, privacy must be evaluated inside the architecture.
- If autonomous behavior is secured by decentralized infrastructure, governance must be embedded before the system becomes difficult to alter.
- If protocol designers decide what actions are possible, governance must also scrutinize who gave them that authority.
This does not mean every AI deployment needs blockchain-style protocol governance or that every model should abandon Batch Normalization. Please do not turn this into another checklist cult. The practical lesson is conditional: the more consequential, privacy-sensitive, autonomous, open, forkable, or decentralized the system becomes, the more architectural the governance review must be.
For a small internal classifier trained on non-sensitive data, the full framework may be excessive. For a medical AI model trained on rare patient records, BN-related memorization risk is not a decorative concern. For an autonomous agent managing payments through crypto rails, “we can ask the operator to stop” is not a control unless the operator can actually stop it.
That last sentence deserves to be printed on a compliance poster, preferably in a font less tragic than most compliance posters.
A practical architecture-review checklist
For teams building or buying AI systems, the combined lesson can be translated into a short review framework.
1. Identify long-tail sensitivity
Ask whether the dataset contains rare, high-value, legally sensitive, or personally distinctive records. Healthcare, finance, HR, insurance, education, and behavioral analytics all qualify quickly. If the answer is yes, average-case privacy testing is not enough.
2. Inspect memorization mechanisms
Document whether the architecture uses normalization layers, aggressive fine-tuning, small sensitive datasets, repeated examples, or data mixtures that may increase sample-specific influence. Compare alternatives where feasible. The decision to keep a mechanism should be explicit, not inherited from a tutorial notebook written in 2018 and spiritually maintained by Stack Overflow.
3. Test privacy where it hurts
Run membership inference or related leakage audits on vulnerable data slices, not only random samples. If rare records are the privacy concern, test rare records. Dashboards that average away the tail are just spreadsheets with manners.
4. Map post-deployment control
For agentic systems, identify who can change the model, harness, tools, memory, identity, wallets, compute environment, and permissions. Then distinguish nominal ownership from causal control. A name on a policy document is not the same as a kill switch.
5. Convert critical policies into execution constraints
Where failure would be costly, policies should become gates: access controls, attestation requirements, transaction limits, approval workflows, sandbox boundaries, logging requirements, and reversible deployment states. The more autonomous the system, the less you should rely on after-the-fact scolding.
6. Review the legitimacy of constraints
When architecture constrains users, customers, agents, or ecosystem participants, ask who designed the constraint, how it can be challenged, how it can be revised, and whether it concentrates power. “The protocol says so” is not an ethics argument. It is the beginning of one.
Conclusion: risk has a floor plan
The old mental model of AI risk was behavioral: the model says something wrong, the user does something bad, the developer violates a rule, the regulator responds. That model still matters, but it is no longer enough.
The two papers examined here push the analysis into architecture. At the model layer, Batch Normalization can amplify the memorization of atypical samples, increasing privacy risk even while helping training. At the ecosystem layer, decentralized AI can break the assumptions that make ordinary governance work, forcing control to move into protocols, identity systems, attestation mechanisms, and execution constraints.
For businesses, the lesson is simple and slightly uncomfortable: before asking whether an AI system complies, ask whether it can be governed at all. Before asking whether a model leaks, ask which parts of training make leakage more likely. Before trusting a policy, find the control surface.
AI risk has a floor plan. Read the blueprint before admiring the lobby.
Cognaptus: Automate the Present, Incubate the Future.
-
Ngoc Phu Doan, Chongyan Gu, and Ihsen Alouani, “Batch Normalization Amplifies Memorization and Privacy Risks,” arXiv:2605.24420, 2026. https://arxiv.org/abs/2605.24420 ↩︎
-
Botao “Amber” Hu and Helena Rong, “Is Decentralized AI Governable? From Regulative Policy to Constitutive Protocol,” arXiv:2605.24538, 2026. https://arxiv.org/abs/2605.24538 ↩︎