Opening — Why this matters now

Everyone suddenly cares about sustainability. Corporations issue glossy ESG reports, regulators publish directives, and investors nod approvingly at any sentence containing net-zero. The problem, of course, is that words are cheap.

Greenwashing—claims that sound environmentally responsible while being misleading, partial, or outright false—has quietly become one of the most corrosive forms of corporate misinformation. Not because it is dramatic, but because it is plausible. And plausibility is exactly where today’s large language models tend to fail.

This paper introduces EmeraldMind, a system built on a blunt realization: if AI is going to judge sustainability claims, it needs evidence, structure, and the discipline to say “I don’t know.”

Background — Context and prior art

Automated fact-checking has improved rapidly, but sustainability claims sit in an awkward corner of the problem space:

  • Evidence lives in ESG reports, regulatory filings, and KPI tables—not Wikipedia.
  • Claims are often vague by design (“we reduced emissions”, “we are greener than before”).
  • Datasets are small, legally sensitive, and expensive to curate.

Generic LLMs do surprisingly well on accuracy when they answer, but they frequently abstain or hallucinate. Fine-tuned models perform better, but only after consuming annotated datasets that barely exist.

The core question the paper asks is refreshingly pragmatic:

How do we build a greenwashing detector that works now, without retraining models, while remaining auditable and trustworthy?

Analysis — What EmeraldMind actually does

EmeraldMind is not another prompt trick. It is an end-to-end, evidence-first architecture with three design commitments:

  1. Separate evidence construction from reasoning
  2. Ground claims in structured ESG knowledge
  3. Treat abstention as a feature, not a failure

The evidence layer: EmeraldGraph + EmeraldDB

EmeraldMind builds two complementary stores:

Component Purpose What it contains
EmeraldDB Textual retrieval ESG report chunks, tables, figures, metadata
EmeraldGraph Structured reasoning Companies, KPIs, facilities, goals, claims, relations

The graph is company-centered by design. Everything—emissions, targets, facilities, certifications—anchors back to a single organization node. This turns vague claims into traversable reasoning paths rather than free-form text.

Schema-driven realism

A key contribution is schema discipline. The system explicitly distinguishes:

  • Targets vs. actual performance
  • Claims vs. verified observations
  • Facilities vs. corporate aggregates

This prevents a common ESG sin: confusing intent with outcome.

Claim grounding and retrieval

When a claim enters the system (e.g., “Company X reduced CO₂ emissions by 30% in 2023”), EmeraldMind:

  1. Extracts entities (company, KPI, value, year)

  2. Grounds them to graph nodes

  3. Retrieves:

    • A compact subgraph of relevant KPI paths
    • A filtered set of document chunks from ESG reports

Only then does the LLM get to reason.

Three reasoning variants

EmeraldMind tests three configurations:

Variant Evidence used Strength
EM-RAG Documents only Broad coverage
EM-KGRAG Knowledge graph only Precision & objectivity
EM-HYBRID Both (LLM-as-judge) Best overall performance

The hybrid model is telling: it doesn’t synthesize—it chooses. When two explanations disagree, the system acts like an auditor, not a storyteller.

Findings — Results that actually matter

Coverage beats cosmetic accuracy

A recurring theme in the results: baseline LLMs look accurate because they refuse to answer.

Model Accuracy Coverage Overall usefulness
Baseline LLM Very high Very low Poor
EM-RAG High High Strong
EM-KGRAG Very high Medium Strong
EM-HYBRID High Highest Best

EmeraldMind consistently evaluates 2–4× more claims than baseline models without collapsing into guesswork.

Explanation quality (the quiet win)

Using LLM-based judges, the paper evaluates explanations on:

  • Logical coherence
  • Objectivity
  • Accuracy
  • Readability

Across both benchmarks, EmeraldMind explanations dominate. Not because they are eloquent—but because they are anchored. Graph paths and report excerpts quietly keep the model honest.

Implications — What this changes

For practitioners, EmeraldMind offers a blueprint:

  • RAG alone is not enough in regulated domains
  • Structured knowledge beats larger prompts
  • Abstention is a governance feature, not a UX bug

For regulators and auditors, the system demonstrates something rare in AI sustainability tooling: traceability. Every verdict can be walked backward through documents, entities, and relations.

For AI builders, the message is sharper: if your system cannot explain why a claim is greenwashing, it probably shouldn’t be trusted to label it.

Conclusion

EmeraldMind does not claim to solve greenwashing. It does something more valuable: it shows how to build AI systems that know when evidence is missing—and act accordingly.

In a domain flooded with confident narratives, that restraint may be the most sustainable feature of all.

Cognaptus: Automate the Present, Incubate the Future.