If AI is going to understand people, it first has to understand relationships. But when it comes to parsing character connections from narrative texts — whether news articles, biographies, or novels — even state-of-the-art language models stumble. They hallucinate links, miss cross-sentence cues, and often forget what they’ve just read.

Enter SymbolicThought, a hybrid framework that gives LLMs a logic-boosted sidekick: symbolic reasoning. Developed by researchers at King’s College London and CUHK, the system doesn’t just extract character relationships from text; it builds editable graphs, detects logical contradictions, and guides users through verification with a smart, interactive interface.

The Problem: Language Models Think in Sentences, Not Structures

Narrative understanding isn’t just about identifying named entities. It’s about connecting them into a web of meaningful, coherent relationships: who loves whom, who betrayed whom, who raised whom. But most LLMs are linear thinkers. They process words in order, not holistically. That means they often:

  • Miss symmetrical or inverse relationships (e.g., if A is B’s father, B should be A’s child).
  • Fail to infer obvious links (e.g., if A is B’s wife and B is C’s son, A is likely C’s daughter-in-law).
  • Propose contradictory edges (e.g., A is B’s daughter and also B’s father).

The result is messy, unreliable relationship graphs that require human clean-up.

The Solution: Injecting Symbolic Logic into the Loop

SymbolicThought tackles this with a two-step, human-in-the-loop system:

  1. Character Extraction:

    • An LLM proposes named entities with temperature sampling.
    • Annotators confirm, merge aliases, and disambiguate homonyms via an intuitive UI.
  2. Relationship Extraction & Refinement:

    • Another LLM suggests relationship triples.
    • A symbolic reasoning engine applies seven logical operations to infer missing edges and flag contradictions:
Category Example
Symmetry A is friend of B ↔ B is friend of A
Inversion A is parent of B ↔ B is child of A
Composition A is sibling of B, B is child of C → A is child of C
Hierarchy “elder brother” is a subtype of “brother”
Incompatible A is father of B ≠ A is child of B
Asymmetric A is boss of B → B cannot be boss of A
Exclusive A is spouse of B → A cannot be spouse of C

If a contradiction is detected, SymbolicThought highlights it in red and fetches supporting context using RAG (retrieval-augmented generation). Then it reframes the conflict as a multiple-choice prompt for the LLM, improving precision.

The Results: More Recall, Less Guesswork

SymbolicThought was tested on 19 narrative texts (biographies, histories, fictions). Compared with prompting, self-consistency, and self-reflection, it delivered consistent improvements in F1 score across all major models:

Model Prompting F1 SymbolicThought F1
GPT-4.1 33.4 37.9
GPT-4o-mini 9.9 18.8
Qwen2.5-32B-Ins 14.8 22.5

Even better: it outperformed human annotators in both recall and speed. For instance, on biography texts:

  • Human Recall: 67.3% vs. SymbolicThought Recall: 91.4%
  • Average Annotation Time: 87.2 mins vs. 45.5 mins

That’s not just a marginal gain. That’s a productivity doubling.

Why This Matters

This is more than annotation optimization. SymbolicThought points toward a hybrid AI future where statistical models generate, but symbolic systems validate and complete. It’s also a compelling case study in LLM limitations: even the best models struggle with graph structure, logic directionality, and implicit family ties.

For AI systems that aim to understand literature, legal documents, or social behavior — not to mention applications in explainable AI — this blend of LLMs + logic + interface is a powerful triad. It also shows the value of human-in-the-loop systems that highlight what machines miss, rather than blindly trusting their outputs.

Cognaptus: Automate the Present, Incubate the Future.