When Meta and other tech giants scale back content moderation, the gap isn’t just technical—it’s societal. Civil society organizations (CSOs), not corporations, are increasingly on the frontlines of monitoring online extremism. But they’re often armed with clunky tools, academic prototypes, or opaque black-box models. A new initiative—highlighted in Civil Society in the Loop—challenges this status quo by co-designing a Telegram monitoring tool that embeds human feedback directly into its LLM-assisted classification system. The twist? It invites civil society into the machine learning loop, not just the results screen.

From Passive Labels to Active Learning

At the heart of the system is a dual-mode architecture: fine-tuning a BERT-based classifier on domain-specific Telegram data, and prompt-based interaction with general-purpose LLMs. The tool flags conspiracy theory (CT) content in public Telegram channels, assigning a binary label with a confidence score. But unlike typical tools, this one treats user disagreement as fuel—not noise—for improvement.

Negative feedback becomes a training signal; silence becomes a subtle vote of agreement.

CSOs reviewing these labels can click to flag incorrect classifications. This feedback is then integrated into a continually evolving gold-standard dataset. Over time, this dataset supports:

  • Fine-tuning: Periodic retraining of the BERT model to capture concept drift.
  • Prompt refinement: Adjusting few-shot prompts or prompt templates for better zero-shot reasoning.

Fine-Tuning vs. Prompting: A Tradeoff Table

Category Fine-Tuning (FT) Prompting (P)
Flexibility Low High
Reproducibility High Low
Hardware Needs High (GPU-intensive) Low (API or local LLMs)
Adaptability to Drift Medium High
Human Effort High (training pipeline) Moderate (prompt curation)
Multilingual Robustness Low High

While fine-tuning delivers consistency and control, it’s inaccessible to most CSOs without GPU resources. Prompting offers adaptability and ease of deployment—but at the cost of unpredictability and opaque model behavior. Hence, the team experiments with both, adjusting based on feedback loops and resource constraints.

Design Challenges: Feedback Isn’t Free

The real innovation isn’t the models—it’s the feedback infrastructure. Three issues stand out:

  1. Negativity Bias: Users tend to provide feedback only when they disagree. But positive confirmations are also needed to train a balanced model.
  2. Multi-User Conflict: When multiple users label the same post differently, whose judgment should prevail? The paper suggests building conflict resolution interfaces.
  3. Implied Agreement: Should we treat a user’s scroll-past or click-through as silent approval? The team is exploring implicit feedback weighting, but it opens up both technical and ethical dilemmas.

These are not mere UX issues—they’re questions of epistemology. Who decides what is ‘harmful’? Can disagreement be quantified into gradients of model retraining?

Toward Democratic AI Monitoring

This system is more than just a Telegram scraper with AI bolted on. It’s an experiment in democratic alignment—testing whether real-world, low-resource stakeholders like civil society groups can shape the behavior of powerful models.

Several promising directions emerge:

  • Hybrid pipelines that pre-label obvious cases and route ambiguous ones to humans.
  • Federated collaboration across CSOs, pooling annotated data without centralization.
  • Beyond classification: extending prompting to summarization and question-answering.

As AI governance becomes a buzzword, this project delivers something more grounded: governance by usage, not just policy. Rather than forcing humans to adapt to machine constraints, it adapts machines to the contextual wisdom of human monitors.


Cognaptus: Automate the Present, Incubate the Future.