CaTCCH
.jpg)
The Problem
Existing content moderation tools focus on abusive language, misinformation, and calls for violence, but they fail to measure how content influences democratic health. CaTCCH expands this scope by targeting civic harm—specifically:
Dehumanization – Content that strips groups or individuals of dignity, reinforcing division. | Partisan Violence – Content that normalizes or incites political violence.
While many classifiers label harmful content, they rarely measure its actual impact on user attitudes and behaviors. CaTCCH shifts the focus from simple content tagging to outcome-based evaluation, ensuring classifiers identify content that truly threatens democracy.
.png)
Our Approach: Building and Testing Smarter Classifiers
CaTCCH goes beyond traditional moderation approaches by developing a rigorous testing framework that evaluates how content influences real-world civic engagement.
Key Features: Testing Classifier Effectiveness – Compares leading classifiers (e.g., Moral Foundations, Google Perspective, Toxic-bert) to see which best identifies content that affects civic health. | Experimental Field Testing – Uses browser-based experiments to measure how exposure to classified content changes user beliefs and behaviors over time. | LLM-Powered Innovations – Integrates AI-driven classification models (e.g., GPT-4) to improve detection of both harmful and prosocial content. | Outcome-Based Approach – Evaluates classifiers not just on accuracy, but on real-world civic impact, ensuring they reduce polarization without enabling censorship.
.png)
Why This Matters Now
With AI-generated content flooding online spaces, the risk of toxic discourse scaling beyond human moderation is growing. CaTCCH helps platforms, researchers, and policymakers understand how content impacts democracy—and how to intervene responsibly.
By moving beyond simple content labeling and toward measurable civic impact, CaTCCH ensures that online interventions actually strengthen democratic norms, rather than just removing problematic content.
.png)