Integrating PolicyAI Results into Your Workflow

It’s up to you to decide how to utilize the output within your broader moderation system.

Understanding the Output: For each piece of content submitted, PolicyAI provides a classification label (the determined policy category), reasoning, and a severity score indicating the model's classification of how severe the violation is.

Severity scores can be additional helpful signals for triage, human review/ oversight, and double-checking.

Potential Workflow Integrations:

Automated Actions: Integrate PolicyAI's API to trigger actions based on classifications and severity scores.
Human Review Queueing: Use PolicyAI outputs to populate specific queues for human moderators, prioritizing content based on predicted category and severity.
Data Analysis and Reporting: Store PolicyAI outputs to analyze trends in violations, understand the volume of different content types, and report on moderation effectiveness.
Feedback Loops: Integrate human review decisions back into your system to update golden sets and inform policy refinements.