It’s up to you to decide how to utilize the output within your broader moderation system.
Understanding the Output: For each piece of content submitted, PolicyAI provides a binary Clear or Flag decision and a classification label (the determined policy category). Additional outputs are available, including a reason, an analysis, a severity score indicating the model's classification of how severe the violation is, and a **risk score ****indicating how certain the content contains a policy violation. You can also create custom outputs tailored to your needs!
Severity and risk scores can be additional helpful signals for triage, human review/ oversight, and double-checking.
Potential Workflow Integrations: