Welcome to PolicyAI! This guide will help you understand how to effectively use Large Language Models (LLMs) within PolicyAI to power your content moderation policies. Using LLMs offers significant flexibility and scalability, but crafting effective policies requires understanding best practices tailored to how these models interpret instructions.
<aside> đź’ˇ
Important Disclaimer: These tips and strategies are based on current observations and performance with specific models. They may not generalize directly to new models or contexts and should always be validated through testing before being applied in production. LLM behavior can be nuanced and sometimes unpredictable.
</aside>
https://www.loom.com/share/c1600fab777a4aae853d14f4842efe02
This section covers the basic steps to get your policy text into PolicyAI.
Navigate to the Manage Policies section within the PolicyAI user interface.
Click the "Create New Policy" button. You will be prompted to name your policy and add a description. This is a good place to put the version number, date, etc. The “Preset” tab includes a simple default policy that you can use to get started, if you want.

You will see a large text area where you can paste or type your policy definition.

Your policy should be formatted using headings (e.g., ## Category Name) to clearly delineate each policy category as instructed in the prompt introduction template. Ensure category names follow the naming conventions (no special characters). Anything that has a header format will populate a Moderation Category in the sidebar.
Periodically save your policy draft as you work.
The definitions within your policy categories need to be unambiguous for the LLM.
<aside> đź’ˇ
Example: When reviewing results for a policy that said, “No hate speech”, we saw reasoning from the LLM for both:
Instead, phrase it positively or as a rule the content must not break (e.g., "Content violates this policy if it contains hate speech as defined below...").
</aside>
In areas where your policy is unclear, vague, or doesn't explicitly cover a scenario, you will often observe the LLM giving inconsistent results for similar content. This is a helpful signal that the policy itself needs refinement to be more definitive in that scenario.
Important: Don’t instruct LLMs to make decisions that require information they cannot possibly have from reviewing a single piece of content in isolation, or from undefined values or subjective words.
<aside> đź’ˇ
Examples of things LLMs struggle to assess or reference without explicit input:
For more information on policy engineering and structuring policies, see our article here.