Guardrails grant you more control over the content that is sent to and generated by LLM models. Use guardrails to check for harmful topics, profane or undesirable words, and sensitive information like email addresses, as well as to check how relevant and accurate a generated response is.
You need an administrator or publisher role on your team to create guardrails. See Guardrail permissions.
Developer required
Your chatbot must use scripts to check content with guardrails. We recommend contacting inGenious AI support for assistance adding guardrails to your chatbot.
You can create two types of guardrails:
- Input guardrails check content before it is sent to an LLM model.
For example, checking user utterances before they're sent to the rewrite utterance task. - Output guardrails check content that has been generated by an LLM model.
For example, checking utterances that have been rewritten by the rewrite utterance task.
Each guardrail has a set of filters that control what content it detects and the actions the chatbot takes. You can use different guardrails for different LLM tasks, and you can create as many guardrails as you need.
To create guardrails, you must have:
- Generative AI configurations enabled for your chatbot.
- Guardrails enabled for your chatbot.
- Scripts to connect guardrail detection to your chatbot.
Contact inGenious AI support for help with adding guardrails to your chatbot.
Guardrails must be published before they can be used by your chatbot.
You can:
- Create a guardrail
- Duplicate an existing guardrail as a new input or output guardrail.
- Rename a guardrail.
- Delete a guardrail.
Guardrail filters
Each guardrail has filters that control the actions the chatbot takes when it detects that type of content. You can configure the guardrail to:
-
Flag and block content to respond to it differently.
- If the filter is on an input guardrail, the chatbot does not activate the LLM model.
- If the filter is on an output guardrail, the chatbot discards the content it received from the LLM model.
Your script can start a passage designed to handle this type of content instead.
- Flag and mask the content).
This replaces the content with text that indicates what the original content was before processing continues, such as replacing an email address with "[EMAIL]".
This option is only available for Sensitive Info filters. - Flag the content but allow processing to continue.
This is useful if you want to detect the content for monitoring or analysis without changing the chatbot's behaviour. - Do nothing.
This option is only available for Denied content (harmful categories).
Some filters and filter configurations are only available to specific types of guardrails. You can configure filters to detect:
- Denied content (harmful categories)
- Denied content (custom topics)
- Profanity
- Denied words
- Sensitive info
- Grounding validation (output guardrails only)
- Relevance validation (output guardrails only)
Guardrail permissions
| Reviewer | Editor | Publisher | Admin | |
|---|---|---|---|---|
| Create a guardrail | - | - | ✓ | ✓ |
| Edit a guardrail filter | - | - | ✓ | ✓ |
| Duplicate a guardrail | - | - | ✓ | ✓ |
| Delete a guardrail | - | - | ✓ | ✓ |