-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Currently, completions and chat completions endpoints only support text contents detectors. When there are detections, we currently set warnings to generic message Unsuitable [inputs/outputs] detected.... This message is static and does not provide any additional information to the user.
As we are expanding these endpoints to support additional detector types (for output detections), this "unsuitable content" warning message isn't always applicable and the logic to determine when a warning should be set becomes potentially complex.
For example, with Granite Guardian risks, a high/low score and yes/no detection may be a “good” or a “bad” thing depending on the risk type and context, e.g. a high answer relevance score is "good" as opposed to a high pii/hap/harm score, so we cannot use score thresholds to apply a warning. Confidence level (provided with metadata for Granite Guardian) also needs to be factored in as “low” could mean the results are not accurate. As the logic to determine when a warning should be applied is opinionated and varies by detector/risk type, I suggest that we deprecate the warnings field and let the client decide how to interpret and handle detection results.
Implications
Current users of the warnings field would need to update their logic to check for [input/output] detections, e.g.
instead of if len(response.warnings) > 0 ... (or checking if it contains unsuitable content warning) do if len(response.detections.input) > 0 / if len(response.detections.output) > 0