Document Identification

I. NSFW (Not Safe for Work)
This section provides the controls and settings related to the detection and management of content that is categorized as NSFW within text and images.
Detection categories: Here is a list of categories OPSWAT PDLP can detect:
- Suggestive Content: Images that are sexually suggestive but do not depict explicit sexual acts.
- Explicit Content: Images depicting explicit sexual acts or pornography.
- Animated Adult Content: Hentai and other animated images with pornographic themes.
Functionality: Enables or disables the detection of NSFW content in both textual and visual data.
Optical Character Recognition (OCR): Scans and recognizes text within images. This helps in identifying NSFW text that might be embedded in graphics
- OCR Quality: Determines the accuracy and efficiency of the OCR process
- Normal: detect the information without pre-processing images
- Best: pre-processing images before detecting the image to have a better detection rate, however, performance will be impacted
- OCR Quality: Determines the accuracy and efficiency of the OCR process
Redact Hits: Automatically redacts (blurs & Blackout) the detected NSFW content
- Allow if detections at or above threshold are redacted: This setting ensures that there is no interruption in the flow of content after making changes and redacting some portion of the content
- Redaction Threshold: Defines the sensitivity level for redaction
- Allow if detections at or above threshold are redacted: This setting ensures that there is no interruption in the flow of content after making changes and redacting some portion of the content
Example
Here is an example of how the toxic text can work in action and detect the words which is not okay for work

This feature uses AI functionality to analyze both images and text to determine whether the content is toxic or not safe for work. You have the option to enable or disable the AI-powered feature within the configuration settings. By default, this feature is disabled but can be activated or deactivated at any time based on user preference. If you enable the AI-powered feature, OPSWAT will not use your content to train or fine-tune its services. You should not rely on any results generated from AI-based functionality without verifying them.
II. Personal Document Configuration
This section focuses on identifying documents, determining if uploaded items are personal documents, like passports or ID cards or Driver License. Based on the detection, customers can choose to block or allow the content.
Functionality: Enables or disables the automatic detection of personal documents, such as IDs, driver license, and other sensitive ID cards.
Default Behaviour: Defines the action taken when a personal document is detected
- Allow: If a customer wishes to permit identified images, this setting can be activated, but only images verified as personal documents will be allowed.
- Block: If a customer wants to restrict identified images, enabling this feature will ensure those images are blocked.
Example

Personal Document Identification, the default behavior is set up to Allow for personal documents:
United States of America Passport Card: Allowed
United States of America Dog Passport card: Blocked
This feature uses AI functionality to differentiate between various documents and identify whether the document contains any government issued identification documentation of any person. You have the option to enable or disable the AI-powered feature within the configuration settings. By default, this feature is disabled but can be activated or deactivated at any time based on user preference. If you enable the AI-powered feature, OPSWAT will not use your content to train or fine-tune its services. You should not rely on any results generated from AI-based functionality without verifying them.