Proactive DLP release notes
v3.0.0
Release date: 07/14/2025
- Tagging system - DLP is now able to add metadata tags to processed PDF, Word, Excel, PowerPoint files and images
- AI scan is now available for Word and PowerPoint files as well
- DLP supports redaction for Json files from now on
- MDB and ACCDB file types are now supported for scanning
- Indian PAN and Aadhar cards are added as pre-defined options
v2.23.1
Release date: 05/12/2025
- Added support for Amazon Linux 2023 and Oracle Linux
- Resolved DLL loading issue on Windows Server 2016
- Fixed failure in NSFW scan under specific conditions
- Improved stability and error handling for large PDF scans
v2.23.0
Release date: 04/08/2025
- It is now possible to HASH the hits in the output document for Word, PDF and plain text files
- Toxic text detection is now available in more languages, including French, Spanish, Turkish, Italian, Russian, and Portuguese
- More NSFW categories introduced, including guns and violence
- More PIIs are available via AI scanning, including NHS and UK Electoral Roll Number
- Resolved an issue where enabling AI scan previously disabled regex scan. Both scans now function concurrently
v2.22.1
Release date: 02/05/2025
- DLP no longer uses "/tmp" directory in order to fully support non-root Docker
- Result UI page became more comprehensive, detailing scan types and processing results
- Switching on AI scan won’t interfere with other features anymore
v2.22.0
Release date: 01/13/2025
Intentional Leakage Detection
- Small Font Size Recognition: Detect intentional data leaks by finding text hidden using very small font sizes in PDF, Word, and Excel files
- Invisible Text Recognition: Detect intentional data leaks by identifying hidden text where the text color and background color are the same or very similar in PDF, Word, and Excel files.
New predefined sensitive information types are now supported for detection and redaction:
- Turkish Passport Numbers
- Turkish Phone Numbers
- Turkish ID Numbers (TC Kimlik)
- IMEI/IMEISV (International Mobile Equipment Identity)
- Israeli ID Numbers
Fixed an issue where AI scanning disrupted regex-based scanning
v2.21.1
Release date: 11/04/2024
- Debian 12 and Rocky Linux 9.4 are now supported
- AI scan failure in case a PDF file has an empty page is now fixed
- Hit validation failure when only a TSV file is used for PDLP is now fixed
v2.21.0
Release date: 10/02/2024
Anonymization is now supported for PCAP and PCAPNG files
PII detection using Artificial Intelligence
- Supported PII: Driver’s license, passport number, and national ID number
- Languages Supported: English, French, Spanish, German, Italian, and Portuguese
- Supported File Types: PDF and text files
Detection policy update fixed
v2.20.0
Release date: 07/25/2024
Support the .jp2 file type.
New predefined sensitive information types are now supported for detection and redaction:
- Australia Medicare Number
- Australia Company Number
- Australia Business Number
- Australia Tax File Number
- UK NHS Number
The localization select list in the UI has been replaced with a country-specific sensitive information select list.
The US-SSN has been divided into US-ITIN and US-SSN (they are still referred to as SSN in the result JSON).
The JP-SSN has been renamed to JP-MYNUM (it is still referred to as SSN in the result JSON).
v2.19.1
Release date: 06/13/2024
- Improve PDF metadata dictionary processing.
- Detection Policy improvements include: precedence, new operators, and parenthesis usage.
- Custom regex file updates no longer require an engine restart.
- Fix date regex issue in Excel files using TSV files.
v2.19.0
Release date: 04/11/2024
- Support multi-frame GIF files with OCR.
- Support .loc files, FDF files, and Acrobat Forms.
- Support recursive PDF processing (for embedded PDF only).
- Improve Microsoft Office 2007 object processing, including linked objects, tracked changes, background images, SmartArt.
- Improve PDF stamp annotation and form processing.
- Fix watermark scan bug.
v2.18.1
Release date: 02/15/2024
- Implement more sophisticated TSV file handling related to the “custom regexes from file” feature.
- Refine detection of SWIFT codes, reducing both false positives and false negatives.
- Resolved interruption issues related to hit limits during embedded image processing.
- Removed duplicate results that occur when using the best quality OCR.
- Fix the independence of redact settings from TSV file loading.
- Fix a memory leak
v2.18
Release date: 01/04/2024
Importing custom regular expressions from a file is now supported.
New predefined sensitive information types are now supported for detection and redaction:
- ABA Routing Number
- U.S. Bank Account
- International Banking Account Number (IBAN)
- International Securities Identification Number (ISIN)
- SWIFT Code
OCR capabilities have been improved.
SSE4.2 is now also accepted by Proactive DLP (in addition to SSE4.1) as the required CPU instruction set for OCR.
File metadata handling for scan requests have been improved.
Proactive DLP’s UI workflow settings are now unified across all officially supported MetaDefender Core versions.
Proactive DLP’s UI workflow settings for detection have been redesigned to provide a more compact look.
v2.17
Release date: 10/20/2023
Announcing Document Identification
Not Safe For Work (NSFW)
- Detect "Not Safe For Work" content in text and images
- Redact textual hits and blur images with NSFW content
Personal document classification
- Detect personal ID on images
Detection of Generic Password and Generic API Key secrets have been improved
v2.16
Release date: 07/18/2023
Announcing DICOM Anonymization:
- Anonymize patient information according to the Basic Attribute Confidentiality Profile
- Remove sensitive burned-in annotations from DICOM images
Patterns for US SSN numbers have been updated
Fixed issue when DLP engine crashes when files are sent for scanning right after engine initialization
v2.15.1
Release date: 05/22/2023
- Improved generic password and generic API token detection (secret detection)
- Fixed issue when processing of large text files took very long time
- Fixed issue when CCN hits contained extra character at the end
- Fixed sensitive info substitution in hyperlinks
- Fixed issue when PDLP was unable to detect certain software secrets in XML files
- Allowlist and custom validators now properly working with XML files
v2.15
Release date: 03/31/2023
Support more secrets:
- Generic passwords
- Generic API tokens
- PostgreSQL credentials
- MYSQL credentials
DOCM and XLSM file types are now supported in scanning
Redact sensitive information in CSV and XML files
Text substitution can be configured instead of redaction in the following document types:
- MS office documents (word, excel, slides)
- PDF files
- Text files
Fixed issue when valid hits can be invalidated due to a bug
Fixed issue with duplicate character validator that resulted in more false positive
v2.14
Release date: 12/19/2022
Support more secrets:
- Private keys (PEM, PPK)
- IBM Cloud key
- IBM API Connect Credentials
- IBM COS HMAC Credentials
Reorganize the workflow configurations
Improve MS Word processing to reduce false positive detection
Fixed issue when valid hits could be lost when other hits are redacted during PDF processing
Fixed log retention
v2.13.1
Release date: 11/2/2022
- Allowlist feature has been added to all sensitive info types
- Log retention period can now be configured from the engine configuration
- Encoding detection can be configured in the Workflow rule setting instead of the engine configuration
- Fixed issue where DLP engine updates were failing permanently due to unrecognized OS
- Fixed issue when encoding detection settings were lost during configuration export and import
- Fixed issue where an empty regular expression field in the workflow rule settings caused DLP engine update to fail permanently
v2.13
Release date: 9/31/2022
Secret detection in text files supporting AWS, Azure and GCP secrets
New optional validators for sensitive data
- Exclude prefix
- Exclude suffix
- Exclude beginning characters
- Exclude ending characters
- Duplicate characters (only for custom regexes)
DLP log level is now selectable in the engine configuration
External dependencies are checked before engine initialization
Some descriptions have been streamlined in the workflow rule settings
Fixed issue when local scan fails on read-only PDF files
Fixed issue when SSN localization could be saved without value
v2.12
Release date: 7/5/2022
- Improved PDF processing speed
- General improvements in performance due to update from .NET Core 3.1 to .NET 6.
- Improved OCR capabilities
- Improved product logging to enhance product diagnostics
- Fixed issue when the quality of jpeg images dropped after metadata removal
- Fixed issue when DLP fails to process EMF images embedded in a document
- Fixed issue when embedded files can't be opened during recursive PPT processing
- Fixed issue when xls files could not be processed during MD Core local scan
v2.11.1
Release date: 5 /11/ 2022
- A configuration to choose a default fallback encoding
- Optimize system resource usage for PDF processing
- PDF Watermark: Added line breaks to long texts
v2.11
Release date: 3/30/2022
- Support watermark feature for MS Word
- Allow the customers to set certainty for the regular expression
- Support encoding detection (for Japanese, Hebrew and UTF8 encodings)
- Processing all annotation object types in PDFs
- Processing all stamp object types in PDFs
- Keeping the original image quality for output files when removing Metadata
v2.10.1
Release date: 2/16/2022
- Enhance PPT file processing
- Improve metadata processing for image file types
- Fix line break issue with DOC/DOCX when applying watermark
- Fix metadata removal with TIFF file format
- More plain text file support
- Improved image processing (performance)
v2.10
Release date: 12/22/2021
- Add more supported encodings (for Email Security Gateway use case)
- A configuration to set limit file size per workflow
- Scan and remove Metadata recursively
- Support processing cropped images for XLS/XLSX/PPT/PPTX
- Fixed memory leak issue
v2.9.1
Release date 12/1/2021
- Detection Policy is available on MetaDefender Core v5 or newer
- A configuration to allow a file if it is redacted (MetaDefender Core v5 or newer)
- Optimize memory usage
- Better encode handling between Email Security Gateway and Proactive DLP
v2.9
Release date 10/13/2021
Support watermark EMF/WMF/SVG
Detect and Redact sensitive info in several objects
- MS Excel sheet name
- Defined Name object in MS Excel
- Image alternative text in MS Office
- Comment Author in MS Word and Excel
- Header and Footer in MS Word and Excel
- Track changes in MS Word and Excel
- Alternative images in PDF
- Form fields in PDF
Improve recursive processing
- Support RTF as an embedded file
- More details about hit location
Upgraded Qt framework to version 5.15.2
v2.8.1
Release date: 8/18/2021
- Improve performance for several file types (MS Excel, text, etc ...)
- Fix MS Excel redaction failure in some cases
v2.8
Release date: 7/8/2021
- A configuration to remove embed objects if recursive processing fails
- Fixed FILE SIZE LIMIT configuration issue
- Fixed embedded image OCR processing
- Improve large plain text file processing
v2.7.1
Release date: 6/3/2021
- Support unlimited depth and number objects in recursive document processing
- Allow users to set a limit number of returned sensitive info
- Added "," to the default delimiter list
- Improve Excel processing
- Fixed Chinese regular expression in text files (CSV, TXT, ...)
- Fixed missing keyword configuration when upgrading to DLP 2.7
v2.7
Release date: 4/27/2021
- Drop supporting MetaDefender Core older than 4.17.1
- Recursive scan and redaction of embedded files in MS Office files
- Localization support for Japanese SSN
- Support watermark, metadata detection, OCR for BMP format
- Support "Delimiter" as an optional validator
- Context detection around hits in PDF files has been improved
- Chart detection and redaction has been introduced in Excel
- Improve OCR detection quality
- Improve redaction function for MS Word files
v2.6.1
Release date: February 8, 2021
- Fix detection issue when an empty cell has a comment (Excel)
- Improve MS Office validation in some regular expression cases
v2.6.0
Release date: January 11, 2021
- Process the hidden areas in a cropped image (DOCX, DOC)
- Support OCR for standalone image file types (JPG, PNG, TIFF)
- Support OCR for embedded images in DOC, DOCX, XLS, XLSX
- Support remove metadata for document files (PDF, DOC, DOCX, XLS, XLSX)
- Support redaction for RTF
v2.5.1
Release date: November 12, 2020
- Improve Japanese string detection/redaction in PDF
- Fix a detection issue when a regular expression contains a Hebrew string
- Fix a crash issue when scanning DOC file on Linux
v2.5
Release date: October 1, 2020
Metadata removal, Watermark, OCR are available on Linux
Advanced watermark configurations: font size, text opacity, text position
New configurations
- Stop the process if found enough sensitive info
- Quality configurations for OCR (Normal, Best)
Support HTML and TXT redaction
10x faster when processing text file
v2.4.1
Release date: August 8, 2020
- Improved memory usage
- Improved IPv4 and CIDR search
- Added threaded comment search and redaction in Excel files
- Up to 40% speedup when scanning Excel files
v2.4
Release date: July 7, 2020
- Utilize column and row header to improve certainty level in Excel
- Detect sensitive info in file properties with regular expressions
- Custom keyword list for regular expression
- Support redaction feature on Linux
- Performance improvement: faster processing, less resource usage
- New system requirements on Linux
- End of support Centos 6, Debian 8
v2.3.2
Release date: May 20, 2020
- Better context calculation for Excel and PDF
- Improve IPv4 detection in TXT
- Distinguish between "Failed to detect" and others
v2.3.1
Release date: April 21, 2020
- Threaded comment redaction in Excel files.
- Slightly increased PDF scan performance.
- Improved certainty calculation for MS Office and PDF files.
- Fixed wrong context when a single cell in an Excel file contained the same hit multiple times.
v2.3.0
Release date: April 7, 2020
- Support Optical Character Recognition (OCR) for PDF (Windows only)
- Redact sensitive information for Microsoft Office Excel (XLS/XLSX)
- Better detection method, reduce false positive
v2.2.1
Release date: Feb 12, 2020
- Improve IPv4/CIDR detection performance
- Better handling temp files
- Remove "Parse Binary" option
v2.2
Release date: Jan 6, 2020
- Supports watermark addition for PDF
- Redact sensitive information for Microsoft Office Word (DOC/DOCX)
- Support DLP in Linux with limited functions (work with MetaDefender Core 4.17.1 or newer)
- Redact sensitive information based on certainty level (work with MetaDefender Core 4.17.1 or newer)
- Sample Regular expressions to detect Personally identifiable information (PII): email, address, full name, date of birth, driver license, phone number, bank account number
v2.1.2
Release date: November 27, 2019
- Better error message when an input PDF file is corrupted
v2.1.1
Release date: October 31, 2019
- Better displaying the words before and after a hit in PDF
v2.1
Release date: September 8, 2019
- Supports IPv4, Classless Inter-Domain Routing (CIDR) detection
- Supports remove metadata for TIFF, GIF file
- Better CCN detection
v2.0.1
Release date: August 15, 2019
- Better watermark and redaction handling when a system is under high load
- Improve CCN detection
v2.0
Release date: June 28, 2019
- Proactive DLP as new name
- Certainty score for sensitive data detection
- Redact sensitive information for text-based PDF file
- Watermark addition for JPEG, TIFF, PNG, GIF
- Supports remove metadata for JPG, PNG file
v1.0.3
Release date: February 18, 2019
- Improve detection for Microsoft Access format
- Improve context for hits
- Improve processing speed (20%)