Proactive DLP release notes

v3.0.0

Release date: 07/14/2025

  • Tagging system - DLP is now able to add metadata tags to processed PDF, Word, Excel, PowerPoint files and images
  • AI scan is now available for Word and PowerPoint files as well
  • DLP supports redaction for Json files from now on
  • MDB and ACCDB file types are now supported for scanning
  • Indian PAN and Aadhar cards are added as pre-defined options

v2.23.1

Release date: 05/12/2025

  • Added support for Amazon Linux 2023 and Oracle Linux
  • Resolved DLL loading issue on Windows Server 2016
  • Fixed failure in NSFW scan under specific conditions
  • Improved stability and error handling for large PDF scans

v2.23.0

Release date: 04/08/2025

  • It is now possible to HASH the hits in the output document for Word, PDF and plain text files
  • Toxic text detection is now available in more languages, including French, Spanish, Turkish, Italian, Russian, and Portuguese
  • More NSFW categories introduced, including guns and violence
  • More PIIs are available via AI scanning, including NHS and UK Electoral Roll Number
  • Resolved an issue where enabling AI scan previously disabled regex scan. Both scans now function concurrently

v2.22.1

Release date: 02/05/2025

  • DLP no longer uses "/tmp" directory in order to fully support non-root Docker
  • Result UI page became more comprehensive, detailing scan types and processing results
  • Switching on AI scan won’t interfere with other features anymore

v2.22.0

Release date: 01/13/2025

  • Intentional Leakage Detection

    • Small Font Size Recognition: Detect intentional data leaks by finding text hidden using very small font sizes in PDF, Word, and Excel files
    • Invisible Text Recognition: Detect intentional data leaks by identifying hidden text where the text color and background color are the same or very similar in PDF, Word, and Excel files.
  • New predefined sensitive information types are now supported for detection and redaction:

    • Turkish Passport Numbers
    • Turkish Phone Numbers
    • Turkish ID Numbers (TC Kimlik)
    • IMEI/IMEISV (International Mobile Equipment Identity)
    • Israeli ID Numbers
  • Fixed an issue where AI scanning disrupted regex-based scanning

v2.21.1

Release date: 11/04/2024

  • Debian 12 and Rocky Linux 9.4 are now supported
  • AI scan failure in case a PDF file has an empty page is now fixed
  • Hit validation failure when only a TSV file is used for PDLP is now fixed

v2.21.0

Release date: 10/02/2024

  • Anonymization is now supported for PCAP and PCAPNG files

  • PII detection using Artificial Intelligence

    • Supported PII: Driver’s license, passport number, and national ID number
    • Languages Supported: English, French, Spanish, German, Italian, and Portuguese
    • Supported File Types: PDF and text files
  • Detection policy update fixed

v2.20.0

Release date: 07/25/2024

  • Support the .jp2 file type.

  • New predefined sensitive information types are now supported for detection and redaction:

    • Australia Medicare Number
    • Australia Company Number
    • Australia Business Number
    • Australia Tax File Number
    • UK NHS Number
  • The localization select list in the UI has been replaced with a country-specific sensitive information select list.

  • The US-SSN has been divided into US-ITIN and US-SSN (they are still referred to as SSN in the result JSON).

  • The JP-SSN has been renamed to JP-MYNUM (it is still referred to as SSN in the result JSON).

v2.19.1

Release date: 06/13/2024

  • Improve PDF metadata dictionary processing.
  • Detection Policy improvements include: precedence, new operators, and parenthesis usage.
  • Custom regex file updates no longer require an engine restart.
  • Fix date regex issue in Excel files using TSV files.

v2.19.0

Release date: 04/11/2024

  • Support multi-frame GIF files with OCR.
  • Support .loc files, FDF files, and Acrobat Forms.
  • Support recursive PDF processing (for embedded PDF only).
  • Improve Microsoft Office 2007 object processing, including linked objects, tracked changes, background images, SmartArt.
  • Improve PDF stamp annotation and form processing.
  • Fix watermark scan bug.

v2.18.1

Release date: 02/15/2024

  • Implement more sophisticated TSV file handling related to the “custom regexes from file” feature.
  • Refine detection of SWIFT codes, reducing both false positives and false negatives.
  • Resolved interruption issues related to hit limits during embedded image processing.
  • Removed duplicate results that occur when using the best quality OCR.
  • Fix the independence of redact settings from TSV file loading.
  • Fix a memory leak

v2.18

Release date: 01/04/2024

  • Importing custom regular expressions from a file is now supported.

  • New predefined sensitive information types are now supported for detection and redaction:

    • ABA Routing Number
    • U.S. Bank Account
    • International Banking Account Number (IBAN)
    • International Securities Identification Number (ISIN)
    • SWIFT Code
  • OCR capabilities have been improved.

  • SSE4.2 is now also accepted by Proactive DLP (in addition to SSE4.1) as the required CPU instruction set for OCR.

  • File metadata handling for scan requests have been improved.

  • Proactive DLP’s UI workflow settings are now unified across all officially supported MetaDefender Core versions.

  • Proactive DLP’s UI workflow settings for detection have been redesigned to provide a more compact look.

v2.17

Release date: 10/20/2023

  • Announcing Document Identification

  • Not Safe For Work (NSFW)

    • Detect "Not Safe For Work" content in text and images
    • Redact textual hits and blur images with NSFW content
  • Personal document classification

    • Detect personal ID on images
  • Detection of Generic Password and Generic API Key secrets have been improved

v2.16

Release date: 07/18/2023

  • Announcing DICOM Anonymization:

    • Anonymize patient information according to the Basic Attribute Confidentiality Profile
    • Remove sensitive burned-in annotations from DICOM images
  • Patterns for US SSN numbers have been updated

  • Fixed issue when DLP engine crashes when files are sent for scanning right after engine initialization

v2.15.1

Release date: 05/22/2023

  • Improved generic password and generic API token detection (secret detection)
  • Fixed issue when processing of large text files took very long time
  • Fixed issue when CCN hits contained extra character at the end
  • Fixed sensitive info substitution in hyperlinks
  • Fixed issue when PDLP was unable to detect certain software secrets in XML files
  • Allowlist and custom validators now properly working with XML files

v2.15

Release date: 03/31/2023

  • Support more secrets:

    • Generic passwords
    • Generic API tokens
    • PostgreSQL credentials
    • MYSQL credentials
  • DOCM and XLSM file types are now supported in scanning

  • Redact sensitive information in CSV and XML files

  • Text substitution can be configured instead of redaction in the following document types:

    • MS office documents (word, excel, slides)
    • PDF files
    • Text files
  • Fixed issue when valid hits can be invalidated due to a bug

  • Fixed issue with duplicate character validator that resulted in more false positive

v2.14

Release date: 12/19/2022

  • Support more secrets:

    • Private keys (PEM, PPK)
    • IBM Cloud key
    • IBM API Connect Credentials
    • IBM COS HMAC Credentials
  • Reorganize the workflow configurations

  • Improve MS Word processing to reduce false positive detection

  • Fixed issue when valid hits could be lost when other hits are redacted during PDF processing

  • Fixed log retention

v2.13.1

Release date: 11/2/2022

  • Allowlist feature has been added to all sensitive info types
  • Log retention period can now be configured from the engine configuration
  • Encoding detection can be configured in the Workflow rule setting instead of the engine configuration
  • Fixed issue where DLP engine updates were failing permanently due to unrecognized OS
  • Fixed issue when encoding detection settings were lost during configuration export and import
  • Fixed issue where an empty regular expression field in the workflow rule settings caused DLP engine update to fail permanently

v2.13

Release date: 9/31/2022

  • Secret detection in text files supporting AWS, Azure and GCP secrets

  • New optional validators for sensitive data

    • Exclude prefix
    • Exclude suffix
    • Exclude beginning characters
    • Exclude ending characters
    • Duplicate characters (only for custom regexes)
  • DLP log level is now selectable in the engine configuration

  • External dependencies are checked before engine initialization

  • Some descriptions have been streamlined in the workflow rule settings

  • Fixed issue when local scan fails on read-only PDF files

  • Fixed issue when SSN localization could be saved without value

v2.12

Release date: 7/5/2022

  • Improved PDF processing speed
  • General improvements in performance due to update from .NET Core 3.1 to .NET 6.
  • Improved OCR capabilities
  • Improved product logging to enhance product diagnostics
  • Fixed issue when the quality of jpeg images dropped after metadata removal
  • Fixed issue when DLP fails to process EMF images embedded in a document
  • Fixed issue when embedded files can't be opened during recursive PPT processing
  • Fixed issue when xls files could not be processed during MD Core local scan

v2.11.1

Release date: 5 /11/ 2022

  • A configuration to choose a default fallback encoding
  • Optimize system resource usage for PDF processing
  • PDF Watermark: Added line breaks to long texts

v2.11

Release date: 3/30/2022

  • Support watermark feature for MS Word
  • Allow the customers to set certainty for the regular expression
  • Support encoding detection (for Japanese, Hebrew and UTF8 encodings)
  • Processing all annotation object types in PDFs
  • Processing all stamp object types in PDFs
  • Keeping the original image quality for output files when removing Metadata

v2.10.1

Release date: 2/16/2022

  • Enhance PPT file processing
  • Improve metadata processing for image file types
  • Fix line break issue with DOC/DOCX when applying watermark
  • Fix metadata removal with TIFF file format
  • More plain text file support
  • Improved image processing (performance)

v2.10

Release date: 12/22/2021

  • Add more supported encodings (for Email Security Gateway use case)
  • A configuration to set limit file size per workflow
  • Scan and remove Metadata recursively
  • Support processing cropped images for XLS/XLSX/PPT/PPTX
  • Fixed memory leak issue

v2.9.1

Release date 12/1/2021

  • Detection Policy is available on MetaDefender Core v5 or newer
  • A configuration to allow a file if it is redacted (MetaDefender Core v5 or newer)
  • Optimize memory usage
  • Better encode handling between Email Security Gateway and Proactive DLP

v2.9

Release date 10/13/2021

  • Support watermark EMF/WMF/SVG

  • Detect and Redact sensitive info in several objects

    • MS Excel sheet name
    • Defined Name object in MS Excel
    • Image alternative text in MS Office
    • Comment Author in MS Word and Excel
    • Header and Footer in MS Word and Excel
    • Track changes in MS Word and Excel
    • Alternative images in PDF
    • Form fields in PDF
  • Improve recursive processing

    • Support RTF as an embedded file
    • More details about hit location
  • Upgraded Qt framework to version 5.15.2

v2.8.1

Release date: 8/18/2021

  • Improve performance for several file types (MS Excel, text, etc ...)
  • Fix MS Excel redaction failure in some cases

v2.8

Release date: 7/8/2021

  • A configuration to remove embed objects if recursive processing fails
  • Fixed FILE SIZE LIMIT configuration issue
  • Fixed embedded image OCR processing
  • Improve large plain text file processing

v2.7.1

Release date: 6/3/2021

  • Support unlimited depth and number objects in recursive document processing
  • Allow users to set a limit number of returned sensitive info
  • Added "," to the default delimiter list
  • Improve Excel processing
  • Fixed Chinese regular expression in text files (CSV, TXT, ...)
  • Fixed missing keyword configuration when upgrading to DLP 2.7

v2.7

Release date: 4/27/2021

  • Drop supporting MetaDefender Core older than 4.17.1
  • Recursive scan and redaction of embedded files in MS Office files
  • Localization support for Japanese SSN
  • Support watermark, metadata detection, OCR for BMP format
  • Support "Delimiter" as an optional validator
  • Context detection around hits in PDF files has been improved
  • Chart detection and redaction has been introduced in Excel
  • Improve OCR detection quality
  • Improve redaction function for MS Word files

v2.6.1

Release date: February 8, 2021

  • Fix detection issue when an empty cell has a comment (Excel)
  • Improve MS Office validation in some regular expression cases

v2.6.0

Release date: January 11, 2021

  • Process the hidden areas in a cropped image (DOCX, DOC)
  • Support OCR for standalone image file types (JPG, PNG, TIFF)
  • Support OCR for embedded images in DOC, DOCX, XLS, XLSX
  • Support remove metadata for document files (PDF, DOC, DOCX, XLS, XLSX)
  • Support redaction for RTF

v2.5.1

Release date: November 12, 2020

  • Improve Japanese string detection/redaction in PDF
  • Fix a detection issue when a regular expression contains a Hebrew string
  • Fix a crash issue when scanning DOC file on Linux

v2.5

Release date: October 1, 2020

  • Metadata removal, Watermark, OCR are available on Linux

  • Advanced watermark configurations: font size, text opacity, text position

  • New configurations

    • Stop the process if found enough sensitive info
    • Quality configurations for OCR (Normal, Best)
  • Support HTML and TXT redaction

  • 10x faster when processing text file

v2.4.1

Release date: August 8, 2020

  • Improved memory usage
  • Improved IPv4 and CIDR search
  • Added threaded comment search and redaction in Excel files
  • Up to 40% speedup when scanning Excel files

v2.4

Release date: July 7, 2020

  • Utilize column and row header to improve certainty level in Excel
  • Detect sensitive info in file properties with regular expressions
  • Custom keyword list for regular expression
  • Support redaction feature on Linux
  • Performance improvement: faster processing, less resource usage
  • New system requirements on Linux
  • End of support Centos 6, Debian 8

v2.3.2

Release date: May 20, 2020

  • Better context calculation for Excel and PDF
  • Improve IPv4 detection in TXT
  • Distinguish between "Failed to detect" and others

v2.3.1

Release date: April 21, 2020

  • Threaded comment redaction in Excel files.
  • Slightly increased PDF scan performance.
  • Improved certainty calculation for MS Office and PDF files.
  • Fixed wrong context when a single cell in an Excel file contained the same hit multiple times.

v2.3.0

Release date: April 7, 2020

  • Support Optical Character Recognition (OCR) for PDF (Windows only)
  • Redact sensitive information for Microsoft Office Excel (XLS/XLSX)
  • Better detection method, reduce false positive

v2.2.1

Release date: Feb 12, 2020

  • Improve IPv4/CIDR detection performance
  • Better handling temp files
  • Remove "Parse Binary" option

v2.2

Release date: Jan 6, 2020

  • Supports watermark addition for PDF
  • Redact sensitive information for Microsoft Office Word (DOC/DOCX)
  • Support DLP in Linux with limited functions (work with MetaDefender Core 4.17.1 or newer)
  • Redact sensitive information based on certainty level (work with MetaDefender Core 4.17.1 or newer)
  • Sample Regular expressions to detect Personally identifiable information (PII): email, address, full name, date of birth, driver license, phone number, bank account number

v2.1.2

Release date: November 27, 2019

  • Better error message when an input PDF file is corrupted

v2.1.1

Release date: October 31, 2019

  • Better displaying the words before and after a hit in PDF

v2.1

Release date: September 8, 2019

  • Supports IPv4, Classless Inter-Domain Routing (CIDR) detection
  • Supports remove metadata for TIFF, GIF file
  • Better CCN detection

v2.0.1

Release date: August 15, 2019

  • Better watermark and redaction handling when a system is under high load
  • Improve CCN detection

v2.0

Release date: June 28, 2019

  • Proactive DLP as new name
  • Certainty score for sensitive data detection
  • Redact sensitive information for text-based PDF file
  • Watermark addition for JPEG, TIFF, PNG, GIF
  • Supports remove metadata for JPG, PNG file

v1.0.3

Release date: February 18, 2019

  • Improve detection for Microsoft Access format
  • Improve context for hits
  • Improve processing speed (20%)
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard