Configurations

To enable Deep CDR, go to Workflow Managements > Workflows > [Workflow name] > Enable Deep CDR

To enable file types to be processed

To enable Archive file types sanitization, go to the Compression tab

Update the "Enable for archive compression filetypes"

Advanced Configurations

Deep CDR can be configured via workflow or rules within administrator management console. For each file type, it is customizable via module configuration, which objects to remove. For example, you can configure to remove macro while keeping hyperlinks.

The engine configuration is accessible under Workflow Management > Workflow > {Workflow name} > Deep CDR > Advanced configuration > File type handling > {File type}

PDF

  • Remove Macro: Remove JavaScript and document open action.

  • Remove Embedded Object: Remove embedded objects including attachments, embedded files, etc. Applicable when recursive sanitization is not performed.

  • Recursive Level: Enable/disable recursive sanitization. Default is 1 level.

  • Process Hyperlink Behavior: Remove hyperlink annotations and change hyperlinks in text (plain text strings that PDF readers recognize as hyperlinks and highlight automatically).

    • Remove hyperlink annotations only.
    • Remove hyperlink annotations and text links.
    • Remove hyperlink annotations and prevent clickable links auto-tuning.
    • Remove hyperlink annotations and leave them in text form.
    • Add hyperlink prefix.
    • Return list of hyperlinks.
    • Do nothing (Keep as is).
  • Process Image:

    • Process Raw Image: Raw images are images that cannot be extracted to be standalone like normal images. They usually require large resources to process.
  • Remove Metadata

  • Remove Embedded Font: Remove all embedded fonts; may break some non-English content (like Hebrew or Arabic).

  • Process Form: Process operational form fields.

  • Remove Web Capture Content: Remove Web Capture content database objects (/IDS, /URLS) and related data (SpiderInfo and capture command).

  • Strict Validation: Strictly report standard compliance issues or ignore slight error in file structure.

  • Skip Signed File: Skip processing PDF file with digital signature.

    • Deep CDR only validates the file integrity and the signature, it can't recognize if a signature is a self-signed signature or not.
  • Validate digital signature

  • Process 3D object

  • Process PostScript Object: PostScript object data are set to empty and related streams are removed.

  • Remove User-Defined Named Action: Named actions are predefined actions that can force the PDF viewer application to invoke an action that might adversely impact the system. User-Defined Named Actions object data are set to empty and related streams are removed.

  • Allowed Named Actions: List of Named actions should be allowed.

    • Note: NextPage, PrevPage, FirstPage, and LastPage are always allowed as they just navigate around a document.

Office Document

Microsoft Word/Excel/Powerpoint document

  • Process Macro: Process macro, DDE protocol and document open action in DOC/DOT/DOCM/DOTM/XLS/XLT/XLSM/XLTM/XLSB/XLAM/PPT/PPS/POT/PPTM/PPSM/SLDX/SLDM.

    • Remove all macro.
    • Skip allowlisted macro.
    • Do nothing (Keep as is).
  • Skip Signed Macro: Signed macro is detected and validated against its signature and excluded from removal if valid. Applicable to DOCM, DOTM, PPTM, PPSM, POTM, PPAM, SLDM, XLSM, XLTM and XLAM.

  • Remove Custom XML

  • Remove Chart

  • Remove Embedded Object: Remove all embedded objects including attachments, embedded files, etc.

  • Process Hyperlink Behavior:

    • Remove hyperlink.
    • Add hyperlink prefix.
    • Return list of hyperlinks.
    • Do nothing (Keep as is).
  • Process Image

  • Sanitize QR Code: Hyperlink in QR code is detected and processed based on Image Configuration.

  • Remove Comment: is applied to DOCX family (DOCX, DOTX, DOCM, etc.), XLSX family (XLSX, XLSM, XLSB, etc.), PPTX family (PPTX, PPTM, PPSX, etc.).

  • Remove Revision: is applied to DOCX family (DOCX, DOTX, DOCM, etc.) only.

  • Remove Metadata

  • Remove Embedded Font

  • Remove Hidden Text: Remove texts that were hidden in the document.

  • Cleanup Unused Resources: Remove unused styles and lists from DOC/DOT/XLS/XLT files.

  • Recursive Level: Enable/disable recursive sanitization. Default is 1 level.

  • Remove Smart Tag: Remove Smart Tags which is an Office feature that associates specific actions with text content matching a certain pattern. Applicable to DOCX family, XLSX family and PPTX family.

  • Skip Signed File: Skip processing file with digital signature.

    • Deep CDR only validates the file integrity and the signature, it can't recognize if a signature is a self-signed signature or not.
  • Validate Document Properties: Enable validation on Document Properties. Applicable to DOCX family, XLSX family, PPTX family, XML-DOCX and XML-PPTX.

    • Validation Effect on Sanitization Result: Choose how failure result of validation affects sanitization result. Default is Success.

Microsoft OneNote document

  • Remove Embedded Object: Remove all embedded objects including attachments, embedded files, etc.
  • Process Image
  • Process Hyperlink Behavior:
    • Remove hyperlink.
    • Add hyperlink prefix.
    • Do nothing (Keep as is).

RTF

  • Remove Embedded Object: Remove all embedded objects including attachments, embedded files, etc.

  • Process Hyperlink Behavior:

    • Remove hyperlink.
    • Add hyperlink prefix.
    • Do nothing (Keep as is).
  • Remove Embedded HTML: Remove HTML tags containing malicious HTML nodes.

    • Remove HTML tags containing malicious nodes.
    • Remove all HTML tags.
    • Do nothing.
  • Process Image

  • Process font table: Remove embedded font table. Font table exceeding a font count setting will be removed.

Microsoft Visio Drawing

  • Process Macro

  • Remove Embedded Object: Remove all embedded objects including attachments, embedded files, etc.

  • Process Hyperlink Behavior:

    • Remove hyperlink.
    • Add hyperlink prefix.
    • Do nothing (Keep as is).
  • Process Image

  • Remove Metadata

  • Recursive Level: Enable/disable recursive sanitization. Default is 1 level.

Image

Deep CDR supports sanitization for AutoCAD, raster and vector image files.

Raster image files

  • Sanitize QR code: Hyperlink in QR code is detected and processed. Process Hyperlink Behavior:

    • Add hyperlink prefix.
    • Return list of hyperlinks.
  • Remove Metadata

  • Remove ICC Profile: Remove ICC profile in JPG files.

  • Remove GeoTIFF tags: Remove or preserve private TIFF tags which store georeferencing information.

  • Image quality: is applied for JPG image (as original or balanced).

  • GIF Layer Count Limit: Maximum normalized layer count of GIF, applicable to GIF only. GIF files with layer count exceeding this value will be skipped from processing.

  • TIFF Size Limit: Maximum decompressed image size on memory for TIFF, set to 0 for unlimited size.

SVG

  • Remove JavaScript: Remove JavaScript code that may harm the system.
  • Remove CDATA: Remove character data that may embed harmful program.
  • Remove Injection: Remove SVG injection to prevent execution of harmful program.
  • Process Image: Process raster images included inside SVG file.
  • Process Embedded Image: Process raster images and vector graphiccs within the SVG files.

DWF

  • Remove Embedded Font
  • Process Hyperlink Behavior: Remove all hyperlinks, Add hyperlink prefix or Do nothing (Keep as is).
  • Process Image: Process raster images included inside SVG file.

DWG

  • Remove Macro

Text

TXT

  • Remove Invisible Character: Remove non-printable UTF-8 characters in these categories:
    • Format character (cf): U+00AD (Soft Hyphen - SHY), U+200E (Left-to-right mark), U+200F (Right-to-left mark), ...
    • Control character (cc): U+0000 (NULL), U+0001 (Start of heading), U+0002 (Start of text)
    • Private use character (co): U+E000 to U+F8FF, U+F0000 to U+FFFFD, U+100000 to U+10FFFD
    • Unassigned character (cn): code points that have not been assigned characters by the Unicode standard.

Markup

HTML

  • Remove Script: Remove all script.

  • Remove Object

  • Remove Applet

  • Process Form: Remove actions in forms

  • Remove Comment: Remove all comment from HTML file.

    • Remove Conditional Comment: Remove conditional comment from HTML file.
  • Remove Iframe

  • Process Hyperlink Behavior:

    • Do nothing.
    • Display hyperlink in text form.
    • Remove hyperlink.
    • Display hyperlink with only domain.
    • Return list of hyperlinks.
    • Add hyperlink prefix.
    • Replace hyperlink by text.
    • Do nothing except remove JavaScript.
      • Process Text Link: If enabled, add prefix to text links besides hyperlinks. Only applicable when choosing 'Add hyperlink prefix'
      • Hyperlink Prefix: Prefix text will be added to hyperlinks. Applicable when choosing 'Add hyperlink prefix'
      • Replaced Text: Hyperlinks will be replaced with this text (English characters only). Applicable when choosing 'Replace hyperlink by text'
  • Display Image Source: If enabled, copy image links ('http(s)://', 'ftp://', 'file://', 'www.', 'ms-its:') in 'src' attribute and place them outside

  • Process Embedded Image: Process images that are embedded directly in HTML files in Base64 format

  • Accept Multiple HTML Tags: Accept documents with multiple separate HTML tags

  • Process Zero Font Text: Process node with zero font text

  • Use Numeric Entity: HTML special characters are escaped with numeric entities

  • Prefer Pretty Print: Create HTML output with spacing in markup code where possible. Spacing visible to users when displayed will not be affected.

JSON

  • Preserve Format: Preserve formats like whitespaces, newlines, etc.
  • Support Newline Delimited: Support Newline Delimited JSON
  • Process Base64 encoded data
    • Data following "data" URL scheme
    • Data at certain JSON path
    • Action on data when sanitization not applied
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard