What Is the Recommended Skip Hash List Size in MetaDefender Core Without Impacting Performance?

This article applies to MetaDefender Core since version v5.12.0.

The Skip By Hash feature in MetaDefender Core allows administrators to define Allowlist, Blocklist, and Skip Engines rules using file hashes.

This article clarifies system limits, performance behavior, and recommended sizing guidance.

Is There a Maximum Number of Hashes?

There is no defined hard limit on the total number of hashes that can be stored in the Skip Hash list. However, the CSV import size is limited to a maximum of 500 MB per file.

Performance testing shows:

Importing 1 million hashes took approximately 115 seconds
Test environment:
- MetaDefender Core 5.17.1 standalone
- Windows VM
- 8 CPUs
- 16 GB RAM

This confirms that MetaDefender Core can support very large skip lists.

How Skip Hash Checking Works

When a file is scanned:

The file hash is calculated.
MetaDefender Core checks whether the hash exists in the Skip List stored in the local database.

The hash comparison itself is lightweight and does not impact Core performance.

Performance Impact by Hash Volume

While there is no fixed maximum, practical performance behavior typically follows:

Total Skip Hashes	Expected Performance Impact (Well-Sized DB)
< 500K	Negligible impact
~1M	Very low impact (validated test)
1M–5M	Generally safe with proper indexing
5M–10M	Increased DB dependency; tuning required
>10M	Requires strong database sizing and monitoring

Important: If the PostgreSQL instance is:

Shared with other applications
Under-provisioned
Running on slow disk

Performance degradation may appear earlier (even at 2–3 million entries).

Import Behavior and Runtime Impact

During CSV import:

Scan service is not interrupted
Duplicate cleanup increases processing time (If the CSV contains hashes that already exist in the Skip Hash list, the system needs to check and update those entries. This “duplicate handling” takes extra time, so the import process may be slower depending on how many duplicates are present).

Large imports increase database write activity but do not stop scanning.

Recommended Safe Practice

For stable and predictable performance:

Keep Skip Hash list ideally under 5 million entries
Use a dedicated SQL instance
Ensure proper indexing
Use SSD/NVMe storage
Monitor:
- DB CPU usage
- Disk I/O latency
- Query execution time
Perform large imports during maintenance windows
Avoid parallel CSV uploads

Final Recommendation

There is no official maximum number of hashes, but performance is infrastructure-dependent.

For most production environments:

1–5 million hashes can be handled without noticeable performance impact when properly sized. Above that, performance depends heavily on database capacity and tuning.

If planning to scale beyond 5–10 million entries, performance testing in a staging environment is strongly recommended.

If Further Assistance is required, please proceed to log a support case or chatting with our support engineer.

Last updated on

Was this page helpful?