Daily Operations

This chapter describes the day-to-day cadence for running MetaDefender NDR. It covers the checklist operators work through at the start of each shift, the typical daily workflow from ingestion review through tuning, and the lower-frequency weekly and monthly activities that keep the deployment healthy. The chapter intentionally stays at the routine-operations level: deep investigation procedures live in the Investigation Runbooks, and administrative procedures live under Administration Page.

This guide is written for Tier 1 and Tier 2 Security Operations Center (SOC) analysts, shift leads, and on-call engineers. It assumes a running MetaDefender NDR deployment with at least one active sensor and a user account with hunting and health-viewing permissions.

First-use acronym expansions in this chapter: SOC (Security Operations Center), NDR (Network Detection and Response), SPAN (Switched Port Analyzer), KPI (Key Performance Indicator), IOC (Indicator of Compromise), IDS / IPS (Intrusion Detection / Prevention System), ML (machine learning), PCAP (packet capture), SLA (service-level agreement), TLS (Transport Layer Security), RBAC (role-based access control).

Daily Checklist

At the beginning of each shift, operators work through the following checklist. The goal is to confirm the platform is healthy, the detection pipeline is producing events at the expected rate, and no high-severity alerts are sitting un-triaged.

AreaWhat to CheckHow to check
Platform HealthAll sensors and the Manager are online, reporting fresh heartbeats, and ingesting traffic at the expected rate.Open (Link Removed) and confirm a green status for the Manager and every sensor. Review the sensor heartbeat column, the capture-rate and drop-rate indicators, and the pipeline lag metrics.
Detection PipelineDetections are arriving and the severity distribution is consistent with the prior day. Sudden drops or spikes warrant investigation.Open the Dashboard. Compare the Recent Severities donut and the Top Signature Hits widget against the previous 24-hour window. Scan Recent Alerts for any activity since the last shift.
High-Severity AlertsEvery Critical and High alert from the overnight window has been acknowledged, assigned, or closed with a triage note.Open the Hunt Page, switch to the All Alerts tab, apply a severity filter of Critical or High, and set the time range to cover the gap since the last shift. Work the list from top to bottom.
Updates CurrencySuricata signatures, InSights feeds, and command-and-control (C2) intelligence are current. No failed update jobs are sitting in an error state.Open Administration Pageand updates confirm the last successful update timestamp for each feed. Review any distribution errors surfaced on the sensor targets.

Operators who find a failing item on the checklist record the observation, apply the relevant runbook, and escalate per the site's incident response procedure. The checklist is intentionally short so it takes under ten minutes when nothing is out of place.

Typical Daily Workflow

The routine daily workflow follows the checklist with investigation, tuning, and handoff activities woven in. Each step is self-contained; analysts may skip steps that the checklist has already cleared.

  1. Review platform health and ingestion. Confirm sensor heartbeats, capture statistics, and pipeline lag from (Link Removed). If a sensor is offline or dropping packets, operators raise the appropriate ticket before beginning alert triage, because gaps in ingestion affect the reliability of the rest of the day's work.
  2. Scan the Dashboard for anomalies. Open the Dashboard and review the Recent Severities donut, Top Signature Hits, and Top Source, Destination, and Port widgets against the prior 24-hour baseline. Any signature or entity that is suddenly dominant and any widget that has gone quiet are both worth a note.
  3. Triage high-severity alerts first. Apply a Critical + High severity filter on the Hunt Page All Alerts tab and work the results oldest-to-newest. For each alert, the operator either escalates to full investigation, downgrades after confirming benign activity, or annotates with a triage note so the shift lead can see progress.
  4. Investigate priority alerts using the runbook library. For Critical alerts and any High alert with credible evidence of compromise, analysts switch from triage to investigation. Common entry points live in the Investigation runbooks: critical-alert triage, C2 beacon investigation, data-exfiltration investigation, and malicious-file investigation.
  5. Correlate file-related alerts with MetaDefender Core. When a detection references an extracted file, operators review the MetaDefender Core enrichment on the alert's detail sidebar. If the file does not yet have a scan result, the analyst coordinates with MetaDefender Core operators to submit it and revisits the alert once the verdict is available. The Malicious File Investigation covers the full procedure.
  6. Work the medium and low queues. After Critical and High are cleared, analysts rotate through Medium and Low severities as time permits. Medium items routinely require correlation with other telemetry; Low items are mostly tuning candidates.
  7. Record tuning candidates. When an alert is consistently benign in the environment, operators open a tuning ticket against the relevant detection family. Repeated false positives on a specific signature, C2 indicator, or behavioral rule feed the weekly tuning cadence; operators do not silently suppress alerts from the shift view.
  8. Hand off to the next shift. The outgoing shift posts a short handover note covering open investigations, any alerts that require continued attention, any platform-health anomalies observed, and any tuning tickets raised during the shift. The handover note is what lets the incoming shift skip steps the outgoing shift has already completed.

Weekly Cadence

Some activities are cheaper to run once a week than once a day, because they require aggregating observations across multiple shifts.

  • Rule-update review. Analysts review the previous week's Suricata-signature and InSights-feed updates from Administration → Updates. The review covers release notes for new detection categories, severity changes, and deprecations, and logs any of those changes that may shift alert volumes or break existing runbooks.
  • Tuning review. Shift leads consolidate the week's tuning tickets — benign alerts that recur, high-false-positive signatures, behavioral rules that are too noisy in the environment — and apply the adjustments that the tuning process approves. Policy edits flow through the change-management process rather than ad-hoc edits.
  • Sensor-health review. Operators review sensor capture-rate and drop-rate trends from (Link Removed) for the previous seven days. Sustained drops or sustained increases in pipeline lag raise a capacity or configuration question that the administrator addresses before it becomes a coverage gap.
  • Backlog review. Shift leads review open investigations that have been carried forward for more than one shift and either close them, escalate them, or assign them an explicit owner.

Monthly Cadence

A small number of activities are appropriate once a month. The goal is to keep deployment design, detection coverage, and data retention aligned with the environment.

  • Sensor-placement review. Network and security engineering review the current SPAN, tap, and sensor layout against the segments that carry high-value traffic. Any segment that has become important since the last review is either added to the sensor footprint or flagged for the next deployment cycle.
  • Detection-tuning audit. The cumulative effect of a month of tuning changes is reviewed against the baseline detection library. Tuning entries that are no longer needed are removed; tuning entries that materially shifted alert volumes are documented in the site's detection runbooks.
  • Retention review. Administrators confirm that the (Link Removed) settings still match the environment's storage footprint and any compliance obligations. Adjustments to retention windows for alerts, flows, session records, extracted files, and packet captures are made through the administration interface and recorded in the change log.
  • Role and access review. The shift lead and an administrator review the active user list and role-based access control (RBAC) assignments from (Link Removed), remove accounts for personnel who no longer require access, and confirm that group-based asset ownership still reflects the current organizational structure.
  • Release-notes check. Deployments that have been upgraded since the last review scan the release notes for features that have moved from preview to general availability and adjust local runbooks where labels or behavior have changed.

See Also

  • Dashboard — the first checkpoint on the daily checklist.
  • Hunt Page — the primary triage and investigation surface referenced throughout the daily workflow.
  • Investigation Runbooks — deep procedures for the investigation step of the daily workflow.
  • (Link Removed) — backing detail for the platform-health checks at the start of each shift and the sensor-health review each week.
  • (Link Removed) — the surface operators use to confirm update currency daily and to plan rule reviews weekly.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard