Security Theatre: Busy Metrics in the SOC are a Great Show, but Terrible Defense
The first in a series of security risk articles by Craig Jones, chief security officer, Ontinue
Unfortunately, too many security professionals, at all levels and disciplines, still consider the old school view that metrics from a noisy SOC are the best benchmarks for strong security performance. I call this “activity theater” – a lot of show from alerts, but no real sustenance. In fact, high volumes of alerts consist of mostly noise, and high noise levels degrade, rather than enhance, true detection capabilities. A show of alert activity may look impressive, but it’s illusionary noise that camouflages exposure points and increases attack opportunities.
This article presents a new way CISOs should evaluate, explain and leverage metrics that will accurately and effectively inform their board of directors, CEOs, and other key stakeholders about the status of security in their organization, and why risk reduction is the best metric to measure.
Drowning in Metrics
We all know that security teams – in-house and at managed SOCs – are drowning in metrics. “Quantity” of alerts is not an industry issue. But a busy SOC is not a healthy SOC; it is a sign of poor tuning and unattended protocols that produce distractions that impede defenders from detecting, responding to and stopping what matters most: actual cyberattacks in motion.
Well-tuned SOCs should be quiet, so when something happens and alerts fire off, defenders know how to prioritize the activity and quickly attend to what needs immediate attention. When alerts are rare, precise, and meaningful, defenders can immediately recognize anomalies and respond decisively instead of spending their time sifting through low‑value noise. In our experience, true positive cases outside of the phishing landscape tend to be highly unique, previously unseen, and rarely repeat across customer environments. When detections are tuned correctly, this can represent unique true positive incidents in roughly 90% of cases across our customers.
Risk reduction metrics should be the forefront of any CISO’s report for their board. This approach may require a fresh mindset and take time for the industry to adopt, but a shift to evaluating and executing on less will result in stronger, more resilient security.
Noise Increases Security Risk
Simply put, overwhelming amounts of reactive “busy work” creates risk and leaves no time to properly manage transformation or enable and secure technology for critical business processes.
The collective goal, between in-house security teams and managed extended detection and response (MXDR) service providers, is to reduce business risk. This is done through focused metrics that allow SOCs to analyze critical alerts that fired or didn’t fire, work fast and with accuracy, and continuously refine and improve detections.
It’s easy for organizations, with smaller security teams, in particular, to get overwhelmed with inaccurate detections that add layers of noise to SOC metrics (and no longer provide visibility). When we are onboarding new customers, one first step is to clean up incompatible, legacy security to simplify the environment. This allows us to have clear visibility into what’s truly happening across an entire estate and diagnose weak points.
For example, in some onboarding cases, our experts have seen security designed decades ago (mis)applied to AWS and other modern cloud environments. In other situations, we’ve unknowingly taken over systems that attackers had already compromised, due to previous blind spots as a result of an abundance of alerts. Messy SOCs also make it easy to fall behind on patching and implementing other security basics, and once alerts are out of control, it’s even harder to put preventative measures in place.
In addition to increasing security risk, activity theaters have a human dark side. These busy environments contribute to alert fatigue and burnout, very real factors in the security industry that could not only affect the quality of defenders’ work but the longevity of their careers and mental health.
The Ideal State: A Quiet SOC
Our goal as defenders is to reduce alert volume while increasing detection. We’re constantly monitoring and adapting so that our response to suspicious activity is strategic, predictable, and fast. We consider every detection a security failure; this is why a quiet SOC is the ideal state.
To achieve this, we focus on metrics that measure containment speed, risk reduction, detection quality, and automation effectiveness. Measuring the right components requires sustained tuning, correlation, automation, and contextual analysis – work that often competes with day-to-day incident response, which is why having artificial intelligence and automation built into a SOC is critical. These technologies can observe, orient, decide, and act (OODA) upon benign and allow-listed alerts at machine speed, so security analysts and threat hunters can focus on notifications deemed urgent for the SOC.
As an example, experts in our Cyber Defense Center only spent time on about 3% of total customer incidents in 2025; our deterministic automation resolved the remaining 97%, as outlined in our whitepaper, Cutting Through the Hype: What Agentic AI Really Means and the Future of Security Operations. Metrics from this 3% or “ideal quiet state” are what CISOs should be focusing on when measuring and presenting the value and status of their SOC to stakeholders.
As mentioned above, a key role for security experts is to constantly calibrate detection and tuning in the SOC. Think of it as a living, dynamic environment that changes daily – if not hourly – with employees coming and going (possibly leaving behind vulnerable ghost accounts and credentials), outdated local admin access, default or weak configurations exposed to the internet without proper patching or segmentation, human errors, such as plugging a tainted personal USB drive into a corporate-managed workstation, and much more.
Metric Altitudes
Let’s take a step back and look at the 97% of alerts. At first glance, these may seem apropos for the “cutting room floor,” but they do have a place in metrics reporting; they’re just not fit for page one of a status summary for the board and CEO.
If we look at the below chart, you can see how we’ve categorized select metrics by whether they are strategic, operational and/or tactical, by audience and by measurement. This is an example of what we call Metrics Altitudes. Every alert in the SOC matters, even ones in the 97%. But because they have very distinct meanings and actions to address them, it’s smarter to organize them in these categories. Note: for the purpose of introducing and explaining the topic, this chart contains only a small sampling out of hundreds of metrics that Ontinue tracks. Will provide deeper information on Metric Altitudes in subsequent articles, including a complete library of metrics to build upon or use as a template.
| Metric | Strategic / Operational / Tactical | Board / CISO / Mgmt | What it measures | How to calculate (sudo-measurement entry) | Measurement points (Old / baseline) | Measurement points (New / AI-instrumented) |
| Detection change failure rate | Tactical | Mgmt | Stability of detection engineering | rollbacks / deployments | deployment_id, roll back_flag | Same |
| Time-to-fix noisy detections | Tactical | Mgmt | Responsiveness to noise | fix_deployed – noise_logged | noise_logged_time, fix_deployed_time | Same |
| Detection coverage growth | Strategic Tactical | CISO, Mgmt | Coverage expansion over time | new detections per period + mapped scope | detection_added_time, detection_scope | Same |
| Telemetry health / ingestion uptime | Operational | Mgmt | Whether monitoring has blind spots | % required sources healthy | source_status, last_ingested_time | Same |
| Playbook success rate | Operational Tactical | Mgmt | Reliability of response automation | successful_runs / total_runs | playbook_run_id, status | Same + include AI-triggered runs separately |
| Automation utilization rate | Operational Tactical | Mgmt | How often automation meaningfully contributes | % cases with successful automation step | automation_step_executed=true | Same + ai_triggered_automation=true |
| Automation “hours returned” | Operational Tactical | CISO | Effort reduction (provider + customer) | estimated time saved vs baseline | baseline task timing estimates | Use measured: ai_time_saved_estimate, workflow durations |
| Utilization rate (internal) | Operational | Mgmt | Capacity consumption | consumed_hours / available_hours | staffing schedules + time tracking | Same |
| Forecast accuracy (case volume) | Operational Tactical | Mgmt | Predictability of workload | % error / MAPE | forecast_volume, actual_volume | Same |
| Noise growth rate | Operational Tactical | CISO, Mgmt | Whether alert fatigue is trending up/down | MoM change in low-value alerts | low_value_alert_count_by_month | Same |
| Stability index (service variance) | Operational Tactical | CISO, Mgmt | Whether performance is consistent | variance of TTN/MTTI/MTTR over time | KPI time series | Same |
| Investigation Quality Score (IQS) | Operational Tactical | CISO, Mgmt | Quality of investigations (QC sampled) | sampled checklist score | qc_sample_id, qc_checklist_scores | Same + include “AI-assisted” attribute |
| Severity downgrade/ upgrade rate | Operational Tactical | Mgmt | Triage calibration | % that shift severity materially | initial_severity, final_severity | Same |
| Root-cause recurrence (30/60/90) | Strategic | Board, CISO | Whether fixes stick | repeats / total | root_cause_tag, incident dates | Same |
| Time-to-improve (gap → fix deployed) | Strategic Operational Tactical | CISO, Mgmt | Continuous improvement velocity | fix_deployed – gap_logged | gap_logged_time, fix_deployed_time | Same |
The first in a series of security risk articles by Craig Jones, chief security officer, Ontinue




