Ontinue’s Practical SOC Metrics Library: Measuring What Actually Matters
In security operations, you can’t improve what you don’t measure. Yet many security teams struggle with metrics that are either too abstract for decision-making or too tactical to demonstrate business value.
Ontinue has developed a comprehensive metrics library that bridges this gap. It provides security leaders with a practical framework to measure performance across the entire detection and response lifecycle—and, crucially, to quantify the impact of AI on SOC efficacy.
This is not about AI theater. These metrics aren’t designed to create impressive dashboards or justify technology purchases with vanity numbers. They’re built to measure real operational outcomes: faster detection, higher-quality investigations, reduced customer burden, and demonstrable risk reduction. Every metric ties back to one of four fundamentals of security operations:
- Speed
- Quality
- Governance
- Business impact
A Measurement Framework, Not Just Definitions
What makes Ontinue’s library uniquely practical is that it doesn’t just define what to measure—it explains how to measure it.
Each metric includes:
- Specific calculation methods
- The exact measurement points needed from existing systems
This transforms abstract concepts like Mean Time to Investigate into concrete data collection requirements (for example: alert_created_time, validated_time, and the timestamps in between).
The library distinguishes between:
- Baseline measurement points – what traditional SOCs can track today
- AI-instrumented measurement points – additional telemetry needed to understand AI’s contribution
This dual-track approach lets organizations measure AI impact through direct comparison, rather than assumption.
Built for Every Stakeholder
The Ontinue metrics library recognizes a fundamental truth: different audiences need different views of security performance.
- Board members care about strategic outcomes and risk reduction.
- CISOs need operational insights into service quality and continuous improvement.
- Security managers require tactical metrics to optimize workflows and resource allocation.
The library addresses this by mapping 50+ metrics across three “altitude levels” and clearly identifying which metrics matter to which audience:
- Strategic
- Operational
- Tactical
This ensures boards aren’t drowning in tactical details while frontline managers get the operational visibility they need.
The Core: Detection and Response Fundamentals
At the foundation are time-based metrics that form the common language of modern security operations.
Mean Time to Detect (MTTD)
Definition: How quickly threats are identified relative to attacker activity.
Calculation: detection_time – incident_start_time
Because incident_start_time is often unknown, the library provides practical proxies (for example, file creation time for malware incidents).
AI-instrumented extension:
- Adds
ai_correlation_startandai_correlation_endto track when AI accelerates the correlation process.
Mean Time to Investigate (MTTI)
Definition: The speed from alert creation to investigation start.
Calculation: updated_time – alert_created_time
AI-instrumented extension:
- Adds timestamps such as
ai_enrichment_start/end,ai_summary_start/end, andai_confidencescores to measure whether AI enrichment actually speeds the investigator’s ability to act.
Mean Time to Qualified Response (MTTQR)
Definition: The full cycle from alert to state change conclusion.
Calculation: validated_time – alert_created_time
This is often where AI’s impact is most visible, as automated enrichment and summarization can compress the validation phase from hours to minutes.
Quality Metrics: Speed Without Noise
Speed without quality is just noise. The library includes robust quality metrics with clear calculation methods.
Benign Positive Rate
Calculation: benign_positives ÷ total_alerts
AI-instrumented extension:
- Flags whether AI recommended the disposition.
- Tracks
ai_dispositionandai_confidencealongside human validation to understand where AI excels and where it struggles.
Alert-to-Incident Conversion Rate
Definition: Measures signal quality.
Calculation: validated_incidents ÷ total_alerts
A low rate suggests noise; a high rate indicates effective filtering.
Measurement points:
alert_idincident_linkedflagvalidatedstatus
Precision by Severity
Definition: Ensures accuracy where it matters most.
Calculation (for high severity): true_high ÷ all_high_flagged_alerts
Measurement points:
initial_severityfinal_validated_outcome
This creates a feedback loop that can tune both human and AI severity assessments.
Reopen Rate
Definition: Tracks quality of closures.
Calculation: reopened_cases ÷ closed_cases
Measurement points:
case_closed_timecase_reopen_timereopen_reason
AI-instrumented extension:
- Splits reopens between AI-assisted and non‑AI‑assisted cases to identify whether automation is creating incomplete resolutions.
Customer-Facing Metrics
The framework also rigorously addresses often-overlooked customer-facing metrics.
Time to Notify (TTN)
Definition: Communication speed once an incident is validated.
Calculation: customer_notified_time – validated_time
AI-instrumented extension:
- Adds
ai_draft_start/endtimestamps to quantify whether AI-generated notifications actually accelerate customer communication.
Time to Actionable Summary (TTAS)
Definition: How fast customers receive usable narratives.
Calculation: actionable_summary_sent – validated_time
Measurement points:
ai_summary_start/endfor AI-generated draftshuman_edit_timeto capture human revisions
This reveals whether AI produces drafts that need heavy revision or are close to “ready to send.”
Customer Involvement Rate
Definition: Tracks workload shift to the customer.
Calculation: Percentage of incidents requiring customer action.
Measurement points:
customer_action_requiredflagaction_count
AI-instrumented extension:
- Shows whether AI can reduce customer burden by autonomously handling more resolution steps.
The Game-Changer: AI SOC Efficacy Metrics That Matter
What distinguishes Ontinue’s metrics library is its sophisticated approach to measuring AI’s impact on security operations.
These are not “bot optimization” metrics. The library deliberately avoids measuring AI activity for its own sake (for example, number of API calls, tokens consumed, or how “busy” the AI appears).
Instead, it focuses ruthlessly on operational outcomes:
- Did the investigation complete faster?
- Was the analysis more complete?
- Did the customer receive better service?
AI Utilization Rate
Definition: Measures AI’s meaningful contribution.
Calculation: Percentage of cases where ai_action_type != "none".
Measurement points:
ai_assist_flagai_action_typetaxonomy (enrich / summarize / disposition / recommend)
This prevents gaming the metric with trivial AI touches—only material contributions count.
AI-Attributed MTTI Reduction
Definition: The “crown jewel” for proving ROI.
Calculation: Compare MTTI for non‑AI cases vs. AI‑assisted cases within matched cohorts.
The library specifies what “matched” means:
- Same alert types
- Same severity levels
- Comparable complexity
Measurement points:
- All baseline timestamps
ai_assist_flag- Case-type matching fields
This rigor ensures any speed improvement is real, not an artifact of AI being applied only to easier cases.
AI Summary Latency & Human Edit Time
These two metrics work as a pair.
- AI Summary Latency:
ai_summary_end – ai_summary_start - Human Edit Time:
final_sent – ai_draft_endor explicitedit_start/endtimestamps
Together, they show whether AI drafts save time overall:
- If AI takes 30 seconds but humans spend 15 minutes editing, the net benefit is questionable.
- If AI takes 30 seconds and humans spend 2 minutes reviewing, that’s genuine productivity.
Human Validation Rate
Definition: Measures governance and human-in-the-loop oversight.
Calculation: Percentage of AI suggestions requiring human approval.
Measurement points:
ai_action_typeapproval_requiredflagapproval_timeapproved/rejectedstatus
This creates transparency about where humans remain in the loop and whether those approval gates are calibrated appropriately.
AI Suggestion Acceptance Rate
Definition: Measures usefulness of AI recommendations.
Calculation: accepted_suggestions ÷ total_suggestions
Measurement points:
ai_suggestion_idacceptedflag
A high acceptance rate indicates that AI recommendations align with human judgment. A low rate signals the need for model tuning or highlights nuances AI is missing.
AI Error Escape Rate
Definition: Critical safety metric.
Calculation:material_QC_failures_in_AI_cases ÷ AI_cases_sampled
Measurement points:
- Quality-control sampling results
- Linked
ai_assist_flag ai_action_type
This directly measures whether AI is making mistakes that slip through and provides data to set appropriate confidence thresholds and approval gates.
AI Confidence Distribution
Definition: Measures calibration of AI self‑assessment.
Approach:
- Build a histogram of
ai_confidence_scorevalues. - Compare confidence scores against actual accuracy.
An overconfident AI (high confidence, low accuracy) is dangerous. A well‑calibrated AI (confidence aligned with accuracy) can be trusted with greater autonomy.
AI Fallback Rate
Definition: Tracks reliability of AI execution.
Calculation: Percentage of AI runs where ai_run_status equals fail or fallback.
Measurement points:
ai_run_statusfallback_reason
A high fallback rate suggests infrastructure issues or model limitations that require attention.
Customer Involvement for AI-Resolved Cases
Two important metrics extend Customer Involvement analysis:
- Customer Involvement Rate for AI‑Resolved Cases
- Self-Contained Resolution Rate (AI‑Assisted)
Both split the standard Customer Involvement Rate by ai_assist_flag, comparing baseline case outcomes against AI‑assisted outcomes. This reveals whether AI enables the SOC to resolve more incidents without customer data or actions—a key value proposition for managed security services.
Measuring Operations, Not Theater
The library also includes operational health metrics with the same level of rigor.
SLA Attainment for High/Critical Cases
Definition: Percentage of high/critical cases meeting SLA targets.
Measurement points:
sla_targetvalues- Case timestamps (open, notify, contain)
AI-instrumented extension:
- Splits metrics by AI‑assisted vs. non‑AI‑assisted cases to show whether AI helps cases meet SLAs more reliably.
Peak-Hour Performance
Definition: Compares performance during peak windows vs. baseline.
Approach:
- Calculate 90th percentile metrics across relevant timestamps.
- Use a
peak_window_idmarker.
AI-instrumented extension:
- Adds “AI assist rate during peak” to understand whether AI helps maintain quality when volume spikes.
Containment Time
Definition: Time to containment, split by who leads the effort.
- Provider-led containment:
containment_complete – validated_time - Customer-led containment:
handoff_sent – validated_time, plus trackingcustomer_complete_timeseparately
AI-instrumented extension:
automation_exec_start/endhuman_approval_timeai_recommendation_start/endai_confidencehandoff_quality_score
This granularity reveals where AI speeds containment and where human judgment or customer action remains the bottleneck.
Evidence Completeness Rate
Definition: Percentage of cases meeting an evidence checklist.
Measurement points (examples):
logs_attachedtimeline_presentioc_list
AI-instrumented extension:
- Flags
ai_evidence_summary_generatedto track whether AI helps ensure investigations are thorough and well documented.
Automation “Hours Returned”
Definition: Quantifies effort reduction.
Approach:
- Compare measured workflow durations against baseline task timing estimates.
AI-instrumented extension:
- Uses
ai_time_saved_estimatederived from actual timestamps rather than assumptions, providing defensible ROI calculations.
From Measurement to Improvement
Metrics exist to drive decisions, not just to populate dashboards. The library includes forward‑looking metrics with actionable calculations.
Detection Coverage Growth
Definition: Tracks how coverage expands over time.
Measurement points:
detection_added_timedetection_scope
This quantifies whether the SOC is expanding visibility or merely treading water.
Time to Improve
Definition: Measures continuous-improvement velocity.
Calculation: fix_deployed_time – gap_logged_time
This creates accountability for how quickly the SOC addresses identified weaknesses.
Root-Cause Recurrence
Definition: Tracks whether fixes “stick.”
Calculation:repeat_incidents ÷ total_incidents within 30/60/90‑day windows.
Measurement points:
root_cause_tag- Incident dates
This identifies chronic issues that keep resurfacing.
Noise Growth Rate
Definition: Monitors alert fatigue.
Calculation: Month‑over‑month change in low_value_alert_count.
This acts as an early warning system to catch detection drift before it overwhelms analysts.
Making AI Transparent and Trustworthy
Ontinue’s approach makes AI measurable, not magical. Every traditional metric has AI‑instrumented measurement points that track:
- When AI contributed
- How long it took
- What confidence level it assigned
- Whether human approval was required
This transparency builds trust.
- A CISO can show the board that AI‑assisted cases resolve 40% faster while maintaining equivalent Investigation Quality Scores (measured via
qc_sample_idandqc_checklist_scoreswith an AI‑assisted attribute). - A security manager can demonstrate that AI suggestions are accepted 85% of the time for enrichment but only 60% for disposition recommendations—actionable data for tuning confidence thresholds and approval workflows.
- When quality control reveals AI Error Escape Rate trending up as confidence thresholds increase, teams gain the feedback loop needed to balance speed and safety.
A Framework for Operational Excellence
Ontinue’s metrics library represents a maturation of security operations measurement. It doesn’t just tell you to measure “AI impact”:
- Exactly which timestamps to capture
- Which flags to track
- How to calculate meaningful comparisons
The framework explicitly rejects vanity metrics. It doesn’t measure “number of AI enrichments performed” or “percentage of alerts that touched AI,” because those numbers can look impressive while delivering zero operational value.
Instead, every metric ties to:
- Speed
- Quality
- Governance
- Business impact
For security leaders evaluating AI investments, this library provides the measurement blueprint to demand operational proof from vendors. For teams already using AI, it offers the instrumentation needed to identify:
- What’s working
- What needs tuning
- Where human expertise remains essential
In an era where security teams face mounting pressure to do more with less, having the right metrics isn’t just helpful—it’s essential. And having honest, measurable metrics that reveal real value rather than activity is what separates genuine operational excellence from expensive theater.