Changelog

What’s new at Roark

Weekly product updates from the team. New features, integrations, and the occasional behind-the-scenes note.

Nº 29April 3 to 3, 2026

🧮 Formula Metrics

You can now create Formula metrics that combine your existing metrics into composite scores and rules — no code required.

What you can do:

Weighted scores — (Empathy * 0.4) + (Clarity * 0.3) + (Resolution * 0.3)
Pass/fail gates — Compliance AND Greeting
Custom benchmarks — (CSAT + NPS) / 2
Comparisons — Sentiment == "Positive" AND Empathy > 3

How it works:

Create a new metric in your Metric Library and select the Formula calc type
Build your formula using the inline builder — start typing to search and insert metrics
Formulas are evaluated automatically during call analysis

Under the hood:

Dependency-aware evaluation — Source metrics are always computed before formulas that reference them
Deletion protection — Metrics used in formulas cannot be deleted until the formula is updated
Cycle detection — Circular dependencies are caught at creation time
Type safety — Math operators only accept numeric metrics; logical operators only accept boolean and classification metrics

Open this issue→

Nº 28March 18 to 18, 2026

🌍 Accent Detection

A new analysis package that identifies English accents per participant across every segment of a call using ML-based classification.

Metric	Type	What it measures
Accent	Classification	Detected accent per segment and dominant accent at call level, with full probability distribution
Accent Stability	Numeric (0–1)	How consistent the detected accent is across segments

Highlights:

Per-segment probability distributions — See the full accent breakdown per segment, not just the top-1 prediction
Stacked probability chart — Visualize accent probabilities over time in the segment view
16 English accent variants — American, British, Australian, Canadian, Indian, Irish, Scottish, Welsh, and more
Threshold support — Set a threshold on Accent Stability to flag calls where the agent's TTS accent drifted

👉 Recipe: Accent Detection & TTS Drift Monitoring

Open this issue→

Nº 27March 7 to 7, 2026

🛡️ Compliance Analysis Package

A new analysis package that evaluates whether your AI agents comply with regulatory requirements, safety boundaries, and organizational policies — across healthcare, finance, and legal verticals.

9 compliance metrics out of the box:

Metric	Type	What it measures
Regulatory Adherence	Scale (1–5)	Compliance with industry-specific regulations (HIPAA, PCI-DSS, GDPR, etc.)
Consent & Disclosure	Boolean	Whether the agent obtained required consent and provided necessary disclosures
Prompt Injection Resistance	Boolean	Whether the agent resisted manipulation attempts to override its instructions
Identity Consistency	Boolean	Whether the agent maintained its assigned identity throughout the call
Hallucination Boundary	Scale (1–5)	Whether the agent avoided fabricating information and deferred when unsure
Unauthorized Commitment	Boolean	Whether the agent made promises or commitments outside its authority
Sensitive Data Handling	Scale (1–5)	Whether the agent properly handled PII, PHI, and financial data
Escalation Protocol	Boolean	Whether the agent correctly escalated when required by policy
Scope Adherence	Scale (1–5)	Whether the agent stayed within its defined role and topic boundaries

Key features:

Segment-level findings — For 5 metrics (prompt injection, identity, unauthorized commitment, escalation, consent), results include the specific agent statements where issues were detected
Customizable prompts — Every metric accepts optional additional evaluation criteria so you can tailor compliance checks to your organization's specific policies
Works with policies — Add compliance metrics to metric policies to automatically evaluate every production call

Also in this update:

Multi-select metric picker — The metric selector now stays open for multi-select with checkboxes, and supports "Select all" at the package level
View-only metric settings — System metric output configuration (boolean labels, scale ranges) is now visible in the metric library in a read-only mode
Optional/Required prompt labels — Metric settings now clearly indicate whether the LLM prompt is optional or required

Open this issue→

Nº 26March 6 to 6, 2026

🔭 OpenTelemetry Tracing — See Inside Every Agent Turn

You can now send OpenTelemetry traces to Roark and see exactly what happens inside every turn of your voice AI agent — every STT transcription, every LLM generation, every TTS synthesis, every tool call — with full timing, hierarchy, and context.

Roark Traces view showing agent turns with STT, LLM, and TTS spans

Zero-config for Vapi. One function call for LiveKit. Works with anything.

Vapi — If you have a Vapi integration, traces are collected automatically. No code changes, no exporters to configure. Just make sure Public Logs are enabled in your Vapi dashboard and traces will appear alongside your calls.
LiveKit — Add a single configure_roark_tracing() call before your agent starts and every span — STT, LLM, TTS, tool calls — flows into Roark automatically.
Custom / Any platform — Point any OpenTelemetry OTLP HTTP exporter at https://api.roark.ai/v1/otel/v1/traces with your API key. We support TypeScript, Python, Go, and any language with an OTel SDK.

What you get:

Full turn-by-turn visibility — See exactly how STT, LLM, and TTS are used in each agent turn with span timings and hierarchy
Latency debugging — Instantly spot slow LLM responses, TTS bottlenecks, or tool call delays
Tool call inspection — See which tools were invoked, what arguments were passed, and how long they took
Correlated with your calls — Traces appear on the Tracing tab of every call detail page, right next to transcripts and metrics
Project-level trace explorer — Browse and search all traces from Observability → Traces

Roark acts as a full OTEL Collector — just send your traces and we handle ingestion, storage, and visualization.

👉 Learn more

Open this issue→

Nº 25February 26 to 26, 2026

📈 Simulation Results Report & Threshold Metrics

We've completely revamped the simulation results experience with a new results report, metric overview, and built-in threshold pass/fail tracking.

What's New:

Results report — When a simulation run completes, you now get a full report with an overview section (total calls, completion rate, pass rate), a metrics breakdown, and a per-call results summary table
Threshold results — A dedicated section in the report shows your pass/fail rate across all threshold metrics with a clear visual breakdown of which calls passed and which didn't
Metric overview — See how every metric performed across your simulation runs with averages, distributions, and per-call breakdowns
Thresholds in run plans — When building a simulation run plan, select which metrics to evaluate and configure thresholds inline (e.g., Customer Satisfaction >= 7, Response Time < 1000ms). After the run, see exactly which calls passed
Thresholds on call detail — Threshold metrics now appear on individual call pages with a dedicated Thresholds section on the Metrics tab and pass/fail cards on the Overview tab
Metric collection banner — A live banner shows when metrics are actively being collected for a call, with automatic polling so you don't need to refresh

Threshold Configuration:

Numeric/Scale/Count metrics: all comparison operators (>=, >, <=, <, =, !=)
Boolean metrics: equals/not-equals
Classification metrics: text matching with equals/not-equals
Aggregation modes: Each, Average, Min, Max, Median, Sum, P95, P99, Count
Participant role filtering: All, Agent, or Customer

Open this issue→

Nº 24February 24 to 24, 2026

📊 Metric Policies

Automate metric collection across your calls with conditions-based rules. Instead of manually triggering metrics, policies evaluate incoming calls and automatically collect the metrics you care about.

Key Features:

Conditions-based targeting — Filter by agent, call source (Vapi, Retell, etc.), or custom call properties to control which calls a policy applies to
Threshold support — Add pass/fail criteria inline when selecting metrics (e.g., Customer Satisfaction >= 7, Response Time < 1000ms)
System + User policies — Roark auto-creates system policies for core metrics; you create your own for custom evaluations
Full SDK support — Create, update, list, and delete policies programmatically via the Node.js SDK

Use Cases:

Run compliance checks on every production call automatically
Collect different metrics for different agents or call sources
Set quality thresholds that flag underperforming calls without manual review

👉 Learn more

Open this issue→

Nº 23February 22 to 22, 2026

🔀 Scenario Variables

Create reusable scenario templates with dynamic values that change between simulation runs. Instead of duplicating scenarios for different test data, define {{variableName}} placeholders that get replaced at runtime.

Key Features:

Inline variable editor — Type {{ in any scenario step to create or reference variables with autocomplete
Three-stage lifecycle — Define placeholders in scenarios, optionally pre-set defaults on run plans, and provide final values at runtime
Multiple instances — Add the same scenario multiple times to a run plan, each with different variable values, to create a test matrix
API support — Pre-set variables on run plans and pass them at runtime via the SDK, with global or per-scenario modes
Reserved variables — System variables like {{persona.name}} and {{phoneNumberToDial}} are automatically resolved

Use Cases:

Test appointment booking with different patient names, dates, and insurance providers
Run the same support scenario with different order numbers and claim types
Parameterize scenarios for CI/CD pipelines without creating duplicates

👉 Learn more

Open this issue→

Nº 22February 20 to 20, 2026

📞 Customer DTMF Testing

You can now simulate DTMF keypad input in your scenarios — perfect for testing IVR menu navigation, phone trees, and any flow that requires touchtone input.

How it works:

Add a Customer DTMF node to your scenario graph
Specify the DTMF digits to send (0-9, *, #, w/W for pauses)
The Roark agent will send the tones without speaking, just like a real caller navigating an IVR

Use Cases:

Test IVR menu navigation and phone tree flows
Validate your agent handles DTMF input correctly at each menu level
Combine DTMF steps with regular conversation turns to test end-to-end flows that start with an IVR and transition to a live agent

👉 Learn more

Open this issue→

Nº 21February 13 to 13, 2026

🧪 Metric Playground

Test and iterate on your metrics in a dedicated sandbox environment — without affecting your production configuration.

Key Features:

Run metrics on existing calls — select calls from your history and run any combination of metrics against them
Upload new audio — drag and drop MP3, WAV, or MP4 files to test metrics on fresh recordings
Edit metrics inline — tweak prompts, labels, scales, and classification options to create draft versions without impacting live metrics
Real-time results — watch metrics compute live with per-call expandable result cards showing values and reasoning
Preview calls side-by-side — review transcripts, tools, and properties alongside metric results
Publish when ready — promote your draft metric changes to production once you're satisfied

Use Cases:

Build and validate new metrics before rolling them out
Debug unexpected metric scores by running them against known calls
Test prompt changes on a curated set of calls before publishing
Upload sample audio to verify metrics work correctly on new scenarios

Find it under Playground in the left navigation.

Open this issue→

Nº 20February 5 to 5, 2026

📊 Reports V2

We've rebuilt the reports experience from the ground up to make it faster and easier to go from question to insight.

What's New:

Multi-metric reports — add multiple metrics to a single report and compare them side-by-side with individual configurations
Inline call details — click any call in your report to open a resizable side panel with the full transcript, tools, and properties without leaving the builder
Recent reports — your most recently edited reports appear at the top so you can pick up right where you left off
Unified builder — creating and editing reports now lives in a single, streamlined interface
One-click dashboard add — save a report and add it to a dashboard in one step

Improvements:

Cleaner sidebar layout that guides you step-by-step through metric selection, configuration, filters, and breakdowns
Per-metric filters and aggregation options for more precise analysis
Baseline comparisons with multiple display modes (value, percentage, custom baseline)
Resizable workspace that remembers your preferred layout

Open this issue→