Calls Metrics and Monitoring¶

Available on Enterprise and Enterprise Advanced plans

This guide provides detailed information on monitoring Mattermost Calls performance and health through metrics and observability tools. Effective monitoring is essential for maintaining optimal call quality and quickly addressing any issues that arise.

Metrics overview
Setting up monitoring
Key metrics to monitor
Performance baselines
Troubleshooting metrics collection

Metrics Overview¶

Mattermost Calls provides metrics through Prometheus for both the Calls plugin and the RTCD service. These metrics help track:

Active call sessions and participants
Media track statistics
Connection states and errors
Resource utilization (CPU, memory, network)
WebSocket connections and events

The metrics are exposed through HTTP endpoints:

Calls Plugin: /plugins/com.mattermost.calls/metrics
RTCD Service: /metrics (default) or a configured endpoint

Resource utilization metrics (CPU, memory, network) are mainly provided by an external service (node-exporter).

Metrics for the calls plugin are exposed through the /plugins/com.mattermost.calls/metrics subpath under the existing Mattermost server metrics endpoint. This is controlled by the Listen address for performance configuration setting. It defaults to port 8067. For example: http://localhost:8067/plugins/com.mattermost.calls/metrics The RTCD Service /metrics endpoint is exposed on the HTTP API (e.g. http://localhost:8045/metrics).

Setting Up Monitoring¶

For instructions on deploying Prometheus and Grafana for Mattermost, please refer to the Deploy Prometheus and Grafana for Performance Monitoring guide.

Once Prometheus and Grafana are set up, you will need to configure Prometheus to scrape metrics from the Calls-related services.

Prometheus Scrape Configuration¶

Add the following jobs to your prometheus.yml configuration:

scrape_configs:
  - job_name: 'calls-plugin'
    metrics_path: /plugins/com.mattermost.calls/metrics
    static_configs:
      - targets: ['MATTERMOST_SERVER_IP:8067']
        labels:
          service_name: 'calls-plugin'

  - job_name: 'rtcd'
    metrics_path: /metrics
    static_configs:
      - targets: ['RTCD_SERVER_IP:8045']
        labels:
          service_name: 'rtcd'

  - job_name: 'rtcd-node-exporter'
    metrics_path: /metrics
    static_configs:
      - targets: ['RTCD_SERVER_IP:9100']
        labels:
          service_name: 'rtcd'

  - job_name: 'calls-offloader-node-exporter'
    metrics_path: /metrics
    static_configs:
      - targets: ['CALLS_OFFLOADER_SERVER_IP:9100']
        labels:
          service_name: 'offloader'

Replace the placeholder IP addresses with your actual server addresses:

MATTERMOST_SERVER_IP: IP address of your Mattermost server
RTCD_SERVER_IP: IP address of your RTCD server
CALLS_OFFLOADER_SERVER_IP: IP address of your calls-offloader server (if deployed)

Important

Metrics Configuration Notice: Use the service_name labels as shown in the configuration above. These labels help organize metrics in dashboards and enable proper service identification.

Note

node_exporter: Optional but recommended for system-level metrics (CPU, memory, disk, network). See node_exporter setup guide for installation instructions.
calls-offloader: Only needed if you have call recording/transcription enabled.

Mattermost Calls Grafana Dashboard¶

You can use the official Mattermost Calls Performance Monitoring dashboard to visualize these metrics.

To import it directly into Grafana, use dashboard ID: 23225.
The dashboard is also available as JSON source from the Mattermost performance assets repository for manual import or customization.

Key Metrics to Monitor¶

RTCD Metrics¶

Process Metrics¶

These metrics help monitor the health and resource usage of the RTCD process:

rtcd_process_cpu_seconds_total: Total CPU time spent
rtcd_process_open_fds: Number of open file descriptors
rtcd_process_max_fds: Maximum number of file descriptors
rtcd_process_resident_memory_bytes: Memory usage in bytes
rtcd_process_virtual_memory_bytes: Virtual memory used

WebRTC Connection Metrics¶

These metrics track the WebRTC connections and media flow:

rtcd_rtc_conn_states_total{state="X"}: Count of connections in different states
rtcd_rtc_errors_total{type="X"}: Count of RTC errors by type
rtcd_rtc_rtp_tracks_total{direction="X"}: Count of RTP tracks (incoming/outgoing)
rtcd_rtc_sessions_total: Total number of active RTC sessions

WebSocket Metrics¶

These metrics track the signaling channel:

rtcd_ws_connections_total: Total number of active WebSocket connections. This is about RTCD <-> MM, so the connection count should match the number of MM nodes.
rtcd_ws_messages_total{direction="X"}: Count of WebSocket messages (sent/received)

Calls Plugin Metrics¶

Similar metrics are available for the Calls plugin with the following prefixes:

Process metrics: mattermost_plugin_calls_process_*
WebRTC connection metrics: mattermost_plugin_calls_rtc_*
WebSocket metrics: mattermost_plugin_calls_websocket_*
Store metrics: mattermost_plugin_calls_store_ops_total

Performance Baselines¶

The following performance benchmarks provide baseline metrics for RTCD deployments under various load conditions and configurations.

Deployment specifications

1x r6i.large nginx proxy
3x c5.large MM app nodes (HA)
2x db.x2g.xlarge RDS Aurora MySQL v8 (one writer, one reader)
1x (c7i.xlarge, c7i.2xlarge, c7i.4xlarge) RTCD
2x c7i.2xlarge load-test agents

App specifications

Mattermost v9.6
Mattermost Calls v0.28.0
RTCD v0.16.0
load-test agent v0.28.0

Media specifications

Speech sample bitrate: 80Kbps
Screen sharing sample bitrate: 1.6Mbps

Results

Below are the detailed benchmarks based on internal performance testing:

Calls	Participants/call	Unmuted/call	Screen sharing	CPU (avg)	Memory (avg)	Bandwidth (in/out)	Instance type (RTCD)
1	1000	2	no	47%	1.46GB	1Mbps / 194Mbps	c7i.xlarge
1	800	1	yes	64%	1.43GB	2.7Mbps / 1.36Gbps	c7i.xlarge
1	1000	1	yes	79%	1.54GB	2.9Mbps / 1.68Gbps	c7i.xlarge
10	100	1	yes	74%	1.56GB	18.2Mbps / 1.68Gbps	c7i.xlarge
100	10	2	no	49%	1.46GB	18.7Mbps / 175Mbps	c7i.xlarge
100	10	1	yes	84%	1.73GB	171Mbps / 1.53Gbps	c7i.xlarge
1	1000	2	no	20%	1.44GB	1.4Mbps / 194Mbps	c7i.2xlarge
1	1000	2	yes	49%	1.53GB	3.6Mbps / 1.79Gbps	c7i.2xlarge
2	1000	1	yes	73%	2.38GB	5.7Mbps / 3.06Gbps	c7i.2xlarge
100	10	2	yes	60%	1.74GB	181Mbps / 1.62Gbps	c7i.2xlarge
150	10	1	yes	72%	2.26GB	257Mbps / 2.30Gbps	c7i.2xlarge
150	10	2	yes	79%	2.34GB	271Mbps / 2.41Gbps	c7i.2xlarge
250	10	2	no	58%	2.66GB	47Mbps / 439Mbps	c7i.2xlarge
1000	2	2	no	78%	2.31GB	178Mbps / 195Mbps	c7i.2xlarge
2	1000	2	yes	41%	2.6GB	7.23Mbps / 3.60Gbps	c7i.4xlarge
3	1000	2	yes	63%	3.53GB	10.9Mbps / 5.38Gbps	c7i.4xlarge
4	1000	2	yes	83%	4.40GB	14.5Mbps / 7.17Gbps	c7i.4xlarge
250	10	2	yes	79%	3.49GB	431Mbps / 3.73Gbps	c7i.4xlarge
500	2	2	yes	71%	2.54GB	896Mbps / 919Mbps	c7i.4xlarge

Troubleshooting Metrics Collection¶

Verify RTCD Metrics are Being Collected¶

To verify that Prometheus is successfully collecting RTCD metrics, use this command:

curl http://PROMETHEUS_IP:9090/api/v1/label/__name__/values | jq '.' | grep rtcd

This command queries Prometheus for all available metric names and filters for RTCD-related metrics.

If no RTCD metrics appear, check:

RTCD is running
Prometheus is configured to scrape the RTCD metrics endpoint
RTCD metrics port is accessible from Prometheus (default: 8045)

Check Prometheus Scrape Targets¶

To verify all Calls-related services are being scraped successfully:

Open the Prometheus web interface (typically http://PROMETHEUS_IP:9090)
Navigate to Status > Targets
Look for your configured Calls services:
- Mattermost server (for Calls plugin metrics)
- RTCD service

Each target should show status “UP” in green. If a target shows “DOWN” or errors:

Verify the service is running
Check network connectivity between Prometheus and the target
Verify the metrics endpoint is accessible

Calls Metrics and Monitoring¶

Metrics Overview¶

Setting Up Monitoring¶

Prometheus Scrape Configuration¶

Mattermost Calls Grafana Dashboard¶

Key Metrics to Monitor¶

RTCD Metrics¶

Process Metrics¶

WebRTC Connection Metrics¶

WebSocket Metrics¶

Calls Plugin Metrics¶

Performance Baselines¶

Troubleshooting Metrics Collection¶

Verify RTCD Metrics are Being Collected¶

Check Prometheus Scrape Targets¶

Other Calls Documentation¶