Docs Self-Managed Manage Kubernetes Monitor Connectors Monitor Kafka Connect in Kubernetes You can monitor the health of Kafka Connect with metrics that are exported through a Prometheus endpoint at the default port 9404. You can use Grafana to visualize the metrics and set up alerts. The Connectors Helm chart is a community-supported artifact. Redpanda Data does not provide enterprise support for this chart. For support, reach out to the Redpanda team in Redpanda Community Slack. Prerequisites A Kubernetes cluster with kubectl installed at least version 1.27.0-0. To check if you have kubectl installed: kubectl version --short --client Helm installed with at least version 3.10.0. To check if you have Helm installed: helm version Kafka Connect deployed with monitoring enabled. Limitations The connectors dashboard renders metrics that are exported by managed connectors. However, when a connector does not create a task (for example, an empty topic list), the dashboard will not show metrics for that connector. Verify monitoring is enabled Before configuring Prometheus, verify that Kafka Connect is exposing metrics: Check that the connectors service exposes port 9404: kubectl get service --namespace <namespace> <release-name>-connectors Expected output should include 9404/TCP: NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redpanda-connectors ClusterIP 10.96.78.129 <none> 8083/TCP,9404/TCP 1h Test that metrics are being exported: POD_NAME=$(kubectl get pod -l app.kubernetes.io/name=connectors --namespace <namespace> -o jsonpath='{.items[0].metadata.name}') kubectl exec $POD_NAME --namespace <namespace> -- curl -s localhost:9404/metrics | head -10 Expected output should show Prometheus metrics: # HELP jmx_exporter_build_info A metric with a constant '1' value labeled with the version of the JMX exporter. # TYPE jmx_exporter_build_info gauge jmx_exporter_build_info{version="0.20.0",name="jmx_prometheus_javaagent",} 1.0 # HELP kafka_producer_iotime_total Kafka producer JMX metric type producer # TYPE kafka_producer_iotime_total gauge Configure Prometheus Prometheus is a system monitoring and alerting tool. It collects and stores metrics as time-series data identified by a metric name and key/value pairs. To configure Prometheus to monitor Kafka Connect metrics in Kubernetes, you can use the Prometheus Operator. The Prometheus Operator provides Kubernetes-native deployment and management of Prometheus and related monitoring components. Follow the steps to deploy the Prometheus Operator. Configure the Prometheus resource to target your Pods running Kafka Connect: prometheus.yaml apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: prometheus namespace: <namespace> spec: serviceAccountName: prometheus podMonitorNamespaceSelector: matchLabels: name: <namespace> podMonitorSelector: matchLabels: app.kubernetes.io/name: connectors resources: requests: memory: 400Mi enableAdminAPI: false podMonitorNamespaceSelector.matchLabels.name: The namespace where Redpanda is deployed. podMonitorSelector.matchLabels.app.kubernetes.io/name: Must match the label on your connectors Pods (default is connectors). Deploy the standalone Kafka Connect chart with monitoring enabled. This will automatically create a PodMonitor resource: --values --set connectors-monitoring.yaml monitoring: enabled: true scrapeInterval: 30s helm upgrade --install redpanda-connectors redpanda/connectors --namespace <namespace> --create-namespace \ --values connectors-monitoring.yaml helm upgrade --install redpanda-connectors redpanda/connectors \ --namespace <namespace> \ --create-namespace \ --set monitoring.enabled=true \ --set monitoring.scrapeInterval=30s Wait until the connectors deployment is ready: kubectl --namespace <namespace> rollout status deployment <release-name>-connectors --watch Verify that the PodMonitor was automatically created: kubectl get podmonitor --namespace <namespace> Expected output: NAME AGE redpanda-connectors 30s Verify that Prometheus is scraping the metrics: kubectl port-forward svc/prometheus-operator-kube-p-prometheus 9090:9090 --namespace <namespace> Then navigate to http://localhost:9090/targets to verify that the Kafka Connect targets are being scraped. Important metrics to monitor The most important Kafka Connect metrics to monitor with alerts are: Connector task status: kafka_connect_connector_status and kafka_connect_task_status Connector task failures: kafka_connect_task_error_count Consumer lag: kafka_consumer_lag_max (for sink connectors) Producer record send rate: kafka_producer_record_send_rate (for source connectors) Error rates: kafka_connect_connector_failed_task_count Example queries Important Prometheus queries for monitoring: # Failed connector tasks kafka_connect_connector_failed_task_count > 0 # Connector status (should be "running") kafka_connect_connector_status{status!="running"} # High consumer lag kafka_consumer_lag_max > 1000 # Low throughput for source connectors rate(kafka_producer_record_send_total[5m]) < 1 Import the Grafana dashboard You can use Grafana to query, visualize, and generate alerts for metrics. Redpanda provides a Grafana dashboard for connectors. To create and use the Grafana dashboard to gather telemetry for your managed connectors, import the connectors dashboard JSON file (Connectors.json). Managed connector metrics You can monitor the following metrics for your Redpanda managed connectors. Connector tasks Number of tasks for a specific connector, grouped by status: running - Tasks that are healthy and running. paused - Tasks that were paused by a user request. failed - Tasks that failed during execution. Expect only running and paused tasks. Create an alert for failed tasks. Sink connector lag The number of records still to be processed by a connector. This metric is emitted for sink connectors only (last_offset - current_offset). For newly-created connectors, the metric is high until the connector sinks all historical data. Expect the lag not to increase over time. MM2 replication latency Age of the last record written to the target cluster by the MirrorMaker 2 connector. This metric is emitted for each partition. For newly-created connectors, the metric is high until the connector processes all historical data. Expect the latency to not increase over time. Count of the records sent to target (by topic) Count of records sent to the cluster by source connectors for each topic. Redpanda consumer latency The Redpanda consumer fetch latency for sink connectors. Redpanda producer latency The Redpanda producer request latency for source connectors. Bytes in Bytes per second (throughput) of data from Redpanda to managed connectors. Bytes out Bytes per second (throughput) of data from managed connectors to Redpanda. Record error rate record errors - Total number of record errors seen in connector tasks. record failures - Total number of record failures seen in connector tasks. record skipped - Total number of records skipped by connector tasks. Producer record rate record sent - Total number of records sent by connector producers. record retry - Total number of records sent retries by connector producers. Producer record error rate Rate of producer errors when producing records to Redpanda. Connectors support Redpanda monitors the managed connector infrastructure 24/7 to ensure the service is available. The monitoring of individual connectors is expected to be done by the end user. If an incident occurs, Redpanda Support follows an incident response process to quickly mitigate it. Consumer lag A connector generally performs lower than expected when it is underprovisioned. Increase the number of Max Tasks (tasks.max) in the connector configuration for a given number of instances and instance types. Additional reasons for increasing consumer lag: Available memory for the connector is too low. Insufficient number of instances. Autoscaling is based on the total running task count for connectors. Sink connector lag rate metric The sink connector lag rate metric shows the difference between a topic max offset rate and a sink connector committed offsets rate. When the message rate for the topic is greater than the sink connector consume rate, the lag rate metric is positive. You should expect the metric to drop below 0 regularly, which means progress is being made and the connector is able to catch up with the produce rate. Contact Redpanda support to align connector instances with your needs. Connector in a failed state If a connector is in a failed state, first check the connector configuration and logs. If a connector fails, it typically occurs immediately after a configuration change. Check exception details and stacktrace by clicking Show Error. Check connector logs in the Logs tab. Restart the connector by clicking Restart. The following table lists the most frequent connector configuration issues that cause a failed status: Issue Action External system connectivity issue Check that the external system is up and running. Check that the external system is available. Check the connector configuration to confirm that external system properties are correct (URL, table name, bucket name). External system authentication issue Check that the given account exists in an external system. Check the credentials defined in the connector configuration. Incorrect topic name or topic name pattern Check that the expected topic is created. Check that the given topic name pattern matches at least one topic name. Out Of Memory error Change the connector configuration, lower the connector cache buffer size, and decrease the maximum records allowed in a batch. Limit the number of topics set in Topics to export (topics) or Topics regex (topics.regex) properties. Decrease Max Tasks (tasks.max) in the connector configuration. Contact Redpanda support. Next steps Create and manage Kafka Connect connectors Set up Grafana dashboards using the Redpanda connectors dashboard Configure alerting rules for critical connector metrics Learn about scaling connector deployments Suggested labs Owl Shop Example Application in DockerStart a Single Redpanda Broker with Redpanda Console in DockerStart a Cluster of Redpanda Brokers with Redpanda Console in DockerSearch all labs Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! Redpanda Rolling Restart