Apache Kafka Monitoring
Features for Apache Kafka
Germain comes preconfigured with monitoring capabilities for Apache Kafka, allowing you to monitor the uptime and performance of your Kafka system. With Germain, you can gain insights into the health and performance of your Apache Kafka infrastructure.
Some of the key monitoring features for Apache Kafka in Germain include:
Uptime Monitoring
Germain tracks the availability and uptime of your Apache Kafka instances, ensuring that you are promptly alerted in case of any downtime or service interruptions.
Performance Monitoring
Germain is now preconfigured to monitor uptime and performance of Apache Kafka.
Topic + message
Broker and Client metrics (see below)
Metrics
Type | Metric | Description |
Connect Worker Metrics | connector-count | The number of connectors run in this worker. |
connector-startup-attempts-total | The total number of connector startups that this worker has attempted. | |
connector-startup-failure-percentage | The average percentage of this worker's connectors starts that failed. | |
connector-startup-failure-total | The total number of connector starts that failed. | |
connector-startup-success-percentage | The average percentage of this worker's connectors starts that succeeded. | |
connector-startup-success-total | The total number of connector starts that succeeded. | |
task-count | The number of tasks run in this worker. | |
task-startup-attempts-total | The total number of task startups that this worker has attempted. | |
task-startup-failure-percentage | The average percentage of this worker's tasks starts that failed. | |
task-startup-failure-total | The total number of task starts that failed. | |
task-startup-success-percentage | The average percentage of this worker's tasks starts that succeeded. | |
task-startup-success-total | The total number of task starts that succeeded. | |
Connect Worker Connector Metrics | connector-destroyed-task-count | The number of destroyed tasks of the connector on the worker. |
connector-failed-task-count | The number of failed tasks of the connector on the worker. | |
connector-paused-task-count | The number of paused tasks of the connector on the worker. | |
connector-restarting-task-count | The number of restarting tasks of the connector on the worker. | |
connector-running-task-count | The number of running tasks of the connector on the worker. | |
connector-total-task-count | The number of tasks of the connector on the worker. | |
connector-unassigned-task-count | The number of unassigned tasks of the connector on the worker. | |
Connect Worker Rebalance Metrics | completed-rebalances-total | The total number of rebalances completed by this worker. |
connect-protocol | The Connect protocol used by this cluster | |
epoch | The epoch or generation number of this worker. | |
leader-name | The name of the group leader. | |
rebalance-avg-time-ms | The average time in milliseconds spent by this worker to rebalance. | |
rebalance-max-time-ms | The maximum time in milliseconds spent by this worker to rebalance. | |
rebalancing | Whether this worker is currently rebalancing. | |
time-since-last-rebalance-ms | The time in milliseconds since this worker completed the most recent rebalance. | |
Connector Metrics | connector-class | The name of the connector class. |
connector-type | The type of the connector. One of 'source' or 'sink'. | |
connector-version | The version of the connector class, as reported by the connector. | |
status | The status of the connector. One of 'unassigned', 'running', 'paused', 'failed', or 'destroyed'. | |
Connector Task Metrics | batch-size-avg | The average size of the batches processed by the connector. |
batch-size-max | The maximum size of the batches processed by the connector. | |
offset-commit-avg-time-ms | The average time in milliseconds taken by this task to commit offsets. | |
offset-commit-failure-percentage | The average percentage of this task's offset commit attempts that failed. | |
offset-commit-max-time-ms | The maximum time in milliseconds taken by this task to commit offsets. | |
offset-commit-success-percentage | The average percentage of this task's offset commit attempts that succeeded. | |
pause-ratio | The fraction of time this task has spent in the pause state. | |
running-ratio | The fraction of time this task has spent in the running state. | |
status | The status of the connector task. One of 'unassigned', 'running', 'paused', 'failed', or 'destroyed'. | |
Sink Task Metrics | offset-commit-completion-rate | The average per-second number of offset commit completions that were completed successfully. |
offset-commit-completion-total | The total number of offset commit completions that were completed successfully. | |
offset-commit-seq-no | The current sequence number for offset commits. | |
offset-commit-skip-rate | The average per-second number of offset commit completions that were received too late and skipped/ignored. | |
offset-commit-skip-total | The total number of offset commit completions that were received too late and skipped/ignored. | |
partition-count | The number of topic partitions assigned to this task belonging to the named sink connector in this worker. | |
put-batch-avg-time-ms | The average time taken by this task to put a batch of sinks records. | |
put-batch-max-time-ms | The maximum time taken by this task to put a batch of sinks records. | |
sink-record-active-count | The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task. | |
sink-record-active-count-avg | The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task. | |
sink-record-active-count-max | The maximum number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task. | |
sink-record-lag-max | The maximum lag in terms of number of records that the sink task is behind the consumer's position for any topic partitions. | |
sink-record-read-rate | The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied. | |
sink-record-read-total | The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted. | |
sink-record-send-rate | The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations. | |
sink-record-send-total | The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted. | |
Source Task Metrics | poll-batch-avg-time-ms | The average time in milliseconds taken by this task to poll for a batch of source records. |
poll-batch-max-time-ms | The maximum time in milliseconds taken by this task to poll for a batch of source records. | |
source-record-active-count | The number of records that have been produced by this task but not yet completely written to Kafka. | |
source-record-active-count-avg | The average number of records that have been produced by this task but not yet completely written to Kafka. | |
source-record-active-count-max | The maximum number of records that have been produced by this task but not yet completely written to Kafka. | |
source-record-poll-rate | The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker. | |
source-record-poll-total | The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker. | |
source-record-write-rate | The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations. | |
source-record-write-total | The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted. | |
Task Error Metrics | deadletterqueue-produce-failures | The number of failed writes to the dead letter queue. |
deadletterqueue-produce-requests | The number of attempted writes to the dead letter queue. | |
last-error-timestamp | The epoch timestamp when this task last encountered an error. | |
total-errors-logged | The number of errors that were logged. | |
total-record-errors | The number of record processing errors in this task. | |
total-record-failures | The number of record processing failures in this task. | |
total-records-skipped | The number of records skipped due to errors. | |
total-retries | The number of operations retried. |
Configuration
For more detailed information on how to set up and configure Apache Kafka monitoring in Germain, please reach out to us. We will provide you with specific guidance and assistance tailored to your Apache Kafka monitoring requirements.
Component: Engine
Feature Availability: 8.6.0 or later