Apache Kafka is an open-source stream-processing software platform developed by the Apache Software Foundation, written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
Prerequisite
- Install Kafka and configure the JMX Port using the environment variable.
export JMX_PORT=5199 && ./bin/kafka-server-start.sh -daemon config/server.properties
- For Virtual Machines, install the Linux Agent.
Configuring the credentials
Configure the credentials in the directory /opt/opsramp/agent/conf/app.d/creds.yaml
kafka:
- name: kafka
user: <username>
pwd: <Password>
encoding-type: plain
labels:
key1: val1
key2: val2
Configuring the application
Virtual machine
Configure the application in the directory /opt/opsramp/agent/conf/app/discovery/auto-detection.yaml
- name: kafka
instance-checks:
service-check:
- kafka
process-check:
- kafka
port-check:
- 9092
mon-type: "jmx"
misc:
jmx-port: "5199"
Docker environment
Configure the application in the directory /opt/opsramp/agent/conf/app/discovery/auto-container-detection.yaml
- name: kafka
container-checks:
image-check:
- kafka
port-check:
- 9092
mon-type: "jmx"
misc:
jmx-port: "5199"
Kubernetes environment
Configure the application in config.yaml
- name: kafka
container-checks:
image-check:
- kafka
port-check:
- 9092
mon-type: "jmx"
misc:
jmx-port: "5199"
Validate
Go to Resources under the Infrastructure tab to check if your resources are onboarded and the metrics are collected.
Metrics
OpsRamp Metric | Metric Display Name | Unit | Description |
---|---|---|---|
kafka_channel_request_queue_size | Request Queue Size | Requests | Number of queued requests |
kafka_channel_response_queue_size | Response Queue Size | Responses | Number of queued responses |
kafka_controller_active_controller_count | Active Controller Count | Controllers | Number of active controller on broker |
kafka_fetch_requests_delayed | Fetch Delayed Requests | Requests | Requests delayed in the fetch purgatory |
kafka_fetch_requests_purgatory_delayed | Fetch Delayed Requests | Requests | Requests delayed in the fetch purgatory |
kafka_fetch_requests_purgatory_size | Fetch Purgatory Size | Requests | Requests waiting in the fetch purgatory |
kafka_fetch_requests_size | Fetch Purgatory Size | Requests | Requests waiting in the fetch purgatory |
kafka_jvm_gc_collection_count | Kafka JVM GC collection_count | Objects collected | Number of garbage objects collected |
kafka_jvm_gc_collection_time | Kafka JVM GC collection_time | Seconds | Time taken for collection of the garbage objects |
kafka_jvm_mem_heap_committed | Kafka JVM Mem heap_committed | megabytes | Heap memory committed for the server |
kafka_jvm_mem_heap_used | Kafka JVM Mem heap_used | megabytes | Heap memory usage of the server |
kafka_jvm_mem_non_heap_committed | Kafka JVM Mem non_heap_committed | megabytes | Non-heap memory committed for the server |
kafka_jvm_mem_non_heap_used | Kafka JVM Mem non_heap_used | megabytes | Non-heap memory usage of the server |
kafka_jvm_open_fds | Kafka JVM OpenFDs | Open FDs | Number of Open file descriptors of the server |
kafka_jvm_threads | Kafka JVM Threads | Threads | Number of threads |
kafka_jvm_uptime | Uptime | Minutes | Uptime of the server |
kafka_log_flush_rate | LogFlush Rate And Time | Log flushes | Log flush rate and time |
kafka_metrics_controlled_shutdown_req_queue_time | Controlled Shutdown Request Queue Time | Request time,ms | Refers to the time for which the request is waiting in queue |
kafka_metrics_controlled_shutdown_requests | Controlled Shutdown Requests | Requests / sec | Request rate |
kafka_metrics_controlled_shutdown_resp_send_time | Controlled Shutdown Response Send Time | Request time,ms | Time to send the response |
kafka_metrics_controlled_shutdown_total_time | Controlled Shutdown Total Time | Total time,ms | Request total time |
kafka_metrics_fetch_consumer_req_queue_time | Fetch Consumer Request Queue Time | milliseconds | Refers to the time for which the request is waiting in the request queue |
kafka_metrics_fetch_consumer_resp_send_time | Fetch Consumer Response Send Time | milliseconds | Time to send the response |
kafka_metrics_fetch_consumer_total_time | Fetch Consumer Total Time | milliseconds | Request total time |
kafka_metrics_fetch_follower_local_time | Fetch Follower Local Time | milliseconds | Time taken to process the request at the leader |
kafka_metrics_fetch_follower_resp_queue_time | Fetch Follower Response Queue Time | milliseconds | Time taken by the request to wait in the response queue |
kafka_metrics_fetch_requests | Fetch Requests | Requests / second | Request rate |
kafka_metrics_leader_isr_local_time | Leader And Isr Local Time | milliseconds | Time taken to process the request at the leader |
kafka_metrics_leader_isr_remote_time | Leader And Isr Remote Time | milliseconds | Time taken by the request to wait for the follower |
kafka_metrics_metadata_req_queue_time | Metadata Request Queue Time | milliseconds | Time taken by the request to wait in the request queue |
kafka_metrics_offset_commit_remote_time | Offset Commit Remote Time | milliseconds | Time taken by the request to wait for the follower |
kafka_metrics_offset_commit_resp_queue_time | Offset Commit Response Queue Time | milliseconds | Time taken by the request to wait in the response queue |
kafka_metrics_offsets_req_queue_time | Offsets Request Queue Time | milliseconds | Time taken by the request to wait in the request queue |
kafka_metrics_offsets_resp_queue_time | Offsets Response Queue Time | milliseconds | Time taken by the request to wait in the response queue |
kafka_metrics_produce_remote_time | Producer Remote Time | milliseconds | Time taken by the request to wait for the follower |
kafka_metrics_stop_replica_total_time | Stop Replica Total Time | milliseconds | Request total time |
kafka_metrics_update_metadata_remote_time | Update Metadata Remote Time | milliseconds | Time taken by the request to wait for the follower |
kafka_metrics_update_metadata_requests | Update Metadata Requests | Requests / second | Request rate |
kafka_net_bytes_in | Bytes In | bytes / second | Bytes in rate |
kafka_net_bytes_out | Bytes Out | bytes / second | Bytes out rate |
kafka_net_bytes_rejected | Bytes Rejected | bytes / second | Bytes Rejected |
kafka_net_messages_in | Messages In | Messages / second | Messages in rate |
kafka_producer_requests_delayed | Producer Delayed Requests | Requests | Requests delayed in the producer purgatory |
kafka_producer_requests_purgatory_delayed | Producer Delayed Requests | Requests | Requests delayed in the producer purgatory |
kafka_producer_requests_purgatory_size | Producer Purgatory Size | Requests | Requests waiting in the producer purgatory |
kafka_producer_requests_size | Producer Purgatory Size | Requests | Requests waiting in the producer purgatory |
kafka_replication_isr_expands | ISR Expands | Exapands / second | ISR expansion rate |
kafka_replication_isr_shrinks | ISR Shrinks | Shrinks / second | ISR shrink rate |
kafka_replication_leader_count | Leader Count | Total Leaders | Leader replica counts |
kafka_replication_leader_elections | Leader Election Rate And Time | Leader elections | Leader election rate |
kafka_replication_max_lag | Replication Max Lag | Messages | Maximum lag in messages between follower and leader replicas |
kafka_replication_partitions | Partition Count | Total Partitions | Partition counts |
kafka_replication_unclean_leader_elections | Unclean Leader Elections | Elections / second | Unclean leader election rate |
kafka_replication_under_replicated_partitions | Under Replicated Partitions | Under Replicated Partitions | Number of under replicated partitions(|ISR| < |all replicas|) |
kafka_request_fetch_failed | Failed Fetch Requests | Requests / second | Failed fetch requests rate |
kafka_request_fetch_time_99percentile | Fetch Total Time 99percentile | Requests / second | Time for produce requests for 99th percentile |
kafka_request_fetch_time_avg | Fetch Total Time | milliseconds | Request total time |
kafka_request_handler_avg_idle_pct | Request Handler Threads Idle Time | Fraction | Average fraction of time when the request handler threads are idle |
kafka_request_metadata_time_99percentile | Metadata 99percentile Time | milliseconds | Time for metadata requests for 99th percentile |
kafka_request_metadata_time_avg | Metadata Total Time | milliseconds | Request total time |
kafka_request_offsets_time_99percentile | Offsets Request Time 99percentile | milliseconds | Time for offset requests for 99th percentile |
kafka_request_offsets_time_avg | Offsets Request Time | milliseconds | Average time for an offset request |
kafka_request_produce_failed | Failed Produce Requests | Requests / second | Failed producer requests rate |
kafka_request_produce_time_99percentile | Produce Request Time 99percentile | Request / second | Time for produce requests for 99th percentile |
kafka_request_produce_time_avg | Produce Request Time | Request / second | Average time for a produce request |
kafka_request_update_metadata_time_99percentile | Update Metadata 99percentile Time | milliseconds | Time for update metadata requests for 99th percentile |
kafka_request_update_metadata_time_avg | Update Metadata Total Time | milliseconds | Request total time |