Apache Kafka stands as a widely recognized open source event store and stream processing platform. It has evolved into the de facto standard for data streaming, as over 80% of Fortune 500 companies use it. All major cloud providers provide managed data streaming services to meet this growing demand.
One key advantage of opting for managed Kafka services is the delegation of responsibility for broker and operational metrics, allowing users to focus solely on metrics specific to applications. In this article, Product Manager Uche Nwankwo provides guidance on a set of producer and consumer metrics that customers should monitor for optimal performance.
With Kafka, monitoring typically involves various metrics that are related to topics, partitions, brokers and consumer groups. Standard Kafka metrics include information on throughput, latency, replication and disk usage. Refer to the Dokumentasi Kafka and relevant monitoring tools to understand the specific metrics available for your version of Kafka and how to interpret them effectively.
Why is it important to monitor Kafka clients?
Monitoring your IBM® Event Streams for IBM Cloud® instance is crucial to ensure optimal functionality and overall health of your data pipeline. Monitoring your Kafka clients helps to identify early signs of application failure, such as high resource usage and lagging consumers and bottlenecks. Identifying these warning signs early enables proactive response to potential issues that minimize downtime and prevent any disruption to business operations.
Kafka clients (producers and consumers) have their own set of metrics to monitor their performance and health. In addition, the Event Streams service supports a rich set of metrics produced by the server. For more information, see Monitoring Event Streams metrics by using IBM Cloud Monitoring.
Client metrics to monitor
Producer metrics
metrik | Deskripsi Produk |
Record-error-rate | This metric measures the average per-second number of records sent that resulted in errors. A high (or an increase in) record-error-rate might indicate a loss in data or data not being processed as expected. All these effects might compromise the integrity of the data you are processing and storing in Kafka. Monitoring this metric helps to ensure that data being sent by producers is accurately and reliably recorded in your Kafka topics. |
Request-latency-avg | This is the average latency for each produce request in ms. An increase in latency impacts performance and might signal an issue. Measuring the request-latency-avg metric can help to identify bottlenecks within your instance. For many applications, low latency is crucial to ensure a high-quality user experience and a spike in request-latency-avg might indicate that you are reaching the limits of your provisioned instance. You can fix the issue by changing your producer settings, for example, by batching or scaling your plan to optimize performance. |
Byte-rate | The average number of bytes sent per second for a topic is a measure of your throughput. If you stream data regularly, a drop in throughput can indicate an anomaly in your Kafka instance. The Event Streams Enterprise plan starts from 150MB-per-second split one-to-one between ingress and egress, and it is important to know how much of that you are consuming for effective capacity planning. Do not go above two-thirds of the maximum throughput, to account for the possible impact of operational actions, such as internal updates or failure modes (for example, the loss of an availability zone). |
Gulir untuk melihat tabel lengkap
Consumer metrics
metrik | Deskripsi Produk |
Fetch-rate fetch-size-avg |
The number of fetch requests per second (fetch-rate) and the average number of bytes fetched per request (fetch-size-avg) are key indicators for how well your Kafka consumers are performing. A high fetch-rate might signal inefficiency, especially over a small number of messages, as it means insufficient (possibly no) data is being received each time. The fetch-rate and fetch-size-avg are affected by three settings: fetch.min.bytes, fetch.max.bytes and fetch.max.wait.ms. Tune these settings to achieve the desired overall latency, while minimizing the number of fetch requests and potentially the load on the broker CPU. Monitoring and optimizing both metrics ensures that you are processing data efficiently for current and future workloads. |
Commit-latency-avg | This metric measures the average time between a committed record being sent and the commit response being received. Similar to the request-latency-avg as a producer metric, a stable commit-latency-avg means that your offset commits happen in a timely manner. A high-commit latency might indicate problems within the consumer that prevent it from committing offsets quickly, which directly impacts the reliability of data processing. It might lead to duplicate processing of messages if a consumer must restart and reprocess messages from a previously uncommitted offset. A high-commit latency also means spending more time in administrative operations than actual message processing. This issue might lead to backlogs of messages waiting to be processed, especially in high-volume environments. |
Bytes-consumed-rate | This is a consumer-fetch metric that measures the average number of bytes consumed per second. Similar to the byte-rate as a producer metric, this should be a stable and expected metric. A sudden change in the expected trend of the bytes-consumed-rate might represent an issue with your applications. A low rate might be a signal of efficiency in data fetches or over-provisioned resources. A higher rate might overwhelm the consumers’ processing capability and thus require scaling, creating more consumers to balance out the load or changing consumer configurations, such as fetch sizes. |
Rebalance-rate-per-hour | The number of group rebalances participated per hour. Rebalancing occurs every time there is a new consumer or when a consumer leaves the group and causes a delay in processing. This happens because partitions are reassigned making Kafka consumers less efficient if there are a lot of rebalances per hour. A higher rebalance rate per hour can be caused by misconfigurations leading to unstable consumer behavior. This rebalancing act can cause an increase in latency and might result in applications crashing. Ensure that your consumer groups are stable by tracking a low and stable rebalance-rate-per-hour. |
Gulir untuk melihat tabel lengkap
The metrics should cover a wide variety of applications and use cases. Event Streams on IBM Cloud provide a rich set of metrics that are documented here and will provide further useful insights depending on the domain of your application. Take the next step. Learn more about Event Streams for IBM Cloud.
Apa selanjutnya?
You’ve now got the knowledge on essential Kafka clients to monitor. You’re invited to put these points into practice and try out the fully managed Kafka offering on IBM Cloud. For any challenges in set up, see the Panduan Persiapan dan Pertanyaan Umum (FAQ).
Pelajari lebih lanjut tentang Kafka dan kasus penggunaannya
Menyediakan mesin virtual Event Streams di IBM Cloud
Apakah artikel ini berguna?
YesTidak
Lainnya dari Awan
Buletin IBM
Dapatkan buletin dan pembaruan topik kami yang menyampaikan kepemimpinan pemikiran terkini dan wawasan tentang tren yang sedang berkembang.
Berlangganan sekarang
Lebih banyak buletin
- Konten Bertenaga SEO & Distribusi PR. Dapatkan Amplifikasi Hari Ini.
- PlatoData.Jaringan Vertikal Generatif Ai. Berdayakan Diri Anda. Akses Di Sini.
- PlatoAiStream. Intelijen Web3. Pengetahuan Diperkuat. Akses Di Sini.
- PlatoESG. Karbon, teknologi bersih, energi, Lingkungan Hidup, Tenaga surya, Penanganan limbah. Akses Di Sini.
- PlatoHealth. Kecerdasan Uji Coba Biotek dan Klinis. Akses Di Sini.
- Sumber: https://www.ibm.com/blog/getting-started-with-kafka-client-metrics/
- :memiliki
- :adalah
- :bukan
- $NAIK
- 00
- 1
- 10
- 11
- 14
- 15%
- 2%
- 2019
- 2020
- 2024
- 21
- 28
- 30
- 300
- 4
- 40
- 400
- 46
- 5
- 500
- 5G
- 6
- 7
- 8
- 9
- a
- Tentang Kami
- atas
- Akun
- akurat
- Mencapai
- Bertindak
- tindakan
- sebenarnya
- tambahan
- administratif
- Keuntungan
- pengiklanan
- terpengaruh
- Semua
- Membiarkan
- juga
- amp
- an
- analisis
- dan
- Pengumuman
- Apa pun
- Apache
- nafsu makan
- Apple
- Aplikasi
- aplikasi
- ADALAH
- artikel
- AS
- At
- penulis
- tersedianya
- tersedia
- rata-rata
- kembali
- Saldo
- pengelompokan
- BE
- karena
- menjadi
- laku
- makhluk
- Lebih baik
- antara
- Luar
- Blog
- Biru
- kedua
- kemacetan
- makelar
- broker
- dibangun di
- bisnis
- operasi bisnis
- tombol
- by
- byte
- CAN
- kemampuan
- Kapasitas
- karbon
- kartu
- Kartu-kartu
- mobil
- kasus
- KUCING
- Kategori
- Menyebabkan
- disebabkan
- penyebab
- selular
- tantangan
- perubahan
- mengubah
- Saluran
- memeriksa
- lingkaran
- kota
- kelas
- klien
- klien
- lebih dekat
- awan
- warna
- COM
- melakukan
- melakukan
- berkomitmen
- melakukan
- Perusahaan
- kesesuaian
- kompromi
- konfigurasi
- Konektivitas
- dikonsumsi
- konsumen
- perilaku konsumen
- Konsumen
- memakan
- Wadah
- terus
- terus
- salinan
- menutupi
- penutup
- coworking
- CPU
- benar-benar
- membuat
- sangat penting
- CSS
- terbaru
- adat
- pelanggan
- Keamanan cyber
- data
- pengolahan data
- Tanggal
- hari
- de
- Default
- definisi
- menunda
- delegasi
- menyampaikan
- Permintaan
- Tergantung
- deskripsi
- diinginkan
- Devices
- langsung
- ditemukan
- Gangguan
- do
- didokumentasikan
- domain
- turun
- Download
- penghentian
- Tanpa pengemudi
- mobil driverless
- Menjatuhkan
- setiap
- Awal
- Efektif
- efektif
- efek
- efisiensi
- efisien
- efisien
- muncul
- memungkinkan
- memastikan
- Memastikan
- Enter
- Enterprise
- lingkungan
- episode
- kesalahan
- terutama
- penting
- Eter (ETH)
- Bahkan
- Acara
- Setiap
- setiap hari
- berkembang
- contoh
- Exit
- mengharapkan
- diharapkan
- pengalaman
- Pengalaman
- mengeksploitasi
- fakto
- Kegagalan
- palsu
- Pertanian
- Menampilkan
- Sudah diambil
- Pertama
- Memperbaiki
- pintu air
- Fokus
- mengikuti
- font
- Untuk
- Nasib
- dari
- penuh
- sepenuhnya
- fungsi
- lebih lanjut
- masa depan
- generator
- mendapatkan
- mendapatkan
- Go
- mendapat
- terbesar
- kisi
- Kelompok
- Grup
- Tumbuh
- Pertumbuhan
- bimbingan
- tangan
- terjadi
- Terjadi
- Memiliki
- Kepala
- Kesehatan
- tinggi
- membantu
- bermanfaat
- membantu
- di sini
- High
- berkualitas tinggi
- lebih tinggi
- jam
- Seterpercayaapakah Olymp Trade? Kesimpulan
- How To
- HTTPS
- manusia
- Manusia
- IBM
- Cloud IBM
- ICO
- ICON
- mengenali
- mengidentifikasi
- if
- gambar
- Dampak
- dampak
- penting
- in
- memasukkan
- Meningkatkan
- indeks
- menunjukkan
- indikator
- ketidakefisienan
- informasi
- wawasan
- contoh
- tidak cukup
- integritas
- Intelijen
- berinteraksi
- intern
- Internet
- ke
- diundang
- melibatkan
- iPhone
- isu
- masalah
- IT
- NYA
- jpg
- kafka
- kunci
- Tahu
- pengetahuan
- lagging
- pendaratan
- halaman arahan
- laptop
- besar
- Latensi
- Terbaru
- memimpin
- Kepemimpinan
- terkemuka
- BELAJAR
- daun
- kurang
- 'like'
- batas
- LINK
- memuat
- lokal
- Lokal
- melihat
- mencari
- lepas
- Lot
- Rendah
- utama
- membuat
- Membuat
- pria
- berhasil
- manajer
- cara
- Produsen
- banyak
- March
- max
- max-width
- maksimum
- cara
- mengukur
- ukuran
- ukur
- Pelajari
- pesan
- pesan
- metrik
- Metrik
- mungkin
- menit
- memperkecil
- meminimalkan
- menit
- mobil
- Teknologi seluler
- mode
- Memantau
- pemantauan
- lebih
- MS
- banyak
- harus
- Navigasi
- jaringan
- New
- Terbaru
- berita
- newsletter
- berikutnya
- tidak
- tidak ada
- sekarang
- jumlah
- of
- lepas
- menawarkan
- mengimbangi
- offset
- on
- ONE
- Buka
- open source
- dibuka
- operasional
- Operasi
- optimal
- Optimize
- dioptimalkan
- mengoptimalkan
- memilih
- or
- Lainnya
- kami
- di luar
- di luar
- lebih
- secara keseluruhan
- sendiri
- halaman
- berpartisipasi
- Konsultan Ahli
- untuk
- prestasi
- melakukan
- orang
- telepon
- PHP
- pipa saluran
- rencana
- perencanaan
- Platform
- plato
- Kecerdasan Data Plato
- Data Plato
- Plugin
- poin
- siap
- kebijaksanaan
- Populer
- posisi
- mungkin
- mungkin
- Pos
- potensi
- berpotensi
- praktek
- mencegah
- sebelumnya
- primer
- Proaktif
- masalah
- diproses
- pengolahan
- menghasilkan
- Diproduksi
- produsen
- Produsen
- Produk
- manajer produk
- memberikan
- penyedia
- menyediakan
- diterbitkan
- menempatkan
- Puting
- segera
- Penilaian
- RE
- mencapai
- Bacaan
- menyeimbangkan
- rebalancing
- diterima
- diakui
- catatan
- tercatat
- arsip
- lihat
- secara teratur
- terkait
- relevan
- keandalan
- replikasi
- melaporkan
- mewakili
- permintaan
- permintaan
- membutuhkan
- tinggal
- sumber
- Sumber
- tanggapan
- tanggung jawab
- responsif
- mengakibatkan
- mengakibatkan
- Kaya
- robot
- mulai tersedia
- skala
- Layar
- script
- Kedua
- melihat
- mengirim
- SEO
- Seri
- Server
- layanan
- penyedia jasa
- Layanan
- set
- pengaturan
- tas
- harus
- Sinyal
- Tanda
- mirip
- situs web
- ukuran
- kecil
- cerdas
- smartphone
- semata-mata
- sumber
- Space
- tertentu
- Pengeluaran
- paku
- membagi
- Disponsori
- kotak
- stabil
- standar
- berdiri
- awal
- mulai
- dimulai
- Langkah
- menyimpan
- menyimpan
- aliran
- Streaming
- Layanan streaming
- stream
- berlangganan
- seperti itu
- tiba-tiba
- Mendukung
- yakin
- SVG
- tabel
- Mengambil
- tech
- Teknologi
- tersier
- uji
- dari
- Terima kasih
- bahwa
- Grafik
- Masa depan
- Dunia
- mereka
- Mereka
- tema
- Sana.
- Ini
- ini
- pikir
- pemikiran kepemimpinan
- ancaman
- intelijen ancaman
- tiga
- keluaran
- Demikian
- waktu
- tepat waktu
- Judul
- untuk
- hari ini
- alat
- puncak
- tema
- Topik
- Pelacakan
- Mengubah
- transformasional
- mengubah
- kecenderungan
- Tren
- mencoba
- lagu
- dua pertiga
- mengetik
- khas
- memahami
- Pembaruan
- URL
- penggunaan
- menggunakan
- berguna
- Pengguna
- Pengguna Pengalaman
- Pengguna
- menggunakan
- variasi
- berbagai
- Verizon
- versi
- Video
- View
- W
- menunggu
- Menunggu
- berjalan
- peringatan
- adalah
- Menonton
- Cara..
- we
- jaringan
- webinar
- BAIK
- Apa
- Apa itu
- ketika
- yang
- sementara
- lebar
- sangat
- akan
- nirkabel
- dengan
- dalam
- WordPress
- dunia
- tertulis
- kamu
- Anda
- Youtube
- zephyrnet.dll
- daerah