What do you really know about how to monitor a Kafka cluster for problems? Is your most reliable monitoring your users telling you there’s something broken? Are you capturing more metrics than the actual data being produced?
Sure, we all know how to monitor disk and network, but when it comes to the state of the brokers, many of us are still unsure of which metrics we should be watching, and what their patterns mean for the state of the cluster.
Kafka has hundreds of measurements, from the high-level numbers that are often meaningless to the per-partition
metrics that stack up by the thousands as our data grows.
We will thoroughly explore three key monitoring concepts in the broker, that will leave you an expert in
identifying problems with the least amount of pain:
1. Under-replicated Partitions: The mother of all metrics
2. Request Latencies: Why your users complain
3. Thread pool utilization: How could 80% be a problem?
We will also discuss the necessity of availability monitoring and how to use it to get a true picture of what your users see, before they come beating down your door!
We all know and love SQL, right? Well, Confluent KSQL is almost like SQL, only it's for Apache Kafka. KSQL allows creating complex systems of thread-specific data processing, without writing Java or Scala (sick!) code! But the most interesting part is when, by using KSQL integration in Arcadia Data, we will visualize real-time streams of tweets!
In this presentation you will learn:
1. KSQL basics
2. How to build a streaming pipeline from the twitter firehose using the Apache Kafka and Kafka Connect
3. How to build visualizations using Arcadia Instant
Todd is a Staff Site Reliability Engineer at LinkedIn, tasked with keeping the largest deployment of Apache Kafka, Zookeeper, and Samza fed and watered. He is responsible for architecture, day-to-day operations, and tools development, including the creation of an advanced monitoring and notification system. Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and can be found sharing his experience on Apache Kafka at industry conferences and tech talks. Todd has spent over 20 years in the technology industry running infrastructure services, most recently as a Systems Engineer at Verisign, developing service management automation for DNS, networking, and hardware management, as well as managing hardware and software standards across the company.
Shant is the co-Founder and CTO of Arcadia Data, where he is responsible for the company’s long-term innovation and technical direction. Previously, Shant was a member of the engineering team at Teradata, which he joined through the acquisition of Aster Data. Shant spent time at Google, where he worked on optimizing the AdWords database, and was a graduate student in computer science at UCLA. He is the coauthor of publications in the areas of modular database design and high-performance storage systems. Follow Shant on Twitter @superdupershant.
Viktor is a Solution Architect at Confluent, the company behind the popular Apache Kafka streaming platform. Viktor has comprehensive knowledge and expertise in enterprise application architecture leveraging open source technologies and enjoys helping different organizations build low latency, scalable and highly available distributed systems. He is also the co-author of O’Reilly’s “Enterprise Web Development.” He is a professional conference speaker on Distributed Systems, Java, and JS topics, and is a regular at the most prestigious events including JavaOne, Devoxx, OSCON, Qcon and others (http://lanyrd.com/gamussa), blogging (http://gamov.io) and producing podcasts “Razbor Poletov” (in Russian) and DevRelRad.io. Follow Viktor on Twitter @gamussa.
