Learning a new software/system, it's better to start with a high-level view of it.
In this article, we will introduce you the basics of Apache Kafka (written in Scala; does not use JMS). From here, you may continue to explore, say, how to configure Kafka components, how to monitor Kafka performance metrics, etc.
In the last few years, there has been significant growth in the adoption of Apache Kafka. Current users of Kafka include Uber, Twitter, Netflix, LinkedIn, Yahoo, Cisco, Goldman Sachs, etc.[12]
Kafka is a message bus that achieves
Kafka is a distributed messaging system providing fast, highly scalable and redundant messaging through a pub-sub model. It is organized around a few key terms:
Each specific message in a Kafka cluster can be uniquely identified by a tuple consisting of the message’s
In summary, Kafka uses Zookeeper for the following:[5]
In this article, we will introduce you the basics of Apache Kafka (written in Scala; does not use JMS). From here, you may continue to explore, say, how to configure Kafka components, how to monitor Kafka performance metrics, etc.
What is Kafka?
Kafka is a message bus that achieves
- a high level of parallelism
Key Concepts
Kafka is a distributed messaging system providing fast, highly scalable and redundant messaging through a pub-sub model. It is organized around a few key terms:
- Topics
- Producers
- Consumers
- Messages
- Brokers
- Kafka
- Topics
- Kafka maintains feeds of messages in categories called topics
- All Kafka messages are organized into topics.
- Cluster
- As a distributed system, Kafka runs in a cluster.
- Each node in the cluster is called a Kafka broker.
- Each broker holds a number of partitions and each of these partitions can be either a leader or a replica for a topic.
- Brokers load balance by partition
- Clients
Messages
Each specific message in a Kafka cluster can be uniquely identified by a tuple consisting of the message’s
- Topic
- Partition
- Topics are broken up into ordered commit logs called partitions
- Offset (within the partition )
- Messages sent to a topic partition will be appended to the commit log in the order they are sent,
- A single consumer instance will see messages in the order they appear in the log,
- A message is ‘committed’ when all in sync replicas have applied it to their log, and
- Any committed message will not be lost, as long as at least one in sync replica is alive.
Zookeeper
Kafka servers require zookeeper. Brokers, producers, and consumers use zookeeper to manage and share state.
However, the way Zookeeper used in v 0.8 Kafka and v 0.9 Kafka differs.[11] So, the first thing you need to know is which version of Kafka you are referring to. To find out the version of Kafka, do:
However, the way Zookeeper used in v 0.8 Kafka and v 0.9 Kafka differs.[11] So, the first thing you need to know is which version of Kafka you are referring to. To find out the version of Kafka, do:
- cd $KAFKA_HOME
- find ./libs/ -name \*kafka_\* | head -1 | grep -o '\kafka[^\n]*'
For example, if the above command line prints:
- kafka_2.10-0.9.0.2.4.2.0-258-javadoc.jar
It means that the following versions of products are installed:
- Scala version
- 2.10
- Kafka version
- 0.9.0.2.4.2.0-258
In summary, Kafka uses Zookeeper for the following:[5]
- Electing a controller
- The controller is one of the brokers and is responsible for maintaining the leader/follower relationship for all the partitions.
- When a node shuts down, it is the controller that tells other replicas to become partition leaders to replace the partition leaders on the node that is going away.
- Zookeeper is used to elect a controller, make sure there is only one and elect a new one if it crashes.
- Cluster membership
- Tells which brokers are alive and are still part of the cluster
- Topic configuration
- Tells which topics exist, how many partitions each has, where are the replicas, who is the preferred leader, and what configuration overrides are set for each topic
- (0.9.0)
- Quotas - how much data is each client allowed to read and write
- ACLs - who is allowed to read and write to which topic
- (old high level consumer)
- Tells which consumer groups exist, who are their members and what is the latest offset each group got from each partition.
- This functionality is going away
Diagram Credit
- www.michael-noll.com
References
- Apache Kafka
- Configuration of Kafka
- Kafka in a Nutshell
- Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers
- What is the actual role of ZooKeeper in Kafka?
- How to choose the number of topics/partitions in a Kafka cluster?
- Message Hub Kafka Java API
- Introduction to Apache Kafka
- Kafka Controller Redesign
- Log4j Appender
- Apache Kafka 0.8 Basic Training (Michael G. Noll, Verisign)
- ZooKeeper
- v0.8: used by brokers and consumers , but not by producers
- v0.9: used by brokers only
- Consumers will use speicial topics instead of ZooKeeper
- Will substitally reduce the load on ZooKeeper for large deployments
- G1 Tuning (JDK 7u51 or later; slide 66)
- java -Xms4g -Xmx4g -XX:PermSize=48m -XX:MaxPermSize=48m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
- Note that PermGen has been removed in JDK 8
- The value of Apache Kafka in Big Data ecosystem
- Kafka Security Specific Features (06/03/2014)
- Kafka FAQ
- Kafka Operations
- Kafka System Tools
- Kafka Replication Tools
- Monitoring Kafka performance metrics
- Collecting Kafka performance metrics
- Spark Streaming + Kafka Integration Guide (Spark 1.6.1)
- All Cloud-related articles on Xml and More
This content creates a new hope and inspiration with in me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks.
ReplyDeleteROI Services in Chennai
Rekhila & Abiya Carol,
ReplyDeleteThank you both for kind words!
-Stanley
I was really excited about your daily updates. If you have new update me.
ReplyDeleteBest hadoop training institute in chennai
Big Data Hadoop Training in Chennai
Really nice blog,i enjoyed your infomations. Thank you and i will expect more in future.
ReplyDeleteJAVA Training in Chennai
Best JAVA Training institute in Chennai
Python Training in Chennai
Selenium Training in Chennai
Android Training in Chennai
JAVA Training in Chennai
Java Training in Tambaram
Wow, amazing blog layout! How long have you been blogging for? you make blogging look easy. The overall look of your website is fantastic, let alone the content!
ReplyDelete3d animation Company
Best Chatbot Development Company
Mobile app development in Coimbatore
Bon situ web : Zonahobisaya
ReplyDeleteBon situ web : Zonahobisaya
Bon situ web : lambang
Bon situ web : Zonahobisaya
Bon situ web : One Piece
Bon situ web : Zonahobisaya
Bon situ web : Resep Masakan
Bon situ web : Resep Masakan