Skip to main content

Using Apache Kafka to implement event-driven microservices

When talking about microservices architecture, most people think of a network of stateless services which communicate through HTTP (one may call it RESTful or not, depending on how much of a nitpicker one is).

But there is another way, which may be more suitable depending on the use case at hand.

I am talking about event-driven microservices, where in addition to the classic request-response pattern, services publish messages which represent events (facts) and subscribe to topics (or queues depending on the terminology used) to receive events/messages.

To fully understand and embrace this new software design paradigm is not straight-forward but it is totally worth it (at least looking into it).

There are several interconnected concepts which need to be explored in order to discover the advantages of event-driven design and the evolutionary path which led to it, for example:
  • Log (including log-structured storage engine and write-ahead log)
  • Materialized View
  • Event Sourcing
  • Command Query Responsibility Segregation (CQRS)
  • Stream processing
  • "Inside-out" databases (a.k.a. "un-bundled" databases)
I'd like to point you to the following books to get familiar with those topics:
I read those three books and then I started building a simple PoC since learning new design ideas is great but it is not completed until you also put them in practice.
Also, I was not happy with the examples of event-driven applications/services available online,  I found them too simplistic and not properly explained, so I decided to create one.

The proof of concept

The source code is split in two GitHub repositories (as per the Clean Architecture):
The proof of concept service keeps track of the balance available in bank accounts (like a ledger).
It listens for Transfer messages on a Kafka topic and when one is received, it updates the balance of the related account by publishing a new AccountBalance message on another Kafka topic.

Please note that each entity type is represented by two different classes: 

  • one is generated by Apache Avro and it's used for serialization and deserialization (so they can be sent and received from Kafka) → see avro directory.
  • the other one is a POJO which may contain some convenience constructors and does not depend on Avro → see net.devaction.entity package.
The net.devaction.kafka.avro.util package holds converters to move back and forth from one data representation to the other.
In the beginning, Apache Kafka may seem overwhelming, even though it resembles a classic messaging broker such as ActiveMQ or RabbitMQ, it is much more than that and it works very differently internally.
Also, there are several Kafka client APIs, which adds more confusion to the learner.
We are going to focus on the following three:
The Producer and the Consumer APIs are lower level and the Streams API is built on top of them.
Both sets of APIs have advantages and disadvantages.
The Producer/Consumer API provides finer control to the application developer at the cost of higher complexity.
On the other hand, the Streams API is not as flexible but it allows the implementation of some standard operations more easily and it requires much less code.


The "transfers recording" example/PoC service can be started in one of the following two modes:

The two modes provide exactly the same functionality which is quite convenient for comparison purposes.

The (explicit) polling mode

It has four main components: 
  • A consumer which listens on the "transfers" topic → see TransferConsumer.java
  • ReadOnlyKeyValueStore (which is part of the Streams API) to materialized the "account-balances" topic data into a queryable view, so we can use the accountId value to retrieve the latest/current balance of a specific account → see AccountBalanceRetrieverImpl.java.
    Please note that the accountId value is extracted from the "transfer" data message received by the consumer.
  • The business logic which creates a new/updated AccountBalanceEntity object from the received TransferEntity object and the current AccountBalanceEntity present in Kafka → see NewAccountBalanceProvider.java
  • A producer which publishes the updated balance by sending a message to the "account-balances" topic → and the local data store will get updated accordingly.

The "join streams" mode

As we said before, this second operating mode only uses the Streams API/DSL and taking advantage of it, we can code at a higher level of abstraction:
We can see that the code is much more compact than in the previous mode.
We do not need to explicitly map the KStream key to the KTable key, that's exactly what the join does (see line 11 in the code snippet below). Hence, we need to choose the Kafka keys accordingly. 
In this case, both message keys represent the account id.





Diagram

Running the code

To build and run the PoC application, in addition to Maven and Java, we also need a Kafka broker.
I decided to install the Confluent Platform which includes a Kafka broker (or a cluster depending on the configuration chosen) with some example topics and pre-configured integration with ElasticSearch and Kibana. But more importantly, it also includes an admin Web UI called "Control Center" which comes in very handy. 

I hit a few bumps when running the Confluent Platform for the first time on my Fedora 30 computer.
Namely, I had to manually install a couple of software packages (i.e., "jot" and "jq").
And I had to separately install the Confluent CLI.
I also had to perform some several changes to some properties files and bash scripts to be able to run the Confluent Platform using a non-root user, here are the changes, please modify the configuration values as per your environment. 

Watch the following YouTube video to get all the details including starting the Confluent Platform and running the example Streams application:

 

Comments

Post a Comment

Popular posts from this blog

Kafka + WebSockets + Angular: event-driven microservices all the way to the frontend

In the the initial post of the  Event-driven microservices with Kafka series (see here  or  here ), I talked about the advantages of using event-driven communication and Kafka to implement stateful microservices instead of the standard stateless RESTful ones. I also presented the architecture and the source code of a related proof of concept application. In this post, I would like to show how to extend the asynchronous event-driven communication all the way from Kafka to the Web frontend passing through the Java backend. Hence, in the first post of this series, we got rid of HTTP as the communication protocol among microservices in the backend, and now we are also replacing it (with WebSockets) as the communication protocol between the frontend and the backend. Ok, but why would you do that? Because it provides a better experience to the end user!. Using WebSockets you can build legit  real-time user interfaces, the updates are pushed immediately from the server to the client

A Java dev journey to full-stack: first chapter

The Motivation I am an experienced Java developer and (surprise!) I like Java. I know it is not perfect but it works just fine for me (I enjoy type-safety and I do not consider verbosity a disadvantage, quite the opposite). I also know that some people dislike Java, which is also fine. But recently I decided to step out of my confort zone as developer, my goal isn't to be one of the "cool kids" neither trying to monetize a new skill in the job market. I have a quite practical motivation: I want to be able to build more (different) stuff. That's exactly the same reason why I learnt Android development by myself a couple of years ago. Web applications are ubiquitous, even more than native mobile apps, and thanks to cloud computing, one can easily and inexpensively release their idea/app to the World Wide Web. I already did some Web development in the past, in the bad old days of JSP and JSF, but the process was slow and painful. Nowadays the Web landscape h