Axon Metrics with Spring Boot 2 + Prometheus + Grafana

-

Included in the Axon framework is a module that will collect metrics on the different Axon message types (commands, events and queries). In this blog, I will explain how to use the Axon metrics module in combination with Prometheus and visualize the metrics in Grafana. 

Full disclaimer, I’m contributing to the Axon Metrics module myself, so I’m not completely impartial 😉

Intro Axon

Axon is a Java framework for writing applications using the CQRS paradigm. One of the things CQRS focuses on more than the traditional CRUD/Active Record style of writing applications is business interactions. In a CRUD based system the focus lies more on the Creating/Read/Update/Deletion (hence CRUD) of records while CQRS focuses more the business interactions by modeling the explicit actions (Commands).

For example, instead of an update on the customer information record when the customer moves to a different address, CQRS also models the intent of the change. There will be an explicit MoveCustomer command. When using CQRS with event sourcing this command will also result in an explicit event that indicates that the customer has moved. Probably something like a CustomerMoved event. This explicit modeling of business interactions and state changes has great benefits in applications with complex business processes.

In a traditional CRUD application this information is usually lost because it’s often impossible to deduce why, in this example, the customer’s address was updated (was it a misspelling or did the customer actually move to a new address). When correctly applying event sourcing we can record and react to this information.

Another advantage of having the business interactions modeled in explicit commands and events is that with Axon metrics we can measure the actual business itself. Visualizing metrics collected on these messages holds great value for businesses. Instead of measuring that certain (obscure) technical calls and updates are made to a system. Why not measure the amount of OrderPlaced events and put a thermometer in the core business process itself.

When the number of OrderPlaced events suddenly go down there is a problem that affects the bottom line of the business, not just a technical problem of some obscure REST API call or internal service call that doesn’t work anymore where it’s unclear what the impact is on the business.

Example application

I’ve created an example application to demonstrate the capabilities of using Axon with the metrics module. The code can be found here: https://github.com/luminis-ams/axon-metrics-example The example application contains a flight ticket booking domain. There is a component that simulates user interaction by generating commands that automatically create flights and books seats on flights.

I’ve instrumented the message processors in such a way that is possible to monitor individual message payload types (like BookSeatCommand and SeatBookedEvent for example) to individual message handlers (like the SlowEventListener component for example). I’ve tried to put in some interesting problems in the application so that I can demonstrate how you can detect problems in your Axon application by monitoring it.

The Axon Metric configuration code can be found here.

Small side note, the current metrics module uses Dropwizard metrics. In order to expose the metrics using Prometheus I’ve configured an exporter for the Dropwizard metrics to Prometheus with this line:

collectorRegistry.register(new DropwizardExports(metricRegistry));

(see MetricConfig line 33)

Micrometer Metrics in Axon

I’ve sent in a pull request to Axon to add support for Micrometer metrics. Micrometer is the new default metrics library which is used in Spring Boot 2.0. If you check out the axon-micrometer_beta branch on the GitHub repository you can see how to use the new module (for now you only can use it if you maven install my pull request branch)

In the configuration class on the Micrometer branch, you can see it’s not necessary to export the Dropwizard metrics to Prometheus anymore. Micrometer provides an abstraction of Prometheus which we now use directly. Micrometer can be used with a lot of different metric implementations. For an overview see https://micrometer.io/docs

Dashboard with Grafana

Now that the metrics are exposed via Prometheus we can use Grafana to visualize them in a dashboard. (For more information about Prometheus and Grafana see the excellent blog of my colleague Jeroen Reijn about “Monitoring Spring Boot applications with Prometheus and Grafana”)

Here are some screenshots from Grafana.

CapacityMonitor on events

This one displays the capacity of the event listeners

As you can see in the above image there is 1 event listener that is close to full capacity. If you have a single thread processor this means that almost 100% of the time the event listener is busy. Probably events are queuing up for this event processor. The capacity is not always 1 because the capacity doesn’t measure the overhead of the Axon Framework itself, Axon has to retrieve event messages from a database and store them once their processed.

MessageTimerMonitor on events

This one displays the timings of the different event processors.

You can see that two of the four event processors show a latency of sometimes more than 4 seconds for half of the requests. This is a problem and should be looked at.

MessageCountingMonitor on BookSeatCommand

This graph displays the success, failure and ingested counter of the BookSeatCommand.

As you can see sometimes the command handler of the BookSeatCommand throws an Exception which will be counted as a failure in the graph.

Run the example application

I’ve created a docker setup if you want to run the full stack yourself. You can check out the code of example application here:

https://github.com/luminis-ams/axon-metrics-example

After that first do a build of the app project to create the application docker image. From the project root run:

./mvnw clean build 

After that run:

docker-compose -f docker/docker-compose.yml up

As told the metrics are exposed using Prometheus. On the Prometheus endpoint http://localhost:8080/actuator/prometheus you can see the metrics in raw form.
In Prometheus, you can see the metrics under http://localhost:9090

You can import the example metrics dashboard in Grafana in the following way:

  1. Login http://localhost:3000 (login admin:password)
  2. On the left side of the screen click on the + icon and select import
  3. Copy the contents of grafana/dashboard.json into the bottom text field
  4. Select the Prometheus data source in the dropdown

Conclusion

Because of Axon’s strong focus on modeling business interactions, metrics on those business interactions contain a lot more useful information. Useful not only for troubleshooting but also for measuring the performance of the business itself.