Analyzing a performance issue

-

Recently, a customer reported that the performance of the web application we are building, an AngularJS front end using REST services from a  Java back end, is unsatisfactory. I had a fun time finding out what the problem was, understanding what was causing it and thinking up solutions. I’d like to share the experience with you.

What is the problem?

“Bad performance” can mean many things. Actual response times are only part of the story; an application that shows feedback while gathering a response can make you wait longer before it is perceived as slow. When users know their bandwith is limited, for instance on a mobile device on a slower network, they anticipate a slower site. But of course, it is always possible that the services on which your site relies are just too slow. So, first order of business: ask the reporter what he means. What action did he perform, and how long did it take? How long did he expect it to take? It turned out there were several problem areas, not all of which were actually related to performance. I will describe two of these areas in more detail:

  • loading of the AngularJS application
  • loading of list views

Loading of the AngularJS application

Our AngularJS application has to perform several tasks before it can render the first view. This takes time, and during this time the end user is staring at a blank page.

Finding out what happens

To figure out what tasks are executed and which of these can most easily be sped up I opened a fresh Chrome window, trashed all the caches, opened the network panel and loaded the main page.
This is interesting. There is a flurry of activity at the start, followed by 300 ms of silence, after which a lot more is loaded. What does it mean? Among the first items to be loaded is our JavaScript library, including AngularJS and other components. This is a large file, so it takes several hundreds of milliseconds to get loaded. After that, the AngularJS application is initializing (nothing happens) and after that, additional data is downloaded as the AngularJS application is building its first view. The largest of these is a translations file. The application won’t show anything until the translations are loaded, so the limiting path is:

  1. Loading of the JavaScript libraries
  2. Initialization of the AngularJS application
  3. Retrieving the translations file

Improvements

The initialization of the AngularJS application is not an easy one to speed up, so I’ll concentrate on the other two parts. The time it takes to download a JavaScript library is largely determined by its size. The library was already minified during the build. However, when a project is running for some time, it is not uncommon to have solved the same problem in two different ways, using two different libraries, or to end up with unused libraries. As it turned out, we were using two different libraries for file uploads. Choosing a single library for the uploads and removing libraries that are no longer used reduced the size of the final, minified JavaScript file and improved load time. The contents of the translations file were all used. However, since this is a plain text file, it was a prime candidate for zipped transfer. AngularJS handles zipped files out of the box, so the only thing we needed to do was add an @org.apache.cxf.annotations.GZIP annotation to the Apache CXF resource in the back end, which reduced the download time from several hundred milliseconds to ~10 ms.

List views

List views are always tricky, especially if there is no limit to the number of elements that may show up in the list, as was the case in some of our services. However, that turned out to not even be the biggest problem.

Feedback: always

When opening a page with a list, the application would show a spinner to indicate that the contents of the list were still being retrieved. The spinner would be removed, and the list shown, when the variable containing the list data was present on the scope:

 

However, the controller that retrieves the projects had no error handling:

projectService.query()
.then(function (projects) {
  $scope.projects = projects;
});

So, if the back end service returned any error, the projects variable would not be set and the spinner would keep spinning – which made the testers conclude that the application was very slow! You should always provide feedback on the outcome of a back end call. At the very least, a spinner suggesting that data is retrieved should always be removed when a response is received. Showing the spinner should depend on whether a call is being performed, not an the presence of the result:

 

$scope.loadingProjects = true;
projectService.query()
.then(function(projects) {
  $scope.projects = projects;
})
.finally(function() {
  $scope.loadingProjects = false;
});

Even better would be to provide an error message if the back end returned an error:

$scope.loadingProjects = true;
projectService.query()
.then(function(projects) {
  $scope.projects = projects;
})
.then(function(error) {
  // Process this to show an appropriate error message in the view
})
.finally(function() {
  $scope.loadingProjects = false;
});

Slow REST services

Of course, there’s still the problem of REST services simply taking a long time to return a response. To monitor the response times for these services, we added simple metrics to all the REST services using Java Simon. This is a simple framework to gather performance data. You request a Stopwatch and use it to measure the call durations on the resource; it will then provide you with minumum, maximum and mean call durations:

 

@GET
@Produces(MediaType.APPLICATION_JSON)
public Response getProjects(
  final Split split = SimonManager.getStopwatch("getProjects").start();
  final Response response;
  try {
    response = Response.ok(projectManagement.findAll()).build();
  } finally {
    split.stop();
  }
  return response;
}

 

Stopwatch stopwatch = SimonManager.getStopwatch("getProjects");
long minimumDuration = stopwatch.min(); // minimum duration in nanoseconds
long maximumDuration = stopwatch.max(); // maximum duration in nanoseconds
long meanDuration = stopwatch.mean(); // mean duration in nanoseconds

We then used Prometheus to provide access to the measurements. There are of course many reasons for a service to be slow, so I won’t go into details here, but this setup provided insight into which services are consistently slow so we could focus improvement efforts on those services.