Circuit breaker

In microservice architecture, it's common that various services running on different machines are connected to each other through remote calls. If one of the services becomes unreachable somehow such as due to network issues, the client which connects to that service will take long to get a response from it or to fail to get one. Situation will get even worse if there are many remote calls to such unresponsive service. You can configure an Armeria client with a circuit breaker to prevent this circumstance. The circuit breaker can automatically detect failures by continuously updating success and failure events. If the remote service is unresponsive, it will immediately respond with an error and not make remote calls. Please refer to CircuitBreaker wiki page by Martin Fowler and LINE Engineering blog post about circuit breaker for more information.

State of CircuitBreaker

A CircuitBreaker can be one of the following three states:

  • CLOSED
    • Initial state. All requests are treated normally.
  • OPEN
    • The state machine enters the OPEN state once the number of failures divided by the total number of requests exceeds a certain threshold. All requests are blocked off responding with FailFastException.
  • HALF_OPEN.
    • After a certain amount of time in the OPEN state, the state machine enters the HALF_OPEN state which sends a request to find out if the service is still unavailable or not. If the request succeeds, it enters the CLOSED state. If it fails, it enters the OPEN state again.
failed again under threshold exceeded threshold timed out back to normal request succeeded OPEN CLOSED HALF OPEN

CircuitBreakerClient

Armeria provides two different client implementations depending on the Request and Response types:

Let's use CircuitBreakerClient to find out what we can do. You can use the decorator() method in WebClientBuilder to build a CircuitBreakerClient:

import com.linecorp.armeria.client.WebClient;
import com.linecorp.armeria.client.circuitbreaker.CircuitBreaker;
import com.linecorp.armeria.client.circuitbreaker.CircuitBreakerClient;
import com.linecorp.armeria.client.circuitbreaker.CircuitBreakerRule;
import com.linecorp.armeria.common.AggregatedHttpResponse;
import com.linecorp.armeria.common.HttpRequest;
import com.linecorp.armeria.common.HttpResponse;

CircuitBreakerRule rule = CircuitBreakerRule.builder()
                                            .onServerErrorStatus()
                                            .onException()
                                            .thenFailure();

WebClient client = WebClient
        .builder(...)
        .decorator(CircuitBreakerClient.builder(rule)
                                       .newDecorator())
        .build();

AggregatedHttpResponse res = client.execute(...).aggregate().join(); // Send requests on and on.

Now, the WebClient can track the number of success or failure events depending on the Responses. The CircuitBreaker will enter OPEN, when the number of failures divided by the total number of Requests exceeds the failure rate. Then the WebClient will immediately get FailFastException by the CircuitBreaker.

CircuitBreakerRule

How does a CircuitBreaker know whether a Response is successful or not? CircuitBreakerRule does the job. In the example above, if the status of a Response is 5xx or an Exception is raised during the call, the count of failure is increased. You can have your own rule by building CircuitBreakerRule. The following example builds a CircuitBreakerRule that fails when an Exception is raised or the status is 5xx, succeeds when the status is 2xx and ignores others.

import com.linecorp.armeria.client.ClientRequestContext;
import com.linecorp.armeria.client.UnprocessedRequestException;
import com.linecorp.armeria.common.HttpStatus;
import com.linecorp.armeria.common.HttpStatusClass;

final CircuitBreakerRule myRule =
        CircuitBreakerRule.of(
                // A failure if an Exception is raised.
                CircuitBreakerRule.onException(),
                // Neither a success nor a failure because the request has not been handled by the server.
                CircuitBreakerRule.builder()
                                  .onUnprocessed()
                                  .thenIgnore(),
                // A failure if the response is 5xx.
                CircuitBreakerRule.onServerErrorStatus(),
                // A success if the response is 2xx.
                CircuitBreakerRule.builder()
                                  .onStatusClass(HttpStatusClass.SUCCESS)
                                  .thenSuccess(),
                // Neither a success nor a failure. Do not take this response into account.
                CircuitBreakerRule.builder().thenIgnore());

If you need to determine whether the request was successful by looking into the response content, you should build CircuitBreakerRuleWithContent and specify it when you create an WebClient using CircuitBreakerClientBuilder:

import com.linecorp.armeria.client.circuitbreaker.CircuitBreakerRuleWithContent;

final CircuitBreakerRuleWithContent<HttpResponse> myRule =
        CircuitBreakerRuleWithContent.of(
                CircuitBreakerRuleWithContent.<HttpResponse>builder()
                                             .onUnprocessed()
                                             .thenIgnore(),
                CircuitBreakerRuleWithContent.<HttpResponse>builder()
                                             .onException()
                                             .thenFailure(),
                CircuitBreakerRuleWithContent.<HttpResponse>builder()
                                             .onResponse(response -> response.aggregate().thenApply(content -> {
                                                 return content.equals("Failure");
                                             }))
                                             .thenFailure());

WebClient client = WebClient
        .builder(...)
        .decorator(CircuitBreakerClient.builder(myRule) // Specify the rule
                                       .newDecorator())
        .build();

AggregatedHttpResponse res = client.execute(...).aggregate().join();

Grouping CircuitBreakers

In the very first example above, a single CircuitBreaker was used to track the events. However, in many cases, you will want to use different CircuitBreaker for different endpoints. For example, there might be an API which performs heavy calculation which fails often. On the other hand, there can be another API which is not resource hungry and this is not likely to fail. Having one CircuitBreaker that tracks all the success and failure does not make sense in this scenario. It's even worse if the client connects to the services on different machines. When one of the remote services is down, your CircuitBreaker will probably be OPEN state although you can connect to other services. Therefore, Armeria provides various ways that let users group the range of circuit breaker instances.

  • Group by host: a single CircuitBreaker is used for each remote host.

    import com.linecorp.armeria.client.circuitbreaker.CircuitBreakerRpcClient;
    import com.linecorp.armeria.common.RpcResponse;
    
    final CircuitBreakerRule httpRule = CircuitBreakerRule.builder()
                                                          .onServerErrorStatus()
                                                          .onException()
                                                          .thenFailure();
    final CircuitBreakerRule rpcRule = CircuitBreakerRule.onException();
    
    // Create a CircuitBreaker with the key name
    final Function<String, CircuitBreaker> factory = key -> CircuitBreaker.of("my-cb-" + key);
    
    // Create CircuitBreakers per host (a.com, b.com ...)
    CircuitBreakerClient.newPerHostDecorator(factory, httpRule);
    CircuitBreakerRpcClient.newPerHostDecorator(factory, rpcRule);
    // The names of the created CircuitBreaker: my-cb-a.com, my-cb-b.com, ...
  • Group by method: a single CircuitBreaker is used for each method.

    // Create CircuitBreakers per method
    CircuitBreakerClient.newPerMethodDecorator(factory, httpRule);
    // The names of the created CircuitBreaker: my-cb-GET, my-cb-POST, ...
    
    CircuitBreakerRpcClient.newPerMethodDecorator(factory, rpcRule);
    // The names of the created CircuitBreaker: my-cb-methodA, my-cb-methodB, ...
  • Group by host and method: a single CircuitBreaker is used for each method and host combination.

    // Create a CircuitBreaker with the host and method name
    final BiFunction<String, String, CircuitBreaker> factory =
            (host, method) -> CircuitBreaker.of("my-cb-" + host + '#' + method);
    // Create CircuitBreakers per host and method
    CircuitBreakerClient.newDecorator(CircuitBreakerMapping.perHostAndMethod(factory), httpRule);
    // The names of the created CircuitBreaker: my-cb-a.com#GET,
    // my-cb-a.com#POST, my-cb-b.com#GET, my-cb-b.com#POST, ...
    
    CircuitBreakerRpcClient.newDecorator(CircuitBreakerMapping.perHostAndMethod(factory), rpcRule);
    // The names of the created CircuitBreaker: my-cb-a.com#methodA,
    // my-cb-a.com#methodB, my-cb-b.com#methodA, my-cb-b.com#methodB, ...

CircuitBreakerBuilder

We have used static factory methods in CircuitBreaker interface to create a CircuitBreaker so far. If you use CircuitBreakerBuilder, you can configure the parameters which decide CircuitBreaker's behavior. Here are the parameters:

  • name:

  • failureRateThreshold:

    • The threshold that changes CircuitBreaker's state to OPEN when the number of failed Requests divided by the number of total Requests exceeds it. The default value is 0.5.
  • minimumRequestThreshold:

    • The minimum number of Requests to detect failures. The default value is 10.
  • trialRequestInterval:

    • The duration that a CircuitBreaker remains in HALF_OPEN state. All requests are blocked off responding with FailFastException during this state. The default value is 3 seconds.
  • circuitOpenWindow:

  • counterSlidingWindow:

    • The duration of a sliding window that a CircuitBreaker counts successful and failed Requests in it. The default value is 20 seconds.
  • counterUpdateInterval:

    • The duration that a CircuitBreaker stores events in a bucket. The default value is 1 second.
  • listeners:

    • The listeners which are notified by a CircuitBreaker when an event occurs. The events are one of stateChanged, eventCountUpdated and requestRejected. You can use CircuitBreakerListener.metricCollecting() to export metrics:
    import com.linecorp.armeria.client.circuitbreaker.CircuitBreakerListener;
    
    import com.linecorp.armeria.common.Flags;
    
    final CircuitBreakerListener listener =
            CircuitBreakerListener.metricCollecting(Flags.meterRegistry());
    final CircuitBreakerBuilder builder = CircuitBreaker.builder()
                                                        .listener(listener);

Using CircuitBreaker with non-Armeria client

CircuitBreaker API is designed to be framework-agnostic and thus can be used with any non-Armeria clients:

  1. Create a CircuitBreaker.
  2. Guard your client calls with CircuitBreaker.tryRequest().
  3. Update the state of CircuitBreaker by calling CircuitBreaker.onSuccess() or CircuitBreaker.onFailure() depending on the outcome of the client call.

For example:

private final CircuitBreaker cb = CircuitBreaker.of("some-client");
private final SomeClient client = ...;

public void sendRequestWithCircuitBreaker() {
    if (!cb.tryRequest()) {
        throw new RuntimeException();
    }

    boolean success = false;
    try {
        success = client.sendRequest();
    } finally {
        if (success) {
            cb.onSuccess();
        } else {
            cb.onFailure();
        }
    }
}