Showing posts with label hystrix. Show all posts
Showing posts with label hystrix. Show all posts

Sunday, June 17, 2018

In the shoes of the consumer: do you really need to provide the client libraries for your APIs?

The beauty of the RESTful web services and APIs is that any consumer which speaks HTTP protocol will be able to understand and use it. Nonetheless, the same dilemma pops up over and over again: should you accompany your web APis with the client libraries or not? If yes, what languages or/and frameworks should you support?

Quite often this is not really an easy question to answer. So let us take a step back and think about the overall idea for a moment: what are the values which client libraries may bring to the consumers?

Someone may say to lower the barrier for adoption. Indeed, specifically in the case of strongly typed languages, exploring the API contracts from your favorite IDE (syntax highlighting and auto-completion please!) is quite handy. But by and large, the RESTful web APIs are simple enough to start with and good documentation would be certainly more valuable here.

Others may say it is good to shield the consumers from dealing with multiple API versions or rough edges. Also kind of make sense but I would argue it just hides the flaws with the way the web APIs in question are designed and evolve over time.

All in all, no matter how many clients you decide to bundle, the APIs are still going to be accessible by any generic HTTP consumer (curl, HttpClient, RestTemplate, you name it). Giving a choice is great but the price to pay for maintenance could be really high. Could we do it better? And as you may guess already, we certainly have quite a few options hence this post.

The key ingredient of the success here is to maintain an accurate specification of your RESTful web APIs, using OpenAPI v3.0 or even its predecessor, Swagger/OpenAPI 2.0 (or RAML, API Blueprint, does not really matter much). In case of OpenAPI/Swagger, the tooling is the king: one could use Swagger Codegen, a template-driven engine, to generate API clients (and even server stubs) in many different languages, and this is what we are going to talk about in this post.

To simplify the things, we are going to implement the consumer of the people management web API which we have built in the previous post. To begin with, we need to get its OpenAPI v3.0 specification in the YAML (or JSON) format.

java -jar server-openapi/target/server-openapi-0.0.1-SNAPSHOT.jar

And then:

wget http://localhost:8080/api/openapi.yaml

Awesome, the half of the job is done, literally. Now, let us allow Swagger Codegen to take a lead. In order to not complicate the matter, let's assume that consumer is also Java application, so we could understand the mechanics without any difficulties (but Java is only one of the options, the list of supported languages and frameworks is astonishing).

Along this post we are going to use OpenFeign, one of the most advanced Java HTTP client binders. Not only it is exceptionally simple to use, it offers quite a few integrations we are going to benefit from soon.

<dependency>
    <groupId>io.github.openfeign</groupId>
    <artifactId>feign-core</artifactId>
    <version>9.7.0</version>
</dependency>

<dependency>
    <groupId>io.github.openfeign</groupId>
    <artifactId>feign-jackson</artifactId>
    <version>9.7.0</version>
</dependency>

The Swagger Codegen could be run as stand-alone application from command-line, or Apache Maven plugin (the latter is what we are going to use).

<plugin>
    <groupId>io.swagger</groupId>
    <artifactId>swagger-codegen-maven-plugin</artifactId>
    <version>3.0.0-rc1</version>
    <executions>
        <execution>
            <goals>
                <goal>generate</goal>
            </goals>
            <configuration>
                <inputSpec>/contract/openapi.yaml</inputSpec>
                <apiPackage>com.example.people.api</apiPackage>
                <language>java</language>
                <library>feign</library>
                <modelPackage>com.example.people.model</modelPackage>
                <generateApiDocumentation>false</generateApiDocumentation>
                <generateSupportingFiles>false</generateSupportingFiles>
                <generateApiTests>false</generateApiTests>
                <generateApiDocs>false</generateApiDocs>
                <addCompileSourceRoot>true</addCompileSourceRoot>
                <configOptions>
                    <sourceFolder>/</sourceFolder>
                    <java8>true</java8>
                    <dateLibrary>java8</dateLibrary>
                    <useTags>true</useTags>
                </configOptions>
            </configuration>
        </execution>
    </executions>
</plugin>

If some of the options are not very clear, the Swagger Codegen has pretty good documentation to look for clarifications. The important ones to pay attention to is language and library, which are set to java and feign respectively. The one thing to note though, the support of the OpenAPI v3.0 specification is mostly complete but you may encounter some issues nonetheless (as you noticed, the version is 3.0.0-rc1).

What you will get when your build finishes is the plain old Java interface, PeopleApi, annotated with OpenFeign annotations, which is direct projection of the people management web API specification (which comes from /contract/openapi.yaml). Please notice that all model classes are generated as well.

@javax.annotation.Generated(
    value = "io.swagger.codegen.languages.java.JavaClientCodegen",
    date = "2018-06-17T14:04:23.031Z[Etc/UTC]"
)
public interface PeopleApi extends ApiClient.Api {
    @RequestLine("POST /api/people")
    @Headers({"Content-Type: application/json", "Accept: application/json"})
    Person addPerson(Person body);

    @RequestLine("DELETE /api/people/{email}")
    @Headers({"Content-Type: application/json"})
    void deletePerson(@Param("email") String email);

    @RequestLine("GET /api/people/{email}")
    @Headers({"Accept: application/json"})
    Person findPerson(@Param("email") String email);

    @RequestLine("GET /api/people")
    @Headers({"Accept: application/json"})
    List<Person> getPeople();
}

Let us compare it with Swagger UI interpretation of the same specification, available at http://localhost:8080/api/api-docs?url=/api/openapi.json:

It looks right at the first glance but we have better ensure things work out as expected. Once we have OpenFeign-annotated interface, it could be made functional (in this case, implemented through proxies) using the family of Feign builders, for example:

final PeopleApi api = Feign
    .builder()
    .client(new OkHttpClient())
    .encoder(new JacksonEncoder())
    .decoder(new JacksonDecoder())
    .logLevel(Logger.Level.HEADERS)
    .options(new Request.Options(1000, 2000))
    .target(PeopleApi.class, "http://localhost:8080/");

Great, fluent builder style rocks. Assuming our people management web APIs server is up and running (by default, it is going to be available at http://localhost:8080/):

java -jar server-openapi/target/server-openapi-0.0.1-SNAPSHOT.jar

We could communicate with it by calling freshly built PeopleApi instance methods, as in the code snippet below.:

final Person person = api.addPerson(
        new Person()
            .email("[email protected]")
            .firstName("John")
            .lastName("Smith"));

It is really cool, if we rewind it back a bit, we actually did nothing. Everything is given to us for free with only web API specification available! But let us not stop here and remind ourselves that using Java interfaces will not eliminate the reality that we are dealing with remote systems. And things are going to fail here, sooner or later, no doubts.

Not so long ago we have learned about circuit breakers and how useful they are when properly applied in the context of distributed systems. It would be really awesome to somehow introduce this feature into our OpenFeign-based client. Please welcome another member of the family, HystrixFeign builder, the seamless integration with Hytrix library:

final PeopleApi api = HystrixFeign
    .builder()
    .client(new OkHttpClient())
    .encoder(new JacksonEncoder())
    .decoder(new JacksonDecoder())
    .logLevel(Logger.Level.HEADERS)
    .options(new Request.Options(1000, 2000))
    .target(PeopleApi.class, "http://localhost:8080/");

The only thing we need to do is just to add these two dependencies (strictly speaking hystrix-core is not really needed if you do not mind to stay on older version) to consumer's pom.xml file.

<dependency>
    <groupId>io.github.openfeign</groupId>
    <artifactId>feign-hystrix</artifactId>
    <version>9.7.0</version>
</dependency>

<dependency>
    <groupId>com.netflix.hystrix</groupId>
    <artifactId>hystrix-core</artifactId>
    <version>1.5.12</version>
</dependency>

Arguably, this is one of the best examples of how easy and straightforward integration could be. But even that is not the end of the story. Observability in the distributed systems is as important as never and as we have learned a while ago, distributed tracing is tremendously useful in helping us out here. And again, OpenFeign has support for it right out of the box, let us take a look.

OpenFeign fully integrates with OpenTracing-compatible tracer. The Jaeger tracer is one of those, which among other things has really nice web UI front-end to explore traces and dependencies. Let us run it first, luckily it is fully Docker-ized.

docker run -d -e \
    COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
    -p 5775:5775/udp \
    -p 6831:6831/udp \
    -p 6832:6832/udp \
    -p 5778:5778 \
    -p 16686:16686 \
    -p 14268:14268 \
    -p 9411:9411 \
    jaegertracing/all-in-one:latest

A couple of additional dependencies have to be introduced in order to OpenFeign client to be aware of the OpenTracing capabilities.

<dependency>
    <groupId>io.github.openfeign.opentracing</groupId>
    <artifactId>feign-opentracing</artifactId>
    <version>0.1.0</version>
</dependency>

<dependency>
    <groupId>io.jaegertracing</groupId>
    <artifactId>jaeger-core</artifactId>
    <version>0.29.0</version>
</dependency>

From the Feign builder side, the only change (besides the introduction of the tracer instance) is to wrap up the client into TracingClient, like the snippet below demonstrates:

final Tracer tracer = new Configuration("consumer-openapi")
    .withSampler(
        new SamplerConfiguration()
            .withType(ConstSampler.TYPE)
            .withParam(new Float(1.0f)))
    .withReporter(
        new ReporterConfiguration()
            .withSender(
                new SenderConfiguration()
                    .withEndpoint("http://localhost:14268/api/traces")))
    .getTracer();
            
final PeopleApi api = Feign
    .builder()
    .client(new TracingClient(new OkHttpClient(), tracer))
    .encoder(new JacksonEncoder())
    .decoder(new JacksonDecoder())
    .logLevel(Logger.Level.HEADERS)
    .options(new Request.Options(1000, 2000))
    .target(PeopleApi.class, "http://localhost:8080/");

On the server-side we also need to integrate with OpenTracing as well. The Apache CXF has first-class support for it, bundled into cxf-integration-tracing-opentracing module. Let us include it as the dependency, this time to server's pom.xml.

<dependency>
    <groupId>org.apache.cxf</groupId>
    <artifactId>cxf-integration-tracing-opentracing</artifactId>
    <version>3.2.4</version>
</dependency>

Depending on the way your configure your applications, there should be an instance of the tracer available which should be passed later on to the OpenTracingFeature, for example.

// Create tracer
final Tracer tracer = new Configuration(
        "server-openapi", 
        new SamplerConfiguration(ConstSampler.TYPE, 1),
        new ReporterConfiguration(new HttpSender("http://localhost:14268/api/traces"))
    ).getTracer();

// Include OpenTracingFeature feature
final JAXRSServerFactoryBean factory = new JAXRSServerFactoryBean();
factory.setProvider(new OpenTracingFeature(tracer()));
...
factory.create()

From now on, the invocation of the any people management API endpoint through generated OpenFeign client will be fully traceable in the Jaeger web UI, available at http://localhost:16686/search (assuming your Docker host is localhost).

Our scenario is pretty simple but imagine the real applications where dozen of external service calls could happen while the single request travels through the system. Without distributed tracing in place, every issue has a chance to turn into a mystery.

As a side note, if you look closer to the trace from the picture, you may notice that server and consumer use different versions of the Jaeger API. This is not a mistake since the latest released version of Apache CXF is using older OpenTracing API version (and as such, older Jaeger client API) but it does not prevent things to work as expected.

With that, it is time to wrap up. Hopefully, the benefits of contract-based (or even better, contract-first) development in the world of RESTful web services and APIs become more and more apparent: generation of the smart clients, consumer-driven contract test, discoverability and rich documentation are just a few to mention. Please make use of it!

The complete project sources are available on Github.

Monday, June 20, 2016

When things may get out of control: circuit breakers in practice. Hystrix.

It is amazing how tightly interconnected modern software systems are. Mostly every simple application has dependency on some external service or component, not to mention emerging at a great pace Internet of Things (or simply IoT) movement. It is good and not so at the same time, let us see why ...

There are many use cases when relying on other services, provided by someone externally or internally, makes a lot of sense (messaging, billing, taxes, payments, analytics, logistics, ...) but under the hood every such integration poses risks to our applications: they become dependent on availability and operationability of those services. Network latency, spikes of load, just banal software defects, each of these unknowns can bring our applications on its knees, making users and partners dissatisfied, to say it mildly.

The good news are there is a pattern we can employ to mitigate the risks: circuit breaker. Firstly explained in great details in the Release It! book by Michael T. Nygard, circuit breakers became the de-facto solution for dealing with external services. The idea is pretty simple: track the state of the external service on a given time interval to collect the knowledge about its availability. If the failure is being detected, circuit breaker opens, signalling that external service should better not be invoked for some time.

There are plenty of circuit breaker implementations available but because we are on JVM, we are going to talk about three of those: Netflix Hystrix, Akka and Apache Zest. To keep the posts considerably short, the topic of our discussion is going to be split in two parts: Netflix Hystrix followed by Akka and Apache Zest.

To show off circuit breakers in action, we are going to build a simple client around https://freegeoip.net/: public HTTP web API for software developers to search the geolocation of IP addresses. The client will return just brief geo-details about particular IP or hostname, wrapped into GeoIpDetails class:

@JsonIgnoreProperties(ignoreUnknown = true)
public final class GeoIpDetails {
    private String ip;
    @JsonProperty("country_code") private String countryCode;
    @JsonProperty("country_name") private String countryName;
    private double latitude;
    private double longitude;
}
So let us get started ...

Undoubtedly, Netflix Hystrix is the most advanced and thoroughly battle-tested circuit breaker implementation at the disposal of Java developers. It is built from the ground up to support asynchronous programming paradigm (heavily utilizing RxJava for that) and to have a very low overhead. It is more than just circuit breaker, it is full-fledged library to tolerate latency and failures in distributed systems, but we will touch upon basic Netflix Hystrix concepts only.

Netflix Hystrix has surprisingly simple design and is built on top of command pattern, with HystrixCommand in its core. Commands are identified by keys and are organized in groups. Before we are going to implement our own command, it is worth to talk about how Hystrix isolates the external service integrations.

Essentially, there are two basic strategies which Hystrix supports: offload the work somewhere else (using dedicated thread pool) or do the work in the current thread (relying on semaphores). Using dedicated thread pools, also known as the bulkhead pattern, is the right strategy to use in most use cases: the calling thread is unblocked, plus the timeout expectations could be set as well. With semaphores, the current thread are going to be busy till the work is completed, successfully or not (timeouts are claimed to be also supported since 1.4.x release branch but there are certain side effects).

Enough theory for now, let us jump into the code by creating our own Hystrix command class to access https://freegeoip.net/ using Apache HttpClient library:

public class GeoIpHystrixCommand extends HystrixCommand<String> {
    // Template: http://freegeoip.net/{format}/{host}
    private static final String URL = "http://freegeoip.net/";
    private final String host;
 
    public GeoIpHystrixCommand(final String host) {
        super(
            Setter
                .withGroupKey(HystrixCommandGroupKey.Factory.asKey("GeoIp"))
                .andCommandKey(HystrixCommandKey.Factory.asKey("GetDetails"))
                .andCommandPropertiesDefaults(
                    HystrixCommandProperties.Setter()
                        .withExecutionTimeoutInMilliseconds(5000)
                        .withMetricsHealthSnapshotIntervalInMilliseconds(1000)
                        .withMetricsRollingStatisticalWindowInMilliseconds(20000)
                        .withCircuitBreakerSleepWindowInMilliseconds(10000)
                    )
                .andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey("GeoIp"))
                .andThreadPoolPropertiesDefaults(
                    HystrixThreadPoolProperties.Setter()
                        .withCoreSize(4)
                        .withMaxQueueSize(10)
                )
        );
        this.host = host;
    }
 
    @Override
    protected String run() throws Exception {
        return Request
            .Get(new URIBuilder(URL).setPath("/json/" + host).build())
            .connectTimeout(1000)
            .socketTimeout(3000)
            .execute()
            .returnContent()
            .asString();
    }
}

The first thing to get from this snippet is that Hystrix commands have a myriad of different properties which are initialized in the constructor. Command group and key, set to "GeoIp" and "GetDetails" respectively, we have already mentioned. Thread pool key, set to "GeoIp", and thread pool properties (for example, core pool size and maximum queue size) allow to tune thread pool configuration, the default execution isolation strategy used by Hystrix. Please notice that multiple commands may refer to the same thread pool (shedding the load for example), but semaphores are not shared.

Other GeoIpHystrixCommand command properties, arguably most important ones, would need some explanation:

  • executionTimeoutInMilliseconds sets the hard limit on overall command execution before timing out
  • metricsHealthSnapshotIntervalInMilliseconds indicates how often the status of the underlying circuit breaker should be recalculated
  • metricsRollingStatisticalWindowInMilliseconds defines the duration of rolling window to keep the metrics for the circuit breaker
  • circuitBreakerSleepWindowInMilliseconds sets the amount of time to reject requests for opened circuit breaker before trying again

It is worth to mention that Hystrix has sensible default value for every property so you are not obliged to provide them. However, those defaults are quite aggressive (in a very good sense) so you may need to relax some. Hystrix has a terrific documentation which talks about all the properties (and their default values) in details.

Another option which Hystrix incorporates is fallback in case the command execution was not successful, timed out or circuit breaker is tripped. Although fallback is optional, it is very good idea to have one, in case of https://freegeoip.net/ we may just return an empty response.

    @Override
    protected String getFallback() {
        return "{}"; /* empty response */
    }

Great, we have our command, and now what? There are multiple ways Hystrix command could be invoked. The most straightforward one is just synchronous execution using execute() method, for example:

public class GeoIpService {
    private final ObjectMapper mapper = new ObjectMapper();
 
    public GeoIpDetails getDetails(final String host) throws IOException {
        return mapper.readValue(new GeoIpHystrixCommand(host).execute(), 
            GeoIpDetails.class);
    }
}

In case of asynchronous execution, Hystrix has a couple of options, ranging from bare Java's Future to RxJava's Observable, for example:

public Observable<GeoIpDetails> getDetailsObservable(final String host) {
    return new GeoIpHystrixCommand(host)
        .observe()
        .map(result -> {
             try {
                 return mapper.readValue(result, GeoIpDetails.class);
              } catch(final IOException ex) {
                  throw new RuntimeException(ex);
              }
        });
}

The complete sources of the project example is available on Github.

If your project is built on top of very popular Spring Framework, there is a terrific out-of-the box Hystrix support using convenient (auto)configuration and annotations. Let us take a quick look on the same command implementation using Spring Cloud Netflix project (certainly, along with Spring Boot):

@Component
public class GeoIpClient {
    @Autowired private RestTemplate restTemplate;

    @HystrixCommand(
        groupKey = "GeoIp",
        commandKey = "GetDetails",
        fallbackMethod = "getFallback",
        threadPoolKey = "GeoIp",  
        commandProperties = {
            @HystrixProperty(
                name = "execution.isolation.thread.timeoutInMilliseconds", 
                value = "5000"
            ),
            @HystrixProperty(
                name = "metrics.healthSnapshot.intervalInMilliseconds", 
                value = "1000"
            ),
            @HystrixProperty(
                name = "metrics.rollingStats.timeInMilliseconds", 
                value = "20000"
            ),
            @HystrixProperty(
                name = "circuitBreaker.sleepWindowInMilliseconds", 
                value = "10000"
            )
        },
        threadPoolProperties = {
            @HystrixProperty(name = "coreSize", value = "4"),
            @HystrixProperty(name = "maxQueueSize", value = "10")
        }
    )
    public GeoIpDetails getDetails(final String host) {
        return restTemplate.getForObject(
            UriComponentsBuilder
                .fromHttpUrl("http://freegeoip.net/{format}/{host}")
                .buildAndExpand("json", host)
                .toUri(), 
            GeoIpDetails.class);
    }
 
    public GeoIpDetails getFallback(final String host) {
        return new GeoIpDetails();
    }
}

In this case the presence of Hystrix command is really hidden so the client just dials with a plain, injectable Spring bean, annotated with @HystrixCommand and instrumented using @EnableCircuitBreaker annotation.

And last, but not least, there are quite a few additional contributions for Hystrix, available as part of the Hystrix Contrib project. The one we are going to talk about first is hystrix-servo-metrics-publisher which exposes a lot of very useful metrics over JMX. It is essentially a plugin which should be explicitly registered with Hystrix, for example here is one of the ways to do that:

HystrixPlugins
    .getInstance()
    .registerMetricsPublisher(HystrixServoMetricsPublisher.getInstance());

When our application is up and running, here is how it looks like in JVisualVM (please notice that the com.netflix.servo MBean is going to appear only after the first Hystrix command execution or instrumented method invocation so you may not see it immediately on application start):

When talking about Hystrix, it is impossible not to mention Hystrix Dashboard: terrific web UI to monitor Hystrix metrics in real time.

Thanks again to Spring Cloud Netflix, it is very easy to integrate it into your applications using just @EnableHystrixDashboard annotation and another project from Hystrix Contrib portfolio, hystrix-metrics-event-stream which exposes Hystrix metrics over event stream. The complete version of the Spring-based project example is available on Github.

Hopefully at this point you would agree that, essentially, every integration with external services (which are most of the time just a black boxes) introduces instability into our applications and may cause cascading failures and serious outages. With this regards, Netflix Hystrix could be a life saver, worth adopting.

In the next part we are going to look at another circuit breaker implementations, namely the one available as part of Akka toolkit and Apache Zest.

All projects are available under Github repository.