16-Aug-17 (Created: 16-Aug-17) | More in '00.15-Research'

Hystrix: Netflix API stack at Netflix OSS

Hystrix is one of the API related tools at Netflix released as part of Netflix OSS. Allows you to compose auto-scaled container based services in a fault tolerant way using RxJava. Along with RxJava, Eureka, Archaiu, Ribbon, Zuul, Turbine, Karyon, Governator it provides a PAAS like environment for all of Netflix API and website needs. You will find here research and a bit of summary.

A simple proposition

If I have a database, writing an API against that database should be no more complicated than writing a select statement. This has been done more than once quite successfully. Container managed stored procedures is one example. An iPAAS like Dell Boomi is another example. More recently the iOT platform NodeRed comes to mind. I am sure there are countless examples more. I am sure there are a few draw backs in each of these solutions.

Then there is bending the world in this pursuit. Someone mentioned Hystrix. I got curious. Looked around. Here is a summary of this new fangled landscape. Let me first start with the ecoSystem and related names I have run into:

Hystrix: Frontends APIs for parallelism, scale, etc

RxJava: Programming model used by Hystrix

Eureka: Service discovery to tell Hystrix where the API is in a changing environment

Archaius: Distributed Configuration management

Ribbon: How all these components talk to each other over network: ipc

Zuul: All requests web and API enter here. Monitoring, security, routing

Turbine: Realtime metrics

Karyon: Template for the body of an API construction

Governator:Java utilities that makes up Karyon

Google Guice: Java utilities that are base for Governator

Kubernetes: A competing technology for service discovery

OpenShift:Along with kubernetes competes with much of this stack

Why the madness?

What am I missing? For what requirements these tools are necessary? So let me quickly go over the conceptual problem space of APIs. The following facilities are desirable depending the level of need:

1. They must run in auto-scaled environments. This means they cannot keep any local state.

2. They must be able to read their run time configuration from a distributed configuration services

3. The clients must be able to discover where the services are currently running

4. They must be able to compose other services either in parallel or in a reactive manner

5. Incoming threads should be closed if the service takes too long

6. Security must be able to be applied external to the service

7. Monitoring for real time analysis to alternately route the services

8. Documented

9. Mocked

10. Discoverable

11. Isolation for debugging

How does Netflix OSS API stack covers this landscape?

Many of these needs boil down to scaling the back ends of websites. Yes that gets you into APIs as well. Let me regurgitate conceptually what these technologies are trying accomplish.

When you write an API you want to write it in such a way that it is cloud ready. This means that code can be deployed and un-deployed at a moments notice in dynamically changing servers (containers). So any client that calls these services cannot be sure where they live. These services cannot rely on local state. So they need to register themselves when they are up so that they can be discovered (This is done through Eureka, a distributed registry for services. In case of OpenShift kubernetes does this through infrastructure tooling). The services need to read configuration values at run time. This is done through Archaius. (Some competing technologies are memcached, Redis, etcd ).

As every service needs to do these things they can use some common libraries. RxJava, Karyon, Governator, Google Guice are all utilities available to write such cloud ready services. Some competing technologies include SpringCloud.

Also the Netflix stack is very JVM centric. So it is specially optimized for Java, Scala, Groovy. On the other hand similar features could be expected for generic polyglot rest services through OpenShift.

Once such an API is constructed a client may discover and call it but the client may choose to a) call multiple APIs in parallel b) disconnect if it is taking too long c) Or provide an alternative implementation if unavailable etc. These aspects are handled by Hystrix when a client calls an API through Hystrix all of these concerns are handled without the client knowing about it. One can imagine Hystrix to be similar to an API manager except the scripting necessary for each API is custom.

Once you have the APIs are ?hystrified? requests can come in. But you want someone to monitor these incoming APIs, apply security, route them to servers based on load, collect stats etc. This is done through Zuul which itself uses Hystrix to do its job.

You also don?t want to log the traffic patterns in Hadoop and analyze later. You want to do this in real time and alter the traffic. This is done through Turbine.

Should API writing have to be this complicated?

No. A developer should be able to write very straightforward java code that manipulates other APIs or data. The underlying framework or platform should do the rest. This complexity need not be and should not be exposed to productive developers. It will unnecessarily tells graduates how complicated programming is. Hopefully clever developers can wrap these in their environments and provide that deep abstraction to end developers that you can just hire from most colleges.

Implications to containers like OpenShift

We got to see if there are answers to Zuul in openshift? Or will Zuul coexist with OpenSfhit? Is there an opportunity for Zuul to take the frontend security in most companies? Can Hystrix be adapted for OpenShift? Or is something in OpenShift that can already does what Hystrix does? Do we need Hystrix for ALL our APIs?

Implications to Boomi/NodeRed

Imagine, writing an API in this manner requires one to know, again, Lambda expressions, closures, RxJava, Google Guice, Archaius, Redis, Spring Cloud. That is not simplicity!! Do we need that for all cases? I don?t know! I hope not. But I could be wrong.

Dell Boomi and NodeRed are showing some good ways to write APIs. These frameworks encompass much of the wiring behind the scenes and in many cases provide a great compromise. Something to think about.

Implications to API Managers

Some of these requirements are also being addressed by API Managers like Apigee, Mashery, Mulesoft etc. However those concerns seem to be more around a) public APIs b) Meetering c) Discovery catalogues d) Security e) Documentation f) Mocking g) Monitoring etc. They are specifically not addressing scalability, circuit breakers, parallelism etc. This is a smaller subset. But in highly scalable scenarios these end up being primary concerns in production.

Are there actionable items

Ok I have read this. I am struggling to translate all this into actionalbe items. Here are some thoughts

1.See if you have an existing container like OpenSfhift and see how much of this can be accomodated in OpenShift. You already then have service discovery and distributed configuration management and auto-scaling

2.Find out by doing so will our services be simplified?

3.How do we best integrate or accomplish Hystrix like functionality on top of OpenShift?

4.Is there equivalent to Zuul in openshift?

5.What does our simple, medium, and complex API coding templates will look? Can we teach the corresponding coding techniques to a broader audience with examples and templates.

6. Can you write a library which allows a developer the same level of speed to market where the developer can move "10 lines of code" as an API instantaneously?

7. May be someone will build a platform where these technologies are behind the scenes and exposes

For now

It is likely that someone will emerge to provide an iPAAS that takes these into consideration.

Short of that senior architects in companies can strive for a few months to abstract these requirements into a framework on top of an existing container like openshift and provide that almost server-less-computing model where any computational logic can be deployed as an API in practically no time and written by novices.