Lead Image by Ricardo Gomez Angel on Unsplash.com

Lead Image by Ricardo Gomez Angel on Unsplash.com

Service mesh for Kubernetes microservices

Mesh Design

Article from ADMIN 50/2019
By
Enable free service mesh functionality on your Kubernetes microservice apps with Istio.

The ease with which microservices can be deployed, upgraded, and scaled makes them a compelling way to build an application, but as a microservice application grows in complexity and scale, so does the demand on the underlying network that is its lifeblood. A service mesh offers a straightforward way to implement a number of common and useful microservice patterns with no development effort, as well as providing advanced routing and telemetry functions for your microservice application straight out of the box.

Service Mesh Concepts

Unlike a monolithic application, in which the network is only a matter of concern at a limited number of ingress and egress points, a microservices application depends on IP-based networking for all its internal communications, and the more the application scales, the less "ideal" the behavior of that internal network becomes. As interactions between services become more complex, more voluminous, and more difficult to visualize, it's no longer safe for microservice app developers to assume that those interactions will happen transparently and reliably. It can become difficult to see whether a bug in the application is caused by one of the microservices (and if so, which one) or by the configuration of the network.

One way to mitigate non-ideal behaviors in the microservice network is by creating services that make allowances for the interactions. Developers can carefully calculate how long request timeouts should be, allowing for dependencies on other requests cascading down the chain. Another approach is for service developers to create a client library for use by developers of dependent services that interacts with their service's API in a resilient manner. Both approaches have serious drawbacks. The first causes a lot of duplicated effort – each team is spending resources trying to solve essentially the same problem. The second approach runs counter to the point of microservices – if you change your client library, all the services that use it have to be upgraded.

Enter the service mesh. In the domain of service-oriented architectures, including microservices, the meaning of this term has been elevated from its uninspiring literal interpretation – that of a fully interconnected collection of services – to a specialized operating environment in which individual microservices can be written and deployed as though the underlying network really was ideal and in which interservice communications can be managed and monitored in great detail. Lee Calcote, in his book on microservices [1], refers to this concept as "decoupling at Layer 5," the implication being that a service mesh frees the developer of any concerns below the Session Layer of the OSI model. The "decoupling" function of the service mesh can be referred to as its data plane. The data plane of a service mesh can be used to implement a number of common and useful microservice patterns, such as:

  • Authentication: Can service A be sure that it is actually sending a request to service B, and not to an impostor?
  • Authorization or role-based access control: Is service A allowed to send a particular request type to service B, and is service B allowed to respond to that request?
  • Encryption of interservice traffic, to prevent man-in-the-middle attacks and allow the application to be deployed on a zero-trust network.
  • Circuit breaking, which prevents a request timeout (caused by a failed service or pod) from cascading back up the request chain and creating a catastrophic failure of the application.
  • Service discovery and load balancing.

Clearly, having all of these requirements taken care of by a freely downloadable service mesh is a far more appealing idea than having to implement and maintain them natively in each individual microservice.

If the service mesh's data plane is equipped with some sort of out-of-band communication abilities (the ability to receive control commands and to send logging and status information), then a control plane can be added, affording the opportunity to implement a number of high-level management and monitoring functions for the application using the mesh. These typically include:

  • Traffic management – routing user requests to services according to criteria such as user identity, request URL, weightings (for A/B and canary testing), quota enforcement, and fault injection.
  • Configuration – of all the data plane functions mentioned earlier.
  • Tracing, monitoring, and telemetry – aggregating logs, status information, and request metrics from every service into a single place makes it easy for the service mesh control plane to find out anything, from the HTTP headers of a single given request to how the application's response time is affected by a given increase in usage rates.

In this article, I'll describe some service mesh architectures, provide a brief overview of popular sidecar-based service meshes that are currently available, and demonstrate Istio, which is a full-featured service mesh that can be easily deployed on a Kubernetes cluster.

Common Service Mesh Architectures

Not to be confused with the client library, mentioned above, libraries such as Hystrix and Ribbon, developed by Netflix, can be built into your microservice code and provide some of the data plane benefits described above. Hystrix provides resilience against network latency and circuit-breaking behavior with the aim of preventing cascading failure in a microservice application. Ribbon is used to provide load balancing.

A node agent runs on each hardware node. The application's pods within the node communicate with each other as usual, but internode communication, which has to use a "real" network, is routed via an agent that provides the Data Layer benefits described above. Because a node agent has to be tightly coupled with the platform of the node itself, they are usually specific to the particular cloud vendor or hardware platform that is hosting the application (Figure 1).

Figure 1: Cluster network showing a node agent proxying each node's pod traffic.

A sidecar is an additional container inserted into each pod that proxies all the network traffic flowing into and out of that pod by applying new iptables rules inside the pod that "redirect" the traffic to and from the application's container. The sidecar provides all the data plane functions and is the currently preferred architecture for implementing a full-featured service mesh; the entire mesh is deployed in containers, so it's easy to make a single mesh work on a wide range of platforms and cloud providers.

Microservices are unaware of the sidecar cohabiting their pod, and because each container has its own sidecar, the granularity of control and observability is much higher than can be achieved with the node agent model. Figure 2 shows pods containing sidecars alongside the application container. Sidecars are typically written in native code to work as fast as possible, minimizing any extra network latency, and are the subject of the remainder of this article.

Figure 2: Sidecar service mesh architecture showing in-band application traffic (blue) and out-of-band control traffic (green).

Sidecar Service Mesh Implementations

The most common open source sidecar-based service meshes currently available include the following:

Linkerd 1.x [2], created by Buoyant and now an open source project of the Cloud Native Computing Foundation. Notable users include SalesForce, PayPal, and Expedia. Linkerd 1.x is a data plane-only service mesh that supports a number of environments, including Kubernetes, Mesos, Consul, and ZooKeeper. Consider choosing this if you only want data plane functions, and you aren't using Kubernetes.

Linkerd 2.x [3], formerly Conduit, is a Kubernetes-only (faster) alternative to Linkerd 1.x. This complete service mesh features a native code sidecar proxy written in Rust and a control plane with a web user interface and command-line interface (CLI). Emphasis is on light weight, ease of deployment, and dedication to Kubernetes.

Istio [4] is a complete (control and data plane) service mesh. Although it uses the Envoy proxy (written in C++) as its default sidecar, it can optionally use Linkerd as its data plane. Version 1.1.0 was just released in March 2019. Istio 1.1 provides significant reductions in CPU usage and latency over Istio 1.0 and incremental improvements to all the main feature groups. Istio is marketed as platform independent (example platforms are Kubernetes, GCP, Consul, or simply running it with services that run directly on virtual or physical servers). You might choose Istio if you want a comprehensive set of control plane features that you can extend by means of a plugin API. Istio presents its feature set as traffic management, security, telemetry, and policy enforcement.

In this article, I focus on Istio's core functions and components and deploy them in a sample microservice application – a WordPress website on a Kubernetes cluster. In case you were wondering, istio is the Greek verb to sail , which places it pretty firmly within the Kubernetes (helmsman) Helm and Tiller nomenclature. Despite being platform independent, its configuration APIs and container-based nature make it a natural fit with Kubernetes, and as you'll see below, you can deploy Istio to and remove it from your Kubernetes cluster quickly, with minimal effect on your application.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus