Enterprises of every size and in every industry are adopting microservices to accelerate the speed of innovation. Designing and building distributed systems has become the norm for many organizations rather than something limited to web-scale companies. While it is universally accepted that microservices architectures provide agility to development teams, it also introduces operational complexity. SREs and DevOps teams often struggle during the migration to microservices. When a monolith is broken into tens or even hundreds of microservices, how do these components discover and communicate with each other? Where to begin debugging efforts? What is the service dependency graph? Who is calling whom? These are questions that SREs did not have to answer in the monolithic world. In a microservices world where a microservices architecture may seem like a distributed service mess, they must have these answers and more.
From Service Mess to Service Mesh
The burden of converting metrics, logs, and traces from the entire fleet of disparate microservices components into a cohesive and manageable observability system that identifies, debugs and resolves performance issues is the responsibility of DevOps or SRE teams. The sheer number of components and the variability in the data formats makes this an extremely complex, ever-changing challenge.
It’s no wonder that monitoring is cited as one of the biggest challenges in adopting distributed architectures according to the latest survey of cloud-native practitioners conducted by the Cloud-Native Computing Foundation.
What is a Service Mesh, and Why Should You Use One?
Fundamentally, a service mesh is a policy-driven proxy layer that channels all communication between microservices. Service meshes, such as Istio, incorporate a sidecar proxy with each instance of a microservice application. Each application communicates only with its local sidecar proxy, while the proxies communicate among themselves to form a mesh of services.
Following good design principles, service meshes are designed with loosely coupled components with separate functional responsibilities. What follows is a review of the architecture and components of the Istio service mesh and how it is logically split into a data plane and a control plane.
Istio Data Plane
The data plane is made up of a number of high-performance sidecar proxies. For Istio, Envoy Proxy is used. These sidecar proxies are responsible for intercepting network communications between microservices. Because Envoy proxies sit between every microservices interaction controlling both ingress and egress traffic going into and out of each service, they have complete visibility into the traffic and support various use cases such as Layer 3/4 filtering, packet inspection, header inspection and manipulation, access logging, rate limiting, statistics capture, and distributed tracing.
Envoy supports multiple protocols that most modern distributed applications consume: REST, gRPC, HTTP/1.1, HTTP/2, Kafka, Redis, and MongoDB to name a few. Understanding the native protocol help the proxies provide contextual insights e.g. how many write operations were executed in a given time period for MongoDB.
Istio Control Plane
The control plane sets up the configuration while also controlling the dos and don’ts in the system such as access policies, route tables, traffic shaping, circuit breaker policies, quotas, etc.
Istio has three services and an API that form the control plane – Pilot provides service discovery and traffic management for Envoy sidecars, Mixer enforces access controls/usage policy and collects telemetry data, and Citadel provides TLS certificates to the proxies for authentication and identity management.
If we were to compare with an air transportation system, proxies, and hence the data plane, would be analogous to planes flying in the sky while the control plane would be like air traffic control keeping a check on when and where the planes should be flying.
Istio Mixer and Adapter
Keeping up with the separation of concerns and loose coupling principles, the Istio Mixer provides an abstraction layer between Istio and an open-ended set of external components such as monitoring and logging systems.
Additionally, the mixer acts as an intermediary between application services running within the Istio mesh and external components. It can, thus, control communication policies such as authentication of backend and data access.
Since the backends can be of multiple types e.g. telemetry, billing, quota enforcement, access control and more. Istio uses adapters to address these different interfaces.
Adapters use Handlers to get the configuration for particular backend infrastructure service, for example, SignalFx for telemetry.
Envoy emits a rich set of attributes, key-value data, qualifying service or environment specific properties. Depending upon the configuration, Mixer can process and send the data attributes to the monitoring system. It is required, however, that the monitoring tool must be able to search, analyze, groups and filters through a large number of attributes to answer insightful questions pertaining to performance.
Introducing SignalFx Telemetry Adapter for Istio
SignalFx provides comprehensive monitoring for your application services deployed on an Istio service mesh. SignalFx get telemetry data from the Istio mixer via an adapter that reports stats, metrics, and traces to SignalFx.
The SignalFx adapter runs out-of-process, independent of other Istio components and services, and can be seamlessly deployed in your Istio environments. It is fully compatible with Istio 1.0.4 and above. To get started, refer to the installation docs and sign up for a free 14-day trial.
SignalFx instantly discovers all deployed microservices and their interdependencies in real-time and builds a service map out of the box to visualize every single interaction, data-flow, and service health.
For example, when you deploy the bookinfo app, instantly you could see the services map with interaction, health metrics, and traffic flow.
The trace view gives details on the traces and associated spans across the distributed services. Application performance insights correlated with host metrics help to quickly narrow down to the root cause of a performance issue.
High Cardinality Analytics
Leverage attributes vocabulary to get more actionable and valuable insights into the performance of microservices and Istio environments. SignalFx allows you to easily query, group and filter over tens of thousands of dimensions and metric time series. Because of this, you get richer visualization of data and precise alerting.
Istio provides huge benefits for developers and SREs alike with minimal instrumentation. If you are deploying applications on Istio environments be sure to check out SignalFx for real-time insights into infrastructure and application performance.
Meet SignalFx at KubeCon/CloudNativeCon, Seattle
If you are attending KubeCon, be sure to stop by our booth, S46, to learn how our customers are accelerating their adoption of Kubernetes and Istio with SignalFx.
Additionally, I will be presenting at the AWS booth at 11:15 AM on Wednesday, December 12th on monitoring distributed services deployed on Amazon EKS. Learn how our customers are reducing MTTR with guided troubleshooting using SignalFx Microservices APM™.
See you at KubeCon.