🔭An Introduction to Service Mesh with Linkerd for teams using Kubernetes

Introducing Linkerd, a service mesh tool well-suited for smaller teams using Kubernetes.

Hello, and welcome to your Aiden's Lab Notebook.

It's no exaggeration to say we're living in the age of microservices.

As container and cloud technologies have advanced, services are increasingly broken down into functional units for deployment to ensure reliability. Container orchestration tools like Kubernetes are frequently used to implement these microservice architectures.

While microservice architecture offers increased flexibility in development and deployment by separating each function into an independent service, it also leads to an exponential increase in network communication between services. In such distributed environments, the overall system complexity rises, making it difficult to manage and trace individual service calls.

This is where the concept of a Service Mesh comes into play.

Why Do We Need a Service Mesh in the Age of Microservices?

When it comes to inter-microservice communication, here are the key challenges we need to address:

  1. Security

    • In a microservice environment, countless services communicate with each other over the internal network, making the security of these communication channels paramount.

  2. Traffic Management

    • Patterns that enhance reliability (like retries, timeouts, etc.) are needed to prevent a failure in one service from cascading to others.

  3. Observability

    • Essential for understanding the operational status and performance of the entire system and for quickly diagnosing problems.

So, can a Service Mesh solve all these challenges? Absolutely!

A Service Mesh is a dedicated infrastructure layer for making service-to-service communication safe, fast, and reliable, without requiring changes to application code.

Typically, it's deployed as a sidecar proxy alongside each service instance (Pod), and it intercepts and controls all network traffic between services.

This allows developers to focus on business logic, while the Service Mesh handles the complexities of inter-service communication.

Let's connect the main features a Service Mesh provides at a platform level with the challenges we just discussed:

  1. mTLS encryption, authentication/authorization <- Security

  2. Routing, load balancing <- Traffic Management

  3. Metrics, traces, logging <- Observability

These features enable consistent application of security policies across services, improve communication reliability, and make it easier to understand the overall system behavior.

Therefore, adopting a Service Mesh can significantly reduce the complexity of operating microservices and greatly enhance management efficiency.

Why is Linkerd a Great Choice for Adopting a Service Mesh?

With the emergence of the Service Mesh concept, various related tools have appeared. Istio and Linkerd are among the most prominent.

The reason we're looking at Linkerd this time is due to its distinct advantages:

  1. Ease of Adoption

    • Unlike the feature-rich Istio, Linkerd is designed with a focus on core functionalities (with the 'just works' principle).

    • You can get started with a Service Mesh quickly, without complex CRDs (Custom Resource Definitions) or configuration files.

  2. Resource Efficiency

    • Linkerd's proxy is implemented in Rust, a language known for its low memory footprint and high performance.

    • The proxy minimizes its impact on the performance of the main application and can operate reliably with fewer resources.

  3. Operational Simplicity

    • It provides an intuitive CLI and a highly readable dashboard, allowing users to easily understand and utilize its key features.

Thanks to these characteristics, Linkerd is a less burdensome option for teams that use Kubernetes but aren't operating at a massive scale.

Furthermore, like Istio, Linkerd is a CNCF Graduated project, recognized for its stability and sustainability.

A Glimpse into Linkerd's Core Architecture

Linkerd's Data Plane consists of proxies deployed as sidecar containers in each Pod.

Linkerd's components are broadly divided into the Control Plane and the Data Plane:

  • Control Plane:

    • Manages policies for the entire Service Mesh.

    • Provides configuration information.

    • Collects aggregated telemetry data.

  • Data Plane:

    • Composed of sidecar proxies that handle actual application traffic.

    • Operates based on instructions from the Control Plane.

The Control Plane provides all the necessary information for the Data Plane proxies to function correctly. This includes service discovery information, routing policies, security policies (e.g., mTLS certificates), and more.

The Control Plane is also responsible for aggregating telemetry data, such as success rates and latencies, collected from each proxy.

Users interact with the Control Plane via the CLI or dashboard to manage the mesh.

Linkerd's Data Plane consists of linkerd-proxy, an ultra-lightweight sidecar proxy deployed alongside each application's Pod.

This proxy intercepts all TCP traffic (HTTP, gRPC, etc.) going into and out of its Pod, performing tasks like mTLS encryption, load balancing, retries, timeouts, and metrics collection.

Can you believe all these diverse features are available without modifying your application code at all?

Linkerd offers a feature called Automatic Proxy Injection. All you need to do is set an annotation (linkerd.io/inject: enabled) on a specific Kubernetes Namespace or add it directly to a Workload manifest like a Deployment.

Then, when new Pods are created in the annotated Namespace or Workload, Linkerd's sidecar proxy is automatically injected alongside them.

Now that we've covered Linkerd's architecture and how it works, it's time to look at its core features.

A Preview of Linkerd's Core Features

1. Security Aspects

Through the Control Plane's certificate management capabilities, Linkerd can enable mutual TLS (mTLS) encryption by default for all communication between services within the Service Mesh.

This ensures data confidentiality and integrity without requiring separate configurations, protecting services from threats like man-in-the-middle attacks. In essence, it allows you to easily implement Zero-Trust network security principles.

What is the Zero-Trust Security Principle?


The Zero-Trust security principle is designed around the philosophy of 'trust nothing.'
It treats every user, device, and component as a potential threat.
Unlike traditional 'perimeter-based security,' it aims to strengthen security by applying strict authentication and authorization to all connections, regardless of whether they originate from inside or outside the network.

2. Traffic Management Aspects

Intermittent request failures due to temporary network issues or service overloads can jeopardize the stability of the entire system.

Therefore, Linkerd, through a resource called HTTPRoute (from the Gateway API), provides features to automatically retry requests or set response timeouts for specific services or paths.

Additionally, Linkerd can implement advanced routing rules like traffic splitting and support various deployment strategies (e.g., canary deployments, Blue/Green deployments).

3. Observability Aspects

Linkerd's Data Plane proxies automatically collect Golden Metrics for all requests and responses, such as success rates (SR), requests per second (RPS), and latency (latency distribution percentiles).

What are Latency Distribution Percentiles?


This metric measures request processing latencies over a specific period and indicates the latency value corresponding to a particular percentile within that time.
For example, the 99th percentile (P99) is the latency value below which 99% of all latency data falls, effectively representing the latency of the slowest 1% of requests.

These metrics are aggregated by the Control Plane and can be easily viewed via the dashboard or CLI. This enables developers and operators to monitor application health in real-time and quickly diagnose problems.

How Will Adopting Linkerd Change Our Team?

Based on what we've covered so far, here's a summary of the changes Linkerd can bring to a team. Think of this as a quick review of the topic.

  1. Adopting Linkerd helps developers focus more on business logic.

    • Since Linkerd handles complex infrastructure concerns like inter-service communication, security, and observability, developers can concentrate more on the core business logic of their applications, leading to improved development productivity.

  2. Linkerd allows for gradual adoption and expansion, enabling effective utilization of the Service Mesh.

    • It's designed to allow for incremental adoption, starting with specific Namespaces or services. This means it can be stably introduced and scaled in existing operational systems.

  3. Linkerd improves service visibility for the operations team, reducing the time spent on diagnosing failures.

    • Thanks to rich telemetry data and an intuitive dashboard, the operations team can quickly identify the root cause of problems and resolve them.

  4. Linkerd enhances the overall security posture of the cluster.

    • Features like Linkerd's mTLS can encrypt all communication within the Service Mesh, minimizing the risk of data breaches and tampering.

  5. Linkerd helps build a more stable and resilient microservice operating environment.

    • Features like Linkerd's automatic retries and timeouts protect the system from transient failures and enhance service stability.

Wrapping Up

So, we've explored Linkerd's architecture and key features. What do you think? Are you a bit more intrigued by Linkerd now?

I've tried to break down Linkerd's core components and functionalities as clearly as possible, and I hope this has been helpful for those curious about Service Mesh and Linkerd.

In this article, we've discussed Linkerd conceptually. However, it's too important a topic to just cover in theory. So, in the next post, we'll dive into a hands-on practice where we'll use Linkerd's core features directly.

The lab guide will also focus on the core functionalities in a compact way, so you should be able to follow along without feeling overwhelmed.

See you in the next post!

Thanks.

✨Enjoyed this issue?

How about this newsletter? I’d love to hear your thought about this issue in the below form!

👉 Feedback Form

Your input will be help improve Aiden’s Lab Notebook and make it more useful for you.

Thanks again for reading this notebook from Aiden’s Lab :)