When Is Service Mesh Worth It?

Service mesh is getting a lot of interest these days, especially as new meshes enter the market to join Istio, Linkerd and Kuma as established open source options. The technology brings a common networking, policy and observability layer for microservices architecture. Due to its significant overhead, though, the perception is that it’s only relevant for very large-scale enterprise deployments with many teams. But is that truly the case? When is adopting service mesh actually worth it?

I recently met with Zack Butcher, founding engineer at Tetrate, and one of the original Istio builders at Google, to learn what organizational size is best suited for service mesh. Though economies of scale have a more apparent advantage when implementing the technology, Butcher believes that framing it as a question of organizational size is the wrong way to look at things.

To Butcher, operational enhancements gained, such as company-wide security enforcement or replacing traditional API gateways, could recoup the upfront investment. Paired with ongoing usability improvements, service mesh is primed to benefit many more organizations, no matter their magnitude.

Service Mesh For The Rest of Us

“Service mesh can be a force multiplier for large efforts,” says Butcher. Economically, it’s easier to justify for larger teams; yet, even for a smaller group, Butcher believes it can streamline operations such as metering, authorization, authentication and encryption in transit.

“I don’t believe that adopting service mesh is a function of scale,” Butcher says. “Rather, the gate for adopting a service mesh is operational expense.” Organizations should be looking at the cost of an alternative approach, not the scale of their company. This boils down to what the organization is attempting to achieve, and many of these points are independent of size, says Butcher.

For example, service mesh can enforce encryption and security policies that are necessary requirements for most organizations. Also, for Envoy-based meshes, Envoy plugins could help externalize services directly from the mesh, thus avoiding investment in other API gateways.

Rapid Encryption and IAM For All

Butcher shared how FICO uses service mesh to encrypt all data in transit and enforce it at the proxy level. Of course, this is a large company, but the scale is not the driving factor — it’s the operational impact of reusable functionality with unified configurations. “You can have a central team take on and pay that cost, as opposed to spreading cost across organization,” he says.

Identity and access management (IAM) is another area where service mesh shines. Butcher describes how, at Google, implementing a centralized identity and access management solution for all cloud services was a significant ordeal involving many teams’ input. But after adopting sidecar proxies for each service, he and one other engineer were able to deploy IAM across every single object in one quarter.

API-Centric Service Mesh

Perhaps the most exciting effect is the impact on API strategy. When folks discuss service mesh and API gateways, they typically keep them in separate camps. However, Butcher sees a convergence. “We see API gateways as an inherent part of service mesh functionality.”

The old mindset of needing an API gateway for north and south traffic and service mesh for east and west traffic, is no longer a meaningful distinction, Butcher believes. “The problem is north-south and east-west doesn’t exist,” he adds.

As more services grow, the distinction between “external” and “internal” starts to disappear. No matter if it’s public, private or partner, all services may encounter high traffic, they all need SLAs and they all require the same zero-trust security awareness.

Instead of using a separate API gateway, API providers could place sidecars around the microservices they intend to expose and extend them with management features like rate limiting, identity control and request transformation. Operators then could apply different authentication policies on a per-service basis from the mesh control plane. Repurposing service mesh for external communications could thus enable smaller shops to expose API-as-a-Product.

Usability Improvements to Ease Adoption

Of course, the cost of running a mesh itself contributes to overhead, but an even higher impediment to adoption is usability. Getting over accessibility hurdles could be a substantial initial cost for small shops. And, Istio doesn’t have the best reputation when it comes to usability.

Butcher acknowledges that many components that led to Istio originally resulted from decomposing a monolithic API gateway while working at Google. “We were building out Istio in a cave,” he says. “No one could use it, and when handled, they cut themselves on it.”

Since then, Istio maintainers have significantly improved how developers interact with the mesh, with out-of-the-box configurations, recipe books and improved documentation on Istio.io. Yet, there is still more to do to improve usability. Increasing the developer experience associated with the technology is the last thing to meaningfully “chip away that cost of adoption,” says Butcher.

Another promising area to improve usability is WebAssembly. As I’ve covered in the past, much effort is centered around WebAssembly as an extensibility mechanism for Envoy. Butcher notes that a “pretty vibrant ecosystem [is] emerging based on WebAssembly.” Ready-made Envoy plugins could help organizations more readily utilize service mesh, thus lowering the barrier to entry for custom needs.

When Service Mesh is Worth It

Service mesh brings a unified configuration to an inherently disparate architecture. And as microservices rise in use, even small organizations may feel the need to adopt a central mechanism for universal control over their ingress and egress policies.

If operational returns and time savings outweigh the upfront effort and ongoing maintenance cost, service mesh could be highly valuable, and not only relegated to enterprise use cases.

Logically, further impact analysis will be vital to determine a break-even point. Similar to weighing other cloud-native technologies, cloud economists must focus their attention on service mesh in an ongoing effort to bridle rising cloud costs.

According to Butcher, “API gateways are an inherent part of service mesh.” Thus, he predicts an incoming surge of more API-centric mesh configurations. “It’s the last domino to fall for service mesh to really break through,” he says.

Bill Doerrfeld

Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high-impact blog on API strategy for providers. He loves discovering new trends, interviewing key contributors, and researching new technology. He also gets out into the world to speak occasionally.

Bill Doerrfeld has 100 posts and counting. See all posts by Bill Doerrfeld