Exploring SMI, the Service Mesh Industry Spec

Service mesh options for Kubernetes keep multiplying. There is now Istio, Linkerd, Consul and new entrants including AWS App Mesh. Naturally, each service mesh solution has its own unique approach. Thus, problems could arise if organizations attempt to navigate between service meshes or operate multiple service meshes simultaneously across multi-cloud environments.

Enter SMI. Solo.io recently partnered with Microsoft and others to introduce Service Mesh Interface (SMI), a vendor-agnostic standard interface for Kubernetes service meshes. We’ve already seen SMI’s use in Solo.io’s SuperGloo project and Service Mesh Hub. But is the rest of the industry ready for a standard service mesh API specification?

Let’s see why Kubernetes service meshes need an interface standard at all. In this post, we go a little deeper to pick apart the details of SMI and explore some sample code. We’ll outline potential benefits and drawbacks and consider how others in the service mesh space are responding.

Introducing SMI: The Service Mesh Interface

In mid–2019, Solo.io announced SMI, a joint effort to encourage standardization within the service mesh industry. SMI is a collaborative effort between Solo.io, Microsoft, Bouyant and Hashicorp. As of April, SMI is a sandbox project within the Cloud Native Computing Foundation and under an Apache–2.0 license.

 

The objective behind SMI is to create a universal standard that can be implemented by many Kubernetes service mesh providers. As detailed here on GitHub, SMI specifies basic features most commonly shared among service mesh APIs, namely metrics, security preferences and routing. SMI utilizes Kubernetes CustomResourceDefinitions (CRD) and extension API servers. It’s segmented into three main areas using the following APIs:

  • Traffic policy: Apply policies such as identity and transport encryption across services. Includes Traffic Access Control and Traffic Specs resources.
  • Traffic telemetry: Capture key metrics such as error rate and latency between services. Includes the Traffic Metrics resource.
  • Traffic management: Shift traffic between different services. Includes the Traffic Metrics resource.

To see what the SMI specification looks like in practice, let’s consider this sample implementation for the TrafficMetrics resource. The code below specifies a standard way to retrieve data on latency and request volumes for Kubernetes pods:

Such information could be helpful in populating metric dashboards that span across multiple meshes. For another example, check out this implementation for the Traffic Access Control resource:

This snippet demonstrates how, by using TrafficTargets, it’s possible to allow access to be controlled by route and source. In doing so, the Traffic Access Control resource specifies source destinations, paths and methods for HTTP traffic.

Reasons for a Service Mesh Specification

So, what exactly are the needs for a Kubernetes service mesh interface standard? Well, there are a few reasons why some industry leaders are encouraging SMI.

Avoid Vendor Lock-In

The service mesh industry is still evolving. “In the service mesh space, there is no clear winner,” says Idit Levine, founder and CEO of Solo.io, “and, in my opinion, there won’t be a clear winner.” Consumer choice and tooling competition seem likely, especially considering how many competing container styles there are on the market still.

Software architects could easily bet on the wrong technology stack (and regret it later when it faces obsolescence). Or, if there is no prevailing solution, they could face interoperability issues between meshes. In both scenarios, Levine argues that open standards and vendor-agnostic buffers can help avoid lock-in. “SMI will provide order for the customer in an ecosystem that is not stable,” she says.

Interoperability

According to Christian Posta, Global Field CTO at Solo.io, there are clear signs that the “industry needs a standard service mesh API.” In a blog post, Posta explains how a standard specification is needed to unify the federation between different implementation types. “Having a unified API allows you to hedge your bets and start with one implementation and possibly migrate to a different one if needed,” he writes. SMI could enable the formation of an agnostic tooling ecosystem, allowing users to port supporting tools to any mesh.

Extensibility

Another concern is the developer experience. Levine argues since a service mesh is a lower-layer tool, it is not ideal to expose it directly to developers. Instead, additional tools should extend services meshes. This extensibility, however, will require a stable specification. A niche developer-facing onboarding platform, for example, would require a specification to support multiple meshes. With a unified data model, customers could take two service meshes and create a virtual service mesh for both.

Here we see SMI in use within the open source Service Mesh Hub by Solo.io, being used as a control pane to aggregate service mesh data from AWS-Mesh, Istio and Linkerd.

Drawbacks and Room for Improvement

Other voices in the service mesh space see potential downsides of introducing a service mesh specification.

Stunting Growth and Competition

According to Lin Sun, STSM and master inventor at Istio, and Daniel Berg, IBM Distinguished Engineer, introducing standards too early could negatively affect the fledgling service mesh community. “We believe it is too early in the evolution of service mesh solutions and user adoption to conclude that a standard API specification would be helpful for the community,” says Sun and Berg. “Trying to force a standard API specification at this point may stifle innovation amongst the projects.” Instead, the duo believes that having a little friendly competition could drive better solutions.

Lowest Common Denominator

SMI currently only covers traffic policy, traffic telemetry, and traffic management. Levine acknowledges that presently, the specification is rather bare. “It’s limited currently because we’ve focused on the lowest common denominator,” Levine says. This is a practical limitation of trying to accommodate shared functionality offerings among various approaches.

Levine acknowledges that the community desires better support for new versions of meshes. Perhaps, in the future, SMI will re-evaluate its “lowest common denominator” process and add support for more advanced service mesh features. Sun and Berg also acknowledge how this approach negatively limits features.

Standardized … or Confused?

Sun and Berg believe a specification could complicate things for developers, who are already inundated with tech decisions. Take the container orchestrator system rivalry of previous years (Kubernetes, Docker Swarm, Mesos). “Envision if we ever had a standard API among container orchestration systems,” says Sun and Berg. “Would that be helpful to users? We think the answer is clearly ‘no.'” Similarly, a specification could just be one more thing to learn; an obfuscation layer that introduces new support headaches and debugging complications.

Lack of Implementations

Service meshes such as Linkerd, HashiCorp Consul Connect and Maesh have adopted the SMI spec to implement their API. However, SMI is still in its infancy. Its proponents must sell the greater market on the benefits of inter-mesh standardization leveraging SMI. Its adoption by the surrounding tooling community will prove if there is actual value for a specification in practice. We’re already seeing SMI in practice with Flagger, a tool that automates progressive delivery.

Futile With Consolidation

The New Stack recently depicted Istio as a leader in the service mesh race, having doubled in usage from 2018 to 2019. If this hegemony escalates, an industry specification might be unnecessary. “Like Kubernetes having emerged as the de facto standard for container orchestration,” notes Sun and Berg, “having a service mesh solution emerge as the de facto standard would be equally valuable for users and the community.”

Istio has a close lead over other service mesh technologies, as found in a StackRox 2020 report.

Final Thoughts

A StackRox 2020 report found that Kubernetes adoption stands at 86%. Though Kubernetes is well-defined and well-adopted, the service mesh ecosystem is still emerging. “Service meshes were just beginning,” says Levine.

SMI is an exciting proposition for a few reasons. It seems useful to have meshes speak a common language. A unified service mesh specification could bring interoperability and extensibility benefits and alleviate lock-in concerns.

In theory, Sun and Berg recognize the benefit of a global standard: “A standard interface has the theoretical benefit of providing a single set of concepts and APIs that you have to learn and simply plug in the implementation of choice.” Overall, this could increase developer confidence in the cloud-native ecosystem.

However, although service mesh implementations have similar concepts, when it comes down to it, they have vastly different APIs and user experiences. Due to the common denominator result of unification, some players view SMI as bringing a negative net value to developer workflows.

SMI will have little use if the industry coalesces on a single mesh solution. Conversely, if various service meshes attain a similar market share, SMI could become quite helpful. Regardless, it will be interesting to track SMI’s adoption as well as its practical output. Seeing more use cases in practice will help to better gauge is value.

Bill Doerrfeld

Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high-impact blog on API strategy for providers. He loves discovering new trends, interviewing key contributors, and researching new technology. He also gets out into the world to speak occasionally.

Bill Doerrfeld has 105 posts and counting. See all posts by Bill Doerrfeld