Can Your Kubernetes CNI Do These Three Things?
As a cluster architect or operator for large enterprises or telco clouds, relying on a basic container network interface (CNI) for advanced cloud networking is like using hand tools for construction; they are accessible and practical for small-scale workshops, but they lack the efficiency for largescale projects. If cluster sprawl, multi-cluster networking and complex security rules have outpaced your staff, what do you do when ‘basic’ just isn’t good enough? In this article, we’ll review CNI fundamentals with a focus on key areas that push networking beyond the basic CNI to require a full Kubernetes SDN.
CNI, a Cloud Native Computing Foundation project, consists of a specification and libraries to provide basic container networking and cleanup if the container is deleted. Due largely to its simplicity, CNI has been widely adopted over the many open source CNI options available for Kubernetes. But there are aspects to networking in Kubernetes that are complicated and extend beyond proxies and IP tables that can seem primitive and inadequate. The primary role of a CNI plug-in is pod interconnectivity, but services’ networking and access control “NetworkPolicy” with a focus on intra-cluster pod networking are part of many plug-ins’ essential implementation together with the built-in kubeproxy. The problem? Most large enterprise and service provider cloud services are constructed of many clusters (e.g., 5GC), meaning a run-of-the-mill CNI falls short, even for very basic networking and security needs.
Escape 2D Kubernetes Flat Networking
The most inherent challenge with default Kubernetes networking is that it’s flat. There is just one subnet for all pods and one more subnet for all intra-cluster services’ virtual IPs.
The flat networking model presents problems that are elegantly solved by a more flexible virtual-network model used by many SDNs—though few apply to Kubernetes. For one thing, default Kubernetes security assumes open, pod-to-pod connectivity resulting in a large web of access-control policies, called NetworkPolicy objects in Kubernetes, to restrict reachability and isolate workloads. As a result, the complexity of managing networking security increases sharply with each additional pod deployment. Imagine a data center network managed this way with ACLs manually applied to each individual interface across the top of rack switches. Yikes! Using SDN tools and protocols like routing instances and virtual networks (e.g., VLANs or VXLANs), pods are deployed with specific and layered network membership so that security and isolation are programmatic and innate. Given that access control policies are expensive in packet processing and unwieldy to manage at scale, SDN layers simplify operations and conserve expensive computing resources.
In the data center, the pitfalls of trying to bolt on security after the fact are well known, often resulting in rigid solutions that restrict platform teams’ ability to bring up services and deliver results that meet today’s demanding business operations. The results can be frustrating; however, with the sophistication of modern-day threat actors, ignoring security will come at your own peril.
Therefore, security must be designed into the network from the beginning with a layered approach while maintaining the agility, scale and performance of Kubernetes to realize the benefits of the cloud.
The benefits of a layered approach to virtual network isolation go beyond multi-dimensional security segmentation. To extend routing in and out of the cluster, it’s better to do so with finer-grained networks and subnets to protect workloads and data from incidental exposure and limit the blast radius of an east-west security breach. Finally, in the newfangled world of VMs running in Kubernetes alongside containers using KubeVirt, this model of many virtual networks is required to support standard workloads, but especially the multi-network-interface workloads that expect different networks on different interfaces.
External Load Balancing Into the Cluster
Public access to container applications is a two-stage process. First, a service inside of a Kubernetes cluster must be defined as reachable by the external world. Second, networking must advertise the external IPs of the service and load-balance access requests across those IPs to assure availability and performance.
In the first stage, there are generally two ways to expose a service outside the cluster: NodePort and LoadBalancer types of Kubernetes services. NodePort is a contrived solution with well-known limitations and inefficiencies. LoadBalancer services are superior but are supported only by some advanced SDNs as they are not implemented in CNIs.
A service defined as a LoadBalancer type requires both IPAM to allocate IPs and something to advertise those IPs to the external network, so the routing knows where the underlying pods are located in the cluster. Even Kubernetes L4-L7 ingress services to frontend applications and better distribute across available microservices instances need to be externally reachable. But ingresses are inside the cluster and, as such, don’t address the networking external to the cluster used to get to these higher-order ingress load balancer instances. What’s left as a workaround?
In the cloud, network load-balancing services fill the gaps to send traffic to the externally routable IPs allocated in the LoadBalancer type of service, which may be your Ingress service or other directly exposed applications. If you’re inside your own private data center, there may be solutions available from physical load balancer vendors, but any router or group of routers, probably already acting as your data center network border or gateway, already know how to fan out traffic across a set of a route’s next hops. While a basic CNI falls short, an advanced SDN used as the CNI can automate this process, extending BGP reachability to interwork the external world with the multiple customized virtual networks and subnets as described above. It’s not just a matter of making efficient connectivity possible, but also orchestrating and managing it all that a modern cloud-native SDN provides.
Multi-Cluster Networking and Management
New business, security and operational requirements lead toward an expansive Kubernetes footprint. There’s a whole world of networking possibilities here depending on the use cases, and there is a good chance that clusters will need to communicate. If they do, your CNI is not making your life as easy as it could be. Even if clusters do not need to communicate at the onset, it’s likely you need to apply common security policies across a diverse set of clusters and, chances are, managing them could be improved at the very least.
While there are some solutions here for tunnels between clusters (e.g., Submariner), the CNIs available today miss the opportunity to use standards-based routing protocols to facilitate networking federation. Then again, if you assume the flat network model, there may be little need for it because you’ll be managing routing through the underlay or physical network between clusters. That’s probably far less agile than you’d like it to be because it’s not part of your Kubernetes provisioning.
Another highly useful and important capability for operational efficiency is the ability to use one SDN as the CNI (and beyond) for many clusters simultaneously from a single SDN instance anchored in one primary cluster among the set of clusters. This obviates the need for network federation, at least among those clusters, and makes networking and security policy even more seamless between clusters. This isn’t a panacea for all multi-cluster Kubernetes use cases, but a good fit to simplify networking when you have, for example, co-located and dependent applications that are architected into their own clusters, which is generally the case with 5G telco applications, strongly resembling common enterprise Kubernetes multi-cluster regional deployments.
To hear more about cloud-native topics, join the Cloud Native Computing Foundation and the cloud-native community at KubeCon+CloudNativeCon North America 2022 – October 24-28, 2022