Deploying and Running BCC in Your Kubernetes Cluster

In 2021, extended Berkeley Packet Filter (eBPF) is becoming an increasingly popular tool for DevOps professionals and backend engineers alike—and rightly so. Using eBPF you can deliver features and experiences instantaneously by instrumenting directly from the kernel. And fortunately, kernel versions are at a great place making it easier for engineers to deliver these solutions to the masses.

Tools like BPF compiler collection (BCC) have made it easier to get started with eBPF. BCC is a set of tools that leverages eBPF for kernel tracing. However, it can be a bit tricky to deploy and run BCC in your Kubernetes cluster. In this article, I’ll walk through how to deploy BCC in your Kubernetes clusters and also highlight some important lessons and common mistakes to avoid along the way. In addition, we will also explore an example repo with a sample BCC application that you can modify. The repo has all the necessary Docker files and Kubernetes deployment files that you’ll need to get up and running.

Getting Started

The first step is to set up the init container (see docker file). The purpose of the init container is to install the necessary Linux headers onto all of your nodes. Since we’re running this as a daemonset, it will automatically install the necessary headers onto all nodes in your cluster and will automatically install them on any newly created nodes. 

The Kubectl trace team put together a simple bash script that will grab the current Linux kernel version and install the necessary Linux headers package to match that version. After testing, we can confirm that this works across all of the major providers like GKE, AKS and EKS.

Setting Up Necessary BCC Containers

The next step is to set up the BCC container (see docker file) where you’re actually going to run the BCC application. From there, you could run one of their tools or run your own custom BCC application. 

The Docker file starts by installing the necessary Ubuntu packages to get everything up and running. Next, it clones BCC and builds it from the source. The last step is to set up the command to run the python script. If you want to extend this, you could implement a multistage build flow, where you build BCC in one image and then copy only the necessary files over to a smaller image like Alpine. 

Deploying on Kubernetes

The next step is setting up the Kubernetes deployment file (see file here). We’re going to run this as a daemonset so that it automatically instruments all nodes in your cluster.

The daemonset will consist of two images: The init container to install the Linux headers and the BCC container to run one of BCC’s tools or your own custom tool.

The next step is to set the container to run as privileged. If you are using a newer kernel version, you might be able to just add the BPF capability, but I haven’t tested this yet. 

Finally, make sure you set the appropriate resource requests for the containers. 

It’s super important to make sure that you set appropriate ephemeral storage requests. Installing all of the necessary packages and the Linux headers will eat up a lot of disk space. I learned this the hard way when running it on a test cluster—the pod ate up most of the disk space on the node leading to pod evictions and other cascading failures. You are also going to want to set up the necessary CPU and memory requests. See here for a more in-depth guide on how to properly set resource requirements. 

Finally, you are going to need to mount the proper volumes onto your containers. You will need to set up host path volumes for the modules directory, the Linux headers and, finally, on the release file. You can see all of the volume mounts at the bottom of the deployment file here.

I hope this article was helpful. We’re looking forward to KubeCon and we hope to see you there!

To hear more about cloud-native topics, join the Cloud Native Computing Foundation and the cloud-native community at KubeCon+CloudNativeCon North America 2021 – October 11-15, 2021

Matt Lenhard

Matt Lenhard is the Co-founder & CTO of ContainIQ. Matt is an experienced technology founder having founded multiple tech startups, twice with Nate Matherson. In his previous roles, Matt built a number of internal tools and software to help internal teams improve productivity and optimize resources. Matt is a full-stack developer with extensive experience in Kubernetes. Outside of work, Matt is an angel investor focusing primarily on early stage software companies.

Matt Lenhard has 1 posts and counting. See all posts by Matt Lenhard