Using Containers to Enable AI at the Edge

July 22, 2021October 1, 2021 Pete Brey AI, containers, edge computing, kubernetes

When the COVID-19 crisis struck, medical professionals quickly realized that this unprecedented situation required them to think differently and use cutting-edge technology; they soon recognized the value of applying artificial intelligence (AI) to aid in disease detection and patient categorization. In response, researchers at the University of Waterloo worked with a growing AI vendor to create a neural network solution called COVID-Net.

The system applies AI to chest x-rays to help medical professionals better diagnose and treat COVID-19. Collaboration between Boston Children’s Hospital and the open source community took the solution further; enhancing it with a graphical user interface enabled by a Kubernetes platform that supports deployment across hybrid and multi-cloud infrastructure.

This innovation exemplifies two converging trends: proliferating AI use cases at the edge of the network and increasingly heterogeneous infrastructures that include hybrid cloud platforms powered by containers and Kubernetes.

What do those trends mean for your organization? Do you have the right infrastructure to support AI at the edge? Will a hybrid environment of bare metal servers, virtual machines (VMs) and containers complicate your AI objectives?

The good news is that you should be able to support edge-based AI even as your use of containers expands. But you need to understand the implications–and implement best practices that optimize AI at the edge.

Balancing Core and Edge

AI–in most cases, in the form of machine learning (ML)–is increasingly being deployed in real-world applications across industries. Energy companies use AI in oil and gas exploration. Financial services firms rely on AI to assess risk and detect fraud. Retailers apply AI to analyze sales and make real-time product recommendations.

Training AI algorithms requires robust computing power. But once an AI model is deployed, it often needs to operate at hundreds or even thousands of edge locations–in bank branches, say, or on oil pipelines–where processing power and network bandwidth are highly limited.

A typical ML lifecycle involves five stages that span core, cloud and edge:

1. Gather and prepare the data that will inform the ML models.
2. Apply the data to develop and train the ML models.
3. Integrate the ML models into application development processes.
4. Deploy the ML-powered applications to analyze data and make predictions or recommendations.
5. Monitor and manage the ML models to ensure their ongoing accuracy.

All those stages can take place in a data center or in the cloud. But stages one, four and five increasingly occur at the edge.
Your edge might run on light, off-the-shelf servers. But for edge-based AI, the application could just as easily run in a container. Containers allow you to quickly spin up microservices, run the ML algorithm and then spin back down. In fact, this approach is a hallmark of Knative, an open-source project that enhances Kubernetes with components for deploying, running and managing serverless applications.

Event-Driven Architectures

Some edge-based AI deployments, however, need to continuously capture data, perform analysis and output predictions or recommendations. That means the AI application is stateful. It needs to be linked to your centralized applications so they can use that output to make decisions or take actions in real-time. This is where your core and edge architectures converge.

In the past, enterprises would batch their data and transmit it to a centralized data warehouse. If a retail store wanted to conduct sales analysis, say, it would send the data to company headquarters for processing. Today, that store needs to do real-time processing at the edge and make decisions on the fly. For example, it might identify customers based on their frequent-buyer app, combine purchase history with external data such as seasonal events and deliver personalized offers in real-time.

An event-driven architecture (EDA) can help. EDA is a way to design applications and services to respond to real-time information based on the sending and receiving of data about individual events. Asynchronous non-blocking communication between event producer and event consumer releases the resource consumption while waiting for the response to return. By making available more communication patterns, EDA enables multiple consumers to receive events, with lower latency and increased throughput.

A similar approach is Apache Kafka, an open source distributed event streaming platform. The framework is designed to enable a high-throughput, low-latency way of handling real-time data feeds to allow applications to process records as they occur.

Optimizing Edge-Based AI With Kubernetes

Maybe you’ve enabled your AI deployments to efficiently run at the edge and transmit data from edge to core. That still leaves the issue of making sure your AI applications continue to function as your core architecture evolves. And that’s where Kubernetes comes in.

On-premises data centers will never go away, even if they might decrease in importance. The public cloud will grow in significance, though your workloads might move from one cloud provider to another or between public cloud and on-premises. Meanwhile, more of your development and production environments will be deployed as containers. It’s a hybrid, heterogeneous world.

Kubernetes gives you a means of abstracting away the differences among these platforms. It provides a substrate of consistency for your AI applications at the core or at the edge. You can leverage open source tools to build data pipelines that run on top of Kubernetes. And you can move smoothly from bare metal servers to cloud platforms.

Although training AI algorithms requires the compute power of the core, running live data points through an algorithm to calculate outputs is a fairly lightweight process that can take place at the edge. Kubernetes microservices are likewise well-suited to the edge, because they’re lightweight and can be dynamically started up and shut down as needed.

Not every AI deployment will be lifesaving like COVID-Net, of course. But many of your AI initiatives will be high-stakes for your business–enabling safe and efficient supply chain routing, providing automated cybersecurity protections, saving significant operational costs or optimizing customer experiences. By understanding how AI fits into your increasingly containerized IT landscape, you can help make sure you get the most from your investments in AI.