Software Dependencies and Containerized Applications

September 21, 2022September 21, 2022 Gilad David Mayaan cloud-native apps, containerized applications, dependencies, dependency mapping

by Gilad David Mayaan

Modern software projects (cloud-native applications built using microservices and containers and running on platforms like Kubernetes) use code written by other developers, whether they are third-party vendors or open source contributors. However, while this can accelerate development and allow developers to focus on creating unique functionality, it can also make a codebase opaque to its developers. Dependencies could have quality issues, security vulnerabilities, or performance issues and those issues will also affect the software project that uses them. The issue is exacerbated in cloud-native architectures.

What is Application Dependency Mapping?

Application dependency mapping (ADM) is the process of identifying all the elements of a software project and understanding how they work together. It gives software teams a view of the health of their environment, insights about application performance, and critical information for managing software operations. When a problem occurs, information about software dependencies can help quickly identify the root cause.

Most modern applications are built of hundreds or even thousands of components. If there are performance issues, a dependency map of the application can tell you where to look to understand where the bottlenecks are, what resources might be overloaded and how to troubleshoot them. If an application is malfunctioning, a dependency map can help understand which dependency is responsible for the problem and how it can be fixed or replaced.

Automated dependency mapping tools can help prepare a dependency map and update it when there are changes to the application. This can help discover the full structure of an application, including the contextual network topology.

Software Dependencies in Containerized Applications

The first step required to port an existing application to a container-based environment is to identify the dependencies that need to be packaged together in a container image. For small-scale applications developed in-house, this is usually straightforward, because all relevant information is known by internal teams.

However, for larger-scale applications or those developed by third parties, it can be more complex to identify dependencies. To make matters worse, container base images are often optimized to include only the bare essentials, omitting components that exist in a standard installation of the same operating system. This may result in gaps in documentation, because some dependencies exist on developer machines but will not exist in a production environment with a stripped-down container operating system.

Analyzing dependencies by comparing the original host to the container environment

In the past, a natural approach for identifying application dependencies was to perform analysis directly on the host operating system. Once the dependencies are identified, developers would verify that they also exist in the target container environment. This makes it possible to use familiar tools and workflows, including graphical tools in Windows.

The problem with this approach is that comparing host and container environments can be very labor intensive, and it can be easy to overlook small system configuration details, which might be critical for running the application.

Analyzing dependencies directly in the container environment

The comparison approach was frustrating and subsequently led to the development of tools to perform analysis directly in the container environment. This is now considered to be the most efficient way to identify application dependencies required for use by a container.

Container-based analysis involves running a tool that causes the container operating system or application to print an error message saying that a required runtime library or configuration data is missing. Instead of obtaining a complete list of dependencies and comparing it with the container environment, you can simply run the application in the container, identify and resolve missing dependencies. Because of the self-documenting nature of Dockerfiles, dependencies can be logged as they are identified.

The container-based approach also has its limitations. It relies on the availability of appropriate scanning tools that can function in a container environment. This is usually not an issue with Linux contains because they can run almost any workload, including graphical analysis tools. However, Windows containers are unable to display GUI elements and so many common analysis tools cannot be used within a Windows container. This means the analysis must be performed via command line tools.

Understanding and Monitoring Dependencies in Cloud Applications

Cloud Operations (also known as CloudOps) involves managing the delivery, performance and optimization of IT services and workloads and IT running in cloud environments. It governs best practices for operational processes in the cloud, extending the DevOps philosophy to cloud applications.

Most modern applications have dependencies accessed with APIs. Understanding dependencies that affect performance requires visibility over the APIs, including observing object domains, viewing connection logs, evaluating embedded services and ensuring APIs are secure.

The next step after deciding what components require monitoring is to collect data. The monitoring tools should:

Log failed API requests.
Track error patterns over time.
Continuously monitor infrastructure and API servers.
Apply regular, automated tests.

These practices will help establish a performance baseline—adding a correlation and analytics engine can help explain the relationships between different events.

Cloud applications rely on various underlying services and APIs, so monitoring their performance and availability is essential to ensure an application’s functioning. The following steps can help build optimized cloud apps:

Infrastructure mapping—Including cloud infrastructure and APIs.
Dependency testing—Use logs and monitoring to keep track of critical service and operational dependencies.
Performance validation—Use event correlation, baselines and alerts to evaluate performance.
Performance optimization—Choose the right architectures and vendors for long-term performance based on monitoring data.

Conclusion

In this article, I explained the basics of software dependencies and how they affect two major elements of the cloud-native environments:

Containerized applications—When porting legacy applications to containers, you must be aware of dependencies and the interaction between them. Automating dependency mapping can save major trouble in production.
Cloud applications—In a cloud environment, most dependencies revolve around APIs. Ensure you fully understand what APIs your applications interact with and that they are resilient to failures, rate limitation or other circumstances affecting remote APIs.

I hope this will help you gain better visibility over your cloud-native environment and architect more resilient and robust cloud-native applications.