NVIDIA Makes Microservices Framework for AI Apps Generally Available

April 23, 2025 Mike Vizard AI app development, containers

NVIDIA today made generally available a microservices framework based on containers that promises to make building complex artificial intelligence (AI) applications simpler.

Joey Conway, NVIDIA senior director for generative AI software for the enterprise, said the NVIDIA NeMo microservices framework will make it simpler for developers to, for example, create the data flywheels that will drive workflows spanning multiple AI agents.

NVIDIA has previously been making the NVIDIA NeMo framework available under an early access program. As the pace at which organizations are now incorporating AI agents into applications accelerates, the need is for a framework that makes it simpler to collect data in a way that serves to improve the accuracy of the output being created by AI agents by enabling continuous training, said Conway.

That capability will reduce any drift that might otherwise be introduced as AI models are more randomly exposed to additional uncurated data sets, he noted.

Additionally, the NVIDIA NeMo framework will also make it simpler to expose AI models to new data sources, he noted. For example, an AI agent trained to respond to customer questions by querying a PostgreSQL database could be retrained using a dataset that now resides in a MongoDB database.

Finally, frameworks such as NVIDIA NeMo will play a crucial role in keeping infrastructure costs under control, said Conway. When an AI agent evaluates actions, it queries multiple sources and validates outputs, Each interaction can require 5x to 10x more compute than a standard model inference. The NVIDIA NeMo framework makes it simpler to, alternatively, employ smaller models to improve accuracy and reduce latency in a way that also reduces the total cost of ownership (TCO), he noted.

Agentic AI applications, almost by definition, are going to be based on microservices-based architectures that enable AI agents to leverage application programming interface (APIs) to communicate with each other and the applications they need to invoke.

It’s not clear at what pace organizations are building and deploying agentic AI applications, but nearly every organization that builds software is at the very least experimenting with them. The challenge is not so much building the AI agent as it is optimizing them continuously to run in production environments at scale. Creating the workflows needed to achieve that goal represents a new domain for software development, said Conway.

Of course, as is the case with any fundamental change to the way software is engineered, the need for application development teams to acquire new skills is pressing. IT leaders are under intense pressure to realize the operational benefit of AI as quickly as possible. Unfortunately, achieving those goals is often more difficult than the average business leader might initially appreciate.

In the meantime, software engineering teams would be well-advised to more thoughtfully consider not what is necessarily required to build one AI application, but rather hundreds that might one day need to interoperate with each other in a way that might not be easily understood at the time they are being constructed and deployed.