Pentaho Simplifies Containerized Big Data Analytics with New Tools

Pentaho, the open source big data company, is capitalizing on containers by building a new deployment tool for launching Docker-based big data servers.

Pentaho, which is owned by Hitachi, develops a platform based on open source tools, including Apache Hadoop, for big data analytics and integration.

On June 20, the company announced at DockerCon that it has released a new set of open source tools for setting up the Pentaho data analytics platform on containerized infrastructure. The tools are available on GitHub.

The company says it developed the utilities to simplify container-based deployment of its platform. It sees this innovation as key to making the containerized version of the platform feasible for organizations to set up at scale.

“As the complexity of application deployments increase, developers must work rapidly to add new software components and expand their software footprint,” Pentaho said. “Due to this, many developers are shifting away from traditional installation methods for more automated and scriptable approaches.”

The company added its open source deployment utilities “complement what developers are already doing by providing a way to unlock the power of analytics within containers to reduce testing and development time, automate the setup and configuration of a given environment and have more control over their big data deployments.”

Pentaho’s DockerCon announcement was not exactly radical. The company’s big data platform was already available in containerized form before this announcement.  The new tools just make it easier to deploy.

That said, the news is interesting as another sign of how vendors are using Docker containers to distribute their software for production deployment.

It’s especially notable to see Pentaho doing this, given that the big data ecosystem has so far not been subject to major convergence with containerized infrastructure and the DevOps scene. The company clearly believes it’s time to containerize big data.

Christopher Tozzi

Christopher Tozzi has covered technology and business news for nearly a decade, specializing in open source, containers, big data, networking and security. He is currently Senior Editor and DevOps Analyst with and

Christopher Tozzi has 254 posts and counting. See all posts by Christopher Tozzi