big data Archives

Apache Spark Grafana eBPF CNI OpenELB Anchore Grype API NetApp Spot

Best of 2023: Using Airflow to Run Spark on Kubernetes

Here's how to set up a Spark application on Kubernetes using the Airflow scheduler for data-intensive workloads ...

Chandan Pandey | January 2, 2024 | Airflow, Apache Spark, big data, container orchestration, data processing, kubernetes, Spark

big data DH2i cloud-native database data protection Databases and Postgres in Containers

Making the Most of Kubernetes for Big Data

Kubernetes (and not Hadoop) has emerged as the most logical containerization system for enterprise-grade big data ...

Ash Munshi | October 10, 2023 | big data, cloud native application development, data analytics, KubeCon, kubernetes, Pepperdata

Hadoop MapR Embraces Kubernetes for Big Data

Best of 2022: Big Data on Kubernetes: The End For Hadoop?

As we close out 2022, we at Container Journal wanted to highlight the most popular articles of the year. Following is the latest in our series of the Best of 2022. When ...

Jessica Day | December 27, 2022 | apache, big data, hadoop, kubernetes

Cloudera’s Data Management Platform Comes to OpenShift

Cloudera today announced that it plans to make an instance of its data management platform based on Hadoop generally available this summer on Red Hat OpenShift, which is based on Kubernetes. Arun ...

Mike Vizard | June 11, 2020 | big data, cloud services, data management, data warehouse, hadoop, kubernetes, microservices, on-premises infrastructure

Actifio Simplifies Copying Data Using Containers

Actifio today made available an update to its namesake copy management software to make it easier to clone databases consisting of terabytes of data using containers. Chandra Reddy, senior vice president of ...

Mike Vizard | December 17, 2019 | Actifio, big data, containers, data, production database, storage

Google Aims to Wed Apache Spark to Kubernetes

Google today announced it will extend Google Cloud Dataproc, a managed service for accessing the Apache Spark in-memory computing framework, to Kubernetes. James Malone, a Google senior product manager, says Google will ...

Mike Vizard | September 10, 2019 | Apache Spark, big data, google, Google Cloud, in-memory computing, kubernetes

MapR Embraces Kubernetes for Big Data

MapR Technologies announced it has integrated its distribution of the open source Apache Spark framework and the Drill query engine platform with Kubernetes. Suzy Visvanathan, senior director for product management at MapR, ...

Mike Vizard | April 2, 2019 | Apache Spark, big data, big data frameworks, kubernetes, MapR, MapR Drill

MapR Adds Support for Container Storage Interface

MapR Technologies this week announced it has added support for the Container Storage Interface (CSI) to its platform for running big data analytics applications. Suzy Visvanathan, director of product management for MapR, ...

Mike Vizard | February 8, 2019 | big data, container storage interface, containers, CSI, kubernetes, storage

Paxata Employs Kubernetes to Extend Data Prep Tool

Paxata has extended the reach of its data preparation software for big data into the realm of the cloud by employing Kubernetes clusters to create a runtime to process jobs in batch ...

Mike Vizard | August 15, 2018 | big data, kubernetes, Paxata

Pepperdata Project Hosts HDFS on Kubernetes

Pepperdata has launched a project aiming to enable the Apache Spark in-memory computing framework for big data analytics applications. Pepperdata CTO Sean Suchter says the Hadoop File System (HDFS) on Kubernetes open-source ...