The Next Revolution From On-Premises Databases to DBaaS

We used to take it for granted that databases should run on hardware. However, this assumption is changing profoundly these days, and that perception will continue to change. Next-generation database engineers will probably be unfamiliar with hardware like CPU and hard disks and more reliant on cloud services like Amazon S3.

This change is brought about by a new database solution—database as a service (DBaaS), in which a service provider installs, configures and maintains a database in the cloud. Companies who want to use these services subscribe to them. They don’t have a database on-premises and they don’t have to worry about major hardware and software investments or scaling their solution. They can increase or decrease the amount of services they use based on their business needs and the service provider handles the details. Best of all, the company’s staff can focus on higher-priority tasks other than maintaining the database. 

In this article, I’ll discuss why adopting DBaaS is a must for most companies. I’ll also share my thoughts on why I think it will replace traditional on-premises databases in the future. 

Why DBaaS is the Future

Both database technology and the business model of the entire database industry are undergoing major transformations. Technically, the development of databases is the evolution from a standalone architecture to cloud-native. As shown in the diagram below, companies initially had on-premises databases with standalone technology. Then, the “shared-nothing” architecture began to emerge, which laid the foundation for distributed NoSQL or NewSQL databases. We are currently undergoing another transition to cloud-native databases. Along with technical evolution comes business model changes. Traditional database vendors sold licenses for on-premises deployment. However, this becomes a bottleneck when they try to scale the business. With DBaaS, users now can subscribe to a flexible, fully managed service and database vendors can scale up their business with ease. 

The best example of a successful DBaaS provider is MongoDB. Its market value doubles every year and is currently more than $30 billion USD. Its DBaaS product, MongoDB Atlas, has maintained a compound annual growth rate (CAGR) of more than 100% each year. This shows the enthusiasm with which companies—some of whom may be your competitors—are adopting DBaaS. That is why cloud services matter. 

Benefits of DBaaS

It’s not difficult to conclude from the DB-to-DBaaS trend that shifting from on-premises databases to DBaaS is where the industry is heading. Only cloud services can break through geographical restrictions and provide users with unlimited computing resources. The benefits of DBaaS are paramount both in the technology and business sense. Here, we will explore the benefits from the technical side.   

Cost Reduction With a Decoupled Architecture 

Cost reduction is the ultimate goal for cloud-native technology. Let’s take TiDB, an open source distributed database, for example. As shown on the left side of the diagram below, before TiDB was deployed in the cloud, it had a coprocessing engine for both its computing and storage layers, which blurred the boundary between computing and storage and made it hard to handle scenarios with different workloads. If you wanted to increase storage capacity, you needed to increase the number of storage nodes. But this created a problem. Due to hardware limitations, you would also have to increase the CPU and internet bandwidth at the same time. This wastes resources. 

As shown on the right side of the diagram, after TiDB was deployed on the cloud, everything improved. The latest gp3 volumes of the Amazon Elastic Block Store (EBS), a block-storage service, could be run on different computers with the same input/output operations per second (IOPS) at the same cost. So, if TiDB is deployed on EBS, we can move the boundary between computing and storage downward. TiDB nodes and TiKV nodes can handle computing workloads together with flexibility. 

The cloud can save more computing resources. CPU is the most expensive resource in the cloud, and its bottleneck is computing—not capacity. With databases in the cloud, it’s possible to optimize clusters and spot instances based on shared resource pools, select storage services on demand and deliver different Amazon EC2 instance combinations for different scenarios. Serverless and elastic computing resources are possible, as well. 

Database deployment on the cloud also enables better resource isolation among the storage, network, memory and even CPU cache. This is because different software programs, especially distributed ones, require different hardware resources and are used by different businesses. With the cloud, you can select and combine resources on demand and reduce the cost even more. 

Data Security 

Data security is another important benefit of DBaaS. DBaaS users can use their own virtual private cloud (VPC) accounts to access their business assets in the cloud while the database provider cannot get access to that data. 

The security in place in the cloud is completely different from—and more complicated—than what is outside the cloud. For example, when you build an on-premises database, you only need to consider role-based access control (RBAC) inside the database. But on the cloud, things are more complicated. To guarantee data security, a complete set of security systems are built involving every layer of data flow from network to storage. 

Automatic Operation and Maintenance 

One of the most annoying pain points for on-premises database providers is the need for intensive labor when it comes to operation and maintenance during the delivery process. Sometimes, providers must send 20 staff members or more to support one customer during delivery. This is unsustainable in the long run. DBaaS can bring you automatic operation and maintenance and makes it possible to scale up your business using light support and a smaller delivery team. 

Summary

I truly believe that DBaaS is the future of databases. It is cost-effective, data secure, regulation-compliant and capable of enabling automatic operation and maintenance. I hope we can all enjoy and appreciate DBaaS soon—anywhere and at any scale. 


Join us for KubeCon + CloudNativeCon Europe 2022 in Valencia, Spain (and virtual) from May 16-20—the first in-person European event in three years!

Ed Huang

Ed Huang is the co-founder and CTO of PingCAP, a distributed system expert, a senior software engineer, and an architect who has worked at Microsoft Research Asia, NetEase Youdao and WandouLabs. He is an active open source enthusiast and open source software author, whose representative work includes Codis, a distributed Redis caching solution, and TiDB, a distributed relational database. At present, the TiDB project has accumulated more than 29000+ stars on GitHub, making it one of the world's top open source projects in this field. He is one of the "Top 10 Outstanding Contributors to Open Source in China in 2020", one of the "33 Pioneers of Open Source in China in 2020" and one of the "OSCAR Open Source Vanguards". His first-author paper, TiDB: A Raft-based HTAP Database, is the first paper in the industry on the Raft-based implementation of a real-time HTAP distributed database.

Ed Huang has 1 posts and counting. See all posts by Ed Huang