Database Operators Bring Stateful Workloads to Kubernetes

Jan 22nd, 2019 7:50am by Joab Jackson

Last year, Red Hat formally launched the Operator Framework, a way to customize the APIs of Kubernetes for specific applications. With the Operator Framework, originally developed by CoreOS before the company was purchased by Red Hat, users can spin up an application, with all of its specific settings and dependencies, and have Kubernetes handle all the provisioning and scaling.

Now we are starting to see more independent software vendors (ISVs) using the Operator Framework to package their own applications. In particular, database management system providers are using the technology to tackle one of the hardest issues with working with Kubernetes, that of managing stateful applications. Kubernetes has offered the Stateful Sets and Persistent Volumes, technology for this, but Operators makes this process of managing the backend databases much-easier, we keep hearing.

An operator turns a complex program, with intricate provisioning and maintenance issues, into an easy-to-run service, noted Sebastien Pahl, who gave a presentation on the technology at the All Things Open conference in Raleigh North Carolina last October.

In essence, the operator format gives the software developer a template that will tell Kubernetes how to deploy and maintain an application, including detailed instructions on how to build and deploy that application. The operator, packaged as a stand-alone container, packages the configuration details into a YAML-defined custom resource definition (CRD) that Kubernetes can understand. An associated controller ensures that the resultant cluster is maintained by the operator specification.

The operator covers dependencies, security permissions (through the kernel’s built-in namespaces support). An associated Operator Lifecycle Manager, also developed by Red Hat, provides the administrator with the ability to manage multiple operators. The framework also includes the ability to monitor and collect metrics from operator-built clusters.

Operators can be written with the Go programming language, with Helm or in Red Hat’s Ansible automation software.

Late last year, Couchbase released its Couchbase Autonomous Operator for Kubernetes, which eases the process of managing its distributed database. And we recently spoke with Crunchy Data, which has been using Operators since the CoreOS days to manage the PostgreSQL database in Kubernetes deployments, as well as PingCap, which uses the operator pattern to package its TiDB database so it can be consumed as a service from cloud providers.

Smooth Operators

One database system purveyor getting in on Operators has been Couchbase, which released Couchbase Autonomous Operator for Kubernetes late last year.

Couchbase is a distributed data platform that has carved out a market for supporting social media engagement, tweaked to use cases as chatting, maintaining profiles and mobile capabilities, said Couchbase vice president of engineering Dave Finlay, in an interview.

A few years back, most users were hesitant to run databases in containers, due to the limits in maintaining a stateful presence. This view is quickly being overturned as more enterprises and start-ups are containerizing their data platforms, Finlay said.

“Databases are the most stateful applications,” Finlay said. “When people write their data, they expect it to be there when they go back to retrieve it.”

When you lose a node in a database cluster, how do you recover the data? By itself, Kubernetes has no natural way of doing that, Finlay said.

Couchbase has built-in replication, with data distributed across multiple nodes. It also has the native capability of failing over to the backup nodes should the main one fail. But Kubernetes, using the Autonomous Operator, can find new hardware within the system to bring this cluster back up to its full strength, in much less time than it would take to have an admin do it.

“If there are very significant faults in the system, Kubernetes can respond can respond much more quickly than humans,” Finlay said.

Having a Kubernetes-controlled Couchbase deployment also helps an organization standardize on a microservices architecture, added Anil Kumar, Couchbase Server’s director of product management. In many cases, a company may have moved all the stateless components to such a containerized architecture, isolating the database in its own silo.

“With this operator, we are bridging that gap,” he said.

Crunchy Data has been offering a containerized version of the PostgreSQL relational database management system for five years, said Jonathan Katz, Crunchy Data director of communications. The Crunchy Container Suite is an open source set of about 20 containers that have all the microservices needed for running production-level Postgres deployments, including monitoring and high availability.

About two years ago, Crunchy started looking at extending this package to Kubernetes. The company didn’t go with StatefulSets, then thought to be the best approach to working with stateful applications at the time. Instead, Crunchy investigated a then-experimental pattern being developed by CoreOS.

Today, thousands of deployments use the Crunchy PostgreSQL For Kubernetes package, which was introduced in March 2017, built on the company’s PostgreSQL operator, one of the first operators in deployment.

It’s not just about having a fixed reference to send and query data — An operator solves a lot of complex deployment issues. It can handle complex tasks such as making and keeping a replica, along with a log of transactions. It can arrange replication tasks such as ensuring both copies of the database aren’t n the same node.

“Maybe you want to have one instance run on a node with a super fast disk array while having another connected to a slower storage volume — this is something different than what stateful sets allow,” Katz said.

“Really, what the operator does is give us the ability to execute mass-deploy commands across the entire Kubernetes cluster,” Katz said.

Kubernetes namespaces can offer security benefits as well, Crunch has found. User access can be isolated to particular namespaces, i.e. database systems, through the use of Kubernetes labels. A new database administrator can be restricted to just the user applications he or she oversees. Namespaces also help for upgrades.

You don’t need a commercial support contract from Crunchy to enjoy the operator. One Fortune 100 company was using the Crunchy operator to run 200+ node clusters of PostgreSQL. It contacted the company only when it needed additional support.

“You still need to evaluate,” Katz said. “But the technology has matured. It should be in the discussion.”

Database-as-a-Service

PingCAP is in the process of rolling out to cloud providers is TiDB, which is an open source distributed scalable hybrid transactional and analytical processing (HTAP) database. The company wanted to make the deployment pattern predictable and easy across different clouds. So it turned to Kubernetes and its own custom operator pattern. The Google Cloud Platform just started offering TiDB on its marketplace for instance.

At the end of the day, TiDB itself can simplify the system environment insofar that it eliminates the need to set up a data warehouse and ETL system to pass the data around. But, with many moving parts on its own, it still can be a bit complex to set up, admitted Kevin Xu, PingCAP general manager of global strategy and operations.

The company actually started the job of easing deployment and automating deployment on Kubernetes a few years back, using Stateful Sets and Persistent Volumes, though “We had to do a lot of hacks to make it work,” Xu said.

Enter the TiDB operator.

“We think this operator layer of abstraction helps deliver a good customer experience,” Xu said. Notably, it provides a consistent user experience across different clouds. Because TiDB is a hybrid with both real-time analytical capabilities and online transactions processing, operators offer the nuance at scaling each component to its own optimal level, independently from each other.

Operators are “essential for delivering our products and our experience in the right way,” Xu said. “the reason the operator shines is its ability to reduce the complexity of stateful applications where you do have to keep track of a lot of states.”

Joab Jackson is Editor-in-Chief for The New Stack, assuring that the TNS website gets a fresh batch of cloud native news, tutorials and perspectives each day. He has reported on infrastructure IT for over 25 years, including stints at IDG...