Friday, January 13, 2017

Deploy a Highly Available WordPress Instance as a StatefulSet in Kubernetes 1.5

At The New Stack, we covered various strategies for running stateful workloads on the Kubernetes container orchestration engine. This article takes a practical, hands-on approach to deploying a highly available WordPress application in Kubernetes based on the strategies and best practices that were discussed earlier. We will cover everything from setting up the Kubernetes cluster to configuring shared storage to deploying the stateful application to configuring auto scaling of the application.

Deploying and managing traditional workloads such as databases and content management systems in a containerized environment calls for a different approach. While it may be easy to package, deploy, manage, and scale contemporary, cloud native applications in Kubernetes, managing a MySQL cluster or a fleet of WordPress containers requires an understanding of how storage, networking, and service discovery work in Kubernetes environment. We will explore these concepts in the context of running a reliable, highly available content management system (CMS) powered by MySQL and WordPress.

Attributes of a Highly Available WordPress Deployment

WordPress is a stateful application the relies on two persistence backends: A file system and MySQL database. To ensure high availability of the application, we need to maximize the uptime of the core PHP application, the underlying storage layer backing the file system, and the data tier powered by MySQL.

When the traffic increases, the core PHP application should elastically scale to meet the load requirements. Each new instance should have read/write access to the file system to access and update the content. The file system capacity should proportionally increase as the content grows. Though not completely elastic, the data tier should also be able to scale-in and scale-out on-demand.

Kubernetes and other container management platforms support scaling the stateless PHP application effortlessly. However, scaling the storage backend and the MySQL data tier is not an easy task. The concept of StatefulSets, introduced in Kubernetes 1.5 precisely address the issue of scaling stateful workloads. We will leverage this concept in designing our deployment strategy.

A Peek at The Stack

The data tier of our application will be powered by Percona XtraDB Cluster. It is one of the most popular open source MySQL clustering technologies powered by Percona Server and Codership Galera. The best thing about Percona's XtraDB Cluster is its support for synchronous, multi-master replication, which delivers high availability. The MySQL Docker images are available on Docker Hub which are maintained by Percona.

The XtraDB cluster relies on etcd for discovering new nodes. The open source etcd is a popular key-value store for shared configuration and service discovery. It powers some of the well-known distributed computing systems such as  Kubernetes, Cloud Foundry, locksmith, vulcand, and Doorman. Coming from CoreOS Inc, the company behind the Container Linux distribution, etcd is highly optimized for container scheduling and management.

For shared storage, we will be configuring a Network File Storage (NFS) backend that is accessible from all the Kubernetes nodes of the cluster.  The static content uploaded through WordPress will be stored in the distributed file system. This approach ensures that the WordPress container is entirely stateless, which can scale rapidly.

We will use the official WordPress image from Docker Hub to create the Kubernetes Pods. The only endpoint exposed to outside world will be the HTTP / HTTPS service associated with the WordPress Pods.

Readying Kubernetes to Run Stateful Workloads

The Persistent Volumes and Claims created in Kubernetes can be based on NFS. They become the storage backbone for MySQL and WordPress Pods. In production scenarios, it is recommended that the NFS share is based on SSD storage running in a network-optimized instance. For this proof of concept, we will configure NFS on the Kubernetes Master.

For higher availability, we will run a minimum of three instances of etcd. We will use the concept of Node Affinity to schedule one and only one etcd Pod in a Node.

The Percona XtraDB Cluster will be configured as a Kubernetes StatefulSet to ensure high availability. StatefulSet mimics the workflow involved in deploying and managing a virtual machine-based cluster. Each Pod gets a stable, unique identifier associated with dedicated persistent storage. It brings the flexibility of ReplicaSet to stateful Pods.

For elastic scaling and rapid scheduling of WordPress Pods, we will configure a ReplicaSet with a minimum of three replicas. Each Pod in the ReplicaSet will be associated with a Volume mounted on NFS. This approach makes WordPress almost stateless. We will also configure Horizontal Pod Autoscaling for the ReplicaSet to enable elasticity.

Setting up the Kubernetes Infrastructure

This walkthrough helps you configure Kubernetes cluster on a set of Vagrant boxes running locally. With a few modifications, the same can be used with mainstream public cloud providers.

Assuming you have a Mac with at least 8GB RAM and 256GB HDD running VirtualBox and Vagrant, you can easily spin up a fully configured three-node Kubernetes cluster in less than 20 minutes. Just run the following commands to get started.

$ export KUBERNETES_PROVIDER=vagrant $ export NUM_NODES=3 $ curl -sS https://get.k8s.io | bash $ cd kubernetes $ ./cluster/kube-up.sh

This would provision four Fedora virtual machines: A Kubernetes Master and three nodes. It would also configure kubectl, the Kubernetes command line interface to work with the cluster.

Typing kubectl get nodes shows the following output.

$ kubectl get nodes NAME STATUS AGE kubernetes-node-1 Ready 1d kubernetes-node-2 Ready 1d kubernetes-node-3 Ready 1d

Once this is done, the next step is to set up shared storage based on NFS. We will configure the Master as the NFS server with the mount point available on all the Nodes.

SSH into the Kubernetes Master (10.245.1.2) and run the commands to configure an NFS share:

cd kubernetes vagrant ssh master sudo -i mkdir -p /opt/data chmod 777 /opt/data echo "/opt/data 10.245.1.2/24(rw,sync,no_root_squash,no_all_squash)" >> /etc/exports systemctl enable --now rpcbind systemctl enable --now nfs-server systemctl start rpcbind systemctl start nfs-server mkdir -p /opt/data/vol/0 mkdir -p /opt/data/vol/1 mkdir -p /opt/data/vol/2 mkdir -p /opt/data/content

SSH into each Node, run the below commands to automount NFS at boot time:

sudo -i systemctl start rpcbind nfs-mountd systemctl enable rpcbind nfs-mountd echo "10.245.1.2:/opt/data /mnt/data nfs rw,sync,hard,intr 0 0" >> /etc/fstab dnf -y install autofs echo "/- /etc/auto.mount" >> /etc/auto.master echo "/mnt/data -fstype=nfs,rw 10.245.1.2:/opt/data" >> /etc/auto.mount systemctl start autofs systemctl enable autofs

This step completes provisioning the cluster with shared storage available at /mnt/data on the Nodes.

Creating Persistent Volumes and Claims

Before we go any further with the setup, let's create Persistent Volumes (PV) and Persistent Volume Claims (PVC) that will be used by the MySQL cluster.

We will first provision three Persistent Volumes that are based on NFS. Notice that the PV definition contains a pointer to the NFS server. The path /opt/data/vol/0 will be explicitly assigned to the PV called mysql-pv0. The remaining two PVs also have an ordinal index of 1 and 2 attached to them. The significance of this convention will be explained when we create the MySQL StatefulSet:

apiVersion: v1 kind: PersistentVolume metadata: name: mysql-pv0 spec: capacity: storage: 1Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Recycle nfs: path: /opt/data/vol/0 server: 10.245.1.2

Each PV will be claimed by a PVC, which will be mapped to the Pod Volume of the StatefulSet:

kind: PersistentVolumeClaim apiVersion: v1 metadata: name: db-mysql-0 spec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi

Execute the following command to provision the storage infrastructure:

$ kubectl create -f https://github.com/janakiramm/wp-statefulset/blob/master/volumes.yml

Let's ensure that the PVs and PVCs are in place:

# For brevity we are only printing a few columns $ kubectl get pv | awk {'print $1" " $2" " $5" "$6'} | column -t NAME CAPACITY STATUS CLAIM mysql-pv0 1Gi Bound default/db-mysql-0 mysql-pv1 1Gi Bound default/db-mysql-1 mysql-pv2 1Gi Bound default/db-mysql-2 # For brevity we are only printing a few columns $ kubectl get pvc | awk {'print $1" " $2" " $3'} | column -t NAME STATUS VOLUME db-mysql-0 Bound mysql-pv0 db-mysql-1 Bound mysql-pv1 db-mysql-2 Bound mysql-pv2

With the infrastructure setup, we are all set to deploy the stateful application.

Deploying etcd

We will configure 3 instances of the distributed key-value store, etcd. Since each instance requires unique configuration, it will be packaged a Pod with a dedicated Service. The Service endpoint will be used by the RAFT consensus protocol for internal communication. We will also expose an internal endpoint for the MySQL cluster to talk to the etcd cluster. To test the etcd deployment, we will also expose a NodePort.

To ensure that no two etcd instances are placed on the same node, we will use node affinity. Before that, we need to add a label to each Kubernetes node. The following commands will assign labels to the nodes.

$ kubectl label nodes kubernetes-node-1 name=node-1 $ kubectl label nodes kubernetes-node-2 name=node-2 $ kubectl label nodes kubernetes-node-3 name=node-3

The Pod definition of each etcd instance will have the node selector parameter that enforces node affinity.

Running the following command to create the etcd cluster:

kubectl create -f https://github.com/janakiramm/wp-statefulset/blob/master/etcd.yml

We should now have three Pods and five Services. The Service, etcd-client is meant only to test the etcd cluster from the host machine. This can be safely deleted later.

$ kubectl get pods NAME READY STATUS RESTARTS AGE etcd-0 1/1 Running 0 10m etcd-1 1/1 Running 0 10m etcd-2 1/1 Running 0 10m # For brevity we are only printing a few columns $ kubectl get svc | awk {'print $1" " $2" " $4'} | column -t NAME CLUSTER-IP PORT(S) etcd 10.247.97.132 2379/TCP,4001/TCP,7001/TCP etcd-0 10.247.188.152 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-1 10.247.213.44 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-2 10.247.0.30 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-client 10.247.180.33 2379:30163/TCP kubernetes 10.247.0.1 443/TCP

Let's verify the etcd configuration by storing and accessing value:

$ curl -L curl -L -X PUT http://10.245.1.2:30163/v2/keys/message -d value="Hello" {"action":"set","node":{"key":"/message","value":"Hello","modifiedIndex":4,"createdIndex":4}} $ curl -L http://10.245.1.2:30163/v2/keys/message {"action":"get","node":{"key":"/message","value":"Hello","modifiedIndex":4,"createdIndex":4}}

This step successfully configured the etcd cluster with node affinity. We will use this to configure the MySQL cluster.

Deploying Percona Cluster

Let's go ahead and deploy 3 instances of MySQL as a StatefulSet. Before we launch the cluster, take a look at the StatefulSet definition. Notice how the volumeMounts parameter is associated with the claim.

During the scheduling of the StatefulSet, Kubernetes will ensure that each Pod is mapped to a Claim based on the ordinal index. The Pod mysql-0 will be mapped to db-mysql-0 which was created earlier. Take a minute to explore the StatefulSet YAML file.

You will also notice that the Percona cluster relies on the etcd service for discovery, which we created in the previous step.

Let's now provision a three-node Percona XtraDB Cluster for MySQL:

$ kubectl create -f https://github.com/janakiramm/wp-statefulset/blob/master/mysql.yml

After a few minutes, you should see three Pods created for us.

$ kubectl get pods NAME READY STATUS RESTARTS AGE etcd-0 1/1 Running 0 10m etcd-1 1/1 Running 0 10m etcd-2 1/1 Running 0 10m mysql-0 1/1 Running 0 4m mysql-1 1/1 Running 0 4m mysql-2 1/1 Running 0 4m

Inspect one of the Pods in the StatefulSet to see the storage and network configuration. We can also check the logs to see that the replication is configured among the MySQL instances:

$ kubectl describe pod mysql-0 $ kubectl logs mysql-0

This step also involves in the creation of an internal and external MySQL service:

# For brevity we are only printing a few columns $ kubectl get svc | awk {'print $1" " $2" " $4'} | column -t NAME CLUSTER-IP PORT(S) etcd 10.247.97.132 2379/TCP,4001/TCP,7001/TCP etcd-0 10.247.188.152 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-1 10.247.213.44 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-2 10.247.0.30 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-client 10.247.180.33 2379:30163/TCP mysql None 3306/TCP mysql-client 10.247.104.122 3306:32236/TCP kubernetes 10.247.0.1 443/TCP

The MySQL endpoint is a headless service used for routing the requests to one of the Pods of the StatefulSet. This will be used by WordPress to talk to the MySQL cluster.

The mysql-client endpoint is created to test the service. It can be deleted after the initial setup. Let's see the MySQL cluster in action by connecting the CLI to the mysql-client endpoint:

# Ignoring the warnings generated by the CLI $ while true; do mysql -h 10.245.1.3 -P 32236 -u root -pk8spassword -NBe 'select @@wsrep_node_address' 2>/dev/null; sleep 1; done 10.246.24.7 10.246.28.4 10.246.28.4 10.246.76.3 10.246.24.7 10.246.76.3 10.246.24.7

The CLI is routed to one of the Pods by the headless service. Each unique IP represents the address of a stateful Pod.

This verifies that MySQL cluster is up and running. You can also SSH into the Kubernetes Master Vagrant box and check the folders (/opt/data/vol/[0,1,2]) that contain the MySQL data and log files.

Deploying WordPress

Since the state is already moved to NFS and MySQL, we can configure WordPress Pods as a ReplicaSet. This will give us flexibility in scaling the application.

Let's create the WordPress ReplicaSet:

$ kubectl create -f https://github.com/janakiramm/wp-statefulset/blob/master/wordpress.yml service "wordpress" created horizontalpodautoscaler "wordpress-scaler" created replicaset "wordpress" created

Let's verify the Pod creation:

$ kubectl get pods NAME READY STATUS RESTARTS AGE etcd-0 1/1 Running 0 10m etcd-1 1/1 Running 0 10m etcd-2 1/1 Running 0 10m mysql-0 1/1 Running 0 4m mysql-1 1/1 Running 0 4m mysql-2 1/1 Running 0 4m wordpress-1rzcr 1/1 Running 0 1m wordpress-2zdql 1/1 Running 0 1m wordpress-8h68h 1/1 Running 0 1m wordpress-gpg8s 1/1 Running 0 1m wordpress-jxhfk 1/1 Running 0 1m

Each Pod of the ReplicaSet mounts the same file system share with read/write access. This will ensure that the content uploaded to WordPress is instantly available to all the Pods.

WordPress.yml file also has a definition for Horizontal Pod Autoscaler (HPA) that will automatically scale the Pods.

$ kubectl get hpa NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE wordpress-scaler ReplicaSet/wordpress 50% 0% 5 10 5m

With the discovery backend (etcd), database (MySQL), and frontend (WordPress) in place, let's go ahead and access it from the browser. Before that let's get the NodePort of the service endpoint.

# For brevity we are only printing a few columns $ kubectl get svc | awk {'print $1" " $2" " $4'} | column -t NAME CLUSTER-IP PORT(S) etcd 10.247.97.132 2379/TCP,4001/TCP,7001/TCP etcd-0 10.247.188.152 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-1 10.247.213.44 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-2 10.247.0.30 2379/TCP,2380/TCP,4001/TCP,7001/TCP etcd-client 10.247.180.33 2379:30163/TCP mysql None 3306/TCP mysql-client 10.247.104.122 3306:32236/TCP kubernetes 10.247.0.1 443/TCP wordpress 10.247.85.101 80:31362/TCP $ open http://10.245.1.3:31362 Disclaimers
  • Since StatefulSet is currently in beta, this architecture is designed to be a proof of concept for StatefulSet. It is not production-ready.
  • For deploying etcd, consider etcd operator from CoreOS.
  • NFS may not be the ideal distributed storage for I/O intensive workloads. The usage of Gluster or Ceph is recommended for this type of deployments.
  • Redis or Memcached is preferred to store PHP sessions. Moving the session state out of WordPress will make the deployment more scalable.
  • Usernames and passwords are hardwired into the YAML definitions. For production deployment, please consider the usage of Kubernetes Secrets.
  • The Cloud Native Computing Foundation, which manages Kubernetes, is a sponsor of The New Stack.

    Feature image: The Eiffel Tower, taken by Peter Y. Chuang, via Unsplash.


    Source: Deploy a Highly Available WordPress Instance as a StatefulSet in Kubernetes 1.5

    No comments:

    Post a Comment