Kubernetes – Persistent Volumes
March 4, 2019Containers are often times short lived. They might scale based on need, and will redeploy when issues occur. This functionality is welcomed, but sometimes we have state to worry about and state is not meant to be short lived. Kubernetes persistent volumes can help to resolve this discrepancy.
Volumes – The Theory
In the Kubernetes world, persistent storage is broken down into two kinds of objects. A Persistent Volume (PV) and a Persistent Volume Claim (PVC). First, lets tackle a Persistent Volume.
Persistent Volumes
Persistent Volumes are simply a piece of storage in your cluster. Similar to how you have a disk resource in a server, a persistent volume provides storage resources for objects in the cluster. At the most simple terms you can think of a PV as a disk drive. It should be noted that this storage resource exists independently from any pods that may consume it. Meaning, that if the pod dies, the storage should remain intact assuming the claim policies are correct. Persistent Volumes are provisioned in two ways, Statically or Dynamically.
Static Volumes – A static PV simply means that some k8s administrator provisioned a persistent volume in the cluster and it’s ready to be consumed by other resources.
Dynamic Volumes – In some circumstances a pod could require a persistent volume that doesn’t exist. In those cases it is possible to have k8s provision the volume as needed if storage classes were configured to demonstrate where the dynamic PVs should be built. This post will focus on static volumes for now.
Persistent Volume Claims
Pods that need access to persistent storage, obtain that access through the use of a Persistent Volume Claim. A PVC, binds a persistent volume to a pod that requested it.
When a pod wants access to a persistent disk, it will request access to the claim which will specify the size , access mode and/or storage classes that it will need from a Persistent Volume. Indirectly the pods get access to the PV, but only through the use of a PVC.
Claim Policies
We also reference claim policies earlier. A Persistent Volume can have several different claim policies associated with it including:
Retain – When the claim is deleted, the volume remains.
Recycle – When the claim is deleted the volume remains but in a state where the data can be manually recovered.
Delete – The persistent volume is deleted when the claim is deleted.
The claim policy (associated at the PV and not the PVC) is responsible for what happens to the data on when the claim has been deleted.
Volumes – In Action
For the demonstration in the lab, we’ll begin by deploying something that looks like the diagram below. The application service and pod won’t change from what we’ve done before, but we need a front end to our application. However, the database pod will use a volume claim and a persistent volume to store the database for our application. Also, if you’re following my example exactly, I’m using an ingress controller for the application, but however you present your application outside of the Kubernetes cluster is fine.
First, we’ll start by deploying a persistent volume through a manifest file. Remember that you can deploy these manifest files by running:
kubectl apply -f [manifest file].yml
Code language: CSS (css)
Here is a sample manifest file for the persistent volume. This is a static persistent volume.
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysqlvol
spec:
storageClassName: manual
capacity:
storage: 10Gi #Size of the volume
accessModes:
- ReadWriteOnce #type of access
hostPath:
path: "/mnt/data" #host location
Code language: PHP (php)
After you deploy your persistent volume, you can view it by running:
kubectl get pv
Code language: JavaScript (javascript)
Now that the volume has been deployed, we can deploy our claim.
NOTE: you can deploy the pv, pvc, pods, services, etc within the same manifest file, but for the purposes of this blog I’ll often break them up so we can focus on one part over the other.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysqlvol
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Once your claim has been created, we can look for those claims by running:
kubectl get pvc
Code language: JavaScript (javascript)
Great, the volume is setup and a claim ready to be used. Now we can deploy our database pod and service. The database pod will mount the volume via the claim and we’re specifying in our pod code, that the volume will be mounted in the /var/lib/mysql directory so it can store our database for mysql.
apiVersion: apps/v1
kind: Deployment
metadata:
name: hollowdb
labels:
app: hollowdb
spec:
replicas: 1
selector:
matchLabels:
app: hollowdb
strategy:
type: Recreate
template:
metadata:
labels:
app: hollowdb
spec:
containers:
- name: mysql
image: theithollow/hollowapp-blog:dbv1
imagePullPolicy: Always
ports:
- containerPort: 3306
volumeMounts:
- name: mysqlstorage
mountPath: /var/lib/mysql
volumes:
- name: mysqlstorage
persistentVolumeClaim:
claimName: mysqlvol
---
apiVersion: v1
kind: Service
metadata:
name: hollowdb
spec:
ports:
- name: mysql
port: 3306
targetPort: 3306
protocol: TCP
selector:
app: hollowdb
Code language: JavaScript (javascript)
And now that we’ve got a working mysql container with persistent storage for the database, we can deploy our app.
NOTE: In this example, my application container, checks to see if there is a database for the app created already. If there is, it will use that database, if there isn’t, it will create a database on the mysql server.
Also, I’m using a secret for the connection string as we’ve discussed in a previous post.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hollowapp
name: hollowapp
spec:
replicas: 3
selector:
matchLabels:
app: hollowapp
strategy:
type: Recreate
template:
metadata:
labels:
app: hollowapp
spec:
containers:
- name: hollowapp
image: theithollow/hollowapp-blog:allin1
imagePullPolicy: Always
ports:
- containerPort: 5000
env:
- name: SECRET_KEY
value: "my-secret-key"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: hollow-secret
key: db.string
---
apiVersion: v1
kind: Service
metadata:
name: hollowapp
labels:
app: hollowapp
spec:
type: ClusterIP
ports:
- port: 5000
protocol: TCP
targetPort: 5000
selector:
app: hollowapp
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: hollowapp
labels:
app: hollowapp
spec:
rules:
- host: hollowapp.hollow.local
http:
paths:
- path: /
backend:
serviceName: hollowapp
servicePort: 5000
Code language: PHP (php)
Once the application has been deployed, it should connect to the database pod and set it up and then start presenting our application. Let’s check by accessing it from our ingress controller.
As you can see from the screenshot below, my app came up and I’m registering a user within my application as a test. This proves that I can submit data to my form and have it stored in the database pod.
After I register a user I can then submit a post just to show that we’re posting data and displaying it through the application. You can see that the first post is successful and it foreshadows our next step.
Now that the app works, lets test the database resiliency. Remember that with replica set, Kubernetes will make sure that we have a certain number of pods always running. If one fails, it will be rebuilt. Great when there is no state involved. Now we have a persistent volume with our database in it. Therefore, we should be able to kill that database pod and a new one will take its place and attach to the persistent storage. The net result will be an outage, but when it comes back up, our data should still be there. The diagram below demonstrates what will happen.
So lets break some stuff!
Lets kill our database pod from the command line.
kubectl delete pod [database pod name]
Code language: JavaScript (javascript)
In the screenshot above you see that the pod was delete, and then I ran a “get pod” command to see whats happening. my DB pod is Terminating and a new one is in a running status already.
Let’s check the state of our application. NOTE: depending on what app you have here, it may or may not handle the loss of a database connection well or not. Mine did fine in this case.
Back in my application, I’m able to login with the user that I registered earlier, which is a good sign.
And once I’m logged in, I can see my previous post which means my database is functioning even though its in a new pod. The volume still stored the correct data and was re-attached to the new pod.
Summary
Well persistent volumes aren’t the most interesting topic to cover around Kubernetes, but if state is involved, they are critical to the resiliency of your applications. If you’re designing your applications, consider whether a pod with persistent volumes will suffice, or if maybe an external service like a cloud database is the right choice for your applications.
Excellent post… Thanks a ton for posting your blog series. I learned a lot from your posts.
Thank for great tutorial but where to get eshanks16/hollowdb-mysql from?
You have brought up a good point. I updated my github repo today with the list of manifests I’ve used. I’ll go back through the posts and make sure there are links to the github repo for easier access.
Thanks for reading. https://github.com/theITHollow/k8s-guide
[…] 13. Persistent Volumes […]
Amazing… Thanks a for posting your blog series. I learned a lot from your posts, but where to get eshanks16/hollowdb-mysql from? still not exist in your repo.
I think I changed those images to theithollow/hollowapp-blog:dbv1 and it should be available.