Install and configure Spark History Server (SHS) on Kubernetes K8s
Category : Kubernetes , Spark
We always struggle like how to install and configure SHS on Kubernetes with gas event log. So here is your solution.
Create a shs-gcs.yaml deployments file which will be used to deploy shs service.
pvc:
enablePVC: false
existingClaimName: nfs-pvc
eventsDir: “/”
nfs:
enableExampleNFS: false
pvName: nfs-pv
pvcName: nfs-pvc
gcs:
enableGCS: true
secret: history-secrets
key: tc-sc-bi-bigdata-ifwk-new-dev-48a2f0a984bb.json
logDirectory: gs://tc-sc-bi-bigdata-ingestion-dev-spark-on-k8s/eventsLogs/
******************************** Step 1 ********************************
(base) saurabhkumar@Saurabhs-MacBook-Pro stats % gcloud container clusters get-credentials spark-on-gke
Fetching cluster endpoint and auth data.
kubeconfig entry generated for spark-on-gke.
(base) saurabhkumar@Saurabhs-MacBook-Pro stats % kubectl cluster-info
Kubernetes master is running at https://10.2.4.110
GLBCDefaultBackend is running at https://10.2.4.110/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
KubeDNS is running at https://10.2.4.110/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://10.2.4.110/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
******************************** Step 2 ********************************
(base) saurabhkumar@Saurabhs-MacBook-Pro stats % kubectl get secrets
NAME TYPE DATA AGE
default-token-2v6p5 kubernetes.io/service-account-token 3 71d
spark-sa Opaque 1 70d
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % kubectl create secret generic history-secrets –from-file=gcp-project-48a2f0a984bb.json
secret/history-secrets created
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % kubectl get secrets
NAME TYPE DATA AGE
default-token-2v6p5 kubernetes.io/service-account-token 3 71d
history-secrets Opaque 1 5s
sh.helm.release.v1.spark-history-server-1624358382.v1 helm.sh/release.v1 1 11m
spark-history-server-1624358382-token-mlh5j kubernetes.io/service-account-token 3 11m
spark-sa Opaque 1 70d
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % kubectl describe secrets/history-secrets
Name: history-secrets
Namespace: default
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
gcp-project-48a2f0a984bb.json: 2358 bytes
******************************** Step 3 ********************************
(base) saurabhkumar@Saurabhs-MacBook-Pro stats % helm repo add stable https://charts.helm.sh/stable
“stable” already exists with the same configuration, skipping
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % helm list -n ifw-reloaded
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
spark-history-server-1616415984 ifw-reloaded 1 2021-03-22 17:56:34.463601 +0530 IST deployed spark-history-server-1.4.3 2.4.0
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % helm install stable/spark-history-server –values shs-gcs.yaml –generate-name
WARNING: This chart is deprecated
NAME: spark-history-server-1624360585
LAST DEPLOYED: Tue Jun 22 16:46:32 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Get the application URL by running the following commands. Note that the UI would take a minute or two to show up after the pods and services are ready.
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running ‘kubectl -n default get svc -w spark-history-server-1624360585′
export SERVICE_IP=$(kubectl get svc –namespace default spark-history-server-1624360585 -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
NOTE: If on OpenShift, run the following command instead:
export SERVICE_IP=$(oc get svc –namespace default spark-history-server-1624360585 -o jsonpath='{.status.loadBalancer.ingress[0].hostname}’)
echo http://$SERVICE_IP:map[name:http-historyport number:18080]
******************************** Step 4 ********************************
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 71d
spark-history-server-1624360585 LoadBalancer 10.1.255.20 <pending> 18080:31739/TCP 17s
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 71d
spark-history-server-1624360585 LoadBalancer 10.1.255.20 10.1.0.113 18080:31739/TCP 54s
******************************** Step 5 ********************************
This is to uninstall shs in one go.
(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % helm uninstall spark-history-server-1616415984 -n ifw-reloaded
Error: uninstallation completed with 2 error(s): clusterrolebindings.rbac.authorization.k8s.io “spark-history-server-1616415984-crb” is forbidden: User “system:serviceaccount:default:ifw-team” cannot delete resource “clusterrolebindings” in API group “rbac.authorization.k8s.io” at the cluster scope; clusterroles.rbac.authorization.k8s.io “spark-history-server-1616415984-cr” is forbidden: User “system:serviceaccount:default:ifw-team” cannot delete resource “clusterroles” in API group “rbac.authorization.k8s.io” at the cluster scope
Please feel free to give your valuable feedback.