Please refer to the following commands to perform a health check or investigation into Rok.
1. Check if the Rok cluster is initialized:
$ kubectl get job -n rok rok-init -o jsonpath='{.status.succeeded}'
2. Check the status of the RokCluster custom resource:
$ kubectl get rokcluster -n rok rok -o wide
3. Inspect the events on the RokCluster custom resource:
$ kubectl describe rokcluster -n rok rok
4. Check the status of the Rok DaemonSet:
$ kubectl get daemonset -n rok -o wide
5. Inspect the events on the Rok DaemonSet:
$ kubectl describe daemonset -n rok rok
6. Check the status of the Rok Pods:
$ kubectl get pods -n rok -l app=rok -o wide
7. Inspect the events and logs of a specific Rok Pod:
$ kubectl describe pod -n rok rok-XYZ
$ kubectl logs -n rok rok-XYZ -f
8. Check if the Rok cluster has a master:
$ ROK_MASTER=$(kubectl get pod -n rok --selector rok-cluster=rok,role=master
-o=jsonpath='{.items[0].metadata.name}') && echo OK || echo FAIL
9. Verify that Rok is the default storage provider:
$ kubectl get storageclass rok
rok (default) rok.arrikto.com Delete WaitForFirstConsumer false 2d
10. Verify that Rok is the only snapshot provider:
$ kubectl get volumesnapshotclass rok
rok 2d
11. (Advanced) Check that Rok daemons are running in the master Pod:
$ kubectl exec -ti -n rok $ROK_MASTER -- rok-daemon status
[ ok ] controllerd.0 is running.
[ ok ] filed.0 is running.
[ ok ] s3d.0 is running.
[ ok ] composerd.0 is running.
[ ok ] hasherd.0 is running.
[ ok ] controllerd.root is running.
[ ok ] throwerd.0 is running.
[ ok ] gwd.0 is running.
[ ok ] gw-policyd.0 is running.
[ ok ] gw-taskd.0 is running.
[ ok ] gw-statsd.0 is running.
[ ok ] electiond.0 is running.
[ ok ] masterd.0 is running.
12. (Advanced) Check the list of Rok cluster members:
$ kubectl exec -ti -n rok $ROK_MASTER -- rok-cluster --cluster rok.rok.svc.cluster.local --etcd-endpoint http://rok-etcd.rok:2379 member-list
--------------------------------------------------------------------
Member ID Membership Status Management IP
--------------------------------------------------------------------
rok-node-1 joined 10.124.1.48
rok-node-2 joined 10.124.0.59
--------------------------------------------------------------------
13. (Advanced) Check if the config of a Rok cluster member is up-to-date:
$ kubectl exec -ti -n rok rok-XYZ -- rok-config
--cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
check
2022-03-10T13:51:37.384756+0000 rok-config
pid=12403/tid=12403/pytid=140634629121472 status:140 [INFO]
Member `rok-node-1' is up to date
14. Check the config version of your Rok cluster:
$ kubectl exec -ti -n rok svc/rok -- rok-config
--cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
version
Current version [Cluster-wide]: v010300_0003
Desired version [This appliance]: v010300_0003
15. View the default configuration of Rok’s automatic GC:
15a. View cron schedule for garbage collection of Rok tasks:
$ kubectl exec -ti -n rok svc/rok -- rok-config --cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
get gw.task_gc.all.cronspec
0 3 * * *
15b. View the time interval after which a Rok task is deleted:
$ kubectl exec -ti -n rok svc/rok -- rok-config --cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
get gw.task_gc.all.max_age
1 month
15c. View cron schedule for garbage collection of successful Rok tasks:
$ kubectl exec -ti -n rok svc/rok -- rok-config --cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
get gw.task_gc.success.cronspec
0 4 * * *
15d. View the time interval after which a successful Rok task is deleted:
$ kubectl exec -ti -n rok svc/rok -- rok-config --cluster rok.rok.svc.cluster.local
--etcd-endpoint http://rok-etcd.rok.svc.cluster.local:2379
get gw.task_gc.success.max_age
1 week
16. Verify that Rok’s automatic GC is active by inspecting the corresponding log file. The content of the log file might vary depending on the usage of your Rok cluster and its current configuration:
$ kubectl exec -ti -n rok service/rok -c rok -- cat /var/log/rok/gc.log
2022-04-12T11:10:04.840975+0000 rok-gc pid=10621/tid=10621/pytid=140302586820032
gc:293 [INFO] Starting Garbage Collection at epoch 13
...
2022-04-12T11:10:04.850722+0000 rok-gc pid=10621/tid=10621/pytid=140302586820032
gc:410 [INFO] Completed GC of chocks
2022-04-12T11:10:04.863778+0000 rok-gc pid=10621/tid=10621/pytid=140302586820032
gc:528 [INFO] All done! Exiting.
Comments
0 comments
Article is closed for comments.