Introduction
The Arrikto Enterprise Kubeflow (EKF) distribution extends the capabilities of the OSS Kubeflow platform with additional automation, reproducibility, portability, and security features. To achieve this EKF has to perform additional system activities behind the scenes. Some examples of system activity:
- Improve Pipeline Resiliency.
- Manage Storage Options & Optimizations.
- Facilitate Environment Reproducibility and Portability.
- Share Experiments at Scale.
As a result, you must consider the following resource groups when sizing your EKF Clusters:
- CPU
- RAM
- Local Disk Spac
CPU
For a full EKF deployment, you should reserve: 10.252 cores + 1.935 cores per node
RAM
For a full EKF deployment, you should reserve: 8.04 GiB + 3.338 GiB per node
Local Disk Space
Arrikto's documentation contains extensive information on how to approach Rok local disk space management, please refer to his section of our documentation for more information: https://docs.arrikto.com/develop/user/rdm.html
Summary
For more information on environment management please refer to our detailed documentation and consider using "Scale-in Protection."
Comments
0 comments
Please sign in to leave a comment.