ML Lab Installation¶
Preparation¶
Install Docker
ML Lab requires Docker to be installed on your host machine.
Choose Service Runtime¶
ML Lab installs and orchestrates various services in the form of Docker container. At the moment, we offer the following installation modes:
- Docker Local Mode: Deploys all services on the same machine as ML Lab. Easy to setup and manage, but does not scale.
- Kubernetes Mode: Distributes all services across a cluster of nodes via Kubernetes. For more information about Kubernetes, please refer to the official guide.
Tip
If you are not sure which mode to use, we recommend the local mode.
Installation¶
If you have special requirements (e.g. data persistance, ssl, hardware restrictions), please consult the configuration section before installing ML Lab.
Configuration¶
Parameters¶
The container can be configured with following environment variables (--env
):
Variable | Description | Default |
---|---|---|
LAB_ACTION | Available actions: install, uninstall, serve, update, update-full | install |
LAB_PORT | Main port that the ML Lab instance is accesible from. | 8091 |
LAB_BASE_URL | If you deploy ML Lab behind a proxy, you can define a base url. It must not end with a slash. For example, if the web app should be accessible behind /test/app instead of /app, the LAB_BASE_URL would be defined as /test. | |
SERVICES_RUNTIME | Determines the technology used for container orchastration. Currently supported: local, kubernetes | local |
JWT_SECRET | The secret used for the authentication layer. | (required); at least 32 characters long |
SERVICE_SSL_ENABLED | Set to true to enable ssl encryption (HTTPS support). If no certificate is provided, a self-signed certificate is generated; this certificate expires within a year, so the ML Lab container has to be re-created for fresh certificates. | false |
LAB_SSL_ROOT | The path on the host system to the folder containing a custom ssl certificate. The folder must contain a cert.crt and cert.key file. This folder is mounted into the ML Lab container, so to renew the certificate it must be replaced on the host system and the ML Lab container has to be restarted. If the LAB_SSL_ROOT variable is not set, ML Lab will look for a volume named lab_ssl. If no such named volume exists, a self signed certificate will be generated. The LAB_SSL_ROOT variable is only valid for docker local mode as secretes are used on Kubernetes. | (optional) |
LAB_NAMESPACE | The namespace used for the ML Lab installation. At the moment, we suggest to not change this value. | lab |
SERVICES_MEMORY_LIMIT | The memory limit (in GB) that every ML Lab managed service, including workspaces, is restricted to. | 100 |
SERVICES_CPU_LIMIT | The CPU limit (number of CPUs) that every ML Lab managed service, including workspaces, is restricted to. | 8 |
SERVICES_STORAGE_LIMIT | The storage limit (in GB) that every ML Lab managed service, including workspaces, is restricted to. For Docker-local and Kubernetes custom cluster setup, there is only a soft enforcement in the workspace. In case of the Kubernetes managed cluster setup, a volume with this size is created on the respective infrastructure and mounted into the service pod (check this as this can lead to substantial costs!). | 100 |
MAX_CONTAINER_SIZE | The maximum size a ML Lab managed container is allowed to grow in Gigabytes. If you add data to a container's writeable layer - basically a path where no volume is mounted - the container size grows. If not set, a container can theoretically consume all of the host's storage. For Docker-local, ML Lab contains a REST-method (/containers/shutdown-disk-exceeding) that will remove all non-core containers that exceed this limit (in Docker-local mode, currently only workspaces are removed). In Kubernetes mode, the native functionality of ephemeral-storage limit is used (for all non-core pods). Set to '-1' to disable it. | 100 |
LAB_DATA_ROOT | Basic mount path where all data is stored. | (optional) |
LAB_DATA_WORKSPACE_ROOT | Basic mount path where all workspace data is stored. Overwrites the LAB_DATA_ROOT variable for workspace mount. | (optional) |
LAB_DEBUG | If true, ML Lab will expose all ports and print out debug logs. | false |
LAB_SSH_ENABLED | Enable ssh jumphost if ML Lab should publish port 22 on startup and start an SSH server. The jumphost functionality can be used so that users can ssh into their own workspace. SSHing into the workspace container itself is not possible. | true |
ALLOW_SELF_REGISTRATIONS | If true, ML Lab will allow user self registrations via register dialog or automatically create users if external OIDC authentication is enabled. | true |
WORKSPACE_BACKUP | If true, workspaces will be automatically backuped every day to the ML Lab Storage and restored if necessary. | false |
WORKSPACE_IMAGE | Docker image used for user workspaces. Should be build on top of the ml-workspace base image. | (optional) |
LAB_MANAGED_KUBERNETES | Specifies whether it is running on a managed Kubernetes cluster instance. For more information, please have a look at the Section about Kubernetes managed cluster setup. | false |
LAB_STORAGE_CLASS | Only relevant for Kubernetes managed cluster setup. The storage class name that is used by the persistent volume claims for issuing the volumes mounted into the pods of Minio, Mongo, and the Workspaces. | lab-storageclass |
LAB_PVC_MINIO_STORAGE_LIMIT | Only relevant for Kubernetes managed cluster setup. It defines the size of the volume created and mounted into the Minio pod in GB. Minio is the storage for uploaded files such as datasets and models. | 100 |
LAB_PVC_MONGO_STORAGE_LIMIT | Only relevant for Kubernetes managed cluster setup. It defines the size of the volume created and mounted into the Mongo pod in GB. Mongo contains the data for experiments and users. | 5 |
LAB_IMAGE_REGISTRY | The registry prefix from where the images lab-service and lab-model-service are loaded. If you, for example, deploy it in Azure you might not have access to the internal Artifactory registry. Make sure to push the images to the defined registry. If you don't use the one-click model-deployment feature, you don't have to push the lab-model-service image. | Default DockerHub registry |
LAB_EXTERNAL_OIDC_AUTH_URL | The authorization endpoint used for external OIDC authentication. The client will be redirected to this page to authenticate with the external authentication provider. For detailed information see: External OIDC Authentication | (optional) |
LAB_EXTERNAL_OIDC_TOKEN_URL | The token endpoint used for external OIDC authentication. It will be used by the backend to obtain the OIDC identity token in exchange for an authorization code. For detailed information see: External OIDC Authentication | (optional) |
LAB_EXTERNAL_OIDC_CLIENT_ID | The OAuth 2.0 client identifier used for external OIDC authentication. For detailed information see: External OIDC Authentication | (optional) |
LAB_EXTERNAL_OIDC_CLIENT_SECRET | The OAuth 2.0 client secret used for external OIDC authentication. For detailed information see: External OIDC Authentication | (optional) |
Proxy¶
If a proxy is required, you can pass the proxy configuration via the http_proxy
and no_proxy
environment variables. For example: --env http_proxy=http://myproxy:1234
Install with Docker Local Mode¶
Note: The install commands here are currently based on the scenario where you built the Docker images locally yourself. For seeing how to use ready images, please check the Readme of the GitHub repository here
To start ML Lab in a single-host (local) deployment execute:
docker run --rm --env LAB_PORT=8091 -v /var/run/docker.sock:/var/run/docker.sock --env LAB_ACTION=install lab-service:latest
After the installation is finished (after several minutes depending on internet speed), visit http://<HOSTIP>:8091
and login with admin:admin
.
Enable SSL
For SSL setup, create the certificate (files must be called cert.crt and cert.key) and specify their path on the host machine via the LAB_SSL_ROOT
environment variable. Additionally you need to set SERVICE_SSL_ENABLED
to true:
docker run --rm --env LAB_PORT=8091 \
--env SERVICE_SSL_ENABLED=true \
--env LAB_SSL_ROOT=/workspace/ssl \
-v /var/run/docker.sock:/var/run/docker.sock \
lab-service:latest
Alternatively, instead of specifying LAB_SSL_ROOT
, the certificate can be provided in a docker volume named lab_ssl
.
If you don't provide a custom certificate, a self-signed certificate is generated and used. Be aware that applications such as your browser might not trust the certificate.
Install with Kubernetes Mode (Own Cluster)¶
These steps are for a custom cluster that does not use a hyperscaler / automated infrastructure in the background. If you have a managed cluster (such as AWS EKS), then please follow the steps in the next Section.
Install Kubernetes
For Kubernetes Mode, please make sure that your server/cluster is installed with Docker and Kubernetes (Version >=1.11). You can find more information about Kubernetes installation here.
Preparation
For the Kubernetes deployment, a few minor steps to prepare the host have to be made such as creating a directory where the data is stored etc. since Kubernetes does not have the concept of Docker's named volumes. The Kubernetes version of the ML Lab needs the kube config of the cluster as well as the access to the Docker socket.
# label the master node with role=master, as we use it in ML Lab
kubectl label nodes <name-of-master-node> role=master
# install nfs-common for mounting nfs-service in workspace
apt-get install nfs-common
# create the data root directory that will be used by Kubernetes
# Default is `/workspace/lab/<namespace>/data`
# Hence, when creating a Lab for default namespace 'lab':
mkdir -p /workspace/lab/lab/data
Installation
# On Mac: in ~/.kube/config for 'server' field replace 'localhost' with 'docker.for.mac.localhost'
docker run --rm --env LAB_PORT=30001 \
-v /root/.kube/config:/root/.kube/config \
-v /var/run/docker.sock:/var/run/docker.sock \
--env SERVICES_RUNTIME=k8s \
--env LAB_DATA_ROOT=/workspace/lab/stulabdio/data \
lab-service:latest
After installation is finished (after several minutes depending on intranet speed), visit http://<HOSTIP>:30001
and login with admin:admin
.
Tip
When a container, e.g. the Workspace container, is launched on a node the first time, it's Docker image has to be pulled to that node. This can take some time in which the user sees a loading screen. To prevent this, you can pull the images manually for each node beforehand.
Enable SSL
For SSL setup, create the certificates and mount them into the container's directory at /resources/ssl
(-v /workspace/ssl:/resources/ssl:ro
) and start with --env SERVICE_SSL_ENABLED=true
:
docker run --rm --env LAB_PORT=30001 \
-v /root/.kube/config:/root/.kube/config \
-v /var/run/docker.sock:/var/run/docker.sock \
--env SERVICES_RUNTIME=k8s \
--env LAB_DATA_ROOT=/workspace/lab/lab/data \
--env SERVICE_SSL_ENABLED=true \
-v /workspace/ssl:/resources/ssl:ro \
lab-service:latest
The files have to be named cert.crt
and cert.key
.
Install with Kubernetes (Managed Cluster such as AWS EKS)¶
TODO
Difference to Custom Cluster Setup¶
In the Managed Cluster scenario, the biggest difference is that you don't have to take care of the volumes as here we leverage persistent volume (claims) to automatically issue volumes in the background. Hence, no need for the NFS service here.
In this setup mode, ML Lab accesses the cluster via its ServiceAccount permissions and not via a mounted kube-config.
Costs
As volumes are created automatically on the infrastructure based on the Kubernetes persistent volume claims, make sure to have an eye on your costs and set the size for the volume via the respective environment variables accordingly. Even when deleting the Kuberentes PersistentVolume and PersistentVolumeClaim resources, the actual volumes might still exist on the cloud provider and have to be deleted manually.
After Installation¶
Please makes sure to change the admin
password (User-menu -> Change Password
) after the installation was successful. We also recommend to activate a data backup job as described here. Check out the administration section in this documentation for more information on how to update/uninstall Lab, or manage services.