Deployment Requirements
The following document describes different sizing scenarios for the core platform components, that can be used to get a first guess of the needed hardware and software for running the data platform.
Important to consider is the fact, that the recommendations regarding sizing can only be a starting point for the right sizing of the platform. The concrete figures depend on the use cases executed on the platform, the generated load from the use cases, and the hardware architecture and generation on which the platform is run.
General requirements
A few requirements have to be met/prepared in order to create the installation process.
- A running Kubernetes cluster, we recommend a managed Kubernetes service.
- An Ingress controller installed on the cluster as well as a cert-manager.
- A general admin kubeconfig with full access rights.
- A (wildcard) domain name for the Ingress controller.
- An external email server, that can be used as email relay with authentication.
- At least one usable RWO-Storage Class.
This tutorial shows how to set up a minimal Kubernetes testing environment.
It is advised to create the configuration (inventory) first, so you get an understanding which namespace(s) you might need prepared by the k8s-hoster and which additional information you need.
Software and Version requirements
Basic k8s knowledge (what is a namespace, a storage class, an operator) is needed in order to progress through this document.
The installation is designed to run on a more or less vanilla Kubernetes. We do not expect any vendor specific services.
What the installation expects, is:
- Ingress NGINX Controller - Traeffic is currently not supported
- To make sure that
snippetAnnotation
is activated for the Ingress NGINX Controller, it is necessary to run a helm update:helm upgrade ingress-nginx --repo https://kubernetes.github.io/ingress-nginx --namespace ingress-nginx --create-namespace ingress-nginx --set controller.allowSnippetAnnotations=true
- To make sure that
- Cert-Manager
- Important: Different environments may require additional steps to get Cert-Manager working. See this list of tutorials.
- CoreDNS
The currently tested K8s Version are:
- 1.25
- 1.26
- 1.27
- 1.28
Infrastructure requirements
The data platform has no direct dependencies to any hardware specific functionality. The only requirement is a x86 64 Bit CPU Architecture, because at the time this document is written, all container images that are used are available for x86 CPUs. Many of them are available for ARM based CPUs, too - but not all. This may change in future.
The platform can run as a sandbox environment for evaluation scenarios on a single host with direct filesystem usage of the host system. This is only suitable for feature testing, not for running a scalable production platform.
The minimum number of hosts for a full Kubernetes environment is three. Three and more nodes can provide a high available environment for running the platform. This requirement is defined by Kubernetes itself, because etcd and other control plane services need at least three nodes to run in a HA-scenario.
In a managed Kubernetes scenario, the control plane and etcds can be provided as shared services by the provider. In this case, only worker nodes must be provided - the recommendation is three or more, too.
Hosting Requirements
The following chapters define hardware requirements for different scenarios of platform usage.
Sandbox (1 Node Cluster)
For installing and running a sandbox, a one-node-cluster can be used.
In this case, we recommend the following hardware specs:
- 8-10 vCPUs (can be shared)
- min. 32 GB RAM
- min. 600 GB SSD Storage
Minimum (3 Node Cluster)
For installing and running the minimum platform, a setup with three or more nodes can be used. Any managed Kubernetes Solution will work, too. We recommend starting with three nodes and add more when needed.
In this case, we recommend the following hardware specs per Node:
- 8-10 vCPUs (can be shared, dedicated preferred)
- min. 32 GB RAM
- min. 300 GB SSD Storage
Additionally, a network storage of at least 500 GB with SSD Performance should be provided. The network storage must support Kubernetes ReadWriteOnce and ReadWriteMany access. The node internal storage will be used for system storage. The amount of needed network storage depends on the implemented use cases. This must be decided on a per-project base.
Standard (3 Node Cluster)
For installing and running the standard platform, a setup with three or more nodes can be used. Any managed Kubernetes Solution will work, too. We recommend starting with three nodes and add more when needed.
In this case, we recommend the following hardware specs per Node:
- min. 12 dedicated Cores
- min. 64 GB RAM
- min. 300 GB SSD Storage
Additionally, a network storage of at least 500 GB with SSD Performance should be provided. More can be required depending on use cases. The network storage must support Kubernetes ReadWriteOnce and ReadWriteMany access. The node internal storage will be used for system storage. The amount of needed network storage depends on the implemented use cases. This must be decided on a per-project base.
Additional requirements
With each use case additional resource can be required. Due to the fact that most use cases generate peak resource request but use over time only a part of the peak resources, there should be an efficient way to provide more resources.
Stages
For Proof-of-Concept setups only one sandbox installation is recommended.
For Implementation Projects, we recommend at least two stages for the development process:
- Staging: For integration and testing work
- Production: For the productive usage
Additional central DEV-Stages can be used for early testing but are not needed in each case.
The development work is normally done on decentral DEV Instances per Developer.
Final Recommendations
In total we recommend a hardware sizing that follows the trends of Green IT. This means an over-provisioning should be avoided. In most hosting scenarios, additional resources can be booked on short notice. The static up-scaling can be achieved by this. A good starting point when use cases are not clear at the beginning, is the Minimum sizing.
Best way to scale is dynamic scaling
but this depends on the hosting provider capabilities.