The following sequence diagram describes the high level steps that are carried out in a sequence during the provisioning process.Customers can optionally automate the entire sequence using Rafay's APIs or automation tools.
Watch a video of provisioning of a "Converged, Multi Master" Rafay MKS cluster on "CentOS" with only Local Storage".
STEP 1: Select Cluster Configuration¶
Review the supported cluster configurations and select your desired cluster configuration. This will determine the number of nodes you need to prepare to initiate cluster provisioning.
|Type||Number of Initial Nodes|
|Converged, Single Master||1 Nodes (1 Master/Worker)|
|Dedicated, Single Master||2 Nodes (1 Master + 1 Worker)|
|Converged, Multi Master||3 Nodes (3 Masters + 1 Worker)|
|Dedicated, Multi Master||4 Nodes (3 Masters + 1 Worker)|
STEP 2: Prepare Nodes¶
Create VMs or bare metal instances compatible with the infrastructure requirements.
- Ensure that you have SSH access to all the instances/VMs
Ensure you have the exact number of nodes for initial provisioning as per the cluster configuration from the previous step. Additional worker nodes should be added after the cluster is successfully provisioned
STEP 3: Create a Cluster¶
- Login into the Rafay Console.
- Navigate to the Project where you would like the cluster provisioned.
- Click on New Cluster
- Select "Create a New Cluster" and click Continue
- Select "Environment" as "Data center/Edge"
- Select "Linux Installer"
- Give it a name and click Continue
- Provide a "Unique Name" for the cluster
- Select a location for the cluster from the drop down list
- Select cluster blueprint from the drop down.
- If you created a custom blueprint, select it and select the blueprint version.
- If not, accept the default blueprint provided by Rafay
- Select the kubernetes version that you want to deploy
- Select the OS and Version you used for the nodes
- Select GlusterFS if you require distributed storage.
- If selecting multiple storage types, select the default storage class.
- Enable "Approve Nodes Automatically" if you do not require an approval gate for nodes to join the cluster
- Enable Install GPU drivers if your nodes support GPUs and you want Rafay to provision required drivers
- Select Multi Master if you selected this cluster configuration
- Select Dedicated Master if you selected this cluster configuration
Auto Approval of nodes helps streamline the cluster provisioning and expansion workflows by eliminating the "manual" approval gate for nodes to join the cluster.
STEP 4: Download Conjurer and Secrets¶
- Review the Node Installation Instructions section on the Rafay Console
- Download the cluster bootstrap binary (i.e.Rafay Conjurer)
- Download the cluster activation secrets (i.e. passphrase and credential files)
- SCP the three (3) files to the nodes you created in the previous step
Note that the activation secrets (passphrase and credentials) are unique per cluster and you cannot reuse this for other clusters.
An illustrative example is provided below. This assumes that you have the three downloaded files in the current working directory. The three files will be securely uploaded to the “/tmp” folder on the instance.
$ scp -i <keypairfile.pem> * [email protected]<Node's External IP Address>:/tmp
STEP 5: Perform Preflight Checks¶
It is strongly recommended to perform the automated preflight tests on every node to ensure that the node has "compatible" hardware, software and configuration. The following preflight test are currently performed:
|#||Description and Type of Preflight Checks|
|1||Is the node running a compatible OS and Version?|
|2||Does the node have minimum CPU Resources?|
|3||Does the node have minimum Memory Resources ?|
|4||Does the node have outbound Internet Connectivity?|
|5||Is the node able to connect to the Rafay Controller?|
|6||Is the node able to perform a DNS Lookup of Rafay Controller?|
|7||Is the node able to establish a MTLS connection to the Rafay Controller ?|
|8||Is the node's time Synchronized with NTP?|
|9||Does the node have minimum and compatible storage?|
|10||Is docker already installed on the node?|
|11||Is Kubernetes already installed on the node?|
- SSH into the node and run the installer using the provided passphrase and credentials.
- From the node installation instructions, copy the preflight check command and run it
An illustrative example is shown below where the "preflight checks" detected an incompatible node for provisioning.
tar -xjf conjurer-linux-amd64.tar.bz2 && sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster-credentials.pem" -t [+] Performing pre-tests [+] Operating System check [+] CPU check [+] Memory check [+] Internet connectivity check [+] Connectivity check to rafay registry [+] DNS Lookup to the controller [+] Connectivity check to the Controller !INFO: Attempting mTLS connection to salt.core.stage.rafay-edge.net:443 [+] Multiple default routes check [+] Time Sync check [+] Storage check !WARNING: No raw unformatted volume detected with more than 50GB. Cannot configure node as a master or storage node. [+] Detected following errors during the above checks !ERROR: System Memory 28GB is less than the required 32GB. !ERROR: Detected a previously installed version of Docker on this node. Please remove the prior Docker package and retry. !ERROR: Detected a previously installed version of Kubernetes on this node. Please remove the prior Kubernetes packages (kubectl, kubeadm, kubelet,kubernetes-cni, etc.) and retry.
- If there are no errors, proceed to the next step
- If there are warnings or errors, fix the issues, run the preflight check before proceeding to the next step
STEP 6: Run Conjurer¶
- From the node installation instructions, copy the provided command to run the Rafay Conjurer binary
- SSH into the nodes and run the installer using the provided passphrase and credentials.
An illustrative example provided below
sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster.pem -t [+] Initiating edge node install [+] Provisioning node [+] Step 1. Installing node-agent [+] Step 2. Setting hostname to node-72djl2g-192-168-0-20-onpremcluster [+] Step 3. Installing credentials on node [+] Step 4. Configuring node-agent [+] Step 5. Starting node-agent [+] Successfully provisioned node
The conjurer is a “cluster bootstrap agent” that connects and registers the nodes with the Rafay Controller. Information about the Controller and authentication credentials for registration is available in the activation secrets files.
Once this step is complete, the node will show up on the Rafay Console as DISCOVERED.
STEP 7: Approve Node¶
This is an optional approval step that acts as a security control to ensure that administrators can inspect and approve a node before it can become part of the cluster.
- Click on Approve button to approve the node to this cluster
- In a few seconds, you will see the status of the node being updated to “Approved" in the Rafay Console
- Once approved, the node is automatically probed and all information about the node is presented to the administrator on the Rafay Console.
STEP 8: Configure Node¶
This is a mandatory configuration step that allows the infrastructure administrator to specify the “role” for the node. They will also provide critical information such as Internet IP address and storage details for the node.
Without the configuration step, cluster provisioning cannot be initiated.
- Click on Configure and follow the instructions provided by the wizard
- Provide at least one Ingress IP for the cluster
- Select the “Storage” role and select the storage location (unformatted, raw block device) from the drop down list.
- Click Save
STEP 9: Initiate Provisioning¶
At this point, we have provided everything necessary where the Rafay Controller can start provisioning Kubernetes and all required software add-ons. These will be automatically provisioned and configured to operationalize the cluster.
- Click on Provision
- A progress bar is displayed showing progress as the software is downloaded, installed and configured on all the nodes.
- Once provisioning is complete, the cluster will report itself as “READY" to accept workloads.
The end-to-end provisioning process can take 10-40 mins. This is dependent on the number of nodes in your cluster and the Internet bandwidth available to your nodes
Once provisioning is complete, users will be presented with a cluster card on the Rafay Console.
- Click on the cluster name and select the Configuration tab to view the provisioned cluster's configuration details