Skip to content

Overview

Introduction

In environments where clusters have a very high number of namespaces and workloads, sometimes in the hundreds or thousands, certain system applications such as the rafay-connector may consume more memory than allocated. This can lead to stability issues, including unexpected termination of the application due to memory constraints.

To address this, resource allocations for CPU and memory can be adjusted by manually editing the deployment configurations of these applications. However, this manual process can be tedious and error-prone, especially when applied across multiple clusters. To simplify this, a workaround enables configuring resource overrides for specific applications delivered through the default blueprint. This approach provides greater flexibility and helps improve stability for clusters operating at scale. For detailed steps, see Workaround for Resource Overrides.

This workaround is currently applicable only to the following applications:

  • rafay-prometheus-adapter
  • rafay-prometheus-helm-exporter
  • rafay-prometheus-kube-state-metrics
  • rafay-prometheus-metrics-server
  • rafay-prometheus-node-exporter
  • rafay-prometheus-server
  • controller-manager-v3
  • edge-client
  • ingress-controller-v1-controller
  • rafay-connector-v3
  • velero
  • opa-gatekeeper
  • rook-ceph
sequenceDiagram
    participant Cluster
    participant SystemApp
    participant Admin
    participant Blueprint

    Cluster->>SystemApp: High number of workloads
    SystemApp-->>Cluster: Increased memory usage

    Admin->>Blueprint: Configure resource overrides
    Blueprint->>SystemApp: Apply CPU/memory limits

    SystemApp-->>Cluster: Runs more reliably