Skip to content

Infrastructure Templates for Generative AI on AWS

We constantly hear from our customers about wanting their developers to experiment with Generative AI. No organization wants to be left behind and they are all trying to find ways to empower their developers and application teams to be able to experiment with use cases powered especially by Generative AI.

According to recent Gartner research, >80% of enterprises will have used Generative AI APIs or Deployed Generative AI-Enabled Applications by 2026.

We have been listening to our customers and are happy to announce Rafay's Templates for AI & Generative AI. Platform teams can now provide their developers with a self service experience for Gen AI infrastructure enabling developers to experiment with new and innovative Generative AI use cases.

Gen AI Logo


Customer Requirements

In our conversations with platform teams, developers and key technology partners, a few key requirements bubbled up to the top as critical requirements to provide this self service experience with transparent enforcement of critical controls.

  1. Self Service
    This was emphasized as the most important. They wanted a frictionless experience for their developers because they do not want any bottlenecks for experimentation. Platform and Ops teams are keenly aware that they are swamped supporting other critical priorities.

  2. Cost
    With potentially 100s or 1000s of active developer environments for Gen AI, it is paramount that the cost associated with every environment is kept extremely low and under-utilized environments are deprovisioned to save $.

  3. Powered by Standards based IaC
    Organizations have made significant investments in Infrastructure as Code (IaC) and they they wanted these environments to be backed by their preferred IaC such as Terraform.

  4. Infrastructure Provider
    Most of the organizations that we spoke with were either on AWS or Azure. Many of them have usage commitments and would like to leverage it.

  5. Access to Multiple Models
    We consistently heard that organizations would like to experiment with different models for different use cases. Given how fast the Generative AI landscape is evolving, it is sensible to not be locked into a provider that can only support a single model.

  6. Customize the Model
    Organizations mentioned that they need the ability to further tune/train a foundational model with custom data to ensure it can be optimized for their use case.

  7. Security
    Organizations said they were uncomfortable about using public/open models until they have guarantees and clarity on whether their data would not be used for public use.

As we looked at these requirements, we decided to prioritize our first version of the templates for Gen AI on AWS. We will be releasing a version of the templates for Azure in a few weeks.


Typical Steps for Users

Using the Gen AI infrastructure templates is essentially a simple "2-step" process. The first step involves the platform engineer importing the templates into their Rafay Org. The second step involves the developer/data scientist "consuming" the templates to provision the environments so that they can use it. The diagram below shows the high level steps.

sequenceDiagram
    autonumber
    participant admin as Platform Team
    participant rafay as Rafay
    participant user as Developer

    rect rgb(191, 223, 255)
    Note over admin,rafay: Step 1: Setup Environment Template
    admin->>admin: Clone Git Repo
    admin->>rafay: Setup Environment Template 
    admin->>rafay: Provide Credentials <br>(Infrastructure)
    end

    rect rgb(191, 223, 255)
    Note over rafay,user: Step 2: Use Environment Template 
    user->>rafay: Create Environment <br> based on Environment Template 
    user->>rafay: Use Environment
    user->>rafay: Destroy Environment
    end

Gen AI on Amazon ECS

This Generative AI template provisions an Amazon ECS cluster inside a VPC, deploys a task with an example Gen AI application. The ECS cluster is automatically configured with IAM policies to make API calls to a LLM in Amazon's Bedrock Generative AI service. The high level architecture looks like the following image.

Gen AI on Amazon ECS

This ECS based environment will cost ~$9/developer/month making it an extremely affordable development environment.

End-to-end provisioning of the ECS based environment based on this template takes approximately 6 minutes.

ECS Environment

Watch a video of the developer experience with this template.


Gen AI on Amazon EKS

This Generative AI template is based on a shared Kubernetes cluster based on Amazon EKS. Every developer gets access to a Kubernetes namespace on the shared EKS cluster. As part of environment creation, an IRSA is automatically deployed in the namespace with the necessary policies for applications to make API calls to a LMM in Amazon's Bedrock Generative AI service.

Two sample Generative AI applications are also deployed to the namespace that the developer can use as a starting point. The high level architecture looks like the following image.

Gen AI on Amazon EKS

End-to-end provisioning of the EKS based environment based on this template takes approximately 6 minutes.

EKS Environment

Watch a video of the developer experience with this template.


Learn More/Try It

Are you interested in learning more about Rafay's "Templates for AI and Gen AI"?

  1. Read through the documentation for the Templates
  2. Schedule a demo
  3. Meet us and watch a live demo at upcoming conferences and industry events
  4. If you would like to try this yourself, you can sign up for a Free Org.

Important

1
The Infrastructure Templates for Gen AI on AWS require [Rafay Environment Manager](https://rafay.co/platform/environment-manager/). Please make sure that this product is enabled in your Rafay Org. Contact [us](mailto:[email protected]) for details.