Skip to content

Use

At this point, the developer will have the ability to deploy and deprovision environments based on the shared environment template. Note that the developer

  • Does not need to have any knowledge of Terraform
  • Does not need access to privileged credentials for AWS
  • Does not any help from the Platform team to deploy their environment

Use Gen AI Environment

Once the developer logs into the Rafay Org (SSO using Identity Provider recommended), they will only have access to the specific Project they have been authorized to use. Their level of access in the newly created project will be controlled using RBAC (role based access control). It is recommended that they only be provided with the role "Environment Template User" which allows to use the provided Environment Templates, but nothing more.

Important

Although the recommended workflow assumes and recommends using an Integration with an Identity Provider (IdP) to provide a Single Sign On (SSO) experience, organizations can also use locally managed users.

sequenceDiagram
    participant dev as Developer
    participant rafay as Rafay <br> Environment Manager
    participant csp as ECS Cluster
    participant idp as Identity Provider 

    dev->>idp: Access Environment 
    idp-->>dev: Redirect to Rafay 
    dev-->>rafay: SSO to Rafay with <br> RBAC (Env Template User)

    dev->>rafay: Create Environment <br>based on Env Template 
    rafay->>csp: Provision new ECS Cluster w/VPC, subnets and Gen AI App  
    rafay-->>dev: Environment Ready
    dev->>csp: Uses GenAI Environment 
    dev-->>csp: Explore 1st Gen AI App 
    dev->>rafay: Deploy 2nd Gen AI App 
    dev-->>rafay: Deploy Custom Gen AI App  

Step 1: Create Application Environment Resource

In this step, a second user, such as a developer, will create an environment resource in the controller which will use the previously created environment template. The environment resource will be used to create the VPC, ECS cluster and Generative AI application. This environment resource will be used to control the lifecycle of the application environment.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click New Environment
  • Enter gen-ai-ecs for the name
  • Select the existing application environment template
  • Select the environment template version
  • Click Create
  • Navigate to Input Variables
  • Click Add Variable
  • Enter image_location for the variable name
  • Select Text for the value type
  • Enter the image location public.ecr.aws/rafay-dev/gen-ai-sample-chat-app for the value
  • Click Add Variable
  • Enter container_port for the variable name
  • Select Text for the value type
  • Enter the container port number 8000 for the value
  • Click Save

Configure Environment


Step 2: Deploy Application Environment

In this step, the developer user will now deploy the previously created application environment. Deploying the environment will create a VPC, ECS Cluster and deploy a generative AI application onto the cluster.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click on the gen-ai-ecs environment
  • Click Publish

The environment will begin to publish and could take ~5 minutes to complete.


Step 3: Access Application

We have provided two Gen AI example applications in a public ECR repository. The environment template will deploy one of the Gen AI example applications as part of the environment creation.

Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click on the gen-ai-ecs environment
  • Click Resource
  • Expand the resource named gen-ai-aws-app, you will see a public endpoint

Access App

  • Copy the endpoint and enter it into a browser

You will now access the first application. This application uses Amazon Bedrock to act as an intelligent chat bot. You can enter text into the chat and the engine will respond.

App1


Step 4: Update Application

We will now deploy the second GenAI application to ECS using the Environment resource that was previously created.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click on the gen-ai-ecs environment
  • Click Edit Configuration
  • Navigate to Input Variables
  • Update the image location with public.ecr.aws/rafay-dev/genai:latest for the value
  • Update the container port number with 80 for the value
  • Click Save

Update App


Step 5: Deploy Application Environment

In this step, the developer user will now deploy the updated application environment. Deploying the environment will update the generative AI application with a new container image.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click on the gen-ai-ecs environment
  • Click Publish

The environment will begin to publish and could take ~5 minutes to complete.


Step 6: Access Application

Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.

  • Log into the controller and select your project
  • Navigate to Environments -> Environments
  • Click on the gen-ai-ecs environment
  • Click Resource
  • Expand the resource named gen-ai-aws-app, you will see a public endpoint

Access App

  • Copy the endpoint and enter it into a browser

You will now access the first application. This application takes a text file as input and summarizes the content. The application uses Amazon Bedrock to produce a summary of the text file.

App2


Develop & Deploy Your Containers

At this point, the developer is ready to go ahead with the development and testing of their own Gen AI containerized applications. They are welcome to use the source code for the two example applications as the starting point. The typical steps are as follows

  • Build the new GenAI container image
  • Upload the container image to a container registry such as ECR
  • Deploy their Gen AI application by updating the image location within the Environment resource

In summary, with Rafay, developers can now develop, deploy and validate their Generative AI applications on Amazon ECS Clusters using Amazon Bedrock for the foundational models.