cloud native notes

Technology, musings and anything else that comes to mind.

Agentic Enterprise: OCI GenAI Agents Are Secure, Observable, and Scalable (A Step-by-Step Guide)

2026-03-23 post Matt Ferguson

We are entering an era where applications and infrastructure will be designed with agents that can use tools and create sophisticated workflows. This “agentic” world is more than just a wrapper for an LLM, it’s a foundation that connects models to enterprise data with actions.

This post is a step-by-step walkthrough of deploying OCI’s Generative AI Agent Service as part of a complete enterprise stack.

The use case: Enterprise RAG with guardrails

Building enterprise agents that answer complex questions grounded in your company’s proprietary documents (RAG) requires a production-grade architecture with security, observability, and governance as top priorities.

The architecture we are building handles the heavy lifting of an enterprise deployment:

  • Authentication & authorization: only the right users can query the agent.
  • Session orchestration: multi-turn conversations across multiple interactions.
  • Pre/post processing: data is cleaned and validated before it reaches the model and before it returns to the user.
  • Guardrails: built-in protection against PII leaks, prompt injection, and inappropriate content.

What you will be building

In this walkthrough, we will create a complete end-to-end stack:

  1. Web Front-end (NGINX + TypeScript UI): A user-facing chat interface built in TypeScript (compiled to vanilla JS) served by NGINX. The UI handles markdown rendering, session continuity, and citation display.
  2. OCI Load Balancer: Distributing traffic to ensure the front-end remains responsive.
  3. OCI API Gateway: The security and traffic management layer that monitors and protects our HTTP endpoints.
  4. OCI Serverless Function (Python application): The execution logic that bridges the API Gateway with the Generative AI Agent service.
  5. OCI Generative AI Agent: A managed Generative AI agent for inferencing with guardrails, RAG, reranking, and a vector embedding knowledge base.

OCI Generative AI Agent: RAG Use Case

Step-by-Step: from zero to agentic

We aren’t just clicking buttons in the console today. We are building a repeatable, automated pipeline. Now might be a good time to review the GitHub repository and clone it to your local environment.

👉 Github: OCI Generative AI Agent Demo (step-by-step) Repo

1. the environment setup

Before we touch Terraform, we need our local environment ready. You’ll need a Python virtual environment to manage the OCI SDK dependencies and a dedicated .tfvars file for secrets.

# Create and activate a clean environment
python3 -m venv venv-oci
source venv-oci/bin/activate

# Install the OCI SDK
pip install oci

2. Download and configure the OCI CLI

The OCI CLI is the primary interface for testing and secondary automation. Ensure you have your config file (~/.oci/config) properly mapped to your tenancy.

👉 Github: OCI CLI Installation

# Verify your connectivity
oci iam compartment list --compartment-id-in-subtree true

3. creating the OCI Generative AI Agent

Before creating the agent itself, you need to give it something to reason over. The RAG pipeline is grounded in documents you supply, stored in OCI Object Storage and indexed into a vector store. We’ll set that up first, then wire it into the agent.

A. create an Object Storage bucket

The agent’s knowledge base pulls documents from an OCI Object Storage bucket. Create a dedicated bucket to hold your RAG source files:

  • Go to Storage → Object Storage & Archive Storage → Buckets.
  • Select Create Bucket, give it a name (e.g., genai-agent-docs), and leave the defaults (Standard tier, Oracle-managed encryption).
  • Click Create.

B. upload your knowledge base documents

The agent will index whatever you put in this bucket. For this demo we’ll create a RAG knowledge base grounded in your professional background (like my own CareerStack.app agent). We’ll create plain text markdown files that are structured and will ultimately be chunked and embedded via a pipeline managed by the OCI Generative AI Agent Service.

Create one or more .md files covering the content you want the agent to know. Some examples:

profile.md — your professional summary, skills, and career history projects.md — descriptions of notable work, architectures you’ve built, and outcomes certifications.md — certifications, training, and areas of expertise

A sample profile.md might look like (you can create anything that reflects your own background, projects, or certifications):

# Professional Profile

## Summary
Jane Smith is a cloud architect with 15+ years of experience in enterprise
infrastructure, cloud platforms, and AI/ML workloads.

## Core skills
- Cloud Platforms: Oracle Cloud Infrastructure (OCI), AWS, Azure
- Infrastructure as Code: Terraform, Ansible
- AI/ML: LLM orchestration, RAG pipelines, vector search
- Networking: SD-WAN, BGP, data center design

## Recent experience
### Acme corp — senior cloud architect
Leading cloud modernization initiatives, GenAI platform adoption,
and cloud-native architecture design for enterprise customers.

Upload the files to your bucket:

  • Open your genai-agent-docs bucket.
  • Select Upload, drag in your markdown files, and confirm.

The more structured and specific your documents, the more precise the agent’s answers will be. Headers, bullet points, and clear section labels help the chunking and retrieval pipeline significantly.

C. create the knowledge base

With documents in Object Storage, create a Knowledge Base to index them:

  • Go to Analytics and AI → AI Services → Generative AI Agents.
  • Select Knowledge Bases → Create Knowledge Base.
  • Give it a name (e.g., career-stack-kb) and select your compartment.
  • Under Data sources, select Add data source and choose OCI Object Storage.
  • Point it at your genai-agent-docs bucket and set the file type to Markdown.
  • Click Create. OCI will ingest and vectorize the documents into an OpenSearch index — this takes a few minutes depending on document volume.

Wait for the Knowledge Base status to show Active before proceeding.

D. create the Agent and endpoint

Now create the agent and attach the knowledge base as a tool:

  • Go to Analytics and AI → AI Services → Generative AI Agents.
  • Select Agents → Create Agent.
  • Give it a name, select your compartment, and choose a model (Meta Llama 3, OpenAI OSS, XAI Grok or Cohere Command R are good defaults for RAG workloads).
  • Under Tools, select Add tool → RAG and attach the Knowledge Base you just created.
  • Complete the wizard and click Create.

Once the agent is Active, create an endpoint:

  • Open the agent and select Endpoints → Create Endpoint.
  • Give it a name and click Create.

Once completed, make note of your GenAI endpoint OCID. It will look something like:

ocid1.genaiagentendpoint.oc1.us-chicago-1.amaaaaaaEXAMPLE

You will need this OCID when configuring terraform.tfvars in a later step.


4. the serverless Python application: step-by-step

The OCI Function ociGenAI-Agents/genaiAgent/func.py is the brain connector — it receives a request from the API Gateway, calls the Generative AI Agent service, and returns the response. Here’s how to build and deploy it from your local machine.

A. set up your OCI tenancy

Before writing any code, make sure your tenancy is configured for OCI Functions.

Step 1: Create groups, users, and a compartment

If you don’t already have a dedicated compartment and user group for Functions:

  • Sign in to the Console as a tenancy administrator.
  • Go to Identity & Security → Identity → Domains → User management.
  • Create a group (e.g., functions-developers) and a user, then add the user to the group.
  • Go to Identity & Security → Identity → Compartments and create a compartment (e.g., dev).
Step 2: Create a VCN and subnets

OCI Functions requires a Virtual Cloud Network (VCN):

  • Go to Networking → Virtual Cloud Networks.
  • Select Start VCN Wizard and choose Create VCN with Internet Connectivity.
  • Give it a name and complete the wizard. This creates the VCN, public subnet, internet gateway, and route table automatically.
Step 3: Create an IAM policy

Your Functions user group needs explicit permissions to manage and invoke functions. Go to Identity & Security → Policies, create a policy at the root compartment level, and add the following statements (replace <group-name> and <compartment-name> with your values):

Allow group <group-name> to manage functions-family in compartment <compartment-name>
Allow group <group-name> to use virtual-network-family in compartment <compartment-name>
Allow group <group-name> to manage repos in tenancy
Allow group <group-name> to manage logging-family in compartment <compartment-name>
Allow group <group-name> to read metrics in compartment <compartment-name>
Allow service faas to use apm-domains in tenancy
Allow service faas to read repos in tenancy where request.operation='ListContainerImageSignatures'

You also need a policy to let the function call the Generative AI Agent service. This is done via a Dynamic Group:

  • Create a Dynamic Group (e.g., genai-func-dynamic-group) with the rule:
    ALL {resource.type = 'fnfunc', resource.compartment.id = '<compartment-ocid>'}
    
  • Then add a policy:
    Allow dynamic-group genai-func-dynamic-group to manage genai-agent-family in compartment <compartment-name>
    

B. create a Functions application in the console

  • Go to Developer Services → Functions → Applications.
  • Select Create application, name it genai-demo-app, and select the VCN and public subnet you created above.

C. set up your local development environment

Step 1: Install and verify Docker

OCI Functions packages code as Docker images. Confirm Docker is installed and running:

docker version
docker run hello-world

If Docker is not installed, follow the Docker installation docs for your platform.

Step 2: Set up your OCI API signing key
  • In the OCI Console, go to your User Settings → Tokens and keys → API keys.
  • Select Add API key → Generate API key pair.
  • Download the private key (.pem file) to ~/.oci/.
  • Copy the generated configuration snippet into ~/.oci/config.
  • Update the key_file path in the config to point to your downloaded .pem file.
  • Lock down file permissions:
chmod go-rwx ~/.oci/oci_api_key.pem
Step 3: Install the Fn Project CLI

The Fn CLI is the local toolchain for building and deploying OCI Functions:

# Linux or macOS
curl -LSs https://raw.githubusercontent.com/fnproject/cli/master/install | sh

# macOS with Homebrew
brew update && brew install fn

# Verify the installation
fn version
Step 4: Configure the Fn CLI context for OCI
# Create a new context using the Oracle provider
fn create context genai-demo --provider oracle
fn use context genai-demo

# Point it at your compartment
fn update context oracle.compartment-id <your-compartment-ocid>

# Set the OCI Functions API endpoint for your region
fn update context api-url https://functions.us-chicago-1.oci.oraclecloud.com

# Configure OCIR (OCI Container Registry) — format: <region-key>.ocir.io/<tenancy-namespace>/<repo-name>
fn update context registry ord.ocir.io/<your-tenancy-namespace>/genaifunction

# Optional: set the image compartment
fn update context oracle.image-compartment-id <your-compartment-ocid>
Step 5: Generate an Auth Token and log in to OCIR

OCI Functions pushes Docker images to OCIR. You need an auth token to authenticate:

  • In the Console, go to User Settings → Tokens and keys → Auth Tokens → Generate token.
  • Copy the token immediately (you will not see it again).

Log in to the registry:

docker login -u '<tenancy-namespace>/<[email protected]>' ord.ocir.io
# When prompted for a password, paste the auth token you just copied

D. deploy the Python Function

The repository includes a ready-to-deploy Python function under genaiAgent/. This function wraps the OCI Generative AI Agent SDK and handles session management.

Key dependencies (requirements.txt):

oci>=2.112.0
fdk>=0.1.105

A notable design detail: func.py uses resource principals for authentication when running inside OCI, with an automatic fallback to the local ~/.oci/config file for development. This means the same code works both locally and in production without any code changes — only the IAM policy distinguishes environments.

Deploy the function:

cd genaiAgent

# Build the Docker image and push to OCIR, then register it with OCI Functions
fn -v deploy --app genai-demo-app

E. testing the Function with fn invoke

Once deployed, test the function end-to-end before wiring up Terraform. The function accepts a JSON body with a userMessage field and an optional sessionId for stateful multi-turn conversations.

Single-turn invocation:

echo '{"userMessage": "What is OCI Generative AI?"}' | fn invoke genai-demo-app genaiagent

Multi-turn stateful conversation:

The function returns a sessionId in its response. Pass that back on subsequent calls to maintain conversation context across turns:

# First turn — no sessionId, agent creates a new session
echo '{"userMessage": "Who is Matt Ferguson?"}' | fn invoke genai-demo-app genaiagent

# Second turn — include the sessionId from the first response
echo '{"userMessage": "What are his key skills?", "sessionId": "<SESSION_ID_FROM_FIRST_RESPONSE>"}' | fn invoke genai-demo-app genaiagent

A successful response will look like:

{
  "response": "Matt Ferguson is a cloud architect and technology leader...",
  "sessionId": "ocid1.genaiagentsession.oc1.us-chicago-1...."
}

You should see a JSON response from your Generative AI Agent. Once this works, your function OCID is ready to feed into Terraform.

F. testing directly with the OCI CLI: myAgent.py

The repository also includes myAgent.py — a lightweight shell script that bypasses the Function and API Gateway entirely, hitting the OCI Generative AI Agent Runtime endpoint directly using the OCI CLI. This is useful for validating your agent configuration in isolation before deploying the full stack.

#!/bin/bash
ENDPOINT="https://agent-runtime.generativeai.us-chicago-1.oci.oraclecloud.com"

oci --profile "GENAI" \
  --endpoint "$ENDPOINT" \
  -- generative-ai-agent-runtime agent-endpoint chat \
  --user-message "Who is Jane Smith" \
  --should-stream false \
  --session-id "ocid1.genaiagentsession.oc1.us-chicago-1.<YOUR_SESSION_OCID>" \
  --agent-endpoint-id "ocid1.genaiagentendpoint.oc1.us-chicago-1.<YOUR_ENDPOINT_OCID>"

To use it, update the two OCID placeholders with your actual values (the session and endpoint OCIDs from Step 3), then run:

chmod +x myAgent.py
./myAgent.py

If the agent responds correctly here, you know the problem isn’t in your infrastructure — it’s isolated to the Function or API Gateway layer. This makes it the right debugging tool when the full stack isn’t behaving as expected.


5. deploying the infrastructure: Terraform Plan, Init, and Apply

With your OCI Function deployed, it’s time to stand up the rest of the stack — Load Balancer, NGINX compute instance, API Gateway, networking, and IAM — all defined in Terraform.

OCI Generative AI Agent Architecture: RAG Use Case

What the Terraform code creates

The main.tf file orchestrates six modules that together form the complete ingress stack:

Module What It Creates
networking VCN, public subnet, internet gateway, route table, NSGs for the LB and NGINX
iam Dynamic group and IAM policies so the Function can call the Generative AI Agent
functions OCI Functions application wiring (links to the function image you deployed above)
api_gateway API Gateway + deployment with a route that proxies /chat calls to the Function
compute_nginx A compute instance running NGINX as a reverse proxy and static front-end host
load_balancer A flexible shape Load Balancer fronting the NGINX instance

When applied, traffic flows: User → Load Balancer → NGINX → API Gateway → OCI Function → Generative AI Agent.

Step 1: configure terraform.tfvars

Copy the example file and fill in your values:

cd terraform
cp terraform.tfvars.example terraform.tfvars

Open terraform.tfvars and update every placeholder:

# OCI Identity
compartment_ocid     = "ocid1.compartment.oc1..YOUR_COMPARTMENT_OCID"
tenancy_ocid         = "ocid1.tenancy.oc1..YOUR_TENANCY_OCID"
user_ocid            = "ocid1.user.oc1..YOUR_USER_OCID"
api_fingerprint      = "aa:bb:cc:dd:ee:ff:11:22:33:44:55:66:77:88:99:00"
api_private_key_path = "~/.oci/oci_api_key.pem"

# Region & Compute
region              = "us-chicago-1"
availability_domain = "Uocm:CHICAGO-1-AD-1"
instance_image_ocid = "ocid1.image.oc1.us-chicago-1.YOUR_IMAGE_OCID"
ssh_authorized_keys = "ssh-ed25519 AAAA... you@yourmachine"

# Agent & Function
agent_endpoint_ocid  = "ocid1.genaiagentendpoint.oc1.us-chicago-1.YOUR_ENDPOINT_OCID"
agent_runtime_region = "us-chicago-1"
function_image       = "ord.ocir.io/<your-tenancy-namespace>/genaifunction/genaiagent:latest"
ocir_repo_name       = "genaifunction"

label_prefix = "genai-demo"

Where to find these values:

  • compartment_ocid: OCI Console → Identity & Security → Compartments
  • tenancy_ocid: OCI Console → Profile menu → Tenancy
  • user_ocid: OCI Console → Profile menu → User settings
  • api_fingerprint: OCI Console → User settings → API keys
  • availability_domain: oci iam availability-domain list --compartment-id <compartment-ocid>
  • instance_image_ocid: OCI Console → Compute → Images (choose Oracle Linux 8 or Ubuntu 22.04)
  • agent_endpoint_ocid: The OCID from Step 3 above
  • function_image: The OCIR path from your fn deploy step

Step 2: initialize Terraform

terraform init downloads all required providers (the OCI provider) and initializes the module tree:

terraform init

Expected output:

Initializing modules...
- networking in ./modules/networking
- iam in ./modules/iam
- functions in ./modules/functions
- api_gateway in ./modules/api_gateway
- compute_nginx in ./modules/compute_nginx
- load_balancer in ./modules/load_balancer

Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/oci versions matching "~> 6.0"...
- Installing hashicorp/oci v6.x.x...

Terraform has been successfully initialized!

Step 3: review the Plan

Before creating anything, review what Terraform intends to do:

terraform plan

You should see roughly 30–40 resources planned for creation, including:

  • 1 VCN + 1 public subnet + internet gateway + route tables + NSGs
  • 1 API Gateway + 1 deployment + 1 route
  • 1 OCI Functions application
  • 1 Compute instance (NGINX)
  • 1 Load Balancer + backend set + listener
  • IAM dynamic group + policies

Review the output carefully. If you see any Error: lines at this stage, they are almost always misconfigured values in terraform.tfvars (wrong OCID format, missing region, etc.).

Step 4: Apply

When the plan looks correct, apply it:

terraform apply

Type yes when prompted. Provisioning typically takes 5–10 minutes. The Load Balancer and compute instance take the longest.

Expected outputs

Once apply completes, Terraform prints the following outputs:

Outputs:

api_gateway_invoke_url    = "https://abcdef1234.apigateway.us-chicago-1.oci.customer-oci.com/v1"
compute_public_ip         = "129.xx.xx.xx"
demo_base_url             = "http://xxx.xxx.xxx.xxx"
function_id               = "ocid1.fnfunc.oc1.us-chicago-1..."
functions_application_id  = "ocid1.fnapp.oc1.us-chicago-1..."
load_balancer_public_ip_addresses = ["xxx.xxx.xxx.xxx"]

Open the demo_base_url in your browser to verify the end-to-end stack is working.

Troubleshooting

Error: 404 NotAuthorizedOrNotFound during apply The Terraform service account or user doesn’t have permission to create one of the resources. Double-check the IAM policies for your compartment and make sure allow service faas to read repos in tenancy is in place.

Load Balancer stuck in CREATING state This usually resolves itself within a few minutes. If it persists, check the subnet’s security list rules — port 80 must be open for ingress.

Function invocation returns 502 Bad Gateway from API Gateway The function deployed but isn’t responding correctly. Check the function logs:

fn logs get genai-demo-app genaiagent

Common cause: agent_endpoint_ocid in terraform.tfvars is incorrect or the dynamic group policy is missing.

terraform plan shows no changes but the agent still isn’t working If you deployed the function manually with fn deploy before running Terraform, and you’re using manual_function_id in tfvars, confirm that OCID is correct. Terraform won’t re-deploy functions it doesn’t manage.

NGINX returning a blank page SSH into the compute instance (compute_public_ip) and check the NGINX configuration:

sudo systemctl status nginx
sudo cat /etc/nginx/conf.d/app.conf

The api_gateway_url should be embedded correctly in the NGINX config by the Terraform compute_nginx module.


6. the ingress stack: LB → NGINX → API gateway

To make this agentic, it must be accessible. We wrap the agent in a standard enterprise ingress pattern:

  1. OCI Load Balancer: Entry point for all traffic. The flexible-shape load balancer distributes requests to the NGINX backend set over port 80.
  2. NGINX (Web Layer): Handles two jobs — serving the static TypeScript chat UI from public/ and acting as a reverse proxy that forwards /ask POST requests upstream to the API Gateway. Terraform injects the API Gateway invoke URL directly into the NGINX config via cloud-init at boot, so there’s no manual configuration step.
  3. OCI API Gateway: The guardrail layer — validating requests, enforcing rate limits, emitting access and execution logs at INFO level, and routing POST /v1/ask to the OCI Function backend. This is where you would add JWT authentication or OAuth policies for production workloads.

Traffic flows in one direction: Browser → Load Balancer → NGINX → API Gateway → OCI Function → Generative AI Agent Runtime. Each hop adds a layer of control without coupling them together — you can swap the front-end or tighten the API Gateway policy without touching the function code.

The chat UI

The TypeScript source in src/app_client.ts compiles down to a single self-contained JavaScript file served by NGINX. The UI handles:

  • Markdown rendering — bold, italics, ordered/unordered lists, headings, and hyperlinks in agent responses
  • Session tracking — stores the sessionId returned by the function and passes it on every subsequent request, giving you multi-turn conversation state in the browser
  • Citation display — renders source annotations returned by the RAG pipeline alongside responses
  • Stop button — uses the browser’s AbortController API to cancel in-flight requests

The three pre-built prompt chips wired into the UI demonstrate the agent’s capabilities on load, making it easy for someone landing on the demo to see the agent in action without typing anything.

Tracing the agentic workflow

In this architecture, the flow changes from a simple query to a multi-step execution. With the RAG knowledge base attached, here’s what happens when a user submits a question:

The browser sends POST /ask with the user’s message and session ID. NGINX proxies it to API Gateway, which forwards it to the OCI Function. func.py checks for an existing session — if there isn’t one, it calls the GenAI Agent Runtime to create one and stores the returned sessionId. The agent then queries the OpenSearch vector store, retrieves the most relevant chunks from the knowledge base, and passes them to the LLM (Llama 3 or Cohere Command R) to ground the response. The function returns the answer and sessionId as JSON, and the chat UI renders the markdown with any citations inline.

Because the session ID is preserved across turns, follow-up questions like “What certifications does she hold?” carry full conversation context — the agent knows what was just asked and can answer coherently without re-explaining the premise.

Why this matters: the middle layer

By building this on OCI, we are using a high-performance substrate that connects the agent directly to the data via a high-speed, multiplanar network. This isn’t just a chatbot; it’s a piece of infrastructure that understands your business and acts with low-latency access to your primary data sources.

What’s next?

We’ve moved from “Packets” to “Agentic Infrastructure.” The next step is scaling this across multiple departments using agent-to-agent (A2A) collaboration. Imagine a Supply Chain Agent talking to a Finance Agent to optimize logistics in real-time.

If you’re ready to build the future of enterprise automation, the tools are ready on OCI.

Follow me on LinkedIn for more updates as we continue to explore the Internet of Agents.