Why Your Infrastructure AI Sucks (And How to Fix It)

Here’s the harsh reality: your AI agent is completely useless for infrastructure management, and you probably don’t even realize it yet.

You’ve probably tried throwing ChatGPT or Claude at your DevOps problems, thinking AI will magically solve your infrastructure challenges. Maybe you got some generic responses that looked helpful on the surface. But when you actually tried to implement those suggestions, you discovered the painful truth - the AI has no clue about your environment, your standards, or your constraints.

Most organizations are making the same critical mistake: they’re treating AI like a search engine instead of building it into their platform properly. They ask vague questions, get generic answers, and wonder why their “AI transformation” isn’t working.

But here’s what changes everything: when you build AI into your Internal Developer Platform the right way, with proper context management, organizational learning, and intelligent workflows, you get something completely different. You get an AI system that actually understands your infrastructure, follows your patterns, enforces your policies, and delivers solutions that work in your specific environment.

In this video, I’m going to show you the five fundamental problems that make most AI implementations worthless, and then walk you through building the essential components that solve every single one of them. By the end, you’ll have a complete blueprint for an AI-powered IDP that actually works.

Setup

This demo is using Claude Code as the coding agent. With a few modification, it should work with any other coding agents like Cursor, GitHub Copilot, etc.

The project we’ll explore currently supports only Anthropic Sonnet models. Please open an issue in the dot-ai repo if you’d like the support for other models or if you have any other feature request or a bug to report.

Install NodeJS if you don’t have it already.

npm install -g @anthropic-ai/claude-code

git clone https://github.com/vfarcic/dot-ai-demo

cd dot-ai-demo

git pull

git fetch

git switch components

Make sure that Docker is up-and-running. We’ll use it to create a KinD cluster.

Watch Nix for Everyone: Unleash Devbox for Simplified Development if you are not familiar with Devbox. Alternatively, you can skip Devbox and install all the tools listed in devbox.json yourself.

devbox shell

./dot.nu setup --dot-ai-tag 0.79.0 \
    --qdrant-run false --qdrant-tag 0.5.0

source .env

claude

Why Your AI Agent Is Useless

Today we’re exploring the essential components you need to build when integrating AI into Internal Developer Platforms. We’re not just talking about throwing ChatGPT at your infrastructure problems and hoping for the best. We’re talking about creating a proper AI-powered system that actually understands your organization, follows your patterns, and enforces your policies.

So let me start with a simple question: what happens when you give an AI agent a vague request without proper context or guidance?

Well, I’ll show you with what seems like an “obvious” example that actually demonstrates everything that’s wrong with how most people approach AI in DevOps.

Watch what happens when I give the AI agent this seemingly straightforward request:

[user]

> Create a PostgreSQL database in AWS. Do NOT use MCP.

[agent]

⏺ I'll help you create a PostgreSQL database in AWS. Let me
  create a todo list to track this task and then provide you
   with the steps.

⏺ Update Todos
  ⎿  ☐ Research AWS PostgreSQL deployment options (RDS vs
       self-managed)
     ☐ Provide step-by-step instructions for creating
       PostgreSQL database in AWS

⏺ Update Todos
  ⎿  ☐ Research AWS PostgreSQL deployment options (RDS vs
       self-managed)
     ☐ Provide step-by-step instructions for creating
       PostgreSQL database in AWS

⏺ For creating a PostgreSQL database in AWS, you have two
  main options:

  Option 1: Amazon RDS (Recommended)

  Amazon RDS for PostgreSQL is a managed service that
  handles maintenance, backups, and scaling.

  Steps:
  1. AWS Console: Go to RDS in AWS Console
  2. Create Database: Click "Create database"
  3. Engine: Select PostgreSQL
  4. Template: Choose production, dev/test, or free tier
  5. Settings:
    - DB instance identifier (e.g., my-postgres-db)
    - Master username (e.g., postgres)
    - Master password
  6. Instance Configuration: Choose instance class
  (db.t3.micro for free tier)
  7. Storage: Set allocated storage (20GB minimum)
  8. Connectivity:
    - VPC and security groups
    - Public access (if needed)
  9. Database Options: Initial database name
  10. Create Database

  Option 2: EC2 Self-Managed

  Install PostgreSQL on an EC2 instance for full control.

  Steps:
  1. Launch EC2 instance (Ubuntu/Amazon Linux)
  2. SSH into instance
  3. Install PostgreSQL:
  sudo apt update
  sudo apt install postgresql postgresql-contrib
  4. Configure PostgreSQL
  5. Set up security groups for port 5432

⏺ Update Todos
  ⎿  ☒ Research AWS PostgreSQL deployment options (RDS vs
       self-managed)
     ☒ Provide step-by-step instructions for creating
       PostgreSQL database in AWS

⏺ Would you like detailed instructions for either approach,
  or do you need help with AWS CLI commands instead of the
  console?

Look at this response. On the surface, it seems helpful, right? The AI gave me two options for creating a PostgreSQL database in AWS.

But here’s where it gets interesting. This seemingly helpful response actually demonstrates five critical flaws that make most AI implementations completely useless in production environments. Let me break down exactly what went wrong here.

Notice what the AI didn’t do. It didn’t ask me for more information about my specific requirements, my organization’s standards, or my environment constraints. There’s no workflow guiding this interaction toward the right solution.

Here’s the second major issue: the AI has no idea which services are actually available in my environment. Sure, it mentioned RDS and EC2, but what if I’m running everything on Kubernetes? What if I have custom operators or Crossplane providers already deployed? The AI should be using the Kubernetes API to discover what’s available.

But here’s the catch: you can’t do semantic search against the Kubernetes API directly. If you want an AI to intelligently find the right resources for a given intent, you need to convert those Kubernetes API definitions and CRDs into embeddings and store them in a Vector database. I call these capabilities.

Please watch the Stop Blaming AI: Vector DBs + RAG = Game Changer if you’d like more info about Vector DBs.

The third major issue: this AI doesn’t know anything about the patterns my organization uses. It has no clue about our naming conventions, our preferred architectures, our deployment strategies, or any of the tribal knowledge that lives in our documentation, wiki pages, code repositories, and Slack conversations.

If you want AI to be truly useful in your organization, you need to capture these patterns from wherever they live, convert them into embeddings, and store them in a Vector database. I call these patterns.

The fourth issue: the AI is completely oblivious to my company’s policies. It doesn’t know that we require all databases to run in specific regions for compliance, that we mandate resource limits on all containers, or that we prohibit the use of latest tags in production.

You need to handle policies the same way as patterns: capture them, convert them to embeddings, and store them in a Vector database. But here’s the key difference: you also want to convert these policies into enforceable rules using Kubernetes Validating Admission Policies, Kyverno, or similar tools as your last line of defense. I call these policies.

And here’s the fifth issue: context quickly becomes garbage, especially when you’re working with massive amounts of data. If you keep accumulating everything in the conversation context, you’ll end up with a bloated mess that the AI can’t effectively process. You need to manage context properly by keeping only the things that actually matter for the specific task at hand.

So what do we actually need to solve these problems? Three fundamental components: proper context management, workflows, and learning.

Let me be clear about what “proper” context management means. It’s not the constant accumulation of everything that’s ever been said or done. That’s a recipe for disaster. Instead, it means starting with a fresh context for every interaction, but populating that context with all the relevant data that specific task needs and nothing more.

Workflows should guide both people and AI towards the right solution instead of relying on incomplete user intents and the AI’s unpredictable decision-making process. You can’t just throw a vague request at an AI and expect it to magically understand what you really need.

Learning is how you teach AI about your organizational data: patterns, policies, best practices, and everything else that makes your environment unique. But here’s the catch: AI models can’t actually learn in the traditional sense. Everything you teach it gets lost when the conversation ends. AI is like a person with severe short-term memory loss, or like a goldfish that forgets everything after a few seconds.

So teaching it everything upfront is a complete waste of time. Instead, you should teach it only the parts that are relevant to specific situations, based on the user’s intent, the current workflow step, and other contextual factors. Think of it as temporary, just-in-time learning.

What we’re covering today is really the culmination of subjects we’ve explored in previous videos. This is where we’re putting quite a few hard-learned lessons together into a cohesive system.

Just to be clear about scope: this video is focused on creation and initial setup. We’ll cover updates and observability in a different video.

We’ll explore these concepts using the DevOps AI Toolkit. Now, I’m not trying to sell you on this specific project. Think of it as a reference implementation, a set of components that demonstrates what you might need to build in your own Internal Developer Platform.

All in all, we’ll explore three types of learning that are crucial for IDPs: capabilities, patterns, and policies. We’ll also dive into context management and workflows. When you combine all these components properly, you get a complete, AI-powered Internal Developer Platform that actually works.

So let’s start with the first piece of the puzzle.

Kubernetes API Discovery That Actually Works

So what exactly are capabilities? Let me explain this concept because it’s absolutely fundamental to building AI that actually understands your infrastructure.

If you notice that dot-ai MCP is not used by the agent, add with dot-ai MCP to input prompts.

Here’s the thing: the capabilities we need are already there. The Kubernetes API acts as a single, unified control plane that can manage resources not just inside the cluster itself, but also external resources in AWS, Google Cloud, Azure, and pretty much anywhere else you can think of.

This is crucial for two reasons. First, it gives AI a single API to work with instead of having to learn dozens of different cloud provider APIs, tools, and interfaces. Instead of the AI needing to understand AWS CLI, Azure CLI, Google Cloud CLI, Terraform, Pulumi, and who knows what else, it just needs to understand one thing: the Kubernetes API.

Second, and this is equally important: by controlling which API endpoints and resource types are available in your Kubernetes cluster, you’re defining the scope of what can and should be done. You’re not giving AI access to everything under the sun. You’re curating a specific set of capabilities that align with your organization’s standards and policies.

But here’s where we hit a problem: the AI agent can’t figure out which resource definitions might match a user’s intent. What’s it supposed to do, go through every single resource definition in your cluster every time someone asks for something? That would be insane. There are potentially hundreds or thousands of custom resources, and there’s no semantic search capability in the Kubernetes API.

So here’s the solution: if we convert the relevant information from the Kubernetes API into embeddings and store them in a Vector database, then the AI can perform semantic search and actually find what it’s looking for. Instead of blindly iterating through every resource definition, it can intelligently search for resources that match the intent.

We’ll dive deeper into the semantic search mechanics later. For now, let’s take a look at some of the data that’s already in the database and see how we can create embeddings and push data into the system.

Let me start by listing the capabilities that are already available in our system:

[user]

> List capabilities

[agent]

⏺ I'll list the available capabilities using the dot-ai
  system.

⏺ dot-ai - manageOrgData (MCP)(dataType: "capabilities", operation:
                              "list")
  ⎿  {
       "success": true,
       "operation": "list",
     … +199 lines (ctrl+r to expand)

⏺ Here are the available capabilities (showing 10 of 344 total):

  ID: 014e4600-f3c1-c91d-b0b0-04242927cbfd
  - Resource: clusterusages.protection.crossplane.io
  - Capabilities: resource protection, deletion blocking, dependency
   management, resource lifecycle, crossplane
  - Description: A Crossplane resource that defines deletion
  blocking relationships between resources to prevent acci...

  ID: 0188da80-6f2e-ff3f-273e-004bc50c9eed
  - Resource: subscriptions.azure.m.upbound.io
  - Capabilities: azure subscription, subscription management,
  billing, azure account, tenant management
  - Description: Manages Azure subscriptions with billing scope
  configuration and workload type classification...

  ID: 03a6109b-a52e-19f3-13ab-633f36b9c1d4
  - Resource: providerconfigs.aws.m.upbound.io
  - Capabilities: aws provider, credential management,
  authentication, iam roles, web identity, assume role, crossplane
  provider, aws configuration
  - Description: Crossplane AWS provider configuration that manages
  authentication and credentials for AWS resources...

  ID: 03f61362-8904-f64c-6110-e0acf759278e
  - Resource: pullrequests.repo.github.upbound.io
  - Capabilities: pull request, github, git workflow, code review,
  version control
  - Description: Manages GitHub pull requests with automated branch
  and repository references through Crossplane...

  ID: 04dc1a16-dd38-061e-b7c0-383c168bd6be
  - Resource: networks.network.upcloud.com
  - Capabilities: private networking, sdn, vpc, subnet, dhcp,
  routing, ip networking
  - Description: UpCloud SDN private network resource that provides
  isolated networking with DHCP, routing, and subnet...

  ID: 054993e1-e0fd-96ad-6b66-077f7c1468e3
  - Resource: ReplicaSet
  - Capabilities: pod replication, scaling, high availability, pod
  management, container orchestration
  - Description: Ensures a specified number of identical pod
  replicas are running at any given time with automatic re...

  ID: 06eb2cc6-bd3f-6b04-7dc8-7e00a1a8289b
  - Resource: teammemberships.team.github.upbound.io
  - Capabilities: github team membership, team management, user
  management, access control, github integration
  - Description: Manages GitHub team membership by adding users to
  teams with specified roles...

  ID: 06f2c5b3-3c62-8e40-ad2b-8f0b9c114cdf
  - Resource: defaultroutetables.ec2.aws.m.upbound.io
  - Capabilities: route table, vpc routing, network routing, aws
  ec2, default route table management
  - Description: Manages the default route table of an AWS VPC,
  allowing configuration of routes, gateway association...

  ID: 0710fc19-88b2-f489-583f-5a6a3441ab7d
  - Resource: runnergroups.actions.github.upbound.io
  - Capabilities: github actions, runner groups, ci/cd, workflow
  management, repository access control
  - Description: Manages GitHub Actions Runner Groups within
  organizations, controlling which repositories and workfl...

  ID: 0a40069f-b501-5786-6cf1-12393bed31c3
  - Resource:
  managedobjectstorageuseraccesskeys.objectstorage.upcloud.com
  - Capabilities: object storage, access keys, user access
  management, s3 compatible, credentials management
  - Description: Manages access keys for UpCloud Managed Object
  Storage users, providing S3-compatible API credential...

As you can see, we’ve got 344 capabilities stored in the database. Each one represents a Kubernetes resource type with its associated metadata: what it can do, which providers it works with, and a description of its functionality. This is exactly the kind of information an AI needs to match user intents with the right infrastructure components.

AI is non-deterministic. Outputs might differ so if you’re following along, you might need to adapt to what you see on the screen.

Let me show you a specific example by looking at a database-related capability:

[user]

> Show sqls.devopstoolkit.live capability

[agent]

⏺ I'll retrieve the details for the sqls.devopstoolkit.live
  capability.

  dot-ai - manageOrgData (MCP)(dataType: "capabilities", operation:
                              "search", id:
                              "sqls.devopstoolkit.live")
  ⎿  {
       "success": true,
       "operation": "search",
     … +672 lines (ctrl+r to expand)

⏺ Search Results for "sqls.devopstoolkit.live"

  Found 25 matching capabilities. Here's the top match:

  Rank 1 - Score: 0.35
  ID: 17c5105f-e215-ae9b-b3bf-58b608abb3b7
  - Resource: sqls.devopstoolkit.live
  - Capabilities: postgresql, database, schema management,
  multi-cloud database, crossplane composite, secret management,
  database migration
  - Providers: aws, azure, gcp, google cloud, upcloud
  - Complexity: medium
  - Description: Multi-cloud PostgreSQL database service with
  cross-cloud portability, schema management, and integrated secret
  handling across AWS, Azure, GCP, and UpCloud.
  - Use Case: PostgreSQL database deployment with schema management
  and multi-cloud flexibility
  - Confidence: 0.9
  - Analyzed: 2025-08-30T15:58:15.555Z

  This is a Crossplane composite resource that provides a unified
  interface for deploying PostgreSQL databases across multiple cloud
   providers with built-in schema management and secret handling
  capabilities.

Perfect! This shows you exactly what the AI gets when it searches for database-related capabilities. It finds not just the resource name, but also the semantic tags, supported providers, complexity level, and a detailed description. This is what enables the AI to make intelligent recommendations instead of just throwing random Kubernetes resources at you.

Now let me show you how we can add new capabilities to the system:

[user]

> Scan capabilities

[agent]

⏺ I'll scan for new capabilities in the cluster.

  dot-ai - manageOrgData (MCP)(dataType: "capabilities", operation:
                              "scan")
  ⎿  {
       "success": true,
       "operation": "scan",
     … +52 lines (ctrl+r to expand)

⏺ Would you like to scan all cluster resources or specify a subset?

  1. all - Scan all available cluster resources
  2. specific - Specify particular resource types to scan

  Please choose option 1 or 2.

This scanning process is what discovers all the Custom Resource Definitions in your cluster, analyzes their schemas, converts them into embeddings, and stores them in the Vector database. It’s the foundation that makes intelligent capability discovery possible.

It might take a while to scan all Kubernetes CRDs, create embeddings, and store them into the Vector DB. The demo comes with the database with capabilities baked-in so we can skip this step. Press ctrl+c to cancel the rest of the process if you’re using Claude Code. If you’re using a different agent, cancel it whichever way cancelling execution of an MCP works.

Please watch Why Kubernetes Discovery Sucks for AI (And How Vector DBs Fix It) if you’d like more details about scanning Kubernetes resources and converting them to embeddings.

So that’s capabilities: teaching AI what infrastructure resources are available. But knowing what’s available is only part of the equation. The AI also needs to understand how your organization actually uses those resources.

Organizational Knowledge AI Can Actually Use

Now let’s talk about patterns. Here’s something important to understand: AI is already perfectly capable of using Kubernetes, assembling solutions in AWS, GCP, Azure, and handling other tasks that are public knowledge. We don’t need to teach it those things.

What we need to teach AI are the things that are specific to our company. The patterns that represent how we do things, our standards, our preferred approaches, and our organizational wisdom that isn’t documented anywhere in the public internet.

AI already knows how to assemble resources based on an intent - that’s public knowledge. But patterns teach it how to assemble resources according to your organization’s specific know-how. Maybe your company always pairs databases with specific monitoring setups, or has a standard way of handling ingress with particular security configurations. These organizational assembly patterns are what we need to capture.

So where do these organizational patterns live? They can be scattered across existing code repositories, documentation, Wiki pages, Slack conversations, or anywhere else you store institutional knowledge. But here’s the problem: a lot of these patterns aren’t written down anywhere. They exist in people’s heads, in the collective experience of your team members who know “how we do things around here.”

So what’s the AI supposed to do with all these scattered patterns? Should it go through every single document, every Slack conversation, every code repository every time someone asks for something? That’s not practical or efficient.

The solution is similar to what we did with capabilities. We need to identify the actual patterns first - and let me be clear, not everything is a pattern worth capturing. Then we create embeddings from those patterns and store them in a Vector database. The logic is exactly the same as with capabilities, but the sources of data are different.

Once the patterns are stored in a Vector database, AI agents can perform semantic search to find the patterns that match specific intents. Now instead of randomly guessing how to implement something, the AI can follow your organization’s established patterns and best practices.

We’ll explore how to use these patterns in AI workflows later. For now, let’s focus on how to create and manage them. I’ll show you how to capture patterns from people’s knowledge - the stuff that’s in their heads. The same logic applies when you’re extracting patterns from other sources like documentation or code.

Let me start by showing you what patterns we already have in the system:

[user]

> List patterns

[agent]

⏺ I'll list the organizational patterns in the system.

  dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "list")
  ⎿  {
       "success": true,
       "operation": "list",
     … +33 lines (ctrl+r to expand)

⏺ Here are the available organizational patterns:

  ID: 9cdb8e4f-afa9-414f-a1c0-196b2bd894c5
  - Description: DevOps Toolkit DB
  - Triggers: 11
  - Resources: 1
  - Created: 2025-08-25T23:39:03.636Z
  - Created By: a-team

  ID: dedcafc2-f488-4295-95ad-46086b5bf048
  - Description: Azure ResourceGroup
  - Triggers: 1
  - Resources: 1
  - Created: 2025-08-15T10:32:40.505Z
  - Created By: a-team

  Total: 2 patterns found (semantic+keyword search available)

As you can see, we currently have just 2 patterns in the system. Each pattern has triggers - these are the keywords that help the AI understand when to apply this pattern. The patterns also track which resources they recommend and who created them.

Now let me show you how to create a new pattern. This is where we capture organizational knowledge and turn it into something the AI can use:

[user]

> Create a pattern

[agent]

⏺ I'll help you create a new organizational pattern. Let me start
  the pattern creation workflow.

  dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create")
  ⎿  {
       "success": true,
       "operation": "create",
     … +19 lines (ctrl+r to expand)

⏺ What deployment capability does this pattern provide? I need a
  capability name (2-4 words).

  Examples:
  - Specific: "Horizontal scaling", "Database persistence", "SSL
  termination"
  - Broad/Organizational: "Application networking", "General
  security", "Basic monitoring"

  Both specific and broad patterns are fine. What capability
  describes your pattern?

I’ll specify “AWS public services” as the capability:

[user]

> AWS public services

[agent]

  dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "triggers", response: "AWS
                              public services")
  ⎿  {
       "success": true,
       "operation": "create",
     … +25 lines (ctrl+r to expand)

⏺ What types of infrastructure should this apply to? Think about the
   categories of resources and systems this covers, not technical
  implementation details.

  Examples by domain:
  - Applications: web applications, microservices, APIs, frontends,
  backends
  - Databases: databases, SQL databases, NoSQL, data stores,
  persistent storage
  - Networking: load balancers, ingresses, services, network
  policies, DNS
  - Storage: persistent volumes, storage classes, backup systems
  - Security: secrets, certificates, authentication, authorization,
  RBAC
  - Infrastructure: clusters, nodes, compute resources, cloud
  resources

  Please provide infrastructure type keywords separated by commas
  (avoid field names like 'image', 'tag', or specific values like
  'latest').

I’ll provide the same answer to specify the infrastructure types:

[user]

> AWS public services

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "trigger-expansion", response:
                              "AWS public services")
  ⎿  {
       "success": true,
       "operation": "create",
     … +53 lines (ctrl+r to expand)

⏺ Here's the expanded list of infrastructure types for your "AWS
  public services" pattern:

  AWS public services, public cloud services, managed services,
  serverless services, cloud-native services, hosted services,
  platform services, SaaS offerings, managed infrastructure, cloud
  resources, AWS managed services, public cloud resources, external
  services, third-party services, cloud APIs, managed databases,
  serverless functions, container services, analytics services, ML
  services, AI services, compute services, storage services,
  networking services, security services, monitoring services,
  integration services

  Please select which ones you want to keep for your pattern
  triggers. You can:
  - Choose any combination from the list above
  - Add your own custom triggers
  - Keep them all

  What triggers would you like for this pattern?

I’ll keep all the suggested triggers:

[user]

> All

[agent]

⏺ Which Kubernetes resources should be suggested for AWS public
  services? Please list the resource types you want this pattern to
  suggest, separated by commas. For example: Deployment, Service,
  ConfigMap or StatefulSet, PersistentVolumeClaim, Secret.

I’ll specify an internet gateway resource as the pattern recommendation:

[user]

> internetgateways.ec2.aws.m.upbound.io

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "rationale", response: "internet
                              gateways.ec2.aws.m.upbound.io")
  ⎿  {
       "success": true,
       "operation": "create",
     … +14 lines (ctrl+r to expand)

⏺ Why does this combination of resources work well together for AWS
  public services? This helps others understand when and why to use
  this pattern.

I’ll ask the AI to generate the rationale for me:

[user]

> Write it for me

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "created-by", response: "This
                              pattern provides the foundational AWS
                              networking infrastructure needed for
                              public-facing services. Internet
                              gateways enable outbound internet
                              connectivity for resources in public
                              subnets and allow inbound traffic from
                               the internet to reach publicly
                              accessible services. This is essential
                               for web applications, APIs, load
                              balancers, and any AWS services that
                              need to communicate with external
                              systems or be accessible from the
                              internet. The pattern ensures proper
                              network routing and connectivity for
                              cloud-native applications deployed on
                              AWS.")

╭───────────────────────────────────────────────────────────────────────╮
│ Tool use                                                              │
│                                                                       │
│   dot-ai - manageOrgData(dataType: "pattern", operation: "create",    │
│   sessionId: "pattern-1756581858189-909c1cbe", step: "created-by",    │
│   response: "This pattern provides the foundational AWS networking    │
│   infrastructure needed for public-facing services. Internet          │
│   gateways enable outbound internet connectivity for resources in     │
│   public subnets and allow inbound traffic from the internet to       │
│   reach publicly accessible services. This is essential for web       │
│   applications, APIs, load balancers, and any AWS services that       │
│   need to communicate with external systems or be accessible from     │
│   the internet. The pattern ensures proper network routing and        │
│   connectivity for cloud-native applications deployed on AWS.")       │
│   (MCP)                                                               │
│   Unified tool for managing cluster data: organizational patterns,    │
│   policy intents, and resource capabilities. For patterns and         │
│   policies: supports create, list, get, delete, deleteAll, and        │
│   search operations (patterns also support step-by-step creation      │
│   workflow). For capabilities: supports scan, list, get, delete,      │
│   deleteAll, and progress operations for cluster resource             │
│   capability discovery and management. Use dataType parameter to      │
│   specify what to manage: "pattern" for organizational patterns,      │
│   "policy" for policy intents, "capabilities" for resource            │
│   capabilities.                                                       │
│                                                                       │
│ Do you want to proceed?                                               │
│ ❯ 1. Yes                                                              │
│   2. Yes, and don't ask again for dot-ai - manageOrgData commands in  │
│   /Users/viktorfarcic/code/dot-ai-demo                                │
│   3. No, and tell Claude what to do differently (esc)                 │
│                                                                       │
╰───────────────────────────────────────────────────────────────────────╯

[agent]

⏺ What is your name or team identifier? This helps track pattern
  ownership and allows others to contact you with questions.

I’ll specify “a-team” as the creator:

[user]

> a-team

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "review", response: "a-team")
  ⎿  {
       "success": true,
       "operation": "create",
     … +84 lines (ctrl+r to expand)

⏺ Please review your pattern:

  Description: AWS public services
  Triggers: AWS public services, public cloud services, managed
  services, serverless services, cloud-native services, hosted
  services, platform services, SaaS offerings, managed
  infrastructure, cloud resources, AWS managed services, public
  cloud resources, external services, third-party services, cloud
  APIs, managed databases, serverless functions, container services,
   analytics services, ML services, AI services, compute services,
  storage services, networking services, security services,
  monitoring services, integration services
  Suggested Resources: internetgateways.ec2.aws.m.upbound.io
  Rationale: This pattern provides the foundational AWS networking
  infrastructure needed for public-facing services. Internet
  gateways enable outbound internet connectivity for resources in
  public subnets and allow inbound traffic from the internet to
  reach publicly accessible services. This is essential for web
  applications, APIs, load balancers, and any AWS services that need
   to communicate with external systems or be accessible from the
  internet. The pattern ensures proper network routing and
  connectivity for cloud-native applications deployed on AWS.
  Created By: a-team

  Does this look correct? Type 'confirm' to create the pattern, or
  'modify' to make changes.

The pattern looks good, so I’ll confirm it:

[user]

> Confirm

[agent]

  dot-ai - manageOrgData (MCP)(dataType: "pattern", operation:
                              "create", sessionId:
                              "pattern-1756581858189-909c1cbe",
                              step: "complete", response: "confirm")
  ⎿  {
       "success": true,
       "operation": "create",
     … +77 lines (ctrl+r to expand)

⏺ Pattern Created Successfully!

  Pattern ID: 70645609-c3e4-4139-bb56-cfb85e7e0536
  Description: AWS public services
  Triggers: AWS public services, public cloud services, managed
  services, serverless services, cloud-native services, hosted
  services, platform services, SaaS offerings, managed
  infrastructure, cloud resources, AWS managed services, public
  cloud resources, external services, third-party services, cloud
  APIs, managed databases, serverless functions, container services,
   analytics services, ML services, AI services, compute services,
  storage services, networking services, security services,
  monitoring services, integration services
  Resources: internetgateways.ec2.aws.m.upbound.io

  The pattern is now ready to enhance AI recommendations. When users
   ask for deployments matching your triggers, this pattern will
  suggest the specified Kubernetes resources.

And that’s how you capture organizational patterns from people’s knowledge. What we just saw was the process for extracting patterns that exist in people’s heads - the tribal knowledge and experience that isn’t written down anywhere.

The same process works for patterns stored in documentation, wikis, Slack conversations, or any other source. The only difference is you’d need an additional step at the beginning to extract and identify the patterns from those sources before you can define them in the system.

But patterns are just one piece of the puzzle. The AI also needs to understand what it’s not allowed to do.

Stop Breaking Production With AI

Now let’s talk about policies. While patterns teach AI how to assemble resources according to your organizational know-how, policies are about what values are allowed or required in the fields of those resources.

For example, policies define constraints like “all databases must run in the us-east-1 region”, “container images cannot use the ’latest’ tag”, or “all pods must have resource limits defined”. These are field-level constraints that ensure compliance and security.

Here’s the key insight: solutions like Kyverno, OPA, or Kubernetes Validating Admission Policies can enforce these policies, but they don’t teach AI or people how to do “the right thing” from the start. Without policy learning, you end up with a trial-and-error approach where you keep hitting enforcement barriers until all the checks finally pass. That’s inefficient and frustrating.

What we’re building here teaches the AI the policies upfront, so it can create compliant resources from the beginning instead of learning through rejection.

The process for handling policies is mostly the same as with patterns. You identify policies from various sources, create embeddings, and store them in a database. But here’s the key difference: we can also convert those policies into enforceable rules using Kubernetes Validating Admission Policies, Kyverno, OPA, or whatever policy implementation you’re using. This gives you both proactive guidance for the AI and reactive enforcement in the cluster.

This creates a powerful two-layer system: the AI can use data in the Vector database to learn which policies apply to a specific intent and create compliant resources from the start, while Kyverno and similar implementations serve as the last line of defense, just as they always have. Best of both worlds: proactive compliance and enforcement backup.

We’ll explore how to use these policies in workflows later. For now, let’s focus on how to create and manage them. I’ll demonstrate capturing policies from people’s knowledge - the compliance requirements and constraints that exist in their heads. The same approach works when extracting policies from other sources like compliance documents or existing policy configurations.

Let me start by showing you what policies we already have in the system:

[user]

> List policies

[agent]

⏺ I'll list the available policy intents in the system.

  dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "list")
  ⎿  {
       "success": true,
       "operation": "list",
     … +65 lines (ctrl+r to expand)

⏺ Here are the available policy intents:

  ID: 65c8d83d-f377-4664-b55c-89566a225784
  - Description: Databases in Azure should always run in the
  us-east1 region
  - Triggers: db, postgresql, azure, databases, data store, sql
  database, relational database, rdbms, azure database, managed
  database, database instance, azure database for postgresql,
  postgresql instance, postgres
  - Rationale: Ensures data residency compliance and reduces latency
   for US-based applications. Running databases in us-east1 provides
   optimal performance for primary workloads and meets regulatory
  requirements for data locality.
  - Created: 2025-08-25T23:12:32.824Z by a-team
  - Deployed Policy: policy-65c8d83d-f377-4664-b55c-89566a225784
  (applied 2025-08-25T23:12:36.304Z)

  ID: 7a95808e-1829-4084-b06f-0ef538fd48eb
  - Description: Applications must NEVER use the latest tag.
  - Triggers: app, application, applications, app deployment,
  application deployment, microservice, microservices
  - Rationale: Using the 'latest' tag creates unpredictable
  deployments and makes rollbacks difficult. Specific version tags
  ensure reproducible builds and enable proper change management.
  - Created: 2025-08-30T18:10:47.591Z by a-team
  - Deployed Policy: policy-7a95808e-1829-4084-b06f-0ef538fd48eb
  (applied 2025-08-30T18:10:51.459Z)

  Total: 2 policy intents found

Perfect! These are great examples of policy constraints. The first policy enforces that Azure databases must run in us-east1 region, and the second prevents the use of ’latest’ tags in applications. Notice how each policy has triggers, rationale, and importantly, a “Deployed Policy” reference - that means these policies have been converted into actual Kyverno enforcement rules.

Now let me show you how to create a new policy:

[user]

> Create a policy

[agent]

⏺ Please describe the policy intent you want to create. What should
  this policy enforce or ensure in your Kubernetes deployments?

  Examples:
  - "All containers must have resource limits defined"
  - "Images must be from trusted registries only"
  - "Pods must not run as root user"
  - "All services must have network policies"

  What policy requirement do you want to enforce?

I’ll create a policy for AWS database region compliance:

[user]

> Databases in AWS should always run in the us-east-1 region

[agent]

  dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "create", sessionId:
                              "policy-1756582341290-78820a05", step:
                               "triggers", response: "Databases in
                              AWS should always run in the us-east-1
                               region")
  ⎿  {
       "success": true,
       "operation": "create",
     … +25 lines (ctrl+r to expand)

⏺ What types of infrastructure should this apply to? Think about the
   categories of resources and systems this covers, not technical
  implementation details.

  Examples by domain:
  - Applications: web applications, microservices, APIs, frontends,
  backends
  - Databases: databases, SQL databases, NoSQL, data stores,
  persistent storage
  - Networking: load balancers, ingresses, services, network
  policies, DNS
  - Storage: persistent volumes, storage classes, backup systems
  - Security: secrets, certificates, authentication, authorization,
  RBAC
  - Infrastructure: clusters, nodes, compute resources, cloud
  resources

  Please provide infrastructure type keywords separated by commas
  (avoid field names like 'image', 'tag', or specific values like
  'latest').

I’ll specify AWS databases as the target:

[user]

> AWS DB

[agent]

⏺ Here's the expanded list of infrastructure types for your "AWS
  databases" policy:

  AWS DB, RDS, database, databases, data store, SQL database, NoSQL
  database, Aurora, DynamoDB, ElastiCache, DocumentDB, Neptune,
  Redshift, persistent storage, data warehouse, cache, in-memory
  database, graph database, time series database, backup systems,
  database clusters, read replicas

  Please select which ones you want to keep for your policy
  triggers. You can:
  - Choose any combination from the list above
  - Add your own custom triggers
  - Keep them all

  What triggers would you like for this policy?

I’ll keep all the expanded triggers:

[user]

> All of those

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "create", sessionId:
                              "policy-1756582341290-78820a05", step:
                               "rationale", response: "AWS DB, RDS,
                              database, databases, data store, SQL
                              database, NoSQL database, Aurora,
                              DynamoDB, ElastiCache, DocumentDB,
                              Neptune, Redshift, persistent storage,
                               data warehouse, cache, in-memory
                              database, graph database, time series
                              database, backup systems, database
                              clusters, read replicas")
  ⎿  {
       "success": true,
       "operation": "create",
     … +15 lines (ctrl+r to expand)

⏺ Why is this policy important for your organization? Please explain
   the rationale behind this policy requirement - what risks does it
   mitigate or what benefits does it provide?

I’ll ask the AI to generate the rationale:

[user]

> Write it for me

[agent]

⏺ dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "create", sessionId:
                              "policy-1756582341290-78820a05", step:
                               "created-by", response: "Ensures data
                               residency compliance and reduces
                              latency for US-based applications.
                              Running AWS databases in us-east-1
                              provides optimal performance for
                              primary workloads, meets regulatory
                              requirements for data locality, and
                              maintains consistency with existing
                              infrastructure deployments. This
                              policy helps control costs by keeping
                              data within the same region as compute
                               resources and simplifies disaster
                              recovery planning.")

╭───────────────────────────────────────────────────────────────────────╮
│ Tool use                                                              │
│                                                                       │
│   dot-ai - manageOrgData(dataType: "policy", operation: "create",     │
│   sessionId: "policy-1756582341290-78820a05", step: "created-by",     │
│   response: "Ensures data residency compliance and reduces latency    │
│   for US-based applications. Running AWS databases in us-east-1       │
│   provides optimal performance for primary workloads, meets           │
│   regulatory requirements for data locality, and maintains            │
│   consistency with existing infrastructure deployments. This policy   │
│    helps control costs by keeping data within the same region as      │
│   compute resources and simplifies disaster recovery planning.")      │
│   (MCP)                                                               │
│   Unified tool for managing cluster data: organizational patterns,    │
│   policy intents, and resource capabilities. For patterns and         │
│   policies: supports create, list, get, delete, deleteAll, and        │
│   search operations (patterns also support step-by-step creation      │
│   workflow). For capabilities: supports scan, list, get, delete,      │
│   deleteAll, and progress operations for cluster resource             │
│   capability discovery and management. Use dataType parameter to      │
│   specify what to manage: "pattern" for organizational patterns,      │
│   "policy" for policy intents, "capabilities" for resource            │
│   capabilities.                                                       │
│                                                                       │
│ Do you want to proceed?                                               │
│ ❯ 1. Yes                                                              │
│   2. Yes, and don't ask again for dot-ai - manageOrgData commands in  │
│   /Users/viktorfarcic/code/dot-ai-demo                                │
│   3. No, and tell Claude what to do differently (esc)                 │
│                                                                       │
╰───────────────────────────────────────────────────────────────────────╯

[agent]

⏺ Who should be credited as the creator of this policy intent? (This
   can be your name, team name, or organization.)

I’ll specify “a-team” as the creator:

[user]

> a-team

[agent]

⏺ Policy Namespace Scope

  Your policy can be applied cluster-wide or limited to specific
  namespaces.

  Available Namespaces in Your Cluster:

  a-team, atlas-operator, b-team, crossplane-system, default,
  ingress-nginx, kube-node-lease, kube-public, kube-system, kyverno,
   local-path-storage

  Choose the scope for your policy:

  1. Apply to all namespaces (cluster-wide enforcement)
    - Type: all or 1
  2. Apply only to specific namespaces (inclusive list)
    - Type: include: namespace1, namespace2, namespace3
    - Example: include: production, staging
  3. Apply to all namespaces EXCEPT specific ones (exclusion list)
    - Type: exclude: namespace1, namespace2
    - Example: exclude: kube-system, kube-public

  Your choice: [Type your selection]

  ---
  Examples:

  For cluster-wide policy:
  all

  To apply only to production and staging:
  include: production, staging

  To exclude system namespaces:
  exclude: kube-system, kube-public, kube-node-lease

  Note

  System namespaces (kube-system, kube-public, kube-node-lease) are
  often excluded from policies to prevent conflicts with Kubernetes
  core functionality. Consider whether your policy should apply to
  these system namespaces.

I’ll choose option 2 and specify specific namespaces:

[user]

> 2. a-team, b-team

[agent]

  dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "create", sessionId:
                              "policy-1756582341290-78820a05", step:
                               "kyverno-generation", response:
                              "include: a-team, b-team")
  ⎿  {
       "success": true,
       "operation": "create",
     … +311 lines (ctrl+r to expand)

⏺ Please review your policy intent:

  Description: Databases in AWS should always run in the us-east-1
  region
  Triggers: AWS DB, RDS, database, databases, data store, SQL
  database, NoSQL database, Aurora, DynamoDB, ElastiCache,
  DocumentDB, Neptune, Redshift, persistent storage, data warehouse,
   cache, in-memory database, graph database, time series database,
  backup systems, database clusters, read replicas
  Rationale: Ensures data residency compliance and reduces latency
  for US-based applications. Running AWS databases in us-east-1
  provides optimal performance for primary workloads, meets
  regulatory requirements for data locality, and maintains
  consistency with existing infrastructure deployments. This policy
  helps control costs by keeping data within the same region as
  compute resources and simplifies disaster recovery planning.
  Created By: a-team

  I've also generated a Kyverno ClusterPolicy that enforces this
  requirement:

  Generated Kyverno Policy:
  apiVersion: kyverno.io/v1
  kind: ClusterPolicy
  metadata:
    name: require-aws-databases-us-east-1-region
    labels:
      policy-intent/id: 62cb97c8-e75a-4ec0-bec8-404dfbebf4e2
    annotations:
      policy-intent/description: "Databases in AWS should always run
   in the us-east-1 region"
      policy-intent/rationale: "Ensures data residency compliance
  and reduces latency for US-based applications..."
  spec:
    background: false
    validationFailureAction: Enforce
    rules:
    - name: require-aws-rds-resources-us-east-1
      match:
        any:
        - resources:
            kinds:
            - rds.aws.m.upbound.io/v1beta1/Cluster
            - rds.aws.m.upbound.io/v1beta1/GlobalCluster
            - rds.aws.m.upbound.io/v1beta1/Instance
            # ... (additional RDS resource types)
            namespaces:
            - a-team
            - b-team
            operations:
            - CREATE
            - UPDATE
      validate:
        cel:
          expressions:
          - expression: >-
              has(object.spec.forProvider.region) &&
  object.spec.forProvider.region == 'us-east-1'
            message: "AWS RDS resources must be deployed in the
  us-east-1 region..."

  Choose what to do:

  1. Apply Kyverno policy to cluster - Store policy intent AND
  deploy enforcement to cluster
  2. Store policy intent only - Save for AI guidance without cluster
   enforcement
  3. Cancel - Do nothing

  ⚠️ Warning: Option 1 will deploy active policy enforcement to your
   cluster.

Let me take a look at the complete Kyverno policy that was generated:

[user]

> Show complete Kyverno policy

[agent]

⏺ Here's the complete Kyverno policy YAML:

  # MANDATORY SCHEMA-BY-SCHEMA ANALYSIS
  #
  # clusters.rds.aws.m.upbound.io: HAS spec.forProvider.region →
  MUST generate rule
  # globalclusters.rds.aws.m.upbound.io: HAS spec.forProvider.region
   → MUST generate rule
  # instances.rds.aws.m.upbound.io: HAS spec.forProvider.region →
  MUST generate rule
  # clusterinstances.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # dbsnapshotcopies.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # clustersnapshots.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # dbinstanceautomatedbackupsreplications.rds.aws.m.upbound.io: HAS
   spec.forProvider.region → MUST generate rule
  # optiongroups.rds.aws.m.upbound.io: HAS spec.forProvider.region →
   MUST generate rule
  # parametergroups.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # snapshots.rds.aws.m.upbound.io: HAS spec.forProvider.region →
  MUST generate rule
  # clusterparametergroups.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # clusterendpoints.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # subnetgroups.rds.aws.m.upbound.io: HAS spec.forProvider.region →
   MUST generate rule
  # instancestates.rds.aws.m.upbound.io: HAS spec.forProvider.region
   → MUST generate rule
  # clusteractivitystreams.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # instanceroleassociations.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # proxies.rds.aws.m.upbound.io: HAS spec.forProvider.region → MUST
   generate rule
  # eventsubscriptions.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # databaseinstances.sql.gcp.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule (but uses us-east1
  format)
  # clusterroleassociations.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # proxydefaulttargetgroups.rds.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # proxyendpoints.rds.aws.m.upbound.io: HAS spec.forProvider.region
   → MUST generate rule
  # ebssnapshotcopies.ec2.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  # proxytargets.rds.aws.m.upbound.io: HAS spec.forProvider.region →
   MUST generate rule
  # availabilityzonegroups.ec2.aws.m.upbound.io: HAS
  spec.forProvider.region → MUST generate rule
  #
  # RESOURCES REQUIRING VALIDATION RULES: All AWS RDS/database
  resources
  #
  apiVersion: kyverno.io/v1
  kind: ClusterPolicy
  metadata:
    name: require-aws-databases-us-east-1-region
    labels:
      policy-intent/id: 62cb97c8-e75a-4ec0-bec8-404dfbebf4e2
    annotations:
      policy-intent/description: "Databases in AWS should always run
   in the us-east-1 region"
      policy-intent/rationale: "Ensures data residency compliance
  and reduces latency for US-based applications. Running AWS
  databases in us-east-1 provides optimal performance for primary
  workloads, meets regulatory requirements for data locality, and
  maintains consistency with existing infrastructure deployments.
  This policy helps control costs by keeping data within the same
  region as compute resources and simplifies disaster recovery
  planning."
  spec:
    background: false
    validationFailureAction: Enforce
    rules:
    - name: require-aws-rds-resources-us-east-1
      match:
        any:
        - resources:
            kinds:
            - rds.aws.m.upbound.io/v1beta1/Cluster
            - rds.aws.m.upbound.io/v1beta1/GlobalCluster
            - rds.aws.m.upbound.io/v1beta1/Instance
            - rds.aws.m.upbound.io/v1beta1/ClusterInstance
            - rds.aws.m.upbound.io/v1beta1/DBSnapshotCopy
            - rds.aws.m.upbound.io/v1beta1/ClusterSnapshot
            -
  rds.aws.m.upbound.io/v1beta1/DBInstanceAutomatedBackupsReplication
            - rds.aws.m.upbound.io/v1beta1/OptionGroup
            - rds.aws.m.upbound.io/v1beta1/ParameterGroup
            - rds.aws.m.upbound.io/v1beta1/Snapshot
            - rds.aws.m.upbound.io/v1beta1/ClusterParameterGroup
            - rds.aws.m.upbound.io/v1beta1/ClusterEndpoint
            - rds.aws.m.upbound.io/v1beta1/SubnetGroup
            - rds.aws.m.upbound.io/v1beta1/InstanceState
            - rds.aws.m.upbound.io/v1beta1/ClusterActivityStream
            - rds.aws.m.upbound.io/v1beta1/InstanceRoleAssociation
            - rds.aws.m.upbound.io/v1beta1/Proxy
            - rds.aws.m.upbound.io/v1beta1/EventSubscription
            - rds.aws.m.upbound.io/v1beta1/ClusterRoleAssociation
            - rds.aws.m.upbound.io/v1beta1/ProxyDefaultTargetGroup
            - rds.aws.m.upbound.io/v1beta1/ProxyEndpoint
            - rds.aws.m.upbound.io/v1beta1/ProxyTarget
            namespaces:
            - a-team
            - b-team
            operations:
            - CREATE
            - UPDATE
      validate:
        cel:
          expressions:
          - expression: >-
              has(object.spec.forProvider.region) &&
  object.spec.forProvider.region == 'us-east-1'
            message: "AWS RDS resources must be deployed in the
  us-east-1 region for data residency compliance and optimal
  performance"
          - expression: >-
              has(object.spec.initProvider.region) ?
  object.spec.initProvider.region == 'us-east-1' : true
            message: "AWS RDS resources must be deployed in the
  us-east-1 region for data residency compliance and optimal
  performance"
    - name: require-aws-gcp-sql-resources-us-east-1
      match:
        any:
        - resources:
            kinds:
            - sql.gcp.m.upbound.io/v1beta1/DatabaseInstance
            namespaces:
            - a-team
            - b-team
            operations:
            - CREATE
            - UPDATE
      validate:
        cel:
          expressions:
          - expression: >-
              has(object.spec.forProvider.region) &&
  object.spec.forProvider.region == 'us-east1'
            message: "GCP SQL database instances must be deployed in
   the us-east1 region for data residency compliance and optimal
  performance"
          - expression: >-
              has(object.spec.initProvider.region) ?
  object.spec.initProvider.region == 'us-east1' : true
            message: "GCP SQL database instances must be deployed in
   the us-east1 region for data residency compliance and optimal
  performance"
    - name: require-aws-ec2-resources-us-east-1
      match:
        any:
        - resources:
            kinds:
            - ec2.aws.m.upbound.io/v1beta1/EBSSnapshotCopy
            - ec2.aws.m.upbound.io/v1beta1/AvailabilityZoneGroup
            namespaces:
            - a-team
            - b-team
            operations:
            - CREATE
            - UPDATE
      validate:
        cel:
          expressions:
          - expression: >-
              has(object.spec.forProvider.region) &&
  object.spec.forProvider.region == 'us-east-1'
            message: "AWS EC2 resources must be deployed in the
  us-east-1 region for data residency compliance and optimal
  performance"
          - expression: >-
              has(object.spec.initProvider.region) ?
  object.spec.initProvider.region == 'us-east-1' : true
            message: "AWS EC2 resources must be deployed in the
  us-east-1 region for data residency compliance and optimal
  performance"

  Choose what to do:

  1. Apply Kyverno policy to cluster - Store policy intent AND
  deploy enforcement to cluster
  2. Store policy intent only - Save for AI guidance without cluster
   enforcement
  3. Cancel - Do nothing

  ⚠️ Warning: Option 1 will deploy active policy enforcement to your
   cluster.

This is impressive - the system generated a complete Kyverno policy with rules for multiple AWS services and database types. Even if something was wrong and we only got 80% of the policy correct, that would still be incredibly helpful. We could save it, update it ourselves or with another agent, and then apply it.

For this demo, I’ll apply the policy to the cluster:

[user]

> Apply Kyverno policy to cluster

[agent]

  dot-ai - manageOrgData (MCP)(dataType: "policy", operation:
                              "create", sessionId:
                              "policy-1756582341290-78820a05", step:
                               "complete", response:
                              "apply-to-cluster")
  ⎿  {
       "success": true,
       "operation": "create",
     … +226 lines (ctrl+r to expand)

⏺ Policy Applied to Cluster Successfully!

  Policy ID: 62cb97c8-e75a-4ec0-bec8-404dfbebf4e2
  Description: Databases in AWS should always run in the us-east-1
  region
  Deployed Policy: policy-62cb97c8-e75a-4ec0-bec8-404dfbebf4e2
  Kyverno File: /app/tmp/sessions/policy-sessions/62cb97c8-e75a-4ec0
  -bec8-404dfbebf4e2-kyverno.yaml

  Apply Output:
  clusterpolicy.kyverno.io/require-aws-databases-us-east-1-region
  created

  The policy intent has been stored in the database and the Kyverno
  policy has been successfully applied to your cluster. This policy
  will now enforce that all AWS database resources in the a-team and
   b-team namespaces must be deployed in the us-east-1 region.

And that’s the power of policy learning. We’ve captured organizational compliance requirements from people’s knowledge, stored them in a searchable format for AI guidance, and automatically generated enforceable Kyverno policies for cluster-level enforcement. Now the AI knows to create databases in the us-east-1 region from the start, and if something slips through, Kyverno will catch it.

The same process works for extracting policies from compliance documents, existing policy configurations, or any other source where your organizational constraints are documented.

So we’ve covered capabilities, patterns, and policies. But there’s still one crucial piece missing: managing all this information efficiently.

The Context Window Disaster Nobody Talks About

Here’s a critical problem that can make or break your AI-powered infrastructure: context management. What exactly is context, and why does it matter so much?

We’re dealing with massive amounts of data: hundreds of Kubernetes resources, each with potentially enormous schemas, plus all our patterns, policies, user intents, and everything else. If you keep piling all of this into your AI’s context (0), it quickly becomes garbage. And that garbage gets compacted into even bigger garbage until the whole system becomes completely useless.

This is a fundamental problem with how most people approach AI in infrastructure. They dump everything into the context and wonder why performance degrades, costs skyrocket, and responses become increasingly inaccurate.

But wait until you see the elegant solution that completely eliminates this problem.

Here’s the solution: instead of building on top of previous context, each interaction in this MCP system starts with a completely fresh context. The agent inside the MCP gets exactly the relevant information it needs for the specific task at hand, no matter when that information was originally fetched or created.

No accumulated garbage. No bloated context windows. No degraded performance. Just clean, relevant data for each interaction.

And here’s a crucial optimization: use code, not agents, to fetch information in predictable situations. When you know exactly what data you need and where to get it, don’t waste time and money asking an AI to fetch it. Direct code execution is faster, less expensive, more reliable, and completely deterministic.

Let me show you what this looks like in practice. Here’s an actual prompt template used in the system:

# Solution Assembly and Ranking Prompt

You are a Kubernetes expert. Given this user intent, available resources, and organizational patterns, create and rank complete solutions that address the user's needs.

## User Intent
{intent}

## Available Resources
{resources}

## Organizational Patterns
{patterns}

## Instructions

## 🏆 PATTERN PRIORITIZATION (HIGHEST PRIORITY)

**Pattern-Aware Resource Selection:**
- **Pattern resources are included** in the Available Resources list below with source marked as "organizational-pattern"
- **Golden Path Priority** - Pattern resources represent approved organizational standards and should rank higher than alternatives
- **Higher-Level Abstractions** - Pattern resources often provide better user experience than low-level cloud provider resources

**SOLUTION ASSEMBLY APPROACH:**

1. **Analyze Available Resources**: Review capabilities, providers, complexity, and use cases
2. **Apply Pattern Logic**: Read pattern rationales to understand when they apply
3. **Create Complete Solutions**: Assemble resources into working combinations
4. **Rank by Effectiveness**: Score based on capability match, pattern compliance, and user intent

**CRITICAL: Pattern Conditional Logic**
- **Read each pattern's "Rationale" field carefully** - it specifies WHEN the pattern applies
- **Apply patterns conditionally** - only include pattern resources when their technical conditions are met
- **Resource compatibility analysis**: Before including pattern resources in a solution, verify the pattern's rationale matches the resources you're selecting
- **API group dependencies**: If a pattern rationale mentions specific API groups (e.g., "solutions using X.api"), only apply that pattern when the solution actually uses resources from those API groups
- **Multi-provider abstractions**: Higher-level abstractions that work across providers should not automatically trigger provider-specific auxiliary patterns unless technically required
- **Pattern compliance increases solution score** - solutions following organizational patterns should rank higher, but only when patterns are correctly applied based on technical compatibility

Create multiple alternative solutions. Consider:
- **🥇 FIRST: Pattern-based solutions** - Complete solutions using organizational patterns when applicable
- **🥈 SECOND: Technology-focused solutions** - Solutions optimized for specific technologies or providers  
- **🥉 THIRD: Complexity variations** - Simple vs comprehensive approaches
- Direct relevance to the user's needs (applications, infrastructure, operators, networking, storage)  
- Common Kubernetes patterns and best practices
- Resource relationships and combinations
- Production deployment patterns
- Complex multi-component solutions
- **Custom Resource Definitions (CRDs)** that might provide higher-level abstractions or simpler alternatives
- Platform-specific resources (e.g., Crossplane, Knative, Istio, ArgoCD) that could simplify the deployment
- **Infrastructure components**: networking (Ingress, Service, NetworkPolicy), storage (PVC, StorageClass), security (RBAC, ServiceAccount)
- **Database operators**: PostgreSQL, MongoDB, MySQL, Redis operators that provide managed database services
- **Monitoring and observability**: Prometheus, Grafana, AlertManager, logging operators
- **Operator patterns**: Look for operators that provide simplified management of complex infrastructure
- **CRD Selection Priority**: If you see multiple CRDs from the same group with similar purposes (like "App" and "AppClaim"), include the namespace-scoped ones (marked as "Namespaced: true") rather than cluster-scoped ones, as they're more appropriate for application deployments

**Generate 2-5 different solutions** that genuinely address the user intent. Prioritize relevance over quantity - it's better to provide 2-3 high-quality, relevant solutions than to include irrelevant alternatives just to reach a target number.

## Response Format

Respond with ONLY a JSON object containing an array of complete solutions. Each solution should include resources, description, scoring, and pattern compliance:

**CRITICAL**: For each resource in your solutions, you MUST include the `resourceName` field. This field contains the correct plural, lowercase resource name used for kubectl explain calls. Extract this from the Available Resources list - each resource shows its `resourceName` field that you should copy exactly.

{
  "solutions": [
    {
      "type": "combination",
      "resources": [
        {
          "kind": "Deployment",
          "apiVersion": "apps/v1",
          "group": "apps",
          "resourceName": "deployments.apps"
        },
        {
          "kind": "Service",
          "apiVersion": "v1",
          "group": "",
          "resourceName": "services"
        }
      ],
      "score": 95,
      "description": "Complete web application deployment with networking",
      "reasons": ["High capability match for web applications", "Includes essential networking"],
      "patternInfluences": [
        {
          "patternId": "web-app-pattern-123",
          "description": "Web application deployment pattern",
          "influence": "high",
          "matchedTriggers": ["web application", "frontend"]
        }
      ],
      "usedPatterns": true
    }
  ]
}

IMPORTANT: Your response must be ONLY the JSON object, nothing else.

## Selection Philosophy

- **Be inclusive** - It's better to analyze too many than miss important ones
- **Think holistically** - Consider complete solutions, not just individual components
- **Consider dependencies** - If you select one resource, include its typical dependencies
- **Include supporting resources** - ConfigMaps, Secrets, ServiceAccounts often needed
- **Evaluate custom resources** - CRDs often provide simpler, higher-level interfaces than raw Kubernetes resources
- **Prefer namespace-scoped CRDs** - When multiple similar CRDs exist from the same group (e.g., "App" vs "AppClaim"), prefer namespace-scoped ones as they're more user-friendly and require fewer permissions
- **Don't assume user knowledge** - Users may not know about available platforms/operators in their cluster
- **Use exact identifiers** - Include full apiVersion and group to avoid ambiguity

This is the context for a single step in the workflow. Think about it: we might have accumulated thousands, tens of thousands, or even hundreds of thousands of tokens in previous interactions, but all of that is gone. Wiped clean.

Instead, this template has placeholders that get replaced with only the relevant data needed for this specific step. {intent} becomes the enhanced user intent, {resources} gets populated with the specific list of Kubernetes resources and schemas that might be involved in assembling a solution, and {patterns} gets replaced with the relevant patterns found through semantic search in the Vector database - not all patterns, just the ones that matter for this particular request.

The prompt outputs clean JSON that can be used, along with other data, in subsequent workflow steps. Whatever else the AI might have generated gets discarded immediately. No bloat, no accumulation, no context pollution.

This approach keeps the system fast, cost-effective, and accurate throughout even the most complex multi-step workflows.

Smart Conversations That Get Results

Now let’s talk about workflows. What exactly are workflows in the context of AI-powered infrastructure management?

Workflows are structured, semi-predictable sequences of steps designed to accomplish something complex. They break down big tasks into manageable pieces, guide users through decision points, and ensure all necessary information is gathered before taking action.

In our example, a workflow is a combination of fetching information from various sources and analyzing that information with AI. Each step in the workflow can involve data gathering, AI analysis, or both. We might fetch information from users to enhance their intent, pull relevant patterns and policies from the Vector database, get current schemas from Kubernetes clusters, gather operational data, and source information from anywhere else that’s relevant. Then AI analyzes all this collected data to make intelligent decisions about the next step.

We’ve already seen workflows in action when we were managing capabilities, patterns, and policies. But this next workflow is more important because it guides users toward the right solution while leveraging all the capabilities, patterns, and policies we’ve created.

Now here’s where everything comes together. Watch what happens when I make the same PostgreSQL request from the beginning, but this time with all our components working together:

[user]

> Create a PostgreSQL database in AWS.

[agent]

⏺ I'll help you create a PostgreSQL database in AWS using
  Kubernetes. Let me use the dot-ai recommendation tool to get the
  best approach for deploying PostgreSQL.

⏺ dot-ai - recommend (MCP)(intent: "create PostgreSQL database in
                          AWS")
  ⎿  {
       "status": "clarification_available",
       "intent": "create PostgreSQL database in AWS",
     … +172 lines (ctrl+r to expand)

⏺ I can help you create a PostgreSQL database in AWS! To provide the
   best recommendations, I need to clarify a few key details about
  your requirements:

  Primary Questions:
  1. Deployment Type: Do you want PostgreSQL deployed as containers
  on Kubernetes (like EKS), or are you looking for a managed AWS
  service like RDS?
  2. Environment: What environment is this for - development,
  staging, or production?
  3. Performance: What are your expected performance requirements
  (concurrent connections, query volume, data size)?
  4. Availability: Do you need high availability with automatic
  failover, or is a single instance acceptable?

  Additional Considerations:
  - Data sensitivity and compliance requirements
  - Network accessibility (internal only vs external access)
  - Storage requirements and expected growth
  - PostgreSQL version preferences
  - Applications that will connect to the database
  - Monitoring and alerting needs
  - Budget constraints
  - Timeline requirements

  Could you provide details on these key points so I can give you
  the most appropriate solution for your PostgreSQL database setup
  in AWS?

Notice what just happened. Instead of immediately suggesting a random PostgreSQL solution, the workflow intelligently recognized that my vague request needed clarification. It asked targeted questions to understand my specific requirements, deployment preferences, and constraints. This is the workflow guiding the conversation toward a better outcome.

The output from the MCP server contains, among other things, information about what should be the next step in the workflow - which MCP tool it should call next.

Now I’ll provide the clarification it requested:

[user]

> It should be managed PostgreSQL DB in AWS (RDS), there should be DB schema management, and it should be publicly accessible.

[agent]

⏺ I'll help you create a managed PostgreSQL database in AWS RDS with
   schema management and public accessibility. Let me get specific
  recommendations for this setup.

  dot-ai - recommend (MCP)(intent: "create managed PostgreSQL RDS
                          database in AWS with schema management and
                           public accessibility", final: true)
  ⎿  {
       "intent": "create managed PostgreSQL RDS database in AWS with
     schema management and public accessibility",
     … +249 lines (ctrl+r to expand)

⏺ I found 4 solutions for creating a managed PostgreSQL RDS database
   in AWS with schema management and public accessibility. Here are
  your options:

  Solution Options:

  1. DevOps Toolkit Managed PostgreSQL (Score: 95) ⭐ Uses
  Organizational Patterns

  - Resources: SQLS (devopstoolkit.live/v1beta1)
  - Benefits: Complete PostgreSQL solution with built-in schema
  management, integrated secret management, and multi-cloud
  portability
  - Pattern Used: DevOps Toolkit DB pattern for simplified database
  deployments

  2. AWS RDS with Crossplane Schema Management (Score: 90) Uses
  Organizational Patterns

  - Resources: RDS Instance, Schema, ProviderConfig, InternetGateway
  - Benefits: Direct AWS RDS control with dedicated schema
  management and role-based access
  - Pattern Used: AWS public services pattern

  3. High-Availability Aurora PostgreSQL (Score: 88) Uses
  Organizational Patterns

  - Resources: Aurora Cluster, ClusterInstance, AtlasSchema,
  InternetGateway
  - Benefits: Enterprise-grade solution with advanced schema
  versioning, high availability, and scalability
  - Pattern Used: AWS public services pattern with serverless
  scaling

  4. Comprehensive RDS with Full Lifecycle Management (Score: 85)

  - Resources: RDS Instance, Database, Role, Grant
  - Benefits: Complete database administration with role-based
  access control and fine-grained security
  - Pattern Used: None (custom implementation)

  Which solution would you prefer? Please let me know the number
  (1-4) of your choice, and I'll proceed with the configuration.

This is brilliant! The AI analyzed my refined requirements and presented me with multiple solutions. Notice how it found organizational patterns and ranked them higher (scores 95, 90, 88 vs 85). I can choose a “golden path” in the form of DevOps Toolkit Managed PostgreSQL or go with a custom solution. The AI gives me options, but it’s up to me to choose which one fits my needs.

I’ll select the top-ranked organizational pattern:

[user]

> 1

[agent]

  dot-ai - chooseSolution (MCP)(solutionId:
                               "sol_2025-08-30T195545_4aeee04d77be")
  ⎿  {
       "status": "stage_questions",
       "solutionId": "sol_2025-08-30T195545_4aeee04d77be",
     … +74 lines (ctrl+r to expand)

⏺ Perfect! I've selected the DevOps Toolkit Managed PostgreSQL
  solution. Now I need some required configuration details:

  Required Configuration:

  1. Database Name: What name would you like for your PostgreSQL
  database? (lowercase letters, numbers, and hyphens only)
  2. Namespace: Which namespace should the database be deployed to?
    - Options: a-team, atlas-operator, b-team, crossplane-system,
  default, ingress-nginx, kube-node-lease, kube-public, kube-system,
   kyverno, local-path-storage
  3. Size: What size database do you need?
    - Options: small, medium, large
  4. Region: ⚠️ Policy requirement: Databases in AWS should run in
  us-east-1 region
    - Options: us-east-1, us-west-2, eu-west-1, ap-southeast-1

  Please provide your choices for these required settings.

Perfect! Notice how the workflow is already enforcing the policy we created earlier. It’s telling me that databases in AWS should run in us-east-1 region - that’s our policy in action, guiding the user toward compliant choices.

[user]

> 1. my-db, 2. a-team, 3. small, 4. us-east-1

[agent]

⏺ Great! Now for the basic configuration options:

  Basic Configuration (Optional):

  1. Databases: Which databases should be created in your PostgreSQL
   instance?
    - Options: app, users, inventory, analytics, logs
    - You can select multiple databases
  2. PostgreSQL Version: Which version would you prefer?
    - Options: 15 (recommended), 14, 13, 12
  3. Secret Management: Do you want to enable automated secret
  management for database credentials?
    - Recommended for production environments

  You can provide your choices for these settings, or say "skip" to
  proceed to advanced configuration.

I’ll specify basic configuration options:

[user]

> 1. db-01, db-02; 2. 15; 3. No

[agent]

⏺ Perfect! Now for the advanced configuration options:

  Advanced Configuration (Optional):

  1. Crossplane Composition: Do you want to use a specific
  composition for your database?
    - Leave empty for default composition
  2. Composition Updates: How should composition updates be handled?
    - Options: Automatic, Manual
  3. External Secret Store: Which external secret store should be
  used for credential management?
    - e.g., vault-store, aws-secrets-manager
  4. Secret Pull Cluster: Should secrets be pulled to a specific
  cluster?
  5. Secret Pull Namespace: Which namespace should receive the
  pulled secrets?
    - Options: a-team, atlas-operator, b-team, etc.
  6. UpCloud Network: If using UpCloud, what network address should
  be configured?
    - e.g., 192.168.1.0/24
  7. Schema Definitions: Do you have specific SQL schemas to
  initialize with your databases?
    - Provide SQL statements for database initialization

  You can provide your choices for these settings, or say "skip" to
  proceed to the final open configuration stage.

Here’s an interesting observation: the question about which Crossplane Composition to use is an example of a potentially missing policy. We should probably create one that instructs the system which Compositions to use for AWS, which ones for Google, and so on. But that’s a task for another day.

For now, let me ask the AI to generate a sample schema:

[user]

> 7. Write a sample schema for me.

[agent]

  dot-ai - answerQuestion (MCP)(solutionId:
                               "sol_2025-08-30T195545_4aeee04d77be",
                                stage: "advanced", answers:
                               {"schema-definitions":"-- Sample
                               schema for db-01\nCREATE SCHEMA IF
                               NOT EXISTS app;\n\nCREATE TABLE
                               app.users (\n    id SERIAL PRIMARY
                               KEY,\n    username VARCHAR(50) UNIQUE
                                NOT NULL,\n    email VARCHAR(100)
                               UNIQUE NOT NULL,\n    created_at
                               TIMESTAMP DEFAULT
                               CURRENT_TIMESTAMP\n);\n\nCREATE TABLE
                                app.products (\n    id SERIAL
                               PRIMARY KEY,\n    name VARCHAR(100)
                               NOT NULL,\n    description TEXT,\n
                                price DECIMAL(10,2) NOT NULL,\n
                               created_at TIMESTAMP DEFAULT
                               CURRENT_TIMESTAMP\n);\n\n-- Sample
                               schema for db-02\nCREATE SCHEMA IF
                               NOT EXISTS analytics;\n\nCREATE TABLE
                                analytics.events (\n    id SERIAL
                               PRIMARY KEY,\n    event_type
                               VARCHAR(50) NOT NULL,\n    user_id
                               INTEGER,\n    event_data JSONB,\n
                               timestamp TIMESTAMP DEFAULT
                               CURRENT_TIMESTAMP\n);"})
  ⎿  {
       "status": "stage_questions",
       "solutionId": "sol_2025-08-30T195545_4aeee04d77be",
     … +107 lines (ctrl+r to expand)

⏺ Perfect! I've added a sample schema for your databases. The schema
   includes:

  For db-01:
  - app schema with users and products tables
  - Users table with id, username, email, and timestamps
  - Products table with id, name, description, price, and timestamps

  For db-02:
  - analytics schema with an events table
  - Events table for tracking user activities with JSONB data
  storage

  Now, would you like to configure the remaining advanced settings,
  or say "skip" to proceed to the final open configuration stage?

  Remaining Advanced Options:
  1. Crossplane composition (leave empty for default)
  2. Composition updates (Automatic/Manual)
  3. External secret store
  4. Secret pull cluster
  5. Secret pull namespace
  6. UpCloud network address

I’ll skip the advanced configuration:

[user]

> skip

[agent]

  dot-ai - answerQuestion (MCP)(solutionId:
                               "sol_2025-08-30T195545_4aeee04d77be",
                                stage: "advanced", answers: {})
  ⎿  {
       "status": "stage_questions",
       "solutionId": "sol_2025-08-30T195545_4aeee04d77be",
     … +37 lines (ctrl+r to expand)

⏺ Final Configuration Stage - Open Requirements:

  Is there anything else about your PostgreSQL requirements or
  constraints that would help us provide better recommendations?

  Examples:
  - Specific performance requirements
  - Backup needs
  - Network security constraints
  - Compliance requirements
  - Connection limits
  - Monitoring preferences

  You can provide any additional requirements, or say "N/A" if you
  have no additional requirements to proceed with generating the
  manifests.

I don’t have additional requirements:

[user]

> N/A

[agent]

 dot-ai - answerQuestion (MCP)(solutionId:
                               "sol_2025-08-30T195545_4aeee04d77be",
                                stage: "open", answers:
                               {"open":"N/A"})
  ⎿  {
       "status": "ready_for_manifest_generation",
       "solutionId": "sol_2025-08-30T195545_4aeee04d77be",
     … +53 lines (ctrl+r to expand)

⏺ Perfect! Configuration is complete. Now I'll generate the
  Kubernetes manifests for your PostgreSQL database.

⏺ dot-ai - generateManifests (MCP)(solutionId: "sol_2025-08-30T19554
                                  5_4aeee04d77be")

╭───────────────────────────────────────────────────────────────────────╮
│ Tool use                                                              │
│                                                                       │
│   dot-ai - generateManifests(solutionId:                              │
│   "sol_2025-08-30T195545_4aeee04d77be") (MCP)                         │
│   Generate final Kubernetes manifests from fully configured           │
│   solution (ONLY after completing ALL stages: required, basic,        │
│   advanced, and open)                                                 │
│                                                                       │
│ Do you want to proceed?                                               │
│ ❯ 1. Yes                                                              │
│  2. Yes, and don't ask again for dot-ai - generateManifests commands  │
│  in /Users/viktorfarcic/code/dot-ai-demo                              │
│   3. No, and tell Claude what to do differently (esc)                 │
│                                                                       │
╰───────────────────────────────────────────────────────────────────────╯

[agent]

⏺ Excellent! Your PostgreSQL database manifests have been generated
  successfully. Here's what was created:

  Generated Resources:

  Primary Resource: SQL Database

  - Name: my-db
  - Namespace: a-team
  - Type: DevOps Toolkit SQL resource
  - Size: small
  - Region: us-east-1
  - PostgreSQL Version: 15

  Database Configuration:

  - Databases: db-01, db-02
  - Schema Management: Built-in with sample schemas
  - Secret Management: Disabled (as requested)
  - Public Accessibility: Enabled

  Sample Schemas Included:

  - db-01: app schema with users and products tables
  - db-02: analytics schema with events table

  The manifests are ready for deployment. Would you like me to
  deploy them to your cluster now, or would you prefer to review
  them first?

  To deploy, I can run: kubectl apply with the generated manifests.

This is the culmination of everything we’ve built! The agentic MCP has assembled the solution by combining the workflow (guided user interaction), capabilities (available Kubernetes resources), patterns (organizational best practices), and policies (compliance requirements like the us-east-1 region rule). It did the right thing because it got all the information it needed through the structured workflow.

Of course, it might not have done the right thing if we had missed providing sufficient patterns and policies - the quality of the AI’s decisions directly depends on the organizational knowledge we’ve fed into the system.

Now I could save these manifests for GitOps, review them first, or do something else entirely with the assembled solution. It’s up to me to choose whether to let the agentic MCP deploy directly or handle the deployment through my organization’s preferred process. For this demo, I’ll let it deploy directly:

[user]

> Deploy them

[agent]

  dot-ai - deployManifests (MCP)(solutionId: "sol_2025-08-30T195545_
                                4aeee04d77be")
  ⎿  {
       "success": true,
       "solutionId": "sol_2025-08-30T195545_4aeee04d77be",
     … +41 lines (ctrl+r to expand)

⏺ 🎉 Deployment Successful!

  Your PostgreSQL database has been deployed successfully to the
  a-team namespace. Here's what was created:

  Deployed Resources:

  - ✅ ConfigMaps: Created for deployment tracking
  - ✅ SQL Database: my-db resource deployed
  - ✅ All system components: Ready and operational

  Database Details:

  - Name: my-db
  - Location: AWS RDS in us-east-1
  - Databases: db-01, db-02
  - Version: PostgreSQL 15
  - Size: Small
  - Public Access: Enabled
  - Schema Management: Built-in with sample schemas

  The database is now provisioning in AWS. You can monitor its
  status with:
  kubectl get sql my-db -n a-team

  The DevOps Toolkit will handle the AWS RDS provisioning, schema
  creation, and make the database publicly accessible as requested.

Your Complete AI-Powered IDP Blueprint

We’ve built something that most organizations only dream about: an AI system that actually understands your infrastructure and works within your constraints.

Here are the five essential components that make this possible:

Capabilities teach your AI what infrastructure resources are actually available in your environment. No more generic suggestions that don’t match your setup.

Patterns encode your organizational wisdom - the tribal knowledge that transforms AI from giving generic solutions to following your specific standards.

Policies ensure compliance from the start instead of learning through rejection. Your AI creates compliant resources immediately.

Context Management keeps your system fast and accurate by starting each interaction with fresh context instead of accumulating garbage.

Workflows guide intelligent conversations toward the right solutions instead of relying on vague requests and unpredictable responses.

The result? AI that deploys infrastructure correctly the first time, follows your patterns, respects your policies, and actually works in your organization.

Most of your work will be in creating patterns and policies. The technical infrastructure is straightforward, but capturing your organizational knowledge and compliance requirements? That’s where you’ll spend your time.

If you choose to experiment with the DevOps AI Toolkit or build your own system, remember that this is an iterative process. Generate recommendations, inspect them carefully, identify gaps in your patterns and policies, add what’s missing, and repeat. We saw a perfect example of this earlier with the composition selector - the AI didn’t know which Crossplane compositions to use for different cloud providers because we hadn’t created that policy yet.

The system is only as good as the organizational knowledge you feed into it. But when you get it right, you’ll have AI that truly understands how your organization does infrastructure.

Destroy

Press ctrl+c twice to exit Claude Code

./dot.nu destroy

exit