AI Coding Agents Are Blind to Your Company Knowledge (Here's the Fix)
AI coding agents are incredibly capable. They can write code, generate configurations, and solve complex problems. But they have a blind spot. They only know what is publicly available. They have no idea how your company actually does things. Your internal standards, your custom abstractions, your architecture decisions, your policies. None of that exists in any AI model’s training data.
So what happens? Developers use AI, get perfectly reasonable output, and then spend time fixing it to match how things are actually done in their organization.
Today I am going to show you how to close that gap. We will start with the most common approaches people try, see why they fall short, and then build a proper solution using a RAG pipeline that gives AI access to your entire company knowledge base. The example we will use is Kubernetes deployments, but the concept applies to anything: coding standards, security policies, architecture patterns, onboarding docs, you name it.
Setup
This demo is using Claude Code as the coding agent. With a few modification, it should work with any other coding agents like Cursor, GitHub Copilot, etc. The major change you might need to make is to change
.mcp-kubernetes.jsonto whichever format and location for MCP config your agent expects.
If you don’t have Claude Code already, and would like to install it, please follow Setup Claude Code instructions.
git clone https://github.com/vfarcic/dot-ai
cd dot-ai
git pull
git fetch
git switch demo/knowledgebaseMake sure that Docker is up-and-running. We’ll use it to run create a KinD cluster.
Watch Nix for Everyone: Unleash Devbox for Simplified Development if you are not familiar with Devbox. Alternatively, you can skip Devbox and install all the tools listed in
devbox.jsonyourself.
devbox shell
./dot.nu setup \
--stack-version 0.45.0 \
--kyverno-enabled false \
--atlas-enabled false
source .env
claude --strict-mcp-configAI Without Company Context
Let’s start with a simple test. We’ll ask an AI coding agent to create Kubernetes manifests for an application. Nothing fancy. Just a straightforward request.
[user]
Create manifests for the Kubernetes app with the image ghcr.io/vfarcic/silly-demo:v1.5.235 in the ./tmp dir. It should be accessible through Ingress host silly-demo.127.0.0.1.nip.io.[agent]
βΊ Write(/Users/viktorfarcic/code/dot-ai/tmp/deployment.yaml)
βΏ Β Wrote 22 lines to /Users/viktorfarcic/code/dot-ai/tmp/deployment.yaml
1 apiVersion: apps/v1
2 kind: Deployment
3 metadata:
4 name: silly-demo
5 labels:
6 app: silly-demo
7 spec:
8 replicas: 1
9 selector:
10 matchLabels:
11 app: silly-demo
12 template:
13 metadata:
14 labels:
15 app: silly-demo
16 spec:
17 containers:
18 - name: silly-demo
19 image: ghcr.io/vfarcic/silly-demo:v1.5.235
20 ports:
21 - containerPort: 8080
βΏ Β Running PostToolUse hooksβ¦
βΊ Write(/Users/viktorfarcic/code/dot-ai/tmp/service.yaml)
βΏ Β Wrote 13 lines to /Users/viktorfarcic/code/dot-ai/tmp/service.yaml
1 apiVersion: v1
2 kind: Service
3 metadata:
4 name: silly-demo
5 labels:
6 app: silly-demo
7 spec:
8 selector:
9 app: silly-demo
10 ports:
11 - port: 80
12 targetPort: 8080
βΏ Β Running PostToolUse hooksβ¦
βΊ Write(/Users/viktorfarcic/code/dot-ai/tmp/ingress.yaml)
βΏ Β Wrote 19 lines to /Users/viktorfarcic/code/dot-ai/tmp/ingress.yaml
1 apiVersion: networking.k8s.io/v1
2 kind: Ingress
3 metadata:
4 name: silly-demo
5 labels:
6 app: silly-demo
7 spec:
8 rules:
9 - host: silly-demo.127.0.0.1.nip.io
10 http:
11 paths:
12 - path: /
13 pathType: Prefix
14 backend:
15 service:
16 name: silly-demo
17 port:
18 number: 80
βΏ Β Running PostToolUse hooksβ¦
βΊ Created three manifests in ./tmp/: 11:00 AM claude-opus-4-6
- deployment.yaml β Deploys
ghcr.io/vfarcic/silly-demo:v1.5.235 with port 8080
- service.yaml β ClusterIP service routing port 80 β 8080
- ingress.yaml β Ingress with host
silly-demo.127.0.0.1.nip.io routing to the service
Apply them with:
kubectl apply -f ./tmp/deployment.yaml -f
./tmp/service.yaml -f ./tmp/ingress.yamlAI created a Deployment, a Service, and an Ingress. That’s perfectly reasonable work. Any Kubernetes tutorial would teach you exactly this. The problem is, that’s not how we deploy apps in this organization. We have internal standards, custom abstractions, and specific ways of doing things that no public AI model could possibly know about.
And that’s the core issue. AI models only know public knowledge. They have no idea about your company’s internal docs, architecture decisions, runbooks, or policies. As far as they’re concerned, the standard Kubernetes way is the only way.
Now, there is already a company document that specifies exactly how we should deploy apps. The question is, how do we get that knowledge into AI? Let’s try a few approaches, starting with the most common one.
The first approach is what I call tribal knowledge. Someone on the team, who may or may not have actually read the docs, types what they remember. It’s the “just tell AI what you know” approach. So let’s do exactly that. We’ll tell AI that our company uses a custom resource for deployments.
[user]
In our company, we use apps.devopstoolkit.live for deploying apps to Kubernetes. The CRD should be in the cluster we're connected to.[agent]
βΊ Let me look up the CRD to understand the custom resource 11:15 AM claude-opus-4-6
schema.
βΊ Bash(kubectl get crd -l "group=apps.devopstoolkit.live" 2>/dev/null || kubectl get crd |
grep devopstoolkit 2>/dev/null)
βΏ Β (No content)
βΊ Bash(kubectl get crd --no-headers 2>/dev/null | grep -i devops)
βΏ Β apps.devopstoolkit.live
2026-02-09T15:44:01Z
...
βΊ Bash(kubectl get crd apps.devopstoolkit.live -o json | jq
'.spec.versions[0].schema.openAPIV3Schema')
...
βΊ Bash(kubectl get crd apps.devopstoolkit.live -o json | jq '{group: .spec.group, kind:
.spec.names.kind, plural: .spec.names.plural, version: .spec.versions[0].name}')
...
βΊ Now I understand the CRD. Let me replace the three 11:15 AM claude-opus-4-6
separate manifests with a single custom resource.
βΊ Write(/Users/viktorfarcic/code/dot-ai/tmp/app.yaml)
βΏ Β Wrote 10 lines to /Users/viktorfarcic/code/dot-ai/tmp/app.yaml
1 apiVersion: devopstoolkit.live/v1beta1
2 kind: App
3 metadata:
4 name: silly-demo
5 spec:
6 image: ghcr.io/vfarcic/silly-demo
7 tag: v1.5.235
8 host: silly-demo.127.0.0.1.nip.io
9 port: 8080
βΏ Β Running PostToolUse hooksβ¦
βΊ Bash(rm /Users/viktorfarcic/code/dot-ai/tmp/deployment.yaml
/Users/viktorfarcic/code/dot-ai/tmp/service.yaml
/Users/viktorfarcic/code/dot-ai/tmp/ingress.yaml)
βΏ Β (No content)
βΊ Replaced the three separate manifests with a single 11:16 AM claude-opus-4-6
./tmp/app.yaml using the App CRD from the cluster. It
sets:
- image: ghcr.io/vfarcic/silly-demo
- tag: v1.5.235
- host: silly-demo.127.0.0.1.nip.io
- port: 8080
Apply it with:
kubectl apply -f ./tmp/app.yamlAI figured it out. It found the CRD in the cluster, inspected the schema, and generated a proper App resource instead of raw Deployments and Services. That worked, but only because the person typing happened to remember the right CRD name. What if they got the API group wrong? What if they forgot about optional fields like scaling or database configuration? What if a new team member has never even heard of this CRD? Tribal knowledge is unreliable, incomplete, and doesn’t scale. You can’t expect every developer on the team to memorize every internal standard.
So let’s try a second approach. Instead of typing from memory, we show AI the actual documentation. In this case, we have a Markdown file that describes the App CRD in detail. Let’s take a look at what’s in it.
Do not copy&paste
!from the instruction that follows. If using Claude Code,!must be typed to indicate that it should execute a command.
[user]
!cat examples/knowledgebase/apps-devopstoolkit-live.md[agent]
All applications deployed to Kubernetes MUST use the `apps.devopstoolkit.live` Custom
Resource Definition (CRD). Direct usage of Kubernetes Deployments, StatefulSets, or other
built-in workload resources for application deployments is not permitted.
## Required CRD
- **API Group**: `devopstoolkit.live`
- **Version**: `v1beta1`
- **Kind**: `App`
This is a Crossplane Composite Resource (XR). It abstracts away underlying Kubernetes
workload resources (Deployments, Services, Ingress, HPA) into a single declarative
interface.
## Spec Fields
### Required
| Field | Type | Description |
|---------|--------|------------------------------------------------|
| `image` | string | The container image (e.g., `ghcr.io/vfarcic/silly-demo`) |
| `tag` | string | The tag of the image (e.g., `1.2.3`) |
### Optional
| Field | Type | Default | Description
|
|--------------------|---------|----------------------|----------------------------------
----|
| `host` | string | `devopstoolkit.live` | The host address of the
application |
| `port` | integer | `80` | The application port
|
| `ingressClassName` | string | `contour` | The Ingress class name
|
| `scaling.enabled` | boolean | `false` | Whether to enable scaling
|
| `scaling.min` | integer | `1` | Minimum number of replicas
|
| `scaling.max` | integer | `10` | Maximum number of replicas
|
| `db.name` | string | | The name of the database
|
| `db.secret` | string | | The name of the Secret with DB
authentication |
| `db.storeName` | string | | The name of the secret store to
use |
| `frontend.backendUrl` | string | | The URL of the backend service
|
## Example
```yaml
apiVersion: devopstoolkit.live/v1beta1
kind: App
metadata:
name: silly-demo
namespace: a-team
spec:
image: ghcr.io/vfarcic/silly-demo
tag: "1.4.307"
port: 8080
host: silly-demo.devopstoolkit.live
scaling:
enabled: true
min: 2
max: 5
```
## Rationale
The `apps.devopstoolkit.live` CRD provides a standardized abstraction for application
deployments that:
- Enforces organizational conventions (naming, labels, annotations) automatically
- Integrates with Crossplane compositions for multi-cloud portability
- Simplifies application manifests by hiding infrastructure complexity
- Ensures consistent networking, scaling, and observability configuration
- Enables policy enforcement through a single resource type
## What NOT to Do
Do not deploy applications using raw Kubernetes resources such as:
- `Deployment` (`apps/v1`)
- `StatefulSet` (`apps/v1`)
- `ReplicaSet` (`apps/v1`)
- `Pod` (`v1`)
These resources should only be created as children of the `App` CRD by the Crossplane
composition, never directly by users.That is a proper document. It has the API group, the version, required and optional fields, examples, and even a “what not to do” section. Now, we could copy-paste this into every AI conversation, or point the agent to the file location. That is better than tribal knowledge because AI gets the actual spec, not someone’s fuzzy memory of it. But it is still manual. You have to know which doc to include, add it to every conversation, and hope it is up to date. And here is the thing. Company knowledge is not just a handful of docs. It is scattered across Git repos, Notion pages, Slack threads, Zoom transcripts, wikis, and who knows where else. There could be thousands of documents, often with overlapping or even contradictory information spread across multiple places. Copy-pasting does not scale.
RAG Pipeline Explained
So we need a third approach, and this is the real solution. Vector databases and embeddings. If those words sound intimidating, do not worry. The concept is simpler than it sounds.
It starts with chunking. You take your documents and break them into smaller pieces. Why? Because a single document might cover ten different topics, and when AI asks a question, you do not want to feed it everything. You want just the relevant parts. Chunks are those relevant parts.
Next, you create embeddings. An embedding is just a way of turning text into a list of numbers that captures its meaning. Think of it as a fingerprint for a piece of text. Two chunks that talk about the same concept will have similar fingerprints, even if they use completely different words.
Those embeddings get stored in a vector database. A vector database is optimized for one thing: finding items that are similar to a given input. You give it a query, and it finds the chunks whose meaning is closest to what you asked.
If you have heard the term RAG, or Retrieval-Augmented Generation, this is exactly what it is.
There are two pipelines at play here. The ingestion pipeline takes your (1) documents, breaks them into (2) chunks, creates (3) embeddings, and stores them in the (4) vector database.
The retrieval pipeline starts when a (5) user asks a question. That query (6) goes to a (7) coding agent, which sends a (8) request to an (9) MCP server. The MCP server converts it into an (10) embedding, performs a (11) similarity search against the vector database, and retrieves the (12) relevant chunks. Those chunks (13) flow back to the coding agent, which sends (14) the original query along with the chunks to the (15) LLM. The LLM generates an answer using both its general knowledge and your private context, and sends (16) that answer back to the coding agent, which (17) presents it to the user.
It is not matching keywords. It is matching meaning. So even if you ask “how should I deploy my app?” and the document says “all applications must use the App CRD,” the system understands those are related.
graph LR
subgraph Ingestion Pipeline
A["(1) Documents"] --> B["(2) Chunks"]
B --> C["(3) Embeddings"]
C --> D["(4) Vector DB"]
end
subgraph Retrieval Pipeline
E["(5) User"] -->|"(6) query"| F["(7) Coding Agent"]
F -->|"(8) request"| G["(9) MCP Server"]
G --> H["(10) Embedding"]
H --> I["(11) Similarity Search"]
I --> D
D --> J["(12) Relevant Chunks"]
J --> G
G -->|"(13) chunks"| F
F -->|"(14) query + chunks"| K["(15) LLM"]
K -->|"(16) answer"| F
F -->|"(17) answer"| E
end
style A fill:#3b82f6,stroke:#333,color:#fff
style B fill:#3b82f6,stroke:#333,color:#fff
style C fill:#3b82f6,stroke:#333,color:#fff
style D fill:#22c55e,stroke:#333,color:#fff
style E fill:#a78bfa,stroke:#333,color:#fff
style F fill:#a78bfa,stroke:#333,color:#fff
style G fill:#06b6d4,stroke:#333,color:#fff
style H fill:#a78bfa,stroke:#333,color:#fff
style I fill:#a78bfa,stroke:#333,color:#fff
style J fill:#a78bfa,stroke:#333,color:#fff
style K fill:#f97316,stroke:#333,color:#fffKnowledge Ingestion Setup
I added this as a proof of concept to DevOps AI Toolkit. It consists of an MCP server that handles the knowledge management side of things: chunking documents, creating embeddings, storing them, and performing semantic search. There is also a Kubernetes controller that automates document ingestion from Git repos using a custom resource called GitKnowledgeSource.
Why Kubernetes? Because this pipeline needs to run continuously, react to changes, and stay in sync with your actual documentation. Kubernetes gives you CRDs for declarative configuration, controllers for event-driven automation, and scheduled reconciliation. And it is already where your workloads live. Knowledge ingestion should be a first-class citizen in your cluster, not a cron job on someone’s laptop.
Let me show you how this works in practice. First, we need to ingest some documents. Here is the definition of a GitKnowledgeSource custom resource.
Exit Claude Code by typing
exitor pressingctrl+ctwice.
cat examples/knowledgebase/git-knowledge-source.yamlapiVersion: dot-ai.devopstoolkit.live/v1alpha1
kind: GitKnowledgeSource
metadata:
name: company-standards
namespace: dot-ai
spec:
repository:
url: https://github.com/vfarcic/dot-ai.git
branch: main
paths:
- "examples/knowledgebase/**/*.md"
mcpServer:
url: http://dot-ai.dot-ai.svc:3456
authSecretRef:
name: dot-ai-secrets
key: auth-tokenIt points to a Git repository, specifies which file paths to ingest using glob patterns, and tells the controller where the mcpServer is running. That is all you need. The controller is cloning the repo and sending changed or new documents to the server which, in turn is chunking the documents, creating embeddings, and storing everything in the vector database. Let’s apply it.
kubectl --namespace dot-ai apply --filename examples/knowledgebase/git-knowledge-source.yamlHere is the beautiful part. The GitKnowledgeSource runs on a schedule. It detects changes via git diff and only re-ingests what actually changed. Your knowledge base stays current without any manual effort. Let’s check the status.
kubectl --namespace dot-ai get gitknowledgesourcesNAME PHASE ACTIVE DOCUMENTS LAST SYNC ERRORS AGE
company-standards Synced true 5 1m47s 2m12sIt synced 5 documents and there are no errors. Now, this is a demo, so five documents is all we have. In a real-world scenario, you would be ingesting hundreds or thousands of documents from multiple repositories, Notion exports, or wherever your company keeps its knowledge. The process is the same.
Now let’s start Claude Code again, this time with the MCP configuration that connects it to the DevOps AI Toolkit.
claude --mcp-config .mcp-kubernetes.jsonLet’s first confirm that everything is up and running.
[user]
Show dot-ai status[agent]
βΊ dot-ai - version (MCP)
βΏ Β {
"status": "success",
"system": {
β¦ +105 lines (ctrl+o to expand)
βΏ IMPORTANT: After presenting the main response data, also display the visualizationUr
l field to the user.
βΊ dot-ai System Status: Healthy
βββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β Component β Status β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Version β 1.1.0 (Node v22.22.0, linux/arm64) β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Overall β healthy β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Kubernetes β Connected (v1.35.0, in-cluster) β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β AI Provider β Anthropic (claude-haiku-4-5) β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Embedding β OpenAI text-embedding-3-small β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Vector DB β Connected β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Tracing β Enabled (OTLP β Jaeger) β
βββββββββββββββΌβββββββββββββββββββββββββββββββββββββ€
β Kyverno β Not installed β
βββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
Vector DB Collections:
βββββββββββββββββ¬ββββββββββββ
β Collection β Documents β
βββββββββββββββββΌββββββββββββ€
β patterns β 0 β
βββββββββββββββββΌββββββββββββ€
β policies β 0 β
βββββββββββββββββΌββββββββββββ€
β capabilities β 347 β
βββββββββββββββββΌββββββββββββ€
β resources β 1,332 β
βββββββββββββββββΌββββββββββββ€
β knowledgeBase β 13 β
βββββββββββββββββ΄ββββββββββββ
Plugins: 1 plugin (agentic-tools v1.0.0) with 32 tools
Capabilities: pattern-management, policy-intent-management, capability-scanning,
semantic-search, ai-recommendations, kubernetes-integration
Visualization: http://dot-ai-ui.127.0.0.1.nip.io/v/ver-1770662892124-075b129cThe system is healthy. The vector database is connected and the knowledgeBase collection has 13 entries, which are the chunks created from our five documents.
Now let’s put it to the test. We will ask AI a question that requires private, company-specific knowledge.
[user]
What are best practices for running apps in Kubernetes. Use dot-ai MCP.[agent]
βΊ Based on your organization's knowledge base, here are the best practices for running apps
in Kubernetes:
---
1. Use the App CRD -- Not Raw Resources
All applications must be deployed using the App CRD (apps.devopstoolkit.live/v1beta1). Do
not create raw Deployments, StatefulSets, ReplicaSets, or Pods directly. The CRD is a
Crossplane Composite Resource that abstracts workloads into a single declarative interface
and enforces organizational conventions automatically.
apiVersion: devopstoolkit.live/v1beta1
kind: App
metadata:
name: silly-demo
namespace: a-team
spec:
image: ghcr.io/vfarcic/silly-demo
tag: "1.4.307"
port: 8080
host: silly-demo.devopstoolkit.live
scaling:
enabled: true
min: 2
max: 5
2. Container Security
- Containers must not run as root (runAsNonRoot: true)
- Drop all Linux capabilities; add back only what's needed
- Use read-only root filesystem (readOnlyRootFilesystem: true)
- Disable privilege escalation (allowPrivilegeEscalation: false)
3. Network Policies
- Every namespace must have a default-deny ingress NetworkPolicy
- Explicitly allow only required traffic sources per service
- Egress to the internet is denied by default (exceptions need platform team approval)
4. Secrets Management
- Never store secrets in Git, even encrypted
- Use External Secrets Operator with Vault for runtime injection
- Database credentials are managed automatically via the SQL CRD
- Rotate API keys and tokens at least every 90 days
5. Image Provenance
- Only images from approved registries: ghcr.io/vfarcic/, docker.io/library/,
registry.internal.company.com
- All images must have valid Cosign signatures
- Always use explicit version tags (e.g., 1.4.307) -- the latest tag is prohibited
6. Scaling per Environment
βββββββββββββββ¬βββββββββββββββ¬βββββββββββββββ¬ββββββββββ
β Environment β Min Replicas β Max Replicas β DB Size β
βββββββββββββββΌβββββββββββββββΌβββββββββββββββΌββββββββββ€
β Dev β 1 β 2 β small β
βββββββββββββββΌβββββββββββββββΌβββββββββββββββΌββββββββββ€
β Staging β 2 β 5 β medium β
βββββββββββββββΌβββββββββββββββΌβββββββββββββββΌββββββββββ€
β Production β 3 β 10 β large β
βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄ββββββββββ
Use Kustomize overlays (not Helm value overrides) for per-environment configuration.
7. Naming Conventions
- Namespaces represent teams (e.g., a-team, b-team)
- Resource names: lowercase with hyphens, max 63 chars, descriptive (e.g., orders-api)
- All resources must include app and tier labels
8. Databases
Use the SQL CRD (sqls.devopstoolkit.live/v1beta1) -- never provision databases directly via
cloud consoles, Terraform, or raw operators. Multi-cloud portability is handled by
switching the compositionRef.
9. GitOps Promotion
Promotion flows automatically through Argo CD: main merge -> CI build -> dev (auto) ->
staging (auto after tests) -> production (manual approval gate).
---
Sources:
- https://github.com/vfarcic/dot-ai/blob/main/examples/knowledgebase/apps-devopstoolkit-liv
e.md
- https://github.com/vfarcic/dot-ai/blob/main/examples/knowledgebase/security-policies.md
- https://github.com/vfarcic/dot-ai/blob/main/examples/knowledgebase/naming-conventions.md
- https://github.com/vfarcic/dot-ai/blob/main/examples/knowledgebase/environments.md
- https://github.com/vfarcic/dot-ai/blob/main/examples/knowledgebase/databases.mdLook at that. AI pulled information from multiple internal documents and synthesized a comprehensive answer. It knows about the App CRD, the security policies, the naming conventions, the environment-specific scaling rules, and even the GitOps promotion flow. None of this is public knowledge. It all came from the vector database.
Now let’s go back to our original request and see what happens this time.
[user]
Create manifests for the Kubernetes app with the image ghcr.io/vfarcic/silly-demo:v1.5.235 in the ./tmp dir. It should be accessible through Ingress host silly-demo.127.0.0.1.nip.io.[agent]
βΊ Read 1 file (ctrl+o to expand)
βΊ Based on the knowledge base best practices, the app should use the App CRD rather than raw
Kubernetes resources.
βΊ Write(tmp/app.yaml)
βΏ Β Error writing file
βΊ Read 1 file (ctrl+o to expand)
βΊ There's already a file there with similar content. Let me update it with the namespace and
scaling settings per org standards.
βΊ Write(tmp/app.yaml)
βΏ Β Added 7 lines, removed 2 lines
2 kind: App
3 metadata:
4 name: silly-demo
5 + namespace: a-team
6 spec:
7 image: ghcr.io/vfarcic/silly-demo
7 - tag: v1.5.235
8 - host: silly-demo.127.0.0.1.nip.io
8 + tag: "v1.5.235"
9 port: 8080
10 + host: silly-demo.127.0.0.1.nip.io
11 + scaling:
12 + enabled: true
13 + min: 2
14 + max: 5
βΏ Β Running PostToolUse hooksβ¦
βΊ Written to ./tmp/app.yaml. Following the org's knowledge base standards:
- Uses the App CRD (apps.devopstoolkit.live/v1beta1) instead of raw
Deployments/Services/Ingress
- Placed in the a-team namespace (approved team namespace)
- Ingress host set to silly-demo.127.0.0.1.nip.io
- Scaling enabled with staging-level defaults (min: 2, max: 5)
- Explicit image tag (v1.5.235) -- no latestThis time, without anyone telling it anything, AI used the App CRD, set the namespace to a-team, enabled scaling with the correct staging-level defaults, and used an explicit image tag. It did all of this because it queried the knowledge base, found the relevant chunks, and applied the organization’s standards automatically. That is the difference between tribal knowledge and a proper RAG pipeline.
Now, what is next for DevOps AI Toolkit? The goal is to incorporate this knowledge directly into specialized agents for things like remediation, recommendations, and querying. Instead of just answering questions, AI would proactively apply organizational knowledge when performing tasks.
If you want to try this out, check out DevOps AI Toolkit. Star it, fork it, give it a spin.
RAG for Company Knowledge
Now, this is not without trade-offs. Let’s talk about what did not work so well.
RAG adds infrastructure complexity. You need a vector database, an embedding model, an MCP server, and a controller to keep everything in sync. That is a lot of moving parts compared to just copy-pasting a document into a chat window.
Chunk quality matters. If your documents are poorly structured or contradictory, the retrieval results will be unreliable. RAG retrieves what is closest in meaning, but it can still pull irrelevant chunks or miss important context if the knowledge base is incomplete or outdated. Garbage in, garbage out.
Staleness is a real concern. Even with automated ingestion, you need a mechanism to distinguish current documents from outdated ones. If old chunks stay in the vector database after a document is updated, AI might retrieve conflicting information. You need to make sure your pipeline replaces or removes stale entries, not just appends new ones.
Now for what worked well.
AI finally understands your organization as much as anyone can. Instead of producing generic output that violates internal standards, it applies your actual policies, conventions, and architecture decisions automatically. No one has to remember which CRD to use or what the scaling defaults are.
That being said, if your docs are a mess, knowledge ingested into AI will be a mess as well. Neither you nor AI can understand a mess.
The solution scales. Unlike tribal knowledge or copy-pasting docs, a RAG pipeline works whether you have five documents or five thousand. Add new docs, and they get ingested automatically.
And it works across all types of knowledge. We used Kubernetes deployments as the example, but this applies to coding standards, security policies, onboarding guides, runbooks, anything your organization has written down.
Destroy
Exit Claude Code by typing
exitor pressingctrl+ctwice.
./dot.nu destroy
git switch main
exit