Using an AI agent to navigate an undocumented Kubernetes repo

In short: A demo environment behaved differently from production because a Kubernetes CronJob existed only in the production overlay.

The infrastructure repo had little documentation, and the environment names were ambiguous: stage and staging both existed.

An AI agent helped me trace the CronJob, map stage to the demo environment, and identify the two IaC files that needed changing.

The useful part was not that the agent “knew Kubernetes.” It could inspect the repo faster than I could, compare patterns across overlays, and keep the environment-specific details visible while I reviewed the result.

Note: I’ve kept the high-level logic intact, but I’ve altered specific internal names and insights to respect project confidentiality.

I was working with a client that had 5 application repositories and one Kubernetes infrastructure-as-code repo.

The IaC repo managed application deployments, environment config, and operational automation through GitOps. It had very little documentation.

I had changed one of the applications and needed to test the behavior on the demo environment. The change depended on a CronJob called archive_lapsed_memberships.

Locally, I could run the task manually. In production, the CronJob existed. On demo, the behavior never happened.

I am not a Kubernetes specialist. Waiting for the part-time DevOps team would have meant a long turnaround, so I opened the IaC repo in my agent and started asking questions.

Finding the missing CronJob

The first step was locating the CronJob. I did not know whether it lived in the application repo, the GitOps repo, or some older Kubernetes directory.

Finding the CronJob was the easy part. The harder question was whether the demo environment used the same overlay. The repo had both stage and staging, which is not exactly an intention revealing naming scheme.

Once the agent mapped stage to the demo domain, I had enough context to ask for the actual change.

What follows is a redacted chat I had with the agent. You can skip through the transcript sections if you only want the final change.

0:00

can you tell me where the "add archive_lapsed_memberships" cronjob is defined?

0:04 grep

{"pattern":"archive_lapsed_memberships"}

0:04

Found 4 files
gitops/taskhub-core/overlays/prod/timers.yaml
kubernetes/prod-cluster/default/timers.yaml
opslane-bundle.xml
bundle-output.xml

0:08 grep

{"pattern":"archive_lapsed_memberships","path":"gitops/taskhub-core/overlays/prod/timers.yaml","output_mode":"content","-B":5,"-A":5}

0:08

426- imagePullPolicy: Always
427- command:
428- - bundle
429- - exec
430- - rake
431: - taskhub:archive_lapsed_memberships
432- envFrom:
433- - configMapRef:
434- name: env
435- env:
436- - name: APP_MASTER_KEY

0:11 grep

{"pattern":"archive_lapsed_memberships","path":"gitops/taskhub-core/overlays/prod/timers.yaml","output_mode":"content","-B":20}

0:11

411----
412-apiVersion: batch/v1
413-kind: CronJob
414-metadata:
415- name: archive-lapsed-membership
416-spec:
417- suspend: False # Enable when prod
418- schedule: "30 11 *"
419- jobTemplate:
420- spec:
421- template:
422- spec:
423- containers:
424- - name: archive-lapsed-membership
425- image: northstarlabs/taskhub:stable
426- imagePullPolicy: Always
427- command:
428- - bundle
429- - exec
430- - rake
431: - taskhub:archive_lapsed_memberships

0:16

The cronjob is defined in timers.yaml:412-431. It's a Kubernetes CronJob named archive-lapsed-membership that:

Runs daily at 11:30 UTC (30 11 *)
Is active (not suspended)
Executes bundle exec rake taskhub:archive_lapsed_memberships

There's also an older copy at kubernetes/prod-cluster/default/timers.yaml.

E Enrico

After locating the cronjob, I wonder if it runs on all stages.

0:38

does it run on all stages?

0:41 grep

{"pattern":"archive_lapsed_memberships","path":"gitops/taskhub-core/overlays"}

0:41

Found 1 fileg
gitops/taskhub-core/overlays/prod/timers.yaml

0:44

No — it only runs in prod. There's no definition for it in any other overlay (e.g., staging).

1:19

what overlays are defined for that app?

1:22 bash

{"command":"ls /home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/","description":"List overlay directories"}

1:24

prod
stage
staging

1:27

Three overlays: prod, stage, and staging.

E Enrico

The poor name definition (I dare you to intuitively know what's the difference between stage and staging) drove my next question.

1:46

what's the difference between stage and staging?

1:50 bash

{"command":"ls /home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/stage/","description":"List stage overlay contents"}

1:50 bash

{"command":"ls /home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/staging/","description":"List staging overlay contents"}

1:51

edge-gateway-cm.yaml
settingsmap.yaml
cronjobs
cloud-secrets.enc.yaml
debugkit.yaml
gateway.yaml
manifest.yaml
hostbackupsecret.enc.yaml
patch-service.yaml
blob-sync-job.yaml
appsecret.yaml
taskhub-api-service.yaml
taskhub-hpa.yaml
processor-hpa.yaml

1:51

edge-gateway-cm.yaml
settings.yaml
cronjobs
cloud-secrets.enc.yaml
gateway.yaml
manifest.yaml
hostbackupsecret.enc.yaml
patch-service.yaml
appsecrets.yaml
taskhub-api-service.yaml
taskhub-hpa.yaml
taskhub-oneoff.yaml
processor-hpa.yaml

2:11

They're very similar — both are staging-like overlays with mostly the same files. Let me check the key differences.

2:13 grep

{"pattern":"namespace|host","path":"/home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/stage/manifest.yaml","output_mode":"content"}

2:13 grep

{"pattern":"namespace|host","path":"/home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/staging/manifest.yaml","output_mode":"content"}

2:13

12:namespace: taskhub-core

2:13

12:namespace: taskhub-core-staging

2:18

The key difference is the namespace:

stage → namespace taskhub-core (shares the default namespace)
staging → namespace taskhub-core-staging (has its own dedicated namespace)

They appear to be two separate staging environments deployed to different Kubernetes namespaces. The staging overlay looks like a more isolated environment with its own namespace.

E Enrico

The agent pointed out some differences in how those overlays were setup but my focus wasn't on refactoring that. Was to understand to which overlay to add the cronjob.

2:29

what domains are linked to those?

2:33 read

{"path":"/home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/stage/gateway.yaml"}

2:33 read

{"path":"/home/devnode/dev_workspace/aurora/opslane/gitops/taskhub-core/overlays/staging/gateway.yaml"}