← Back to blog

Linting SOUL.md Files: Catch Prompt Bugs Before Your Users Do

I spent three days debugging an agent that would randomly switch between formal and casual tone mid-conversation. Turns out the SOUL.md had two conflicting instructions 40 lines apart. Line 12: "Always maintain a professional, formal tone." Line 53: "Be conversational and approachable, like talking to a friend." The agent was dutifully following both, alternating based on whichever instruction was closer to the current context window position.

This is a prompt bug. It's exactly like a code bug, except there's no compiler to catch it and no stack trace when it manifests. The agent just behaves weirdly and you have to manually read through the entire SOUL.md to figure out why.

SOUL.md files need linting, just like code. Here's what a linter catches and how to set one up.

## The most common SOUL.md bugs

I've reviewed about 200 SOUL.md files over the past year. The same bugs come up over and over.

### Contradictory instructions

This is the most common one and the hardest to spot manually. It's not always as obvious as "be formal" vs. "be casual." Sometimes it's:

- "Never make assumptions about the user's intent" combined with "Proactively suggest solutions based on the user's likely needs" - "Keep responses concise, under 3 sentences" combined with "Always provide thorough explanations with examples" - "Escalate to a human if you're unsure" combined with "Always provide an answer, never leave the user without a response"

Each instruction makes sense in isolation. Together, they create an agent that flips between behaviors unpredictably. The agent isn't broken. It's confused, in the same way a new employee would be confused if their manager gave them contradictory directions.

A linter detects these by building a semantic map of instructions and flagging pairs that pull in opposite directions. It won't catch every subtle contradiction, but it catches the obvious ones that humans miss because they wrote the instructions on different days.

### Missing constraints

A SOUL.md that says "Help users with their email" without specifying limits is asking for trouble. Help how? Draft emails? Read their inbox? Delete messages? Forward things to other people?

Missing constraints are permissions bugs. The agent will try to do whatever the user asks because nothing in the SOUL.md says it shouldn't. A linter checks for common unconstrained patterns:

- Action verbs without explicit scope ("manage," "handle," "process" without specifying what's in/out of bounds) - Permission-adjacent capabilities without explicit permission rules - User-facing features without error/fallback behavior defined

### Formatting issues that break parsing

OpenClaw's SOUL.md parser is more forgiving than it should be. It'll accept markdown that technically renders but doesn't parse the way you'd expect.

Common formatting bugs:

- **Nested headers that create wrong hierarchy.** A ## section inside another ## section doesn't create nesting. The parser treats them as siblings, which can change how instruction priority works. - **Bullet points without blank lines before them.** Some markdown parsers handle this fine. OpenClaw's parser sometimes merges the preceding paragraph with the first bullet. Your carefully separated instruction becomes one run-on sentence. - **Code blocks inside instructions.** If you include example formats in fenced code blocks, the parser might treat the contents as literal text the agent should output, not as a template to follow. Use description instead: "Format the response as a JSON object with fields: name, email, status" rather than a code block. - **Curly braces and template syntax.** If you use {variable_name} as placeholder syntax, some OpenClaw versions try to interpolate it. Escape your braces or use a different placeholder format like [VARIABLE_NAME].

### Dead instructions

Instructions that reference skills, capabilities, or configurations that don't exist anymore. "When using the email-v2 skill, always..." but you migrated to email-v3 six months ago. The agent sees the instruction, can't find the referenced skill, and either ignores the instruction entirely or gets confused trying to follow it.

A linter cross-references SOUL.md instructions against your installed skills and AGENTS.md configuration. If an instruction references something that doesn't exist, it flags it.

## Setting up a SOUL.md linter

### Option 1: ClawProd's built-in linter

If you're already using ClawProd for CI/CD, the SOUL.md linter runs as part of your pipeline. Push a change to your SOUL.md, and the linter checks for contradictions, missing constraints, formatting issues, and dead references before the agent deploys.

The linter output looks like this:

``` WARNING line 12 + line 53: Potential contradiction L12: "Always maintain a professional, formal tone" L53: "Be conversational and approachable" Suggestion: Choose one tone or define conditions for each

ERROR line 87: Dead reference "When using the email-v2 skill" - skill not found in AGENTS.md Installed email skills: email-v3

WARNING line 34: Unconstrained action "Handle user scheduling requests" - no boundaries defined Suggestion: Add explicit scope (e.g., "read calendar only" or "create events but never delete") ```

Warnings don't block deployment. Errors do. You can configure which checks are warnings vs. errors based on your risk tolerance.

### Option 2: Pre-commit hook

For a lighter-weight approach, run the linter as a pre-commit hook. Every time you commit a SOUL.md change, the linter runs locally and shows you issues before they reach your repo. Faster feedback loop, but you miss the cross-reference checks against your deployed configuration (since those need access to your live AGENTS.md and skill list).

### Option 3: Manual review with the linter CLI

Not ready to automate? Run the linter manually when you make changes:

``` clawprod lint soul.md --verbose ```

This gives you the full report. Good for initial cleanup of an existing SOUL.md that's never been linted. I ran this on a client's SOUL.md last week and found 14 warnings and 3 errors in a 200-line file. Two of the errors were causing user-reported bugs they'd been trying to track down for a month.

## Writing lint-friendly SOUL.md files

A few practices that make your SOUL.md easier to lint and maintain:

**One instruction per paragraph.** Don't combine "be professional" and "never discuss competitors" in the same paragraph. Separate concerns make contradiction detection more accurate and make it easier to update one instruction without accidentally affecting another.

**Use consistent structure.** Group instructions by category: tone, scope, permissions, error handling, escalation. Consistent structure helps both the linter and human reviewers scan the file.

**Be specific about scope.** Instead of "help with email," write "read the user's last 10 inbox messages when asked, draft reply suggestions, but never send emails or modify existing messages without explicit user confirmation." Verbose, but unambiguous. The linter won't flag it as unconstrained.

**Version your SOUL.md.** Track changes in git like code. When the agent starts behaving differently, `git diff` on the SOUL.md is the fastest way to find what changed. The linter can also compare the current version against the last-deployed version to show you exactly which new issues your changes introduced.

Prompt engineering is engineering. Treat SOUL.md files with the same rigor you'd give application code, and your agents will behave more predictably.

Related posts

Why Your OpenClaw Skill Needs CI/CDTesting AI Agent Skills: A Practical Guide to Behavioral TestingBuilding an Agent Deployment Pipeline: From Git Push to ProductionCI/CD for OpenClaw Skills: Automated Testing and PublishingCI/CD for AI Skills: Why You Need Tests Before You Publish