Most teams drop an AI coding assistant into their workflow and call it done. They get autocomplete, they get the occasional boilerplate generation, and after six months their “AI-assisted development” looks identical to what they had before — except the engineers are slightly faster at writing tests they already knew how to write.
The problem is not the tool. It is the absence of a designed workflow. Hermes Agent was built for a different model: one where the agent accumulates knowledge about your codebase, not just the training distribution it shipped with. That distinction only pays off if you structure how you interact with it. This post is that structure.
I will cover how to pre-seed Hermes with the skills that matter for software development, how to frame coding tasks for best results, how to teach the agent your codebase conventions, when to reach for Loop 8 parallel sub-agents, and how to measure whether any of it is working.
The Baseline: What Hermes Needs to Know Before the First Commit
Before you write a single line of code with Hermes, run a setup pass. The goal is to give Loop 6 (skill retrieval) something useful to find from day one. Without this, Hermes is drawing from its training distribution — which includes every style and pattern ever written in your language, not the subset your team has agreed on.
The five skills worth creating upfront:
1. Code style skill — captures linting rules, naming conventions, and patterns your codebase enforces.
# ~/.hermes/skills/code-style-typescript.md
---
name: TypeScript Code Style
tags: [lang/typescript, domain/code-style, scope/all-tasks]
---
- Use `const` by default; `let` only when reassignment is required
- Prefer explicit return types on exported functions
- No `any` — use `unknown` with type guards or a named interface
- Error handling: always wrap external I/O in try/catch; log with `logger.error({ err }, 'message')`
- Import order: stdlib → third-party → internal (`~/`) → relative
- File naming: kebab-case for modules, PascalCase for React components
2. Architecture patterns skill — records the decisions your team has made about layering, service boundaries, and data flow.
# ~/.hermes/skills/architecture-patterns.md
---
name: Service Architecture Patterns
tags: [domain/architecture, scope/backend, scope/all-tasks]
---
- HTTP handlers live in `src/handlers/`; they parse input and delegate to services
- Services in `src/services/` contain business logic; they do not import from `src/handlers/`
- DB queries in `src/db/`; services import from `src/db/`, not directly from the ORM
- Use command/query separation: mutations return `{ ok: true }` or throw; reads return typed data
- Feature flags via `src/flags.ts` — no inline env var reads outside that module
3. Test writing skill — documents which test library you use, how you structure fixtures, and what coverage gates you enforce.
# ~/.hermes/skills/test-patterns.md
---
name: Test Writing Patterns
tags: [domain/testing, lang/typescript, scope/all-tasks]
---
- Framework: Vitest; file naming `*.test.ts` co-located with source
- Use `vi.mock()` for modules, `vi.spyOn()` for methods — never mutate globals
- Factory helpers in `tests/factories/`; each factory returns a typed object with sane defaults
- Unit tests: pure functions only, no DB or network. Integration tests: use test DB via `createTestDb()`
- Arrange / Act / Assert structure in every test block — no assertions in beforeEach
- Coverage gate: 80% line coverage enforced in CI via `--coverage --coverage.thresholds.lines 80`
4. PR description skill — the template your team uses so Hermes generates diffs that pass review without back-and-forth.
# ~/.hermes/skills/pr-description-template.md
---
name: PR Description Template
tags: [domain/git, domain/review, scope/all-tasks]
---
## What changed
One paragraph, past tense, no jargon.
## Why
Link to ticket or one-sentence motivation.
## Testing done
Bullet list of: unit tests added/updated, manual test steps, edge cases checked.
## Deployment notes
Any migration, flag, or config change needed. If none, write "None."
5. Review checklist skill — the things you want flagged before a commit goes up.
# ~/.hermes/skills/review-checklist.md
---
name: Pre-commit Review Checklist
tags: [domain/review, domain/git, scope/all-tasks]
---
Before committing, verify:
- [ ] No hardcoded secrets or API keys
- [ ] Error paths are handled and logged
- [ ] New public functions have JSDoc comments
- [ ] No new `// TODO` without a ticket reference
- [ ] `console.log` debug lines removed
- [ ] Migration files include both `up` and `down` methods
Save these to ~/.hermes/skills/ before your first session. Loop 6 will load the relevant ones at task start based on tag matching.
The Ideal Coding Loop
Here is the full development cycle I run with Hermes, from task intake to merged PR:
flowchart TD
A[Ticket / task description] --> B[Hermes: understand task\nand load relevant skills]
B --> C[Write implementation plan\nas inline comments]
C --> D{Plan approved?}
D -- No --> E[Clarify requirements\nor refine plan]
E --> C
D -- Yes --> F[Hermes: generate implementation]
F --> G[Hermes: generate tests\nin parallel via Loop 8]
F --> H[Hermes: self-review\nvia review-checklist skill]
G --> I[Run test suite]
H --> I
I --> J{All tests pass?}
J -- No --> K[Hermes: diagnose failure\nand patch]
K --> I
J -- Yes --> L[Hermes: generate PR description]
L --> M[Push branch and open PR]
M --> N[Loop 3: save new patterns\nas skills]
N --> O[Human review and merge]The loop has three properties that compound over time. First, the plan-before-code step at C catches the most expensive class of error — building the wrong thing — before any implementation is written. Second, the parallel test generation at G means you do not pay for test coverage with extra turns. Third, the skill save at N means the next similar task starts with the patterns from this one already loaded.
How to Frame Coding Tasks
The framing of a task prompt is the highest-leverage thing you control. Four rules that consistently improve output quality:
Give context about where, not just what. “Add rate limiting” generates generic middleware. “Add rate limiting to the /api/v1/webhooks/:id/trigger handler in src/handlers/webhooks.ts, following our retry pattern in src/services/queue.ts” generates code that fits.
State the constraint explicitly. “Without breaking existing tests” and “without introducing a new dependency” are constraints Hermes will respect if you say them, and ignore if you do not.
Reference existing code by path. Hermes can read src/ via its workspace tool. When you say “follow the pattern in src/services/email.ts”, it will actually read that file rather than inventing a pattern from training data.
End with a deliverable list. A prompt ending in “Deliver: implementation file, test file, and a short migration plan” structures the output before Hermes writes a single line. You will spend far less time asking for additions.
Example prompt that works well:
Task: Add soft-delete support to the `users` table.
Context:
- Schema at `src/db/schema.ts`, existing users service at `src/services/users.ts`
- We already have soft-delete on `posts` (see `src/db/migrations/0014_soft_delete_posts.ts`) — follow the same pattern
- Auth middleware at `src/middleware/auth.ts` should continue to treat soft-deleted users as inactive (they should get 401, not 404)
Constraints:
- No new ORM methods — use raw queries like the rest of `src/db/`
- Migration must include a down method
Deliver:
1. Migration file
2. Updated `users.ts` service with `softDelete` and `isActive` methods
3. Updated auth middleware
4. Unit tests for the service methods
Teaching Hermes Your Codebase Conventions
Skills handle the what of conventions; the where requires a different approach. Hermes needs to know the shape of your project so it can navigate it without being told every time.
Create a project-structure skill:
# ~/.hermes/skills/project-structure-myapp.md
---
name: MyApp Project Structure
tags: [domain/project-structure, scope/all-tasks, project/myapp]
---
src/
handlers/ — HTTP route handlers (one file per resource)
services/ — business logic (imported by handlers)
db/ — database layer
migrations/ — Drizzle migration files (naming: NNNN_description.ts)
schema.ts — table definitions
middleware/ — Express middleware (auth, rate-limit, logging)
lib/ — shared utilities with no business logic
flags.ts — feature flag definitions
tests/
factories/ — typed test data factories
integration/ — integration tests (require test DB)
Update this file whenever you add a significant new directory. It takes thirty seconds and saves minutes per task.
For conventions that are too complex to describe in a skill, consider a codebase-tour skill that links to canonical examples:
# ~/.hermes/skills/codebase-tour.md
---
name: Codebase Tour — Canonical Examples
tags: [domain/project-structure, scope/all-tasks, project/myapp]
---
When writing a new service: see src/services/email.ts (simple) or src/services/billing.ts (with external API)
When writing a new handler: see src/handlers/users.ts
When writing a migration: see src/db/migrations/0014_soft_delete_posts.ts
When writing tests: see src/services/email.test.ts (unit) or tests/integration/users.test.ts (integration)
This gives Loop 6 a map it can use when it needs a concrete reference point.
Using Loop 8 for Parallel Coding Tasks
Loop 8 is Hermes’s sub-agent orchestration layer. For most single-feature tasks it is not necessary — a single-threaded session is faster to manage. But there are three coding scenarios where parallel sub-agents deliver real time savings:
Write + test + review simultaneously. After the implementation plan is approved, you can dispatch three sub-agents: one to write the implementation, one to write the tests against the agreed interface, and one to draft the PR description from the plan. They run in parallel and their outputs are merged.
# hermes sub-agent config for parallel coding
agents:
- name: implementer
task: "Write the implementation for {feature} per the plan in context"
output: src/{path}
- name: tester
task: "Write tests for {feature} assuming the interface in the plan"
output: src/{path}.test.ts
- name: describer
task: "Draft a PR description for {feature} per our PR template skill"
output: .hermes/pr-draft.md
merge_strategy: review_then_combine
Cross-cutting changes. When a refactor touches ten files with the same mechanical change (rename a method, update an import path, add a field to every API response), split the files across sub-agents and merge the results. Hermes’s Loop 8 handles the fan-out; you review the diff once.
Audit pass before PR. A dedicated review sub-agent that runs against the full diff using the review-checklist skill, producing a structured list of issues before the human reviewer sees it. This catches the category of comments that are purely mechanical — missing JSDoc, debug logs, unhandled error paths — and resolves them before the review cycle starts.
The Review-Before-Commit Skill in Practice
The review-checklist skill becomes significantly more useful if you wire it into a Hermes workflow step rather than remembering to invoke it manually. Here is the pattern I use:
# ~/.hermes/workflows/pre-commit.md
---
name: Pre-commit Review
trigger: manual
tags: [domain/git, domain/review]
---
steps:
1. Load review-checklist skill
2. Read all staged files via `git diff --cached`
3. For each checklist item, scan the diff and report: PASS, FAIL, or N/A
4. If any FAIL: list the specific lines and the fix required
5. If all PASS or N/A: output "Ready to commit"
Running this before git commit costs about fifteen seconds. In my experience it catches something meaningful in roughly one in three commits — not because I write bad code, but because the checklist covers the mechanical class of issues that are easy to introduce and easy to miss in a quick self-review.
The comparison with a full CI run:
| Check | CI (minutes) | Hermes pre-commit (seconds) | Shift |
|---|---|---|---|
| Linting | 2–4 min | 10 sec | Left |
| Secret detection | 1–2 min | 5 sec | Left |
| Missing tests | Post-PR | 15 sec | Left |
| PR description quality | Human review | 10 sec | Left |
| Migration completeness | Post-deploy | 10 sec | Left |
“Shift left” is overused, but the underlying point holds: the earlier in the cycle you catch something, the cheaper it is to fix. The pre-commit workflow does not replace CI; it filters the noise before CI runs.
Measuring Coding Velocity Improvement
The worst way to evaluate an AI coding workflow is vibes. The second-worst way is lines of code. Here are four metrics that actually track whether the Hermes coding loop is working:
1. First-pass acceptance rate. Of the code Hermes generates, what fraction gets committed without modification? Track this per skill set. If your architecture patterns skill is well-written, the implementations should fit your codebase without structural rewrites. If acceptance rate is low, the skill is wrong — update it.
2. Review round trips. How many round trips does a PR take before approval? A well-formed PR description (from the template skill) and a clean diff (from the pre-commit review) should reduce this. Baseline two weeks of PRs before you start, measure two weeks after.
3. Test coverage delta. Does new code ship with tests? Track coverage per PR over time. If Hermes is generating tests alongside implementation, the coverage delta per feature should stay positive. A drop indicates tasks where test generation is being skipped — usually because the task framing did not include “Deliver: tests.”
4. Skill reuse rate. How many tasks per session result in a skill being loaded from the library (Loop 6 hit) vs. Hermes generating from scratch (miss)? Hermes logs this. A growing hit rate means your skill library is accumulating useful, well-tagged patterns. A flat hit rate means skills are being saved but not found — check your tagging taxonomy.
A simple dashboard:
Week | First-pass% | Review rounds | Coverage delta | Skill hit rate
-----|-------------|---------------|----------------|---------------
W1 | 42% | 2.4 | +3.1% | 18%
W4 | 61% | 1.8 | +4.7% | 34%
W8 | 74% | 1.4 | +5.2% | 51%
The numbers compound. A higher skill hit rate means better first-pass acceptance, which means fewer review rounds, which means more time to write well-structured tasks, which means the next skill saved is higher quality.
Putting It Together
The workflow I have described is not a set of optional enhancements on top of a standard coding process. It is a different model of what a coding session is. The shift is: time spent upfront on skills and task framing reduces total time over the lifetime of the feature and beyond, because the patterns compound.
The practical starting sequence:
- Create the five foundational skills (code style, architecture patterns, tests, PR template, review checklist) in a single setup session.
- Run three tasks using the full coding loop. Do not optimize anything yet — just observe.
- After those three tasks, check which skills were loaded (Loop 6 logs) and which checklist items fired most often. Those are the signals for what to improve.
- Add a
project-structureskill and acodebase-tourskill specific to your project. - Try one Loop 8 parallel run on a feature that has a clear interface contract you can commit to before implementation starts.
After two weeks, pull the four metrics. The numbers will tell you where the leverage is. In most codebases I have worked with, the highest-return investment is improving the architecture patterns skill — because it is the skill that, when wrong, generates the most expensive class of rework.
The skill library is the asset. Every task either draws from it or contributes to it. Design the workflow around that, and the velocity improvement is not a one-time gain — it compounds with every feature you ship.