27 KiB
Agile + Spec Driven Development + GitOps Documentation Framework
This document defines the documentation framework for a software project. It establishes a structured approach to creating, maintaining, and evolving technical documentation in alignment with GitOps principles—ensuring that documentation is versioned, auditable, and continuously validated alongside the codebase.
The ASG Framework: Eight Pillars of Documentation
| Document | Purpose | Primary Audience | Format / Content | Example (SaaS Context) | Measurement (KPI) |
|---|---|---|---|---|---|
| Requirements | Capture the business intent — why we're building this and what success looks like. Defines boundaries and user-visible outcomes. | Stakeholders, Product Owners, Lead Developers | User stories, PRDs, acceptance criteria, non-functional constraints. | "System must process tabular data from Julia to SvelteKit UI with <200ms latency for 5-member teams." | 95% of requests complete <200ms (synthetic monitoring). |
| Specification | The technical contract — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test. | Developers, QA Engineers, CI/CD pipelines | OpenAPI, Protobuf, AsyncAPI. Endpoint definitions, schemas, error codes. | contract.yaml defining a NATS subject that accepts Arrow streams with snake_case headers. |
100% of messages validated against spec (CI block rate). |
| UI Specification | The design contract — precise rules for UI components, interactions, and visual patterns. Ensures consistency across design and implementation. | Product Designers, Frontend Developers, QA Engineers | Component libraries, style guides, interaction specs, design tokens. | Atomic design system with Figma tokens synced to CSS variables. | 100% of UI components match design spec (visual regression tests). |
| Walkthrough | The end-to-end trace — maps the entire system flow from startup to cycle completion. Provides a "big picture" view that aligns users and developers, exposing architectural gaps before coding begins. | Product Owners, Architects, Developers | End-to-end user journey, sequence diagrams, state machine diagrams. Explicitly links to specification.md and ui-specification.md. | "User logs in → selects wine → edits details → saves → backend validates against spec → UI updates with error states." | 100% of user journeys traced across all layers before implementation starts. |
| Architecture | The blueprint — how components fit together, interact, and scale. Guides system structure and trade-offs. | Architects, Senior Developers, DevOps | C4 diagrams, Mermaid.js, component/network/storage models. | Diagram showing 6-node cluster routing traffic via Caddy → Node.js API → Julia pods. | 100% of major decisions logged with trade-off analysis. |
| Implementation | The real code — business logic, helpers, tests, configs. Where design becomes executable. | Developers, Code Reviewers | Source code, README.md, unit tests, setup scripts. | Julia function for matrix calculation + SvelteKit component rendering table. | >80% unit test coverage, <5% drift from spec. |
| Validation | The enforcer — ensures implementation matches the spec. Blocks drift and human error. | Automation servers, QA, Lead Developers | CI jobs, contract tests, linting, integration checks. | CI job rejects PR with camelCase field not allowed by YAML spec. | <1% of PRs bypass validation gates. |
| Runbook | The operational manual — how the system lives in production, scales, and recovers. Guides on-call engineers. | DevOps, SREs, On-call Developers | K8s manifests, Helm charts, Markdown guides. Deployment, scaling, backup/restore, troubleshooting. | GitOps manifest ensuring 6 Julia replicas restart if memory >80%. | MTTR <15 minutes for P1 incidents. |
Detailed Document Descriptions
1. Requirements
requirements.md
Purpose: Capture the business intent — why we're building this and what success looks like. Defines boundaries and user-visible outcomes.
Why It Matters:
- Aligns engineering efforts with business goals
- Provides a north star for feature development
- Establishes acceptance criteria before implementation begins
- Creates a contract between product and engineering
Content Guidelines:
- User stories with clear acceptance criteria (As a X, I want Y so that Z)
- Product Requirements Documents (PRDs) with success metrics
- Non-functional requirements (performance, security, scalability)
- Boundary definitions (what's in scope vs. out of scope)
- Do not cite any other documents as requirements.md is the first document to be created in the ASG Framework
Best Practices:
- Link each requirement to a measurable KPI
- Keep requirements testable and verifiable
- Maintain backward compatibility with existing requirements
- Review and update requirements as business context changes
2. Specification
specification.md
Purpose: The technical contract — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test.
Why It Matters:
- Prevents implementation drift between components
- Enables contract testing in CI/CD pipelines
- Provides a single source of truth for data structures
- Facilitates integration between teams
- Enables traceability from business requirements to technical implementation
Content Guidelines:
- API endpoint definitions (methods, paths, parameters)
- Request/response schemas (JSON, XML, Protobuf, AsyncAPI)
- Error codes and their meanings
- Data validation rules and constraints
- Rate limiting and quota definitions
- Each specification item must cite the specific requirement ID(s) from requirements.md (for example, FR-001, NFR-201). Link each endpoint, schema, validation rule, error case, etc to the exact requirement(s) it implements or satisfies.
Best Practices:
- Use formal specification languages (OpenAPI 3.0+, AsyncAPI)
- Version specifications alongside code
- Generate client SDKs from specifications
- Block CI on specification violations
- Document edge cases and error scenarios
3. UI Specification
ui-specification.md
Purpose: The design contract — precise rules for UI components, interactions, and visual patterns. Ensures consistency across design and implementation.
Why It Matters:
- Prevents visual inconsistency across the application
- Enables design system scalability and maintainability
- Provides a single source of truth for UI components and interactions
- Facilitates collaboration between designers and developers
- Enables traceability from business requirements to UI implementation
Content Guidelines:
- Component specifications (atomic design principles)
- Style guide (colors, typography, spacing, icons)
- Interaction patterns (animations, transitions, states)
- Design tokens (CSS variables, Figma tokens)
- Accessibility standards (WCAG compliance)
- Responsive design breakpoints and layouts
- Each UI specification artifact must cite the specific requirement ID(s) from requirements.md (for example FR-001, NFR-201) and/or the exact specification ID(s) or spec path(s) from specification.md that it implements or depends on.
Best Practices:
- Use design systems with versioned component libraries
- Sync design tokens from Figma to CSS variables automatically
- Document component states (normal, hover, disabled, error)
- Include accessibility requirements for each component
- Create visual regression tests for critical UI flows
- Version UI specifications alongside code
4. Walkthrough
walkthrough.md
Purpose: The end-to-end trace — maps the entire system flow from startup to cycle completion, incorporating both internal logic and external interactions. Provides a "big picture" view that aligns users and developers, exposing architectural gaps before coding begins.
Why It Matters:
- Serves as a single source of truth for system behavior, bridging user experience and technical implementation
- Validates that all components (UI, API, backend logic) work together cohesively before implementation begins
- Exposes architectural gaps and integration risks early in the design phase
- Aligns stakeholders (product, design, development) on the complete user journey
- Provides a reference for verifying implementation against intended behavior
Content Guidelines:
- End-to-end user journey from system startup to task completion
- External interactions (user actions, API calls, third-party integrations)
- Internal flow (state transitions, business logic, data transformations)
- Error paths and edge cases with their handling
- Sequence diagrams showing request/response patterns across all layers
- State machine diagrams for complex workflows
- Each major walkthrough step must cite the exact specification ID(s) or spec path(s) in specification.md and/or the UI‑spec artifact ID(s) or path(s) in ui-specification.md that it implements or depends on.
Best Practices:
- Write walkthroughs from the user's perspective first, then trace the technical implementation
- Include concrete examples with realistic data values
- Link each step to its corresponding specification and UI spec sections
- Document failure modes and recovery paths
- Keep walkthroughs updated as requirements evolve
- Review walkthroughs against architecture to verify feasibility
5. Architecture
architecture.md
Purpose: The blueprint — how components fit together, interact, and scale. Guides system structure and trade-offs.
Why It Matters:
- Provides a mental model for system design
- Guides technical decision-making and trade-off analysis
- Facilitates onboarding of new architects and senior developers
- Documents scaling and performance considerations
- Enables traceability from technical specifications and UI specifications to architectural decisions
Content Guidelines:
- C4 diagrams (Context, Container, Component levels)
- Mermaid.js flowcharts for sequence diagrams
- Component interaction diagrams
- Network topology and data flow
- Storage and caching strategies
- Scaling and resilience patterns
- Each architecture component must cite the exact specification ID(s) or spec path(s) in specification.md and/or the UI‑spec artifact ID(s) or path(s) in ui-specification.md that it implements or depends on.
Best Practices:
- Use diagrams that are easy to update (Mermaid.js over static images)
- Document trade-off decisions with Rationale Documents
- Include scaling considerations for each component
- Document failure modes and recovery strategies
- Keep architecture diagrams versioned with code
6. Implementation
Purpose: The real code — business logic, helpers, tests, configs. Where design becomes executable.
Why It Matters:
- This is the actual artifact that runs in production
- Code is the ultimate source of truth (when it matches spec)
- Tests validate correctness and prevent regressions
- Configuration files define runtime behavior
Content Guidelines:
- Business logic implementation
- Helper functions and utilities
- Unit and integration tests
- Configuration files (YAML, JSON, environment)
- Setup and development scripts
- Code organization and module structure
Best Practices:
- Follow consistent code style and conventions
- Write tests before or alongside implementation (TDD/BDD)
- Document complex logic with inline comments
- Keep configuration externalized and versioned
- Use type annotations where applicable
7. Validation
validation.md
Purpose: The enforcer — ensures implementation matches the spec. Blocks drift and human error.
Why It Matters:
- Prevents breaking changes from reaching production
- Catches specification violations early in the CI pipeline
- Maintains data integrity and API consistency
- Reduces manual QA effort through automation
Content Guidelines:
- CI/CD pipeline configurations
- Contract testing scripts
- Linting rules and configurations
- Integration test suites
- Schema validation jobs
- Security scanning and audit jobs
- Each validation must include explicit references to requirements.md (e.g., FR-001, NFR-201), specification.md (exact spec IDs or paths — API contracts, data schemas, e.g., sections 3.1, 4.2), and ui-specification.md (components, interactions, visual states, e.g., sections 15.1, 15.3) as applicable.
Best Practices:
- Fail CI on specification violations
- Run validation jobs on every commit and PR
- Use automated code review tools
- Maintain validation job health dashboard
- Document validation failure remediation steps
8. Runbook
runbook.md
Purpose: The operational manual — how the system lives in production, scales, and recovers. Guides on-call engineers.
Why It Matters:
- Reduces Mean Time To Recovery (MTTR) for incidents
- Provides step-by-step guidance for common issues
- Documents scaling and deployment procedures
- Ensures operational knowledge is not siloed
Content Guidelines:
- Deployment procedures (manual and automated)
- Scaling instructions (horizontal/vertical)
- Backup and restore procedures
- Troubleshooting guides for common issues
- Runbook entries for specific error codes
- Contact information and escalation paths
- Explicit references to
specification.md(API contracts, data schemas, specification.md sections 3.1, 4.2) - Explicit references to
ui-specification.md(components, interactions, visual states, ui-specification.md sections 15.1, 15.3)
Best Practices:
- Write runbooks for every P1/P2 incident
- Include exact commands and configuration snippets
- Test runbooks periodically (chaos engineering)
- Link runbook entries to relevant documentation
- Keep runbooks updated when system changes
How to Use This Approach Effectively
1. Start with Requirements
Before writing any code or documentation, establish clear requirements. Ask:
- What business problem are we solving?
- How will we measure success?
- What are the non-negotiable constraints?
Action: Create a docs/requirements/ directory and start with PRD.md and KPIs.md.
2. Define the Specification First
Once requirements are stable, define the technical specification. This becomes the contract for implementation.
Action: Create docs/specification/ with contract.yaml (or appropriate format) and error-codes.md.
3. Define UI Specification Early
Create UI specifications before implementing frontend components. This ensures design consistency and provides a contract for implementation.
Action: Create docs/ui-specification/ with component-library.md, style-guide.md, and design-tokens.json.
4. Create Walkthroughs Early
As soon as the UI specification is defined, create walkthroughs. This helps identify gaps in the flow and provides onboarding material. The walkthrough should be a single source of truth that explicitly links to specification.md and ui-specification.md.
Action: Create docs/walkthrough/ with TOUR.md, sequence diagrams, and cross-references to specification and UI spec sections.
5. Design the Architecture
With requirements, specification, UI spec, and walkthrough in place, design the architecture. Document trade-off decisions explicitly.
Action: Create docs/architecture/ with Mermaid diagrams and trade-offs.md.
6. Implement with Validation in Mind
Write implementation code that adheres to the specification. Build validation into the CI pipeline from day one.
Action: Ensure test files are co-located with implementation and run on every commit.
7. Automate Validation
Build automated validation that runs in CI/CD. This ensures spec compliance and prevents drift.
Action: Configure CI jobs to validate against specification and block PRs on violations.
8. Document Operations from Day One
Create runbook entries as soon as deployment procedures are established. Update them when incidents occur.
Action: Create docs/runbook/ with entries for deployment, scaling, and common issues.
The Gap-Check Process
Since all docs defined in the ASG Framework are living documents that evolve throughout the project lifecycle, this Gap-Check process ensures each documentation stage validates the previous one before moving forward.
| Stage Transition | Goal | Gap-Check Question | Example Gap | Result |
|---|---|---|---|---|
| Requirements → Specification | Turn a "Wish" into a "Rule" | Does the Specification define all edge cases and conflict scenarios from the Requirements? | Requirement says "invite a teammate" but Specification doesn't define what happens if the teammate already has an account | Add a UserConflict rule to the Specification |
| Specification → UI Specification | Turn a "Rule" into a "Button" | Does the UI Specification expose all the data and states defined in the Specification? | Specification says device must "Handshake" within 5 seconds, but UI Specification has no connection status indicator | Add a Connection Status component to UI Specification |
| UI Specification → Walkthrough | Turn "Screens" into a "Story" | Does the Walkthrough reflect the complete flow including error states and timing? | Walkthrough shows "Success" screen, but Specification says backend process takes 2 minutes | Add a Processing state to UI Specification and JobStatus field to Specification |
| Walkthrough → Architecture | Turn the "Story" into "Steel" | Does the Architecture support the performance and integration requirements defined in the Walkthrough? | Walkthrough shows 10 IoT sensors sending real-time updates, but Architecture uses REST API | Switch to NATS/MQTT for high-frequency data flow |
Why Gap-Checks Matter
- Prevent rework: Catch missing requirements before coding begins
- Ensure completeness: Verify all scenarios are covered across all layers
- Validate feasibility: Confirm architectural decisions support the intended user flow
- Maintain alignment: Keep all stakeholders (product, design, dev) on the same page
How to Run Gap-Checks
- Requirements Review: After writing requirements, ask "What happens if X?" for each user story
- Specification Review: After writing the spec, verify every requirement has at least one rule with a corresponding requirement ID reference
- UI Specification Review: After writing UI Specification, verify every spec rule has at least one UI representation with a corresponding requirement ID reference
- Walkthrough Review: After writing walkthrough, verify every UI step has a corresponding backend flow
- Architecture Review: After designing architecture, verify every walkthrough flow is technically feasible and every architecture decision references specification and UI specification sections
The Pre-Code Debugging Habit
This habit ensures that each documentation stage is complete and validated before moving forward. These checklists serve as mental guards against building features that don't meet the requirements or architecture.
| Stage | Debug Checklist | Why It Matters |
|---|---|---|
| Requirements | If I can't state the Value, I'm building a feature nobody needs. | Clarifies the business impact before technical discussion begins |
| Specification | If I can't write a Rule with a requirement ID reference, I haven't thought the logic through. | Ensures traceability from technical rules to business requirements |
| UI Specification | If I can't See it with a requirement ID reference, the user has no way to trigger the logic. | Ensures traceability from UI elements to business requirements |
| Walkthrough | If I can't Trace the full flow, the system has a 'broken bridge'. | Identifies gaps between components before implementation begins |
| Architecture | If the Walkthrough doesn't Require it, or if I can't trace spec/UI-spec references, I'm over-engineering or missing requirements. | Ensures traceability from architecture to spec/UI-spec and requirements |
How to Use This Habit:
- Before writing each doc, run through the checklist for that stage
- Before moving to the next doc, run through the checklist for the current stage
- When reviewing docs, use the checklist to verify completeness
This habit turns documentation into a self-validating process—each stage catches gaps from the previous one before they become bugs in production.
GitOps Integration
This documentation framework aligns with GitOps principles:
| GitOps Principle | Documentation Alignment |
|---|---|
| Versioned | All documentation lives in git, with history and audit trail |
| Declarative | Specifications and architecture are declarative contracts |
| Automated | Validation jobs automate spec compliance checks |
| Self-Service | Walkthroughs and runbooks enable self-service onboarding and operations |
| Observability | KPIs and metrics are defined for each documentation artifact |
Metrics and Continuous Improvement
Each documentation artifact has associated KPIs. Track these to ensure quality:
| Document | KPI | Target |
|---|---|---|
| Requirements | Requirement coverage | 100% of features have associated requirements |
| Specification | Specification compliance rate | 100% of messages validate against spec |
| UI Specification | UI spec compliance | 100% of components match design spec (visual regression tests) |
| Architecture | Decision documentation | 100% of major decisions logged with trade-offs |
| Walkthrough | New dev time-to-first-PR | <2 days from onboarding to first contribution |
| Implementation | Test coverage | >80% unit test coverage |
| Validation | Bypass rate | <1% of PRs bypass validation gates |
| Runbook | MTTR | <15 minutes for P1 incidents |
Review Cadence:
- Weekly: Review KPI dashboards and documentation gaps
- Monthly: Update documentation based on incident learnings
- Quarterly: Full framework review and improvement
Template Examples
Requirements Template
# PRD: Feature Name
## Business Goal
[What problem are we solving?]
## Success Metrics
- [Metric 1]: Target [value]
- [Metric 2]: Target [value]
## User Stories
- As a [role], I want [feature] so that [benefit]
- Acceptance Criteria: [details]
## Non-Functional Requirements
- Performance: [details]
- Security: [details]
- Scalability: [details]
## Out of Scope
- [What's explicitly excluded]
Specification Template
# contract.yaml
openapi: 3.0.0
info:
title: NATSBridge API
version: 1.0.0
description: Technical specification with requirements traceability
paths:
/api/v1/endpoint:
post:
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/Request'
responses:
'200':
description: Success
content:
application/json:
schema:
$ref: '#/components/schemas/Response'
x-requirement-id: FR-001 # Reference to Requirements document
components:
schemas:
Request:
type: object
properties:
data:
type: string
x-requirement-id: FR-002 # Reference to Requirements document
Response:
type: object
properties:
status:
type: string
x-requirement-id: FR-003 # Reference to Requirements document
UI Specification Template
# UI Specification: Component Name
## Component Overview
**Name**: [Component Name]
**Status**: [Draft/Approved/Deprecated]
**Version**: 1.0.0
## Requirements Traceability
| UI Element | Requirement ID | Description |
|------------|----------------|-------------|
| Login form | FR-001 | Username/password authentication |
| Sidebar menu | NFR-202 | Collapsible sidebar for usability |
| Wine table | FR-101 | Sortable, searchable inventory table |
## Design Tokens
- **Color**: `--color-primary`
- **Typography**: `--font-body`
- **Spacing**: `--spacing-md`
## Component States
| State | Description | Example | Requirement ID |
|-------|-------------|---------|----------------|
| Default | Normal state | `<Button />` | FR-001 |
| Hover | Mouse over | `<Button hover />` | NFR-201 |
| Disabled | Disabled state | `<Button disabled />` | NFR-203 |
| Error | Error state | `<Button error />` | EH-004 |
## Accessibility
- [ ] WCAG 2.1 AA compliant
- [ ] Keyboard navigation support
- [ ] Screen reader compatible
## Implementation
- Component file: `src/components/[Component].svelte`
- Test file: `src/components/[Component].test.js`
- Visual regression test: `tests/visual/[Component].spec.js`
## References
- Figma: [Link to design file]
- Storybook: [Link to component story]
- Requirements: [`docs/requirements.md`](../requirements.md)
Architecture Template
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#3b82f6'}}}%%
flowchart TD
A[Client] --> B[Caddy]
B --> C[Node.js API]
C --> D[Julia Worker]
D --> E[NATS Cluster]
E --> F[Storage]
style A fill:#f9f9f9,stroke:#333
style E fill:#e0e7ff,stroke:#3b82f6
Specification Traceability:
| Architecture Component | Specification Section | UI Specification Section | Requirement ID |
|---|---|---|---|
| NATS Bridge (smartsend) | specification.md:3.1 | - | FR-301, NFR-201 |
| Login Form | specification.md:10.1 | ui-specification.md:15.2.1 | FR-001, NFR-201 |
| Wine Inventory Table | specification.md:4.2 | ui-specification.md:15.3.3 | FR-101, NFR-205 |
| Sidebar Menu | specification.md:11.1 | ui-specification.md:15.3.2 | NFR-202 |
Component Details:
- Caddy: Static file serving (specification.md:2.2, ui-specification.md:15.1)
- Node.js API: NATS bridge client (specification.md:3.1)
- Julia Worker: Backend processing (specification.md:4.1)
- NATS Cluster: Message bus (specification.md:3.3)
- Storage: Data persistence (specification.md:5.1)
Runbook Template
# Runbook: Service Restart
**Severity**: P2
**Estimated Time**: 5 minutes
## Symptoms
- Service is unresponsive
- Health checks are failing
## Steps
1. SSH to the host
2. Run: `kubectl rollout restart deployment/natsbridge`
3. Monitor: `kubectl get pods -l app=natsbridge -w`
## Rollback
- Run: `kubectl rollout undo deployment/natsbridge`
## Post-Incident
- [ ] Review logs for root cause
- [ ] Update runbook if needed
Conclusion
This ASG Documentation Framework ensures that documentation is:
- Structured: Eight distinct artifacts with clear purposes
- Automated: Validation and CI/CD integration
- Versioned: All documentation in git with history
- Measurable: KPIs for quality and effectiveness
- Actionable: Practical templates and examples
Use this framework as a living document—update it as your team's needs evolve.