712 lines
35 KiB
Markdown
712 lines
35 KiB
Markdown
# Agile + Spec Driven Development + GitOps Documentation Framework
|
||
|
||
This document defines the documentation framework for a software project. It establishes a structured approach to creating, maintaining, and evolving technical documentation in alignment with GitOps principles—ensuring that documentation is versioned, auditable, and continuously validated alongside the codebase.
|
||
|
||
---
|
||
|
||
## The ASG Framework: Eight Pillars of Documentation
|
||
|
||
| Document | Purpose | Primary Audience | Format / Content | Example (SaaS Context) | Measurement (KPI) |
|
||
|----------|---------|-----------------|------------------|------------------------|-------------------|
|
||
| **Requirements** | Capture the **business intent** — why we're building this and what success looks like. Defines boundaries and user-visible outcomes. | Stakeholders, Product Owners, Lead Developers | User stories, PRDs, acceptance criteria, non-functional constraints. | "System must process tabular data from Julia to SvelteKit UI with <200ms latency for 5-member teams." | 95% of requests complete <200ms (synthetic monitoring). |
|
||
| **Solution Design** | Translate requirements into an **viable solution** — how the system solves the user problem. Defines approach, components, and trade-offs before technical details. | Product Owners, Architects, Developers | Problem decomposition, solution approach, alternatives considered, high-level component diagram, decision rationale. | "Problem: Users need to compare wines. Solution: Build a comparison table with filtering, sorting, and visual comparison. Components: Data layer (Wine API), UI layer (Table component), Logic layer (Filter/Sort engine)." | 100% of user problems mapped to solution components before spec writing. |
|
||
| **Specification** | The **technical contract** — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test. | Developers, QA Engineers, CI/CD pipelines | OpenAPI, Protobuf, AsyncAPI. Endpoint definitions, schemas, error codes. | `contract.yaml` defining a NATS subject that accepts Arrow streams with snake_case headers. | 100% of messages validated against spec (CI block rate). |
|
||
| **UI Specification** | The **design contract** — precise rules for UI components, interactions, and visual patterns. Ensures consistency across design and implementation. | Product Designers, Frontend Developers, QA Engineers | Component libraries, style guides, interaction specs, design tokens. | Atomic design system with Figma tokens synced to CSS variables. | 100% of UI components match design spec (visual regression tests). |
|
||
| **Walkthrough** | The **end-to-end trace** — maps the entire system flow from startup to cycle completion. Provides a "big picture" view that validates solution design before technical specification is locked. | Product Owners, Architects, Developers | End-to-end user journey, sequence diagrams, state machine diagrams. Explicitly links to specification.md and ui-specification.md. | "User logs in → selects wine → edits details → saves → backend validates against spec → UI updates with error states." | 100% of user journeys traced across all layers before implementation starts. |
|
||
|
||
| **Implementation** | The **real code** — business logic, helpers, tests, configs. Where design becomes executable. | Developers, Code Reviewers | Source code, README.md, unit tests, setup scripts. | Julia function for matrix calculation + SvelteKit component rendering table. | >80% unit test coverage, <5% drift from spec. |
|
||
| **Validation** | The **enforcer** — ensures implementation matches the spec. Blocks drift and human error. | Automation servers, QA, Lead Developers | CI jobs, contract tests, linting, integration checks. | CI job rejects PR with camelCase field not allowed by YAML spec. | <1% of PRs bypass validation gates. |
|
||
| **Runbook** | The **operational manual** — how the system lives in production, scales, and recovers. Guides on-call engineers. | DevOps, SREs, On-call Developers | K8s manifests, Helm charts, Markdown guides. Deployment, scaling, backup/restore, troubleshooting. | GitOps manifest ensuring 6 Julia replicas restart if memory >80%. | MTTR <15 minutes for P1 incidents. |
|
||
|
||
---
|
||
|
||
## Detailed Document Descriptions
|
||
|
||
### 1. Requirements
|
||
`requirements.md`
|
||
|
||
**Purpose**: Capture the *business intent* — why we're building this and what success looks like. Defines boundaries and user-visible outcomes.
|
||
|
||
**Why It Matters**:
|
||
- Aligns engineering efforts with business goals
|
||
- Provides a north star for feature development
|
||
- Establishes acceptance criteria before implementation begins
|
||
- Creates a contract between product, engineering, security, and operations.
|
||
|
||
**Content Guidelines**:
|
||
- User stories with clear acceptance criteria (As a X, I want Y so that Z)
|
||
- Functional Requirements Documents with clear success metrics and KPIs.
|
||
- Non-Functional Requirements covering performance, scalability, availability, reliability, and privacy.
|
||
- Boundary definitions that state what is in scope and out of scope.
|
||
- Security requirements including threat model outcomes, authentication and authorization expectations, data classification, encryption requirements, and compliance controls.
|
||
- Observability requirements specifying required telemetry, metrics, traces, logs, alerting thresholds, and retention policies.
|
||
- Traceability Rule: Requirements must be self-contained (no cross-references to other docs as justification), but must include IDs (e.g., FR-001) so downstream artifacts (spec, UI spec, architecture) can trace back. All downstream documents MUST cite requirement IDs, but requirements themselves stand alone.
|
||
|
||
**Best Practices**:
|
||
- Link each requirement to a measurable KPI
|
||
- Keep requirements testable and verifiable
|
||
- Maintain backward compatibility with existing requirements
|
||
- Review and update requirements as business context changes
|
||
|
||
---
|
||
|
||
### 2. Solution Design
|
||
`solution-design.md`
|
||
|
||
**Purpose**: Translate requirements into an *viable solution* — how the system solves the user problem. Defines approach, components, and trade-offs before technical details.
|
||
|
||
**Why It Matters**:
|
||
- Prevents technical-first thinking that skips design decisions
|
||
- Ensures the solution addresses the actual user problem, not just requirements
|
||
- Enables architects to evaluate alternatives before committing to technical details
|
||
- Creates alignment between product, design, and engineering on the approach
|
||
- Provides a reference for evaluating specification choices against intended solution
|
||
|
||
**Content Guidelines**:
|
||
- Problem decomposition: Break down the user problem into smaller, solvable pieces
|
||
- Solution approach: Describe the high-level approach (e.g., "Use event sourcing for audit trail", "Implement caching layer for performance")
|
||
- Alternatives considered: Document at least 2-3 alternative approaches with rationale for why they were rejected
|
||
- High-level component diagram: Show major components and their relationships (not technical details, just the big picture)
|
||
- Decision rationale: Explain why each major decision was made (e.g., "Chose NATS over Kafka for simpler ops", "Used server-side rendering for SEO")
|
||
- Risk assessment: Identify potential risks and how they will be mitigated
|
||
- Traceability: Link each solution component to specific requirement ID(s) (e.g., FR-001, NFR-201) that it addresses
|
||
|
||
**Best Practices**:
|
||
- Write solution design from the user's perspective first, then justify technical choices
|
||
- Keep it high-level—avoid technical specifics (API endpoints, data schemas, etc.)
|
||
- Use diagrams that are easy to update (Mermaid.js)
|
||
- Document trade-off decisions explicitly
|
||
- Review solution design against requirements to verify completeness
|
||
- Get sign-off from product, architecture, and engineering before moving to specification
|
||
|
||
**Gap-Check Question**: Does the Solution Design clearly explain how the system solves the user problem, not just what it does?
|
||
|
||
**Example Gap**: Requirements say "users can compare wines", but solution design only mentions "wine comparison API" without explaining how users will actually use the comparison feature.
|
||
|
||
---
|
||
|
||
### 3. Specification
|
||
`specification.md`
|
||
**Purpose**: The *technical contract* — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test.
|
||
|
||
**Why It Matters**:
|
||
- Prevents implementation drift between components
|
||
- Enables contract testing in CI/CD pipelines
|
||
- Provides a single source of truth for data structures
|
||
- Facilitates integration between teams
|
||
- Enables traceability from business requirements to technical implementation
|
||
|
||
**Content Guidelines**:
|
||
- API endpoint definitions (methods, paths, parameters)
|
||
- Request/response schemas (JSON, XML, Protobuf, AsyncAPI)
|
||
- Error codes and their meanings
|
||
- Data validation rules and constraints
|
||
- Rate limiting and quota definitions
|
||
- Each specification item must cite the specific requirement ID(s) from requirements.md (for example, FR-001, NFR-201). Link each endpoint, schema, validation rule, error case, etc to the exact requirement(s) it implements or satisfies.
|
||
|
||
**Best Practices**:
|
||
- Use formal specification languages (OpenAPI 3.0+, AsyncAPI)
|
||
- Version specifications alongside code
|
||
- Generate client SDKs from specifications
|
||
- Block CI on specification violations
|
||
- Document edge cases and error scenarios
|
||
|
||
---
|
||
|
||
### 4. UI Specification
|
||
`ui-specification.md`
|
||
|
||
**Purpose**: The *design contract* — precise rules for UI components, interactions, and visual patterns. Ensures consistency across design and implementation.
|
||
|
||
**Why It Matters**:
|
||
- Prevents visual inconsistency across the application
|
||
- Enables design system scalability and maintainability
|
||
- Provides a single source of truth for UI components and interactions
|
||
- Facilitates collaboration between designers and developers
|
||
- Enables traceability from business requirements to UI implementation
|
||
|
||
**Content Guidelines**:
|
||
- Component specifications (atomic design principles)
|
||
- Style guide (colors, typography, spacing, icons)
|
||
- Interaction patterns (animations, transitions, states)
|
||
- Design tokens (CSS variables, Figma tokens)
|
||
- Accessibility standards (WCAG compliance)
|
||
- Responsive design breakpoints and layouts
|
||
- Each UI specification artifact must cite the specific requirement ID(s) from requirements.md (for example FR-001, NFR-201) and/or the exact specification ID(s) or spec path(s) from specification.md that it implements or depends on.
|
||
|
||
**Best Practices**:
|
||
- Use design systems with versioned component libraries
|
||
- Sync design tokens from Figma to CSS variables automatically
|
||
- Document component states (normal, hover, disabled, error)
|
||
- Include accessibility requirements for each component
|
||
- Create visual regression tests for critical UI flows
|
||
- Version UI specifications alongside code
|
||
|
||
---
|
||
|
||
### 5. Walkthrough
|
||
`walkthrough.md`
|
||
|
||
**Purpose**: The *end-to-end trace* — maps the entire system flow from startup to cycle completion, incorporating both internal logic and external interactions. Provides a "big picture" view that aligns users and developers, validating the solution design before technical specification is locked.
|
||
|
||
**Unified Lifecycle Review**
|
||
The walkthrough functions as a collaborative audit, mapping the proposed solution directly onto your workflow to ensure a shared "single source of truth." It synchronizes the user's operational needs with the developer's technical implementation.
|
||
|
||
**For the User: Validating Operational Utility**
|
||
- **Visual Fidelity**: Concrete demonstration of the UI and user-facing components
|
||
- **Workflow Integration**: Clear mapping of how the tool fits into existing day-to-day operations
|
||
- **Operational Control**: Practical exposure to system controls and interaction patterns
|
||
- **Outcome Verification**: Direct observation of the system's outputs and deliverables
|
||
- **Impact Assessment**: Data-driven evaluation of the solution's ROI and efficiency gains
|
||
- **Expectation Alignment**: Early discovery of discrepancies between intended functionality and actual experience
|
||
|
||
**For the Developer: Aligning Architecture with Reality**
|
||
- **Requirement Validation**: Confirming the solution satisfies core business objectives
|
||
- **Impact Analysis**: Assessing the solution's real-world behavioral performance
|
||
- **Knowledge Synthesis**: Identifying disconnects between domain intent and technical interpretation
|
||
- **Specification Refinement**: Spotting missing technical dependencies or edge cases
|
||
- **System Harmony**: Observing how the technical workflow and the user process interleave
|
||
- **Code-to-Process Mapping**: Refining abstraction layers to better reflect the user's actual sequence of events
|
||
|
||
**Why It Matters**:
|
||
- Serves as a single source of truth for system behavior, bridging user experience and technical implementation
|
||
- Validates that all components (UI, API, backend logic) work together cohesively before implementation begins
|
||
- Exposes solution design gaps and integration risks early in the design phase
|
||
- Aligns stakeholders (product, design, development) on the complete user journey
|
||
- Provides a reference for verifying implementation against intended behavior and for tracing requirements through to architecture
|
||
|
||
**Content Guidelines**:
|
||
- End-to-end user journey from system startup to task completion
|
||
- External interactions (user actions, API calls, third-party integrations)
|
||
- Internal flow (state transitions, business logic, data transformations)
|
||
- Error paths and edge cases with their handling
|
||
- Sequence diagrams showing request/response patterns across all layers
|
||
- State machine diagrams for complex workflows
|
||
- Each major walkthrough step must cite the exact specification ID(s) or spec path(s) in specification.md and/or the UI‑spec artifact ID(s) or path(s) in ui-specification.md that it implements or depends on.
|
||
|
||
**Best Practices**:
|
||
- Write walkthroughs from the user's perspective first, then trace the technical implementation
|
||
- Include concrete examples with realistic data values
|
||
- Link each step to its corresponding specification and UI spec sections
|
||
- Document failure modes and recovery paths
|
||
- Keep walkthroughs updated as requirements evolve
|
||
- Review walkthroughs against the solution design to verify completeness
|
||
|
||
---
|
||
|
||
### 6. Implementation
|
||
|
||
**Purpose**: The *real code* — business logic, helpers, tests, configs. Where design becomes executable.
|
||
|
||
**Why It Matters**:
|
||
- This is the actual artifact that runs in production
|
||
- Code is the ultimate source of truth (when it matches spec)
|
||
- Tests validate correctness and prevent regressions
|
||
- Configuration files define runtime behavior
|
||
|
||
**Content Guidelines**:
|
||
- Business logic implementation
|
||
- Helper functions and utilities
|
||
- Unit and integration tests
|
||
- Configuration files (YAML, JSON, environment)
|
||
- Setup and development scripts
|
||
- Code organization and module structure
|
||
|
||
**Best Practices**:
|
||
- Follow consistent code style and conventions
|
||
- Write tests before or alongside implementation (TDD/BDD)
|
||
- Document complex logic with inline comments
|
||
- Keep configuration externalized and versioned
|
||
- Use type annotations where applicable
|
||
|
||
---
|
||
|
||
### 7. Validation
|
||
`validation.md`
|
||
|
||
**Purpose**: The *enforcer* — ensures implementation matches the spec. Blocks drift and human error.
|
||
|
||
**Why It Matters**:
|
||
- Prevents breaking changes from reaching production
|
||
- Catches specification violations early in the CI pipeline
|
||
- Maintains data integrity and API consistency
|
||
- Reduces manual QA effort through automation
|
||
|
||
**Content Guidelines**:
|
||
- CI/CD pipeline configurations
|
||
- Contract testing scripts
|
||
- Linting rules and configurations
|
||
- Integration test suites
|
||
- Schema validation jobs
|
||
- Security scanning and audit jobs
|
||
- Each validation must include explicit references to requirements.md (e.g., FR-001, NFR-201), specification.md (exact spec IDs or paths — API contracts, data schemas, e.g., sections 3.1, 4.2), and ui-specification.md (components, interactions, visual states, e.g., sections 15.1, 15.3) as applicable.
|
||
|
||
**Best Practices**:
|
||
- Fail CI on specification violations
|
||
- Run validation jobs on every commit and PR
|
||
- Use automated code review tools
|
||
- Maintain validation job health dashboard
|
||
- Document validation failure remediation steps
|
||
|
||
---
|
||
|
||
### 8. Runbook
|
||
`runbook.md`
|
||
|
||
**Purpose**: The *operational manual* — how the system lives in production, scales, and recovers. Guides on-call engineers.
|
||
|
||
**Why It Matters**:
|
||
- Reduces Mean Time To Recovery (MTTR) for incidents
|
||
- Provides step-by-step guidance for common issues
|
||
- Documents scaling and deployment procedures
|
||
- Ensures operational knowledge is not siloed
|
||
|
||
**Content Guidelines**:
|
||
- Deployment procedures (manual and automated)
|
||
- Scaling instructions (horizontal/vertical)
|
||
- Backup and restore procedures
|
||
- Troubleshooting guides for common issues
|
||
- Runbook entries for specific error codes
|
||
- Contact information and escalation paths
|
||
- Explicit references to [`specification.md`](specification.md) (API contracts, data schemas, specific endpoint sections)
|
||
- Explicit references to [`ui-specification.md`](ui-specification.md) (components, interactions, visual states, specific component sections)
|
||
|
||
**Best Practices**:
|
||
- Write runbooks for every P1/P2 incident
|
||
- Include exact commands and configuration snippets
|
||
- Test runbooks periodically (chaos engineering)
|
||
- Link runbook entries to relevant documentation
|
||
- Keep runbooks updated when system changes
|
||
|
||
---
|
||
|
||
## How to Use This Approach Effectively
|
||
|
||
### 1. Start with Requirements
|
||
|
||
Before writing any code or documentation, establish clear requirements. Ask:
|
||
- What business problem are we solving?
|
||
- How will we measure success?
|
||
- What are the non-negotiable constraints?
|
||
|
||
**Action**: Create a `docs/requirements/` directory and start with `PRD.md` and `KPIs.md`.
|
||
|
||
### 2. Define Solution Design
|
||
|
||
Once requirements are stable, translate them into an viable solution. This ensures the technical approach solves the user problem, not just requirements.
|
||
|
||
**Action**: Create `docs/solution-design/` with `solution-design.md`, problem decomposition, alternatives considered, high-level component diagram, and decision rationale.
|
||
|
||
### 3. Define the Specification
|
||
|
||
With solution design in place, define the technical specification. This becomes the contract for implementation.
|
||
|
||
**Action**: Create `docs/specification/` with `contract.yaml` (or appropriate format) and `error-codes.md`.
|
||
|
||
### 4. Define UI Specification Early
|
||
|
||
Create UI specifications before implementing frontend components. This ensures design consistency and provides a contract for implementation.
|
||
|
||
**Action**: Create `docs/ui-specification/` with `component-library.md`, `style-guide.md`, and `design-tokens.json`.
|
||
|
||
### 5. Create Walkthroughs Early
|
||
|
||
As soon as the UI specification is defined, create walkthroughs. This helps identify gaps in the flow and provides onboarding material. The walkthrough should be a single source of truth that explicitly links to specification.md and ui-specification.md.
|
||
|
||
**Action**: Create `docs/walkthrough/` with `TOUR.md`, sequence diagrams, and cross-references to specification and UI spec sections.
|
||
|
||
### 6. Implement with Validation in Mind
|
||
|
||
Write implementation code that adheres to the specification. Build validation into the CI pipeline from day one.
|
||
|
||
**Action**: Ensure test files are co-located with implementation and run on every commit.
|
||
|
||
### 7. Automate Validation
|
||
|
||
Build automated validation that runs in CI/CD. This ensures spec compliance and prevents drift.
|
||
|
||
**Action**: Configure CI jobs to validate against specification and block PRs on violations.
|
||
|
||
### 8. Document Operations from Day One
|
||
|
||
Create runbook entries as soon as deployment procedures are established. Update them when incidents occur.
|
||
|
||
**Action**: Create `docs/runbook/` with entries for deployment, scaling, and common issues.
|
||
|
||
---
|
||
|
||
## The Gap-Check Process
|
||
|
||
Since all docs defined in the ASG Framework are **living documents** that evolve throughout the project lifecycle, this Gap-Check process ensures each documentation stage validates the previous one before moving forward.
|
||
|
||
| Stage Transition | Goal | Gap-Check Question | Example Gap | Result |
|
||
|------------------|------|-------------------|-------------|--------|
|
||
| **Requirements → Solution Design** | Turn a "Wish" into a "Solution" | Does the Solution Design clearly explain how the system solves the user problem, not just what it does? | Requirements say "users can compare wines", but solution design only mentions "wine comparison API" without explaining how users will actually use the comparison feature | Add UI flow and user interaction patterns to the solution design |
|
||
| **Solution Design → Specification** | Turn a "Solution" into a "Contract" | Does the Specification define all technical details that the solution approach requires? | Solution design says "use caching for performance", but Specification doesn't define cache invalidation strategy | Add cache TTL, invalidation rules, and error handling to Specification |
|
||
| **Specification → UI Specification** | Turn a "Rule" into a "Button" | Does the UI Specification expose all the data and states defined in the Specification? | Specification says device must "Handshake" within 5 seconds, but UI Specification has no connection status indicator | Add a `Connection Status` component to UI Specification |
|
||
| **UI Specification → Walkthrough** | Turn "Screens" into a "Story" | Does the Walkthrough reflect the complete flow including error states and timing? | Walkthrough shows "Success" screen, but Specification says backend process takes 2 minutes | Add a `Processing` state to UI Specification and `JobStatus` field to Specification |
|
||
| **Walkthrough → Specification** | Turn the "Story" into "Rules" | Does the Specification cover all the behaviors defined in the Walkthrough? | Walkthrough shows 10 IoT sensors sending real-time updates, but Specification only defines REST API | Add NATS/MQTT protocol definitions to Specification |
|
||
|
||
### Why Gap-Checks Matter
|
||
|
||
- **Prevent rework**: Catch missing requirements before coding begins
|
||
- **Ensure completeness**: Verify all scenarios are covered across all layers
|
||
- **Validate feasibility**: Confirm technical approach supports the intended user flow
|
||
- **Maintain alignment**: Keep all stakeholders (product, design, dev) on the same page
|
||
|
||
### How to Run Gap-Checks
|
||
|
||
1. **Requirements Review**: After writing requirements, ask "What happens if X?" for each user story
|
||
2. **Solution Design Review**: After writing solution design, verify every requirement is addressed by at least one solution component with a corresponding requirement ID reference
|
||
3. **Specification Review**: After writing the spec, verify every requirement has at least one rule with a corresponding requirement ID reference
|
||
4. **UI Specification Review**: After writing UI Specification, verify every spec rule has at least one UI representation with a corresponding requirement ID reference
|
||
5. **Walkthrough Review**: After writing walkthrough, verify every UI step has a corresponding backend flow
|
||
|
||
---
|
||
|
||
## The Pre-Code Debugging Habit
|
||
|
||
This habit ensures that each documentation stage is complete and validated before moving forward. These checklists serve as mental guards against building features that don't meet the requirements or architecture.
|
||
|
||
| Stage | Debug Checklist | Why It Matters |
|
||
|-------|-----------------|----------------|
|
||
| **Requirements** | If I can't state the Value, I'm building a feature nobody needs. | Clarifies the business impact before technical discussion begins |
|
||
| **Solution Design** | If I can't Imagine the solution, I'm building without a compass. | Ensures the technical approach solves the user problem before spec writing |
|
||
| **Specification** | If I can't write a Rule with a requirement ID reference, I haven't thought the logic through. | Ensures traceability from technical rules to business requirements |
|
||
| **UI Specification** | If I can't See it with a requirement ID reference, the user has no way to trigger the logic. | Ensures traceability from UI elements to business requirements |
|
||
| **Walkthrough** | If I can't Trace the full flow, the system has a 'broken bridge'. | Identifies gaps between components before implementation begins |
|
||
|
||
**How to Use This Habit:**
|
||
|
||
1. **Before writing each doc**, run through the checklist for that stage
|
||
2. **Before moving to the next doc**, run through the checklist for the current stage
|
||
3. **When reviewing docs**, use the checklist to verify completeness
|
||
|
||
This habit turns documentation into a self-validating process—each stage catches gaps from the previous one before they become bugs in production.
|
||
|
||
---
|
||
|
||
## GitOps Integration
|
||
|
||
This documentation framework aligns with GitOps principles:
|
||
|
||
| GitOps Principle | Documentation Alignment |
|
||
|------------------|------------------------|
|
||
| **Versioned** | All documentation lives in git, with history and audit trail |
|
||
| **Declarative** | Specifications and architecture are declarative contracts |
|
||
| **Automated** | Validation jobs automate spec compliance checks |
|
||
| **Self-Service** | Walkthroughs and runbooks enable self-service onboarding and operations |
|
||
| **Observability** | KPIs and metrics are defined for each documentation artifact |
|
||
|
||
---
|
||
|
||
## Metrics and Continuous Improvement
|
||
|
||
Each documentation artifact has associated KPIs. Track these to ensure quality:
|
||
|
||
| Document | KPI | Target |
|
||
|----------|-----|--------|
|
||
| Requirements | Requirement coverage | 100% of features have associated requirements |
|
||
| Solution Design | Solution completeness | 100% of user problems mapped to solution components before spec writing |
|
||
| Specification | Specification compliance rate | 100% of messages validate against spec |
|
||
| UI Specification | UI spec compliance | 100% of components match design spec (visual regression tests) |
|
||
| Architecture | Decision documentation | 100% of major decisions logged with trade-offs |
|
||
| Walkthrough | New dev time-to-first-PR | <2 days from onboarding to first contribution |
|
||
| Implementation | Test coverage | >80% unit test coverage |
|
||
| Validation | Bypass rate | <1% of PRs bypass validation gates |
|
||
| Runbook | MTTR | <15 minutes for P1 incidents |
|
||
|
||
**Review Cadence**:
|
||
- Weekly: Review KPI dashboards and documentation gaps
|
||
- Monthly: Update documentation based on incident learnings
|
||
- Quarterly: Full framework review and improvement
|
||
|
||
---
|
||
|
||
## Template Examples
|
||
|
||
### Requirements Template
|
||
|
||
```markdown
|
||
# PRD: Feature Name
|
||
|
||
## 1. Business Context & Success Metrics
|
||
- Business Goal
|
||
- User Stories (with acceptance criteria)
|
||
- KPIs & Targets (e.g., "99.95% availability", "<200ms p95 latency")
|
||
|
||
## 2. Technical Boundaries
|
||
- In Scope
|
||
- Out of Scope
|
||
- Dependencies (e.g., "Requires Stripe API v2023-08")
|
||
|
||
## 3. Functional Requirements (FR)
|
||
- **FR-XXX**: [Requirement ID] - [Clear, testable functional requirement]
|
||
- **FR-XXX**: [Requirement ID] - [Clear, testable functional requirement]
|
||
- *Example: FR-001 - System shall allow users to invite teammates via email address*
|
||
|
||
## 4. Non-Functional Requirements (NFRs)
|
||
### 4.1 Performance & Scalability
|
||
- [e.g., Support 10K TPS, scale horizontally to 100 nodes]
|
||
|
||
### 4.2 Availability & Reliability
|
||
- [e.g., SLO: 99.9% monthly uptime, MTTR < 10min]
|
||
|
||
### 4.3 Privacy & Security
|
||
- Data Classification: [e.g., PII, PHI]
|
||
- Threat Model Outcomes: [e.g., "Mitigates replay attacks via nonce + timestamp"]
|
||
- Auth/Z Expectations: [e.g., RBAC with 3 roles: viewer, editor, admin]
|
||
- Encryption: [e.g., TLS 1.3+, AES-256 at rest]
|
||
- Compliance: [e.g., GDPR Art. 32, SOC2 Type II]
|
||
|
||
### 4.4 Observability & Telemetry
|
||
- Required Logs: [e.g., `user_id`, `request_id`, `status`, `latency_ms`]
|
||
- Critical Metrics: [e.g., `auth_failures_total`, `api_latency_seconds{quantile=0.99}`]
|
||
- Tracing: [e.g., Zipkin/B3 propagation, 10% sampling]
|
||
- Alerting: [e.g., `auth_failure_rate > 5%/min` triggers PagerDuty]
|
||
- Retention: [e.g., Logs: 30 days, Metrics: 1 year]
|
||
|
||
## 5. Acceptance Conditions
|
||
- [List verifiable conditions for sign-off, including validation gates]
|
||
```
|
||
|
||
### Solution Design Template
|
||
|
||
```markdown
|
||
# Solution Design: Feature Name
|
||
|
||
## 1. Problem Decomposition
|
||
|
||
Break down the user problem into smaller, solvable pieces.
|
||
|
||
| Problem | Description | User Impact |
|
||
|---------|-------------|-------------|
|
||
| [Problem ID] | Brief description of the problem | How this affects the user |
|
||
|
||
**Example**:
|
||
- **P-001**: Users cannot compare multiple wines side-by-side → They must navigate between wine pages, losing context and making comparisons difficult
|
||
|
||
## 2. Solution Approach
|
||
|
||
Describe the high-level approach to solving the problem.
|
||
|
||
**Approach**: [e.g., "Implement a comparison table with filtering, sorting, and visual comparison features"]
|
||
|
||
**Key Principles**:
|
||
- [e.g., "Keep comparison data in memory for instant updates"]
|
||
- [e.g., "Use server-side rendering for SEO"]
|
||
- [e.g., "Support unlimited wine comparisons for power users"]
|
||
|
||
## 3. Alternatives Considered
|
||
|
||
Document at least 2-3 alternative approaches with rationale.
|
||
|
||
| Alternative | Pros | Cons | Decision |
|
||
|-------------|------|------|----------|
|
||
| Client-side comparison | Fast updates, no server load | Memory-intensive, no sharing | Rejected - memory constraints on mobile |
|
||
| Server-side comparison | Centralized, scalable | Latency, complex state sync | Accepted - matches performance requirements |
|
||
| Hybrid approach | Best of both worlds | Complex, hard to maintain | Rejected - over-engineering |
|
||
|
||
## 4. High-Level Component Diagram
|
||
|
||
```mermaid
|
||
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#3b82f6'}}}%%
|
||
flowchart TD
|
||
A[User] --> B[Comparison UI]
|
||
B --> C[Comparison State]
|
||
B --> D[Wine API]
|
||
C --> D
|
||
```
|
||
|
||
**Component Descriptions**:
|
||
- **[Component Name]**: [Brief description of responsibilities and how it solves part of the problem]
|
||
- Links to requirements: FR-001, NFR-201
|
||
|
||
## 5. Decision Rationale
|
||
|
||
Explain why each major decision was made.
|
||
|
||
| Decision | Rationale | Alternatives Rejected |
|
||
|----------|-----------|----------------------|
|
||
| Use WebSockets for real-time updates | Required for <100ms feedback | REST polling - too slow, too many requests |
|
||
| Store comparison list in localStorage | Allows offline comparison editing | Redux - overkill for this feature |
|
||
| Server-side rendering for comparison export | SEO for shared comparisons | Client-side rendering - no SEO benefit |
|
||
|
||
## 6. Risk Assessment
|
||
|
||
Identify potential risks and mitigation strategies.
|
||
|
||
| Risk | Impact | Probability | Mitigation |
|
||
|------|--------|-------------|------------|
|
||
| Performance degradation with >5 wines | High | Medium | Implement performance profiling, limit to 10 wines max |
|
||
| Race conditions in real-time updates | Medium | Low | Use optimistic updates with rollback |
|
||
| Memory leaks from comparison state | Medium | Medium | Implement cleanup on unmount, use weak references |
|
||
|
||
## 7. Requirements Traceability
|
||
|
||
Link each solution component to specific requirements.
|
||
|
||
| Solution Component | Requirement ID | Description |
|
||
|-------------------|----------------|-------------|
|
||
| Comparison table UI | FR-101 | Display wines in table format |
|
||
| Filtering | FR-102 | Filter wines by criteria |
|
||
| Sorting | FR-103 | Sort wines by any column |
|
||
| Real-time updates | NFR-201 | Updates appear within 100ms |
|
||
| Export comparison | FR-201 | Export comparison to PDF/CSV |
|
||
|
||
|
||
### Specification Template
|
||
|
||
```yaml
|
||
# contract.yaml
|
||
openapi: 3.0.0
|
||
info:
|
||
title: NATSBridge API
|
||
version: 1.0.0
|
||
description: Technical specification with requirements traceability
|
||
paths:
|
||
/api/v1/endpoint:
|
||
post:
|
||
requestBody:
|
||
content:
|
||
application/json:
|
||
schema:
|
||
$ref: '#/components/schemas/Request'
|
||
responses:
|
||
'200':
|
||
description: Success
|
||
content:
|
||
application/json:
|
||
schema:
|
||
$ref: '#/components/schemas/Response'
|
||
x-requirement-id: FR-001 # Reference to Requirements document
|
||
components:
|
||
schemas:
|
||
Request:
|
||
type: object
|
||
properties:
|
||
data:
|
||
type: string
|
||
x-requirement-id: FR-002 # Reference to Requirements document
|
||
Response:
|
||
type: object
|
||
properties:
|
||
status:
|
||
type: string
|
||
x-requirement-id: FR-003 # Reference to Requirements document
|
||
```
|
||
|
||
### UI Specification Template
|
||
|
||
```markdown
|
||
# UI Specification: Component Name
|
||
|
||
## Component Overview
|
||
**Name**: [Component Name]
|
||
**Status**: [Draft/Approved/Deprecated]
|
||
**Version**: 1.0.0
|
||
|
||
## Requirements Traceability
|
||
| UI Element | Requirement ID | Description |
|
||
|------------|----------------|-------------|
|
||
| Login form | FR-001 | Username/password authentication |
|
||
| Sidebar menu | NFR-202 | Collapsible sidebar for usability |
|
||
| Wine table | FR-101 | Sortable, searchable inventory table |
|
||
|
||
## Design Tokens
|
||
- **Color**: `--color-primary`
|
||
- **Typography**: `--font-body`
|
||
- **Spacing**: `--spacing-md`
|
||
|
||
## Component States
|
||
| State | Description | Example | Requirement ID |
|
||
|-------|-------------|---------|----------------|
|
||
| Default | Normal state | `<Button />` | FR-001 |
|
||
| Hover | Mouse over | `<Button hover />` | NFR-201 |
|
||
| Disabled | Disabled state | `<Button disabled />` | NFR-203 |
|
||
| Error | Error state | `<Button error />` | EH-004 |
|
||
|
||
## Accessibility
|
||
- [ ] WCAG 2.1 AA compliant
|
||
- [ ] Keyboard navigation support
|
||
- [ ] Screen reader compatible
|
||
|
||
## Implementation
|
||
- Component file: `src/components/[Component].svelte`
|
||
- Test file: `src/components/[Component].test.js`
|
||
- Visual regression test: `tests/visual/[Component].spec.js`
|
||
|
||
## References
|
||
- Figma: [Link to design file]
|
||
- Storybook: [Link to component story]
|
||
- Requirements: [`docs/requirements.md`](../requirements.md)
|
||
```
|
||
|
||
### Architecture Template
|
||
|
||
```mermaid
|
||
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#3b82f6'}}}%%
|
||
flowchart TD
|
||
A[Client] --> B[Caddy]
|
||
B --> C[Node.js API]
|
||
C --> D[Julia Worker]
|
||
D --> E[NATS Cluster]
|
||
E --> F[Storage]
|
||
|
||
style A fill:#f9f9f9,stroke:#333
|
||
style E fill:#e0e7ff,stroke:#3b82f6
|
||
```
|
||
|
||
**Specification Traceability:**
|
||
| Architecture Component | Specification Section | UI Specification Section | Requirement ID |
|
||
|------------------------|----------------------|--------------------------|----------------|
|
||
| NATS Bridge (smartsend) | specification.md:3.1 | - | FR-301, NFR-201 |
|
||
| Login Form | specification.md:10.1 | ui-specification.md:15.2.1 | FR-001, NFR-201 |
|
||
| Wine Inventory Table | specification.md:4.2 | ui-specification.md:15.3.3 | FR-101, NFR-205 |
|
||
| Sidebar Menu | specification.md:11.1 | ui-specification.md:15.3.2 | NFR-202 |
|
||
|
||
**Component Details:**
|
||
- **Caddy**: Static file serving (specification.md:2.2, ui-specification.md:15.1)
|
||
- **Node.js API**: NATS bridge client (specification.md:3.1)
|
||
- **Julia Worker**: Backend processing (specification.md:4.1)
|
||
- **NATS Cluster**: Message bus (specification.md:3.3)
|
||
- **Storage**: Data persistence (specification.md:5.1)
|
||
|
||
### Runbook Template
|
||
|
||
```markdown
|
||
# Runbook: Service Restart
|
||
|
||
**Severity**: P2
|
||
**Estimated Time**: 5 minutes
|
||
|
||
## Symptoms
|
||
- Service is unresponsive
|
||
- Health checks are failing
|
||
|
||
## Steps
|
||
1. SSH to the host
|
||
2. Run: `kubectl rollout restart deployment/natsbridge`
|
||
3. Monitor: `kubectl get pods -l app=natsbridge -w`
|
||
|
||
## Rollback
|
||
- Run: `kubectl rollout undo deployment/natsbridge`
|
||
|
||
## Post-Incident
|
||
- [ ] Review logs for root cause
|
||
- [ ] Update runbook if needed
|
||
```
|
||
|
||
---
|
||
|
||
## Conclusion
|
||
|
||
This ASG Documentation Framework ensures that documentation is:
|
||
- **Structured**: Nine distinct artifacts with clear purposes
|
||
- **Automated**: Validation and CI/CD integration
|
||
- **Versioned**: All documentation in git with history
|
||
- **Measurable**: KPIs for quality and effectiveness
|
||
- **Actionable**: Practical templates and examples
|
||
|
||
Use this framework as a living document—update it as your team's needs evolve.
|