From f9aa6bc9f63e4b844727eb1c547d1d5bffae0a41 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 07:49:51 +0700 Subject: [PATCH 01/29] add sdd file --- docs/SDD_FRAMEWORK.md | 152 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 docs/SDD_FRAMEWORK.md diff --git a/docs/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md new file mode 100644 index 0000000..fde8c7d --- /dev/null +++ b/docs/SDD_FRAMEWORK.md @@ -0,0 +1,152 @@ +# Software Design Document (SDD) Framework + +A structured taxonomy of artifacts for comprehensive software documentation and design. + +--- + +## Overview + +This framework categorizes Software Design Document artifacts by their purpose in the software development lifecycle. Each artifact type serves a specific role in defining **what** we build, **how** it works, and **how** we verify and document it. + +--- + +## Artifact Taxonomy + +| Document Type | "Artifact" Type | Purpose in SDD | Tooling Examples | +|---------------|-----------------|----------------|------------------| +| Requirements | User Stories / ADRs | Defines the "Business Contract" — why are we building this? | GitHub Issues, Jira, Architecture Decision Records (ADRs) | +| The Spec | Schema Definition | The Source of Truth — defines data structures, types, and endpoints | OpenAPI (YAML), Protobuf, Apache Arrow Schemas | +| Architecture | System Blueprint | Defines how the specs connect (e.g., Service A talks to B via NATS) | Mermaid.js (Diagrams-as-code), IcePanel, Structurizr | +| Implementation | Generated Code | The code "fills in" the logic defined by the Spec | OpenAPI Generator, TypeSafe clients (Zod, TypeScript) | +| Validation | Contract Tests | Automatically checks if the code matches the Spec | Prism (Mocking), Dredd, Schemathesis | +| Tutorial | Interactive Sandbox | Allows others to "play" with the spec without writing code | Swagger UI, Redoc, Postman Collections | + +--- + +## Purpose by Artifact Type + +### 1. Requirements (User Stories / ADRs) + +**Purpose:** Defines the "Business Contract" — why are we building this? + +**Key Questions Answered:** +- What problem are we solving? +- Who are the stakeholders? +- What are the success criteria? + +**Examples:** +- GitHub Issues +- Jira tickets +- Architecture Decision Records (ADRs) + +--- + +### 2. The Spec (Schema Definition) + +**Purpose:** The Source of Truth — defines data structures, types, and endpoints. + +**Key Questions Answered:** +- What data structures do we use? +- What are the API endpoints? +- What are the message formats? + +**Examples:** +- OpenAPI (YAML) specifications +- Protocol Buffers (Protobuf) +- Apache Arrow Schemas + +--- + +### 3. Architecture (System Blueprint) + +**Purpose:** Defines how the specs connect — the high-level structure and communication patterns. + +**Key Questions Answered:** +- How do services communicate? +- What is the deployment topology? +- What are the system boundaries? + +**Examples:** +- Mermaid.js diagrams (Diagrams-as-code) +- IcePanel +- Structurizr + +--- + +### 4. Implementation (Generated Code) + +**Purpose:** The code "fills in" the logic defined by the Spec. + +**Key Questions Answered:** +- How do we implement the spec? +- What libraries/frameworks do we use? +- How do we ensure type safety? + +**Examples:** +- OpenAPI Generator +- TypeSafe clients (Zod, TypeScript) + +--- + +### 5. Validation (Contract Tests) + +**Purpose:** Automatically checks if the code matches the Spec. + +**Key Questions Answered:** +- Does the implementation match the spec? +- Are there breaking changes? +- Is the contract upheld? + +**Examples:** +- Prism (Mocking) +- Dredd +- Schemathesis + +--- + +### 6. Tutorial (Interactive Sandbox) + +**Purpose:** Allows others to "play" with the spec without writing code. + +**Key Questions Answered:** +- How do I use this API? +- What are the expected inputs/outputs? +- How can I experiment safely? + +**Examples:** +- Swagger UI +- Redoc +- Postman Collections + +--- + +## Usage Guidelines + +1. **Start with Requirements** — Define the business context and decision records +2. **Define The Spec** — Create the source of truth for data and endpoints +3. **Design Architecture** — Visualize how components connect +4. **Implement** — Generate or write code based on the spec +5. **Validate** — Run contract tests to ensure alignment +6. **Tutorial** — Provide interactive documentation for users + +--- + +## Relationship to NATSBridge + +This NATSBridge project implements a NATS-based messaging bridge that can be used with all artifact types in this framework: + +- **Architecture:** NATS as the communication backbone +- **Spec:** Message schemas for bridge operations +- **Implementation:** Generated code for NATS message handling +- **Validation:** Contract tests for message format compliance +- **Tutorial:** Interactive examples for bridge configuration + +--- + +## References + +- Architecture Decision Records (ADRs): [http://adr.cascadely.com/](http://adr.cascadely.com/) +- OpenAPI Specification: [https://spec.openapis.org/](https://spec.openapis.org/) +- Protocol Buffers: [https://protobuf.dev/](https://protobuf.dev/) +- Mermaid.js: [https://mermaid.js.org/](https://mermaid.js.org/) +- Schemathesis: [https://schemathesis.io/](https://schemathesis.io/) \ No newline at end of file From 64796ff0a35ea6bcf386648e6508e4c849a8b2b1 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 08:24:54 +0700 Subject: [PATCH 02/29] update --- SDD_FRAMEWORK.md | 279 ++++++++++++++++++++++++++++++++++++++++++ docs/SDD_FRAMEWORK.md | 152 ----------------------- 2 files changed, 279 insertions(+), 152 deletions(-) create mode 100644 SDD_FRAMEWORK.md delete mode 100644 docs/SDD_FRAMEWORK.md diff --git a/SDD_FRAMEWORK.md b/SDD_FRAMEWORK.md new file mode 100644 index 0000000..bf5eae7 --- /dev/null +++ b/SDD_FRAMEWORK.md @@ -0,0 +1,279 @@ +# SDD + GitOps Documentation Stack + +A comprehensive documentation strategy for modern software development that aligns different types of documentation with their specific purposes, audiences, and tooling. + +## The Big Picture + +This framework ensures that every piece of documentation serves a clear purpose and reaches the right audience. It emphasizes: + +- **Machine-readable truths** as the foundation for automation +- **Separation of concerns** between human-facing docs and machine-consumable contracts +- **GitOps integration** where deployment and configuration are version-controlled +- **Multi-role audience targeting** from stakeholders to DevOps + +--- + +## Documentation Matrix + +| Document | Purpose ("The Why") | Primary Audience | Format / Tooling | Example (SaaS Context) | +|----------|---------------------|------------------|------------------|------------------------| +| **Requirements** | Define business goals & user needs | Stakeholders, PM, Lead Dev | GitHub Issues, Notion | "System must support 5-member teams with real-time sync." | +| **The Spec** | The Contract. Machine-readable truth. | Developers, QA, Machines | OpenAPI, Protobuf, YAML | A `.yaml` file defining `user_id` as a UUID in snake_case. | +| **Architecture** | High-level structural blueprint | Senior Devs, DevOps | Mermaid.js, IcePanel | Diagram of SvelteKit ↔ NATS ↔ Julia 6-node cluster. | +| **Walkthrough** | The Intuition. The "Big Picture" narrative. | New Devs, The Team | Recorded Video, TOUR.md | "Why we use a Claim-Check pattern for large Arrow data." | +| **Implementation** | The actual logic & generated code | Developers | SvelteKit, Julia, Node.js | Auto-generated TypeScript types from the OpenAPI spec. | +| **Validation** | Automated "Contract" enforcement | CI/CD Pipelines, QA | GitHub Actions, Prism | A test that fails if the Julia API returns camelCase keys. | +| **Runbook** | Deployment, Scaling, & Recovery | DevOps, SRE | K8s Manifests, Flux | `git push` to update the replica count from 3 to 6. | + +--- + +## Detailed Explanations + +### 1. Requirements + +**Purpose**: Define business goals & user needs. + +**Why it matters**: Before writing code, we need to understand *why* we're building something. Requirements capture the business context, user pain points, and success criteria. + +**Primary Audience**: +- **Stakeholders**: Business owners who need to approve the direction +- **Product Managers**: Translate requirements into features +- **Lead Developers**: Understand scope and technical constraints + +**Format / Tooling**: +- **GitHub Issues**: Simple, version-controlled, integrated with code +- **Notion**: Rich text, collaborative, good for initial brainstorming + +**Best Practices**: +- Write in user story format: "As a [role], I want [feature] so that [benefit]" +- Include acceptance criteria as checklist items +- Link to related specs and architecture decisions + +**Example**: "System must support 5-member teams with real-time sync." + +--- + +### 2. The Spec (The Contract) + +**Purpose**: Machine-readable truth that defines the API contract. + +**Why it matters**: The spec is the single source of truth for how systems communicate. It enables code generation, automated testing, and ensures consistency across services. + +**Primary Audience**: +- **Developers**: Implement the API according to the spec +- **QA Engineers**: Create test cases based on the spec +- **Machines**: Used for code generation, validation, and documentation + +**Format / Tooling**: +- **OpenAPI (Swagger)**: REST API specifications +- **Protobuf**: gRPC service definitions +- **YAML/JSON**: Configuration and data schema definitions + +**Best Practices**: +- Use snake_case for consistency +- Define all fields with types and constraints +- Include examples for complex data structures +- Keep specs versioned alongside code + +**Example**: A `.yaml` file defining `user_id` as a UUID in snake_case. + +--- + +### 3. Architecture + +**Purpose**: High-level structural blueprint showing how components interact. + +**Why it matters**: Architecture diagrams help everyone understand the system's structure without drowning in implementation details. They're crucial for onboarding, design reviews, and long-term maintainability. + +**Primary Audience**: +- **Senior Developers**: Design decisions and component responsibilities +- **DevOps**: Understand deployment topology and service dependencies +- **Technical Leads**: Evaluate trade-offs and scalability concerns + +**Format / Tooling**: +- **Mermaid.js**: Code-based diagrams that are version-controlled +- **IcePanel**: Interactive, automated architecture visualization +- **C4 Model**: Standardized approach to architectural diagrams + +**Best Practices**: +- Focus on *relationships* between components, not implementation details +- Include technology choices (e.g., NATS vs WebSocket) +- Show data flow direction with arrows +- Update diagrams when architecture changes + +**Example**: Diagram of SvelteKit ↔ NATS ↔ Julia 6-node cluster. + +--- + +### 4. Walkthrough + +**Purpose**: The intuition and "Big Picture" narrative. + +**Why it matters**: Code alone doesn't explain *why* decisions were made. Walkthroughs provide context, historical decisions, and architectural intuition that helps new developers become productive quickly. + +**Primary Audience**: +- **New Developers**: Understand the system's philosophy and patterns +- **The Team**: Share context and reasoning behind design choices +- **Code Reviewers**: Evaluate design decisions alongside implementation + +**Format / Tooling**: +- **Recorded Video**: Personal, engaging, good for complex explanations +- **TOUR.md**: Markdown file with narrative walk-through of the codebase +- **Architecture Decision Records (ADRs)**: Formal documentation of key decisions + +**Best Practices**: +- Explain *why* more than *how* +- Include anti-patterns to avoid +- Link to related documentation +- Keep walkthroughs updated with architecture changes + +**Example**: "Why we use a Claim-Check pattern for large Arrow data." + +--- + +### 5. Implementation + +**Purpose**: The actual logic and generated code. + +**Why it matters**: This is the executable truth of the system. Well-structured implementation code should be clear, maintainable, and follow established patterns. + +**Primary Audience**: +- **Developers**: Read, modify, and extend the code +- **Reviewers**: Verify correctness and adherence to standards +- **CI/CD**: Run tests and builds + +**Format / Tooling**: +- **SvelteKit**: Frontend framework with server-side rendering +- **Julia**: High-performance numerical computing +- **Node.js**: Backend services and tooling + +**Best Practices**: +- Generate code from specs to ensure consistency +- Use consistent naming conventions (snake_case, camelCase appropriately) +- Include unit tests alongside implementation +- Document complex algorithms with inline comments + +**Example**: Auto-generated TypeScript types from the OpenAPI spec. + +--- + +### 6. Validation + +**Purpose**: Automated "Contract" enforcement. + +**Why it matters**: Automated tests ensure that the system behaves as specified and prevent regressions. Validation in CI/CD pipelines catches issues before they reach production. + +**Primary Audience**: +- **CI/CD Pipelines**: Run tests automatically on every commit +- **QA Engineers**: Verify system behavior against requirements +- **Developers**: Get immediate feedback on changes + +**Format / Tooling**: +- **GitHub Actions**: Automated testing and validation workflows +- **Prism (ReadMe)**: OpenAPI spec validation in CI +- **Jest/Vitest**: JavaScript testing framework +- **Pytest**: Python testing framework + +**Best Practices**: +- Test the contract (spec) not just implementation details +- Use contract testing (PACT) for service-to-service validation +- Fail fast: tests should run quickly and provide clear error messages +- Include negative test cases (invalid inputs, edge cases) + +**Example**: A test that fails if the Julia API returns camelCase keys. + +--- + +### 7. Runbook + +**Purpose**: Deployment, scaling, and recovery procedures. + +**Why it matters**: Runbooks ensure that deployments are consistent, repeatable, and recoverable. In GitOps, the runbook *is* the configuration, version-controlled alongside the code. + +**Primary Audience**: +- **DevOps Engineers**: Execute deployments and scaling operations +- **SREs**: Manage system reliability and incident response +- **Developers**: Deploy feature branches for testing + +**Format / Tooling**: +- **Kubernetes Manifests**: Declarative deployment configurations +- **Flux**: GitOps operator for Kubernetes +- **Helm Charts**: Package management for Kubernetes +- **Docker Compose**: Local development environments + +**Best Practices**: +- Use Git as the source of truth (GitOps) +- Make deployments idempotent (running twice has same effect) +- Include rollback procedures +- Document scaling procedures for different load levels + +**Example**: `git push` to update the replica count from 3 to 6. + +--- + +## How the Stack Fits Together + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Requirements │ +│ (Business goals, user needs) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ The Spec │ +│ (Machine-readable contract: OpenAPI, Protobuf) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Architecture │ +│ (Structural blueprint: Mermaid, IcePanel) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Walkthrough │ +│ (Intuition, big picture narrative) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Implementation │ +│ (Actual code: SvelteKit, Julia, Node.js) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Validation │ +│ (Automated tests: GitHub Actions, Prism) │ +└───────────────────┬─────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Runbook │ +│ (Deployment, scaling: K8s, Flux) │ +└─────────────────────────────────────────────────────────────┘ +``` + +## Key Principles + +1. **Machine-Readable Truth**: Specs and configurations should be machine-readable to enable automation +2. **Separation of Concerns**: Different audiences need different types of information +3. **Version Control**: All documentation should be in Git, just like code +4. **Automation-First**: Validation should be automated and integrated into CI/CD +5. **Living Documentation**: Documentation should evolve with the codebase + +## Getting Started + +To adopt this stack in your project: + +1. Start with requirements in GitHub Issues or Notion +2. Create a spec file (OpenAPI/Protobuf) as the contract +3. Add architecture diagrams using Mermaid.js +4. Write a walkthrough explaining the "why" behind decisions +5. Implement code following the spec +6. Add automated tests that validate the spec +7. Create runbooks for deployment and scaling + +This framework ensures that every piece of documentation serves a clear purpose and reaches the right audience. \ No newline at end of file diff --git a/docs/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md deleted file mode 100644 index fde8c7d..0000000 --- a/docs/SDD_FRAMEWORK.md +++ /dev/null @@ -1,152 +0,0 @@ -# Software Design Document (SDD) Framework - -A structured taxonomy of artifacts for comprehensive software documentation and design. - ---- - -## Overview - -This framework categorizes Software Design Document artifacts by their purpose in the software development lifecycle. Each artifact type serves a specific role in defining **what** we build, **how** it works, and **how** we verify and document it. - ---- - -## Artifact Taxonomy - -| Document Type | "Artifact" Type | Purpose in SDD | Tooling Examples | -|---------------|-----------------|----------------|------------------| -| Requirements | User Stories / ADRs | Defines the "Business Contract" — why are we building this? | GitHub Issues, Jira, Architecture Decision Records (ADRs) | -| The Spec | Schema Definition | The Source of Truth — defines data structures, types, and endpoints | OpenAPI (YAML), Protobuf, Apache Arrow Schemas | -| Architecture | System Blueprint | Defines how the specs connect (e.g., Service A talks to B via NATS) | Mermaid.js (Diagrams-as-code), IcePanel, Structurizr | -| Implementation | Generated Code | The code "fills in" the logic defined by the Spec | OpenAPI Generator, TypeSafe clients (Zod, TypeScript) | -| Validation | Contract Tests | Automatically checks if the code matches the Spec | Prism (Mocking), Dredd, Schemathesis | -| Tutorial | Interactive Sandbox | Allows others to "play" with the spec without writing code | Swagger UI, Redoc, Postman Collections | - ---- - -## Purpose by Artifact Type - -### 1. Requirements (User Stories / ADRs) - -**Purpose:** Defines the "Business Contract" — why are we building this? - -**Key Questions Answered:** -- What problem are we solving? -- Who are the stakeholders? -- What are the success criteria? - -**Examples:** -- GitHub Issues -- Jira tickets -- Architecture Decision Records (ADRs) - ---- - -### 2. The Spec (Schema Definition) - -**Purpose:** The Source of Truth — defines data structures, types, and endpoints. - -**Key Questions Answered:** -- What data structures do we use? -- What are the API endpoints? -- What are the message formats? - -**Examples:** -- OpenAPI (YAML) specifications -- Protocol Buffers (Protobuf) -- Apache Arrow Schemas - ---- - -### 3. Architecture (System Blueprint) - -**Purpose:** Defines how the specs connect — the high-level structure and communication patterns. - -**Key Questions Answered:** -- How do services communicate? -- What is the deployment topology? -- What are the system boundaries? - -**Examples:** -- Mermaid.js diagrams (Diagrams-as-code) -- IcePanel -- Structurizr - ---- - -### 4. Implementation (Generated Code) - -**Purpose:** The code "fills in" the logic defined by the Spec. - -**Key Questions Answered:** -- How do we implement the spec? -- What libraries/frameworks do we use? -- How do we ensure type safety? - -**Examples:** -- OpenAPI Generator -- TypeSafe clients (Zod, TypeScript) - ---- - -### 5. Validation (Contract Tests) - -**Purpose:** Automatically checks if the code matches the Spec. - -**Key Questions Answered:** -- Does the implementation match the spec? -- Are there breaking changes? -- Is the contract upheld? - -**Examples:** -- Prism (Mocking) -- Dredd -- Schemathesis - ---- - -### 6. Tutorial (Interactive Sandbox) - -**Purpose:** Allows others to "play" with the spec without writing code. - -**Key Questions Answered:** -- How do I use this API? -- What are the expected inputs/outputs? -- How can I experiment safely? - -**Examples:** -- Swagger UI -- Redoc -- Postman Collections - ---- - -## Usage Guidelines - -1. **Start with Requirements** — Define the business context and decision records -2. **Define The Spec** — Create the source of truth for data and endpoints -3. **Design Architecture** — Visualize how components connect -4. **Implement** — Generate or write code based on the spec -5. **Validate** — Run contract tests to ensure alignment -6. **Tutorial** — Provide interactive documentation for users - ---- - -## Relationship to NATSBridge - -This NATSBridge project implements a NATS-based messaging bridge that can be used with all artifact types in this framework: - -- **Architecture:** NATS as the communication backbone -- **Spec:** Message schemas for bridge operations -- **Implementation:** Generated code for NATS message handling -- **Validation:** Contract tests for message format compliance -- **Tutorial:** Interactive examples for bridge configuration - ---- - -## References - -- Architecture Decision Records (ADRs): [http://adr.cascadely.com/](http://adr.cascadely.com/) -- OpenAPI Specification: [https://spec.openapis.org/](https://spec.openapis.org/) -- Protocol Buffers: [https://protobuf.dev/](https://protobuf.dev/) -- Mermaid.js: [https://mermaid.js.org/](https://mermaid.js.org/) -- Schemathesis: [https://schemathesis.io/](https://schemathesis.io/) \ No newline at end of file From 0fb132555bc03c3c0efb2aecfcc6155f46911ba0 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 08:26:02 +0700 Subject: [PATCH 03/29] update --- SDD_FRAMEWORK.md => docs/SDD_FRAMEWORK.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename SDD_FRAMEWORK.md => docs/SDD_FRAMEWORK.md (100%) diff --git a/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md similarity index 100% rename from SDD_FRAMEWORK.md rename to docs/SDD_FRAMEWORK.md From fbd061b25300d7c08d899d0c3ae7fcbf0e04c2b9 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 09:15:47 +0700 Subject: [PATCH 04/29] update --- docs/SDD_FRAMEWORK.md | 307 +++++++++++++++--------------------------- 1 file changed, 108 insertions(+), 199 deletions(-) diff --git a/docs/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md index bf5eae7..3a2418f 100644 --- a/docs/SDD_FRAMEWORK.md +++ b/docs/SDD_FRAMEWORK.md @@ -1,279 +1,188 @@ -# SDD + GitOps Documentation Stack +# SDD + GitOps Documentation Framework -A comprehensive documentation strategy for modern software development that aligns different types of documentation with their specific purposes, audiences, and tooling. +## Overview -## The Big Picture +The **SDD + GitOps Documentation Framework** is a comprehensive, structured approach to software development documentation that aligns technical work with business outcomes through clear separation of concerns. -This framework ensures that every piece of documentation serves a clear purpose and reaches the right audience. It emphasizes: - -- **Machine-readable truths** as the foundation for automation -- **Separation of concerns** between human-facing docs and machine-consumable contracts -- **GitOps integration** where deployment and configuration are version-controlled -- **Multi-role audience targeting** from stakeholders to DevOps +This framework ensures that every piece of documentation serves a specific purpose, reaches the right audience, and can be measured for effectiveness. It's designed to prevent common pitfalls like feature creep, communication gaps, and operational fragility. --- -## Documentation Matrix +## The Documentation Matrix -| Document | Purpose ("The Why") | Primary Audience | Format / Tooling | Example (SaaS Context) | -|----------|---------------------|------------------|------------------|------------------------| -| **Requirements** | Define business goals & user needs | Stakeholders, PM, Lead Dev | GitHub Issues, Notion | "System must support 5-member teams with real-time sync." | -| **The Spec** | The Contract. Machine-readable truth. | Developers, QA, Machines | OpenAPI, Protobuf, YAML | A `.yaml` file defining `user_id` as a UUID in snake_case. | -| **Architecture** | High-level structural blueprint | Senior Devs, DevOps | Mermaid.js, IcePanel | Diagram of SvelteKit ↔ NATS ↔ Julia 6-node cluster. | -| **Walkthrough** | The Intuition. The "Big Picture" narrative. | New Devs, The Team | Recorded Video, TOUR.md | "Why we use a Claim-Check pattern for large Arrow data." | -| **Implementation** | The actual logic & generated code | Developers | SvelteKit, Julia, Node.js | Auto-generated TypeScript types from the OpenAPI spec. | -| **Validation** | Automated "Contract" enforcement | CI/CD Pipelines, QA | GitHub Actions, Prism | A test that fails if the Julia API returns camelCase keys. | -| **Runbook** | Deployment, Scaling, & Recovery | DevOps, SRE | K8s Manifests, Flux | `git push` to update the replica count from 3 to 6. | +| Document | Purpose & Rationale (The "Why") | Audience | Format / Content | Measurement (KPI/SLO) | Example (SaaS Context) | +|----------|----------------------------------|----------|------------------|----------------------|------------------------| +| **Requirements** | The Business North Star. Defines exactly what problem the user has and what success looks like. It prevents "feature creep" by setting hard boundaries on what we will NOT build. | Founder, Team, PM | Format: Shared Wiki (Notion/GitHub Wiki). Content: User stories, business constraints, competitive context, and success metrics. | **KPI**: Business Outcomes. Measured by User Retention, Conversion Rates, and Monthly Recurring Revenue (MRR). | "The system must process high-volume math so clients see reports instantly. Goal: 15% increase in daily active users." | +| **Spec** | The Technical Contract. A machine-readable, strictly typed definition of all data interfaces. It is the "Single Source of Truth" that prevents bugs caused by communication gaps between services. | Developers, QA, Automation | Format: OpenAPI/YAML or Protobuf. Content: API endpoints, snake_case key naming, data validation rules, and error response codes. | **SLA/SLO**: System Performance. Measured by API Uptime (99.9%), Response Latency (<100ms), and Error Rates. | A `contract.yaml` defining exactly how Julia sends Arrow data to Node.js. It forces `user_id` to be a UUID. | +| **Architecture** | The Structural Blueprint. A visual map of how the components (services, DBs, networks) fit together. It shows how the data flows through the 6-node cluster and where bottlenecks live. | Senior Devs, DevOps | Format: Diagrams-as-code (Mermaid.js). Content: System Context diagrams, Database ERDs, Network Security Policies, and Infrastructure maps. | **Efficiency Metrics**: Resource utilization. Measured by CPU Load (<70%), RAM per pod, and internal network throughput. | A diagram showing the data path: Caddy (Proxy) → Node.js (API) → NATS (Queue) → Julia (Math Engine). | +| **Walkthrough** | The Intuition & Logic. A narrative guide that explains the "steps" and "rationale" behind end-to-end flows. It's about building a mental model so devs understand why the sequence matters. | The Team, New Hires | Format: TOUR.md file or Loom Video. Content: Step-by-step traces of core features, explanation of architectural trade-offs, and "The Big Picture" flow. | **Quality**: Developer Velocity. Measured by "Time-to-First-Commit" for new hires and reduction in conceptual bugs. | "End-to-End Trace": 1. UI sends JSON. 2. API wraps it in Claim-Check. 3. Julia pulls it. Rationale: To avoid NATS memory spikes. | +| **Implementation** | The Functional Reality. The actual code that does the work. In SDD, the "boring" parts (types/routes) are auto-generated from the Spec to ensure the code never lies. | Developers, Reviewers | Format: Git Repository. Content: Business logic, internal helper functions, Unit Tests, and a README.md for local environment setup. | **Code Health**: Internal Quality. Measured by Test Coverage (90%+), Linting compliance, and Cyclomatic Complexity. | The SvelteKit frontend components and the specific Julia math-processing functions. | +| **Validation** | The Enforcement Layer. Automated gates that prove the Implementation matches the Spec. It prevents human error (like changing a key name) from reaching production. | CI/CD Pipeline, QA | Format: GitHub Actions / Tests. Content: Contract tests (Dredd/Prism), Integration tests, and Security scans that run on every pull request. | **Compliance**: Safety Metrics. Measured by Build Success Rate and 0 "Contract Violations" in the production logs. | A CI job that blocks a Pull Request because a developer used camelCase in a database field instead of snake_case. | +| **Maintenance** | The Health & Evolution. Defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software over time. | The Team, DevOps | Format: MAINTENANCE.md. Content: Dependency update schedules, Secret rotation steps, DB Migration logs, and Tech Debt "Graveyard" tracking. | **Sustainability**: System Longevity. Measured by "Package Age", "Security Vulnerabilities Found", and "Migration Success Rate". | "Steps to upgrade the Julia version across all 6 nodes without downtime using a Blue-Green deployment strategy." | +| **Runbook** | The Operational Life-Support. The instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. | DevOps, SRE, On-call Devs | Format: K8s Manifests (Flux/Argo). Content: Deployment steps, Scaling triggers, Backup/Restore procedures, and "3:00 AM" troubleshooting guides. | **Reliability**: Operational Health. Measured by MTTR (Mean Time to Recovery) and Error-Free Deployments. | A Flux manifest that ensures 6 replicas of the Julia service are always healthy and restarts them if they hit 80% RAM. | --- -## Detailed Explanations +## Detailed Document Descriptions ### 1. Requirements -**Purpose**: Define business goals & user needs. +**Purpose**: Establish the Business North Star. -**Why it matters**: Before writing code, we need to understand *why* we're building something. Requirements capture the business context, user pain points, and success criteria. +**Why It Matters**: Without clear requirements, teams drift into "feature creep" - building things that don't solve the actual problem. This document anchors the project in business outcomes. -**Primary Audience**: -- **Stakeholders**: Business owners who need to approve the direction -- **Product Managers**: Translate requirements into features -- **Lead Developers**: Understand scope and technical constraints - -**Format / Tooling**: -- **GitHub Issues**: Simple, version-controlled, integrated with code -- **Notion**: Rich text, collaborative, good for initial brainstorming +**Key Elements**: +- **User Stories**: What the user needs to accomplish +- **Business Constraints**: Budget, timeline, regulatory requirements +- **Competitive Context**: What competitors do and how you differentiate +- **Success Metrics**: Quantifiable goals that define "done" **Best Practices**: -- Write in user story format: "As a [role], I want [feature] so that [benefit]" -- Include acceptance criteria as checklist items -- Link to related specs and architecture decisions - -**Example**: "System must support 5-member teams with real-time sync." +- Keep it in a shared wiki (Notion, GitHub Wiki) for collaborative editing +- Focus on outcomes, not solutions +- Explicitly state what you will NOT build --- -### 2. The Spec (The Contract) +### 2. Spec (Specification) -**Purpose**: Machine-readable truth that defines the API contract. +**Purpose**: Create a machine-readable technical contract. -**Why it matters**: The spec is the single source of truth for how systems communicate. It enables code generation, automated testing, and ensures consistency across services. +**Why It Matters**: Communication gaps between services cause bugs. A strict, typed spec prevents these by being the Single Source of Truth. -**Primary Audience**: -- **Developers**: Implement the API according to the spec -- **QA Engineers**: Create test cases based on the spec -- **Machines**: Used for code generation, validation, and documentation - -**Format / Tooling**: -- **OpenAPI (Swagger)**: REST API specifications -- **Protobuf**: gRPC service definitions -- **YAML/JSON**: Configuration and data schema definitions +**Key Elements**: +- **API Endpoints**: All routes with HTTP methods +- **Data Types**: Strict typing with validation rules +- **Error Codes**: Comprehensive error response definitions +- **Naming Conventions**: snake_case keys, consistent patterns **Best Practices**: -- Use snake_case for consistency -- Define all fields with types and constraints -- Include examples for complex data structures -- Keep specs versioned alongside code - -**Example**: A `.yaml` file defining `user_id` as a UUID in snake_case. +- Use OpenAPI (YAML/JSON) for REST APIs or Protobuf for gRPC +- Automate generation of client/server code from the spec +- Run contract tests against the spec in CI/CD --- ### 3. Architecture -**Purpose**: High-level structural blueprint showing how components interact. +**Purpose**: Visualize the system structure and data flow. -**Why it matters**: Architecture diagrams help everyone understand the system's structure without drowning in implementation details. They're crucial for onboarding, design reviews, and long-term maintainability. +**Why It Matters**: Complex systems (like your 6-node cluster) need clear maps. Without them, teams can't identify bottlenecks or make informed decisions. -**Primary Audience**: -- **Senior Developers**: Design decisions and component responsibilities -- **DevOps**: Understand deployment topology and service dependencies -- **Technical Leads**: Evaluate trade-offs and scalability concerns - -**Format / Tooling**: -- **Mermaid.js**: Code-based diagrams that are version-controlled -- **IcePanel**: Interactive, automated architecture visualization -- **C4 Model**: Standardized approach to architectural diagrams +**Key Elements**: +- **System Context Diagram**: Shows the system and its external dependencies +- **Database ERD**: Entity-Relationship diagrams for data model +- **Network Security Policies**: Firewall rules, service mesh configs +- **Infrastructure Maps**: Cloud resources, scaling groups **Best Practices**: -- Focus on *relationships* between components, not implementation details -- Include technology choices (e.g., NATS vs WebSocket) -- Show data flow direction with arrows +- Use Mermaid.js for diagrams-as-code (versionable, diffable) - Update diagrams when architecture changes - -**Example**: Diagram of SvelteKit ↔ NATS ↔ Julia 6-node cluster. +- Focus on data flow and decision points --- ### 4. Walkthrough -**Purpose**: The intuition and "Big Picture" narrative. +**Purpose**: Build a mental model through narrative. -**Why it matters**: Code alone doesn't explain *why* decisions were made. Walkthroughs provide context, historical decisions, and architectural intuition that helps new developers become productive quickly. +**Why It Matters**: Code doesn't explain *why*. Walkthroughs capture the reasoning behind architectural trade-offs, making onboarding faster and reducing conceptual bugs. -**Primary Audience**: -- **New Developers**: Understand the system's philosophy and patterns -- **The Team**: Share context and reasoning behind design choices -- **Code Reviewers**: Evaluate design decisions alongside implementation - -**Format / Tooling**: -- **Recorded Video**: Personal, engaging, good for complex explanations -- **TOUR.md**: Markdown file with narrative walk-through of the codebase -- **Architecture Decision Records (ADRs)**: Formal documentation of key decisions +**Key Elements**: +- **Step-by-step traces**: End-to-end flow of user actions +- **Trade-off explanations**: Why you chose option A over B +- **The Big Picture**: How components fit together conceptually **Best Practices**: -- Explain *why* more than *how* -- Include anti-patterns to avoid -- Link to related documentation -- Keep walkthroughs updated with architecture changes - -**Example**: "Why we use a Claim-Check pattern for large Arrow data." +- Write in a TOUR.md file or record Loom videos +- Focus on intuition, not just mechanics +- Include "Rationale" sections for each major decision --- ### 5. Implementation -**Purpose**: The actual logic and generated code. +**Purpose**: The functional reality - the actual code. -**Why it matters**: This is the executable truth of the system. Well-structured implementation code should be clear, maintainable, and follow established patterns. +**Why It Matters**: This is what runs in production. In SDD, the spec-driven approach ensures boring parts are generated automatically, so developers focus on business logic. -**Primary Audience**: -- **Developers**: Read, modify, and extend the code -- **Reviewers**: Verify correctness and adherence to standards -- **CI/CD**: Run tests and builds - -**Format / Tooling**: -- **SvelteKit**: Frontend framework with server-side rendering -- **Julia**: High-performance numerical computing -- **Node.js**: Backend services and tooling +**Key Elements**: +- **Business Logic**: The unique value you provide +- **Unit Tests**: Covering edge cases and error paths +- **README.md**: Local environment setup instructions **Best Practices**: -- Generate code from specs to ensure consistency -- Use consistent naming conventions (snake_case, camelCase appropriately) -- Include unit tests alongside implementation -- Document complex algorithms with inline comments - -**Example**: Auto-generated TypeScript types from the OpenAPI spec. +- Generate boilerplate (types, routes) from the Spec +- Maintain 90%+ test coverage +- Keep README.md up-to-date for local development --- ### 6. Validation -**Purpose**: Automated "Contract" enforcement. +**Purpose**: Automated quality gates. -**Why it matters**: Automated tests ensure that the system behaves as specified and prevent regressions. Validation in CI/CD pipelines catches issues before they reach production. +**Why It Matters**: Human error happens. Validation layers catch mistakes before they reach production, preventing contract violations and security issues. -**Primary Audience**: -- **CI/CD Pipelines**: Run tests automatically on every commit -- **QA Engineers**: Verify system behavior against requirements -- **Developers**: Get immediate feedback on changes - -**Format / Tooling**: -- **GitHub Actions**: Automated testing and validation workflows -- **Prism (ReadMe)**: OpenAPI spec validation in CI -- **Jest/Vitest**: JavaScript testing framework -- **Pytest**: Python testing framework +**Key Elements**: +- **Contract Tests**: Verify implementation matches spec (Dredd, Prism) +- **Integration Tests**: Test service-to-service interactions +- **Security Scans**: SAST/SBOM analysis on every PR **Best Practices**: -- Test the contract (spec) not just implementation details -- Use contract testing (PACT) for service-to-service validation -- Fail fast: tests should run quickly and provide clear error messages -- Include negative test cases (invalid inputs, edge cases) - -**Example**: A test that fails if the Julia API returns camelCase keys. +- Run validation on every pull request +- Block merges on contract violations +- Track build success rate as a KPI --- -### 7. Runbook +### 7. Maintenance -**Purpose**: Deployment, scaling, and recovery procedures. +**Purpose**: Guide for long-term health and evolution. -**Why it matters**: Runbooks ensure that deployments are consistent, repeatable, and recoverable. In GitOps, the runbook *is* the configuration, version-controlled alongside the code. +**Why It Matters**: Software decays. Without a maintenance plan, dependency upgrades become risky, secrets accumulate, and technical debt piles up. -**Primary Audience**: -- **DevOps Engineers**: Execute deployments and scaling operations -- **SREs**: Manage system reliability and incident response -- **Developers**: Deploy feature branches for testing - -**Format / Tooling**: -- **Kubernetes Manifests**: Declarative deployment configurations -- **Flux**: GitOps operator for Kubernetes -- **Helm Charts**: Package management for Kubernetes -- **Docker Compose**: Local development environments +**Key Elements**: +- **Dependency Update Schedule**: When and how to upgrade packages +- **Secret Rotation Steps**: How to rotate credentials securely +- **DB Migration Logs**: History of schema changes +- **Tech Debt "Graveyard"**: Documented technical debt with remediation plans **Best Practices**: -- Use Git as the source of truth (GitOps) -- Make deployments idempotent (running twice has same effect) -- Include rollback procedures -- Document scaling procedures for different load levels - -**Example**: `git push` to update the replica count from 3 to 6. +- Document the "how" for common maintenance tasks +- Track package age and security vulnerabilities +- Schedule regular tech debt reviews --- -## How the Stack Fits Together +### 8. Runbook -``` -┌─────────────────────────────────────────────────────────────┐ -│ Requirements │ -│ (Business goals, user needs) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ The Spec │ -│ (Machine-readable contract: OpenAPI, Protobuf) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Architecture │ -│ (Structural blueprint: Mermaid, IcePanel) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Walkthrough │ -│ (Intuition, big picture narrative) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Implementation │ -│ (Actual code: SvelteKit, Julia, Node.js) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Validation │ -│ (Automated tests: GitHub Actions, Prism) │ -└───────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ Runbook │ -│ (Deployment, scaling: K8s, Flux) │ -└─────────────────────────────────────────────────────────────┘ -``` +**Purpose**: Operational life-support for production systems. -## Key Principles +**Why It Matters**: When production is down, teams need clear instructions. In GitOps, the runbook is the "desired state" that the system constantly works toward. -1. **Machine-Readable Truth**: Specs and configurations should be machine-readable to enable automation -2. **Separation of Concerns**: Different audiences need different types of information -3. **Version Control**: All documentation should be in Git, just like code -4. **Automation-First**: Validation should be automated and integrated into CI/CD -5. **Living Documentation**: Documentation should evolve with the codebase +**Key Elements**: +- **Deployment Steps**: How to deploy new versions +- **Scaling Triggers**: When and how to scale up/down +- **Backup/Restore Procedures**: Disaster recovery steps +- **"3:00 AM" Troubleshooting**: Quick fixes for common failures -## Getting Started +**Best Practices**: +- Store in K8s manifests (Flux/Argo) for GitOps +- Automate as much as possible +- Test runbook procedures regularly -To adopt this stack in your project: +--- -1. Start with requirements in GitHub Issues or Notion -2. Create a spec file (OpenAPI/Protobuf) as the contract -3. Add architecture diagrams using Mermaid.js -4. Write a walkthrough explaining the "why" behind decisions -5. Implement code following the spec -6. Add automated tests that validate the spec -7. Create runbooks for deployment and scaling +## How to Use This Framework -This framework ensures that every piece of documentation serves a clear purpose and reaches the right audience. \ No newline at end of file +1. **Start with Requirements** - Define the business problem and success criteria +2. **Create the Spec** - Translate requirements into machine-readable contracts +3. **Design Architecture** - Visualize how the system will work +4. **Write Walkthrough** - Document the logic and trade-offs +5. **Implement** - Build the actual code +6. **Set up Validation** - Add automated tests and gates +7. **Document Maintenance** - Plan for long-term health +8. **Create Runbook** - Define operational procedures + +This framework ensures that every document serves a clear purpose and that your project remains maintainable, scalable, and aligned with business goals. \ No newline at end of file From 437ca81e76b12fcfe4ba54405a1984cf3f98c7b9 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 09:47:10 +0700 Subject: [PATCH 05/29] update --- docs/SDD_FRAMEWORK.md | 305 ++++++++++++++++++++++++++++-------------- 1 file changed, 206 insertions(+), 99 deletions(-) diff --git a/docs/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md index 3a2418f..099a17b 100644 --- a/docs/SDD_FRAMEWORK.md +++ b/docs/SDD_FRAMEWORK.md @@ -2,187 +2,294 @@ ## Overview -The **SDD + GitOps Documentation Framework** is a comprehensive, structured approach to software development documentation that aligns technical work with business outcomes through clear separation of concerns. +The **SDD (Software Design Documentation) + GitOps Documentation Framework** is a comprehensive, structured approach to software development documentation that aligns technical work with business outcomes through clear separation of concerns. -This framework ensures that every piece of documentation serves a specific purpose, reaches the right audience, and can be measured for effectiveness. It's designed to prevent common pitfalls like feature creep, communication gaps, and operational fragility. +This framework ensures that every piece of documentation serves a specific purpose, reaches the right audience, and is measurable through clear KPIs and SLOs. --- ## The Documentation Matrix | Document | Purpose & Rationale (The "Why") | Audience | Format / Content | Measurement (KPI/SLO) | Example (SaaS Context) | -|----------|----------------------------------|----------|------------------|----------------------|------------------------| -| **Requirements** | The Business North Star. Defines exactly what problem the user has and what success looks like. It prevents "feature creep" by setting hard boundaries on what we will NOT build. | Founder, Team, PM | Format: Shared Wiki (Notion/GitHub Wiki). Content: User stories, business constraints, competitive context, and success metrics. | **KPI**: Business Outcomes. Measured by User Retention, Conversion Rates, and Monthly Recurring Revenue (MRR). | "The system must process high-volume math so clients see reports instantly. Goal: 15% increase in daily active users." | -| **Spec** | The Technical Contract. A machine-readable, strictly typed definition of all data interfaces. It is the "Single Source of Truth" that prevents bugs caused by communication gaps between services. | Developers, QA, Automation | Format: OpenAPI/YAML or Protobuf. Content: API endpoints, snake_case key naming, data validation rules, and error response codes. | **SLA/SLO**: System Performance. Measured by API Uptime (99.9%), Response Latency (<100ms), and Error Rates. | A `contract.yaml` defining exactly how Julia sends Arrow data to Node.js. It forces `user_id` to be a UUID. | -| **Architecture** | The Structural Blueprint. A visual map of how the components (services, DBs, networks) fit together. It shows how the data flows through the 6-node cluster and where bottlenecks live. | Senior Devs, DevOps | Format: Diagrams-as-code (Mermaid.js). Content: System Context diagrams, Database ERDs, Network Security Policies, and Infrastructure maps. | **Efficiency Metrics**: Resource utilization. Measured by CPU Load (<70%), RAM per pod, and internal network throughput. | A diagram showing the data path: Caddy (Proxy) → Node.js (API) → NATS (Queue) → Julia (Math Engine). | -| **Walkthrough** | The Intuition & Logic. A narrative guide that explains the "steps" and "rationale" behind end-to-end flows. It's about building a mental model so devs understand why the sequence matters. | The Team, New Hires | Format: TOUR.md file or Loom Video. Content: Step-by-step traces of core features, explanation of architectural trade-offs, and "The Big Picture" flow. | **Quality**: Developer Velocity. Measured by "Time-to-First-Commit" for new hires and reduction in conceptual bugs. | "End-to-End Trace": 1. UI sends JSON. 2. API wraps it in Claim-Check. 3. Julia pulls it. Rationale: To avoid NATS memory spikes. | -| **Implementation** | The Functional Reality. The actual code that does the work. In SDD, the "boring" parts (types/routes) are auto-generated from the Spec to ensure the code never lies. | Developers, Reviewers | Format: Git Repository. Content: Business logic, internal helper functions, Unit Tests, and a README.md for local environment setup. | **Code Health**: Internal Quality. Measured by Test Coverage (90%+), Linting compliance, and Cyclomatic Complexity. | The SvelteKit frontend components and the specific Julia math-processing functions. | -| **Validation** | The Enforcement Layer. Automated gates that prove the Implementation matches the Spec. It prevents human error (like changing a key name) from reaching production. | CI/CD Pipeline, QA | Format: GitHub Actions / Tests. Content: Contract tests (Dredd/Prism), Integration tests, and Security scans that run on every pull request. | **Compliance**: Safety Metrics. Measured by Build Success Rate and 0 "Contract Violations" in the production logs. | A CI job that blocks a Pull Request because a developer used camelCase in a database field instead of snake_case. | -| **Maintenance** | The Health & Evolution. Defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software over time. | The Team, DevOps | Format: MAINTENANCE.md. Content: Dependency update schedules, Secret rotation steps, DB Migration logs, and Tech Debt "Graveyard" tracking. | **Sustainability**: System Longevity. Measured by "Package Age", "Security Vulnerabilities Found", and "Migration Success Rate". | "Steps to upgrade the Julia version across all 6 nodes without downtime using a Blue-Green deployment strategy." | -| **Runbook** | The Operational Life-Support. The instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. | DevOps, SRE, On-call Devs | Format: K8s Manifests (Flux/Argo). Content: Deployment steps, Scaling triggers, Backup/Restore procedures, and "3:00 AM" troubleshooting guides. | **Reliability**: Operational Health. Measured by MTTR (Mean Time to Recovery) and Error-Free Deployments. | A Flux manifest that ensures 6 replicas of the Julia service are always healthy and restarts them if they hit 80% RAM. | +|----------|---------------------------------|----------|------------------|----------------------|------------------------| +| **Requirements** | The Business North Star. Defines exactly what problem the user has and what success looks like. It prevents "feature creep" by setting hard boundaries on what we will NOT build. | Founder, Team, PM | Format: Shared Wiki (Notion/GitHub Wiki). Content: User stories, business constraints, competitive context, and success metrics. | KPI: Business Outcomes. Measured by User Retention, Conversion Rates, and Monthly Recurring Revenue (MRR). | "The system must process high-volume math so clients see reports instantly. Goal: 15% increase in daily active users." | +| **Spec** | The Technical Contract. A machine-readable, strictly typed definition of all data interfaces. It is the "Single Source of Truth" that prevents bugs caused by communication gaps between services. | Developers, QA, Automation | Format: OpenAPI/YAML or Protobuf. Content: API endpoints, snake_case key naming, data validation rules, and error response codes. | SLA/SLO: System Performance. Measured by API Uptime (99.9%), Response Latency (<100ms), and Error Rates. | A `contract.yaml` defining exactly how Julia sends Arrow data to Node.js. It forces `user_id` to be a UUID. | +| **Architecture** | The Structural Blueprint. A visual map of how the components (services, DBs, networks) fit together. It shows how the data flows through the 6-node cluster and where bottlenecks live. | Senior Devs, DevOps | Format: Diagrams-as-code (Mermaid.js). Content: System Context diagrams, Database ERDs, Network Security Policies, and Infrastructure maps. | Efficiency Metrics: Resource utilization. Measured by CPU Load (<70%), RAM per pod, and internal network throughput. | A diagram showing the data path: Caddy (Proxy) → Node.js (API) → NATS (Queue) → Julia (Math Engine). | +| **Walkthrough** | The Intuition & Logic. A narrative guide that explains the "steps" and "rationale" behind end-to-end flows. It's about building a mental model so devs understand why the sequence matters. | The Team, New Hires | Format: TOUR.md file or Loom Video. Content: Step-by-step traces of core features, explanation of architectural trade-offs, and "The Big Picture" flow. | Quality: Developer Velocity. Measured by "Time-to-First-Commit" for new hires and reduction in conceptual bugs. | "End-to-End Trace:" 1. UI sends JSON. 2. API wraps it in Claim-Check. 3. Julia pulls it. Rationale: To avoid NATS memory spikes. | +| **Implementation** | The Functional Reality. The actual code that does the work. In SDD, the "boring" parts (types/routes) are auto-generated from the Spec to ensure the code never lies. | Developers, Reviewers | Format: Git Repository. Content: Business logic, internal helper functions, Unit Tests, and a README.md for local environment setup. | Code Health: Internal Quality. Measured by Test Coverage (90%+), Linting compliance, and Cyclomatic Complexity. | The SvelteKit frontend components and the specific Julia math-processing functions. | +| **Validation** | The Enforcement Layer. Automated gates that prove the Implementation matches the Spec. It prevents human error (like changing a key name) from reaching production. | CI/CD Pipeline, QA | Format: GitHub Actions / Tests. Content: Contract tests (Dredd/Prism), Integration tests, and Security scans that run on every pull request. | Compliance: Safety Metrics. Measured by Build Success Rate and 0 "Contract Violations" in the production logs. | A CI job that blocks a Pull Request because a developer used camelCase in a database field instead of snake_case. | +| **Maintenance** | The Health & Evolution. Defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software over time. | The Team, DevOps | Format: MAINTENANCE.md. Content: Dependency update schedules, Secret rotation steps, DB Migration logs, and Tech Debt "Graveyard" tracking. | Sustainability: System Longevity. Measured by "Package Age," "Security Vulnerabilities Found," and "Migration Success Rate." | "Steps to upgrade the Julia version across all 6 nodes without downtime using a Blue-Green deployment strategy." | +| **Runbook** | The Operational Life-Support. The instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. | DevOps, SRE, On-call Devs | Format: K8s Manifests (Flux/Argo). Content: Deployment steps, Scaling triggers, Backup/Restore procedures, and "3:00 AM" troubleshooting guides. | Reliability: Operational Health. Measured by MTTR (Mean Time to Recovery) and Error-Free Deployments. | A Flux manifest that ensures 6 replicas of the Julia service are always healthy and restarts them if they hit 80% RAM. | --- -## Detailed Document Descriptions +## Detailed Breakdown of Each Document Type ### 1. Requirements -**Purpose**: Establish the Business North Star. +**Purpose**: Establish the Business North Star -**Why It Matters**: Without clear requirements, teams drift into "feature creep" - building things that don't solve the actual problem. This document anchors the project in business outcomes. +The Requirements document is your anchor point. It answers the fundamental question: "What problem are we solving, and how do we know we've succeeded?" -**Key Elements**: -- **User Stories**: What the user needs to accomplish -- **Business Constraints**: Budget, timeline, regulatory requirements -- **Competitive Context**: What competitors do and how you differentiate -- **Success Metrics**: Quantifiable goals that define "done" +**Key Characteristics**: +- **Business-Focused**: Written in business terms, not technical jargon +- **Boundary-Setting**: Explicitly defines what we will NOT build +- **Outcome-Oriented**: Focuses on user outcomes, not features **Best Practices**: -- Keep it in a shared wiki (Notion, GitHub Wiki) for collaborative editing -- Focus on outcomes, not solutions -- Explicitly state what you will NOT build +- Include user stories that describe the user's perspective +- Document business constraints (regulatory, legal, compliance) +- Define competitive context and market positioning +- Establish clear success metrics from day one + +**Common Pitfalls to Avoid**: +- Vague descriptions like "improve user experience" +- Changing requirements without updating the document +- Not defining what's out of scope --- ### 2. Spec (Specification) -**Purpose**: Create a machine-readable technical contract. +**Purpose**: Create the Technical Contract -**Why It Matters**: Communication gaps between services cause bugs. A strict, typed spec prevents these by being the Single Source of Truth. +The Spec serves as the Single Source of Truth for all data interfaces. It's a machine-readable definition that ensures consistency across services. -**Key Elements**: -- **API Endpoints**: All routes with HTTP methods -- **Data Types**: Strict typing with validation rules -- **Error Codes**: Comprehensive error response definitions -- **Naming Conventions**: snake_case keys, consistent patterns +**Key Characteristics**: +- **Machine-Readable**: Can be parsed by tools for validation and code generation +- **Strictly Typed**: Enforces data types and validation rules +- **Comprehensive**: Covers all endpoints, request/response formats, and error codes **Best Practices**: -- Use OpenAPI (YAML/JSON) for REST APIs or Protobuf for gRPC -- Automate generation of client/server code from the spec -- Run contract tests against the spec in CI/CD +- Use OpenAPI/Swagger for REST APIs or Protobuf for gRPC +- Enforce consistent naming conventions (e.g., snake_case) +- Define validation rules for all data fields +- Document all possible error responses + +**Common Pitfalls to Avoid**: +- Letting the spec diverge from the implementation +- Incomplete error handling documentation +- Not versioning the API spec --- ### 3. Architecture -**Purpose**: Visualize the system structure and data flow. +**Purpose**: Visualize the System Structure -**Why It Matters**: Complex systems (like your 6-node cluster) need clear maps. Without them, teams can't identify bottlenecks or make informed decisions. +The Architecture document provides a visual map of how components fit together. It helps identify bottlenecks and understand data flow. -**Key Elements**: -- **System Context Diagram**: Shows the system and its external dependencies -- **Database ERD**: Entity-Relationship diagrams for data model -- **Network Security Policies**: Firewall rules, service mesh configs -- **Infrastructure Maps**: Cloud resources, scaling groups +**Key Characteristics**: +- **Visual**: Uses diagrams to represent complex relationships +- **Comprehensive**: Covers system context, data flow, and infrastructure +- **Living Document**: Updated as the system evolves **Best Practices**: -- Use Mermaid.js for diagrams-as-code (versionable, diffable) -- Update diagrams when architecture changes -- Focus on data flow and decision points +- Use Mermaid.js for diagrams-as-code (versionable in Git) +- Include multiple views: System Context, C4 model, ERDs, network topology +- Document trade-offs and architectural decisions +- Show data flow through the system + +**Common Pitfalls to Avoid**: +- Over-engineering diagrams with unnecessary detail +- Not updating diagrams when the architecture changes +- Using static images instead of diagrams-as-code --- ### 4. Walkthrough -**Purpose**: Build a mental model through narrative. +**Purpose**: Build Mental Models -**Why It Matters**: Code doesn't explain *why*. Walkthroughs capture the reasoning behind architectural trade-offs, making onboarding faster and reducing conceptual bugs. +The Walkthrough document explains the "why" behind the "how." It helps developers understand the rationale behind design decisions. -**Key Elements**: -- **Step-by-step traces**: End-to-end flow of user actions -- **Trade-off explanations**: Why you chose option A over B -- **The Big Picture**: How components fit together conceptually +**Key Characteristics**: +- **Narrative-Driven**: Tells a story about how the system works +- **Context-Rich**: Explains trade-offs and decisions +- **End-to-End**: Traces flows from user input to system output **Best Practices**: -- Write in a TOUR.md file or record Loom videos -- Focus on intuition, not just mechanics -- Include "Rationale" sections for each major decision +- Document step-by-step traces of core features +- Explain architectural trade-offs and why you chose them +- Include "The Big Picture" context +- Use real examples and data flows + +**Common Pitfalls to Avoid**: +- Only documenting the happy path +- Assuming developers will figure out the "why" +- Not explaining the rationale behind decisions --- ### 5. Implementation -**Purpose**: The functional reality - the actual code. +**Purpose**: The Functional Reality -**Why It Matters**: This is what runs in production. In SDD, the spec-driven approach ensures boring parts are generated automatically, so developers focus on business logic. +The Implementation is the actual code that does the work. In SDD, the "boring" parts are auto-generated from the Spec to ensure consistency. -**Key Elements**: -- **Business Logic**: The unique value you provide -- **Unit Tests**: Covering edge cases and error paths -- **README.md**: Local environment setup instructions +**Key Characteristics**: +- **Machine-Generated**: Types and routes auto-generated from Spec +- **Human-Written**: Business logic and helper functions +- **Tested**: Includes unit and integration tests **Best Practices**: -- Generate boilerplate (types, routes) from the Spec -- Maintain 90%+ test coverage -- Keep README.md up-to-date for local development +- Auto-generate boring parts (types, routes) from the Spec +- Keep business logic separate from boilerplate +- Maintain comprehensive test coverage +- Document the local development setup + +**Common Pitfalls to Avoid**: +- Hand-writing types that should be auto-generated +- Inconsistent code style +- Insufficient test coverage --- ### 6. Validation -**Purpose**: Automated quality gates. +**Purpose**: Enforce the Contract -**Why It Matters**: Human error happens. Validation layers catch mistakes before they reach production, preventing contract violations and security issues. +The Validation layer provides automated gates that ensure the Implementation matches the Spec. It prevents human error from reaching production. -**Key Elements**: -- **Contract Tests**: Verify implementation matches spec (Dredd, Prism) -- **Integration Tests**: Test service-to-service interactions -- **Security Scans**: SAST/SBOM analysis on every PR +**Key Characteristics**: +- **Automated**: Runs on every commit/Pull Request +- **Comprehensive**: Covers contract tests, integration tests, and security scans +- **Blocking**: Prevents merges that violate the contract **Best Practices**: -- Run validation on every pull request -- Block merges on contract violations -- Track build success rate as a KPI +- Use contract testing tools (Dredd, Prism) to validate API contracts +- Run integration tests on every commit +- Include security scans in the CI pipeline +- Fail builds on contract violations + +**Common Pitfalls to Avoid**: +- Not running tests on every commit +- Allowing manual overrides of validation gates +- Not updating tests when the Spec changes --- ### 7. Maintenance -**Purpose**: Guide for long-term health and evolution. +**Purpose**: Ensure Long-Term Health -**Why It Matters**: Software decays. Without a maintenance plan, dependency upgrades become risky, secrets accumulate, and technical debt piles up. +The Maintenance document defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software. -**Key Elements**: -- **Dependency Update Schedule**: When and how to upgrade packages -- **Secret Rotation Steps**: How to rotate credentials securely -- **DB Migration Logs**: History of schema changes -- **Tech Debt "Graveyard"**: Documented technical debt with remediation plans +**Key Characteristics**: +- **Procedural**: Step-by-step instructions for common tasks +- **Scheduled**: Includes regular maintenance windows +- **Documented**: Tracks technical debt and migration history **Best Practices**: -- Document the "how" for common maintenance tasks -- Track package age and security vulnerabilities -- Schedule regular tech debt reviews +- Document dependency update schedules +- Create secret rotation procedures +- Track technical debt in a "Graveyard" +- Document migration history and rollback procedures + +**Common Pitfalls to Avoid**: +- Ad-hoc upgrades without documentation +- Ignoring technical debt until it becomes critical +- Not testing upgrades in staging first --- ### 8. Runbook -**Purpose**: Operational life-support for production systems. +**Purpose**: Operational Life-Support -**Why It Matters**: When production is down, teams need clear instructions. In GitOps, the runbook is the "desired state" that the system constantly works toward. +The Runbook provides instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. -**Key Elements**: -- **Deployment Steps**: How to deploy new versions -- **Scaling Triggers**: When and how to scale up/down -- **Backup/Restore Procedures**: Disaster recovery steps -- **"3:00 AM" Troubleshooting**: Quick fixes for common failures +**Key Characteristics**: +- **Action-Oriented**: Step-by-step instructions for common operations +- **Automated**: Infrastructure as code defines the desired state +- **Crisis-Ready**: Includes "3:00 AM" troubleshooting guides **Best Practices**: -- Store in K8s manifests (Flux/Argo) for GitOps -- Automate as much as possible -- Test runbook procedures regularly +- Document deployment procedures +- Define scaling triggers and procedures +- Include backup and restore procedures +- Create troubleshooting guides for common issues + +**Common Pitfalls to Avoid**: +- Not documenting procedures for common issues +- Not testing runbook procedures +- Not versioning runbooks with the infrastructure --- -## How to Use This Framework +## How to Use This Approach Effectively -1. **Start with Requirements** - Define the business problem and success criteria -2. **Create the Spec** - Translate requirements into machine-readable contracts -3. **Design Architecture** - Visualize how the system will work -4. **Write Walkthrough** - Document the logic and trade-offs -5. **Implement** - Build the actual code -6. **Set up Validation** - Add automated tests and gates -7. **Document Maintenance** - Plan for long-term health -8. **Create Runbook** - Define operational procedures +### Phase 1: Foundation (Week 1-2) -This framework ensures that every document serves a clear purpose and that your project remains maintainable, scalable, and aligned with business goals. \ No newline at end of file +1. **Create Requirements Document** + - Define the Business North Star + - Establish success metrics + - Define out-of-scope items + +2. **Write the Spec** + - Define all data interfaces + - Establish naming conventions + - Document validation rules + +3. **Design Architecture** + - Create system diagrams + - Document data flow + - Identify potential bottlenecks + +### Phase 2: Development (Week 3+) + +4. **Write Walkthrough** + - Document end-to-end flows + - Explain architectural trade-offs + - Create mental models for developers + +5. **Implement Code** + - Auto-generate boring parts from Spec + - Write business logic + - Implement tests + +### Phase 3: Quality Assurance + +6. **Set Up Validation** + - Configure CI/CD pipeline + - Set up contract testing + - Configure security scans + +7. **Create Runbook** + - Document deployment procedures + - Define scaling triggers + - Create troubleshooting guides + +### Phase 4: Maintenance + +8. **Document Maintenance** + - Create dependency update schedule + - Document secret rotation + - Track technical debt + +--- + +## Key Principles for Success + +1. **Separation of Concerns**: Keep business concerns separate from technical concerns +2. **Machine-Readable Contracts**: Use OpenAPI/Protobuf for specs to enable automation +3. **Automation**: Automate boring parts and validation to reduce human error +4. **Measurability**: Every document should have measurable outcomes +5. **Version Control**: Keep all documentation in Git for history and collaboration +6. **Living Documents**: Update documentation as the system evolves +7. **Audience-Focused**: Write for the intended audience's needs and knowledge level + +--- + +## Conclusion + +The SDD + GitOps Documentation Framework provides a comprehensive, structured approach to software development documentation. By following this framework, teams can ensure that: + +- Business goals are clearly defined and measurable +- Technical contracts are machine-readable and enforced +- System architecture is visualized and understood +- Developers have clear mental models of the system +- Code quality is maintained through automation +- Operations are reliable and repeatable + +This framework is not just about documentation—it's about creating a shared understanding across the entire team and ensuring that every decision is aligned with business goals. \ No newline at end of file From e974dc5fdbb076eeacba64a4036c7c202c711893 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 13:15:01 +0700 Subject: [PATCH 06/29] update --- AI_prompt.txt | 103 -------- DO_NOT_READ_AI_prompt.txt | 154 ++++++++++++ docs/SDD_FRAMEWORK.md | 509 +++++++++++++++++++++++--------------- 3 files changed, 462 insertions(+), 304 deletions(-) delete mode 100644 AI_prompt.txt create mode 100644 DO_NOT_READ_AI_prompt.txt diff --git a/AI_prompt.txt b/AI_prompt.txt deleted file mode 100644 index 2df5aab..0000000 --- a/AI_prompt.txt +++ /dev/null @@ -1,103 +0,0 @@ -Consider the following scenarios: -Scenario 1: The "Command & Control" Loop (Low Latency)Focus: Small payloads, Core NATS, bi-directional JSON.The Action: A user on a JavaScript dashboard clicks a "Start Simulation" button. This sends a JSON configuration (parameters like step_size and iterations) to Julia.The Flow: * JS (Sender): Recognizes the message is small ($< 10KB$). Packages it as a direct transport JSON envelope.Julia (Receiver): Listens on the NATS subject, decodes the JSON, and immediately acknowledges receipt with a "Running" status.Project Requirement Met: Fast, low-overhead communication for control signals without involving the fileserver. -Scenario 2: The "Deep Dive" Analysis (High Bandwidth)Focus: Large Arrow tables, Claim-Check pattern, Julia to JS.The Action: Julia finishes a heavy computation and produces a 500MB DataFrame with 10 million rows. It needs to send this to the JS frontend for visualization (e.g., using Perspective.js or D3).The Flow:Julia (Sender): Converts the DataFrame to an Arrow IPC stream. It sees the size is $> 1MB$, so it uploads the bytes to the HTTP fileserver. It then publishes a NATS message with transport: "link" and the URL.JS (Receiver): Receives the URL, fetches the data via fetch(), and uses tableFromIPC() to load the data into memory with zero-copy.Project Requirement Met: Handling massive datasets that exceed NATS message limits while maintaining data integrity across languages. -Scenario 3: Live Audio/Signal Processing (Multimedia & Metadata)Focus: Raw binary, bi-directional streaming, NATS Headers.The Action: The JS client captures a 2-second "chunk" of microphone audio. It needs Julia to perform a Fast Fourier Transform (FFT) or AI transcription.The Flow:JS (Sender): Sends the raw binary WAV/PCM data. It uses NATS Headers to store the metadata ($fs = 44.1kHz$, $channels = 1$) to keep the payload purely binary.Julia (Receiver): Processes the audio and sends back a JSON result (the transcription) and an Arrow Table (the frequency spectrum data).Project Requirement Met: Bi-directional flow involving mixed media (Audio) and technical results (Arrow). -Scenario 4: The "Catch-Up" (Persistence & JetStream)Focus: NATS JetStream, late-joining consumers, state sync.The Action: Julia is constantly publishing "System Health" updates. The JS dashboard is closed for 10 minutes. When the user re-opens the dashboard, they need to see the last 10 minutes of history.The Flow:NATS (Server): Uses a JetStream with a Limits retention policy.JS (Consumer): Connects and requests a "Replay" from the last 10 minutes. It receives a mix of direct (small updates) and link (historical snapshots) messages.Project Requirement Met: Temporal decoupling—consumers can receive data that was sent while they were offline. - -Role: Principal Systems Architect & Lead Software Engineer.Objective: Implement a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream).⚠️ STRICT ARCHITECTURAL CONSTRAINTS (Non-Negotiable)Transport Strategy (Claim-Check Pattern):Direct Path: If payload is < 1MB, send data directly via NATS inside the message envelope (Base64 encoded).Link Path: If payload is > 1MB, upload to a shared HTTP fileserver/store. The NATS message must only contain the metadata and the download URL.Tabular Data Format: * MUST use Apache Arrow IPC Stream for all tables/DataFrames. No CSV or standard JSON-serialization of tables allowed.System Symmetry: * Both services must function as Producers AND Consumers.Modular Elegance: * Implementation must be abstracted into a SmartSend function and a SmartReceive handler. The developer calling these functions should not need to care if the data is going via NATS direct or HTTP link.Technical Stack & Use CasesJulia: NATS.jl, Arrow.jl, JSON3.jl, HTTP.jl.Node.js: nats.js, apache-arrow.Scenarios to Support: * Large Data: Sending a 500MB Arrow table from Julia $\rightarrow$ JS.Media: Sending a 5MB WAV file from JS $\rightarrow$ Julia.Signals: Sending small JSON control commands ($< 10KB$) directly via NATS.Implementation Requirements1. Unified JSON Envelope:Define a schema containing: correlation_id (UUID), type (table/binary/json), transport (direct/link), payload (if direct), and url (if link).2. The Julia Module:Implement SmartSend(subject, data, type): Handles Arrow serialization to an IOBuffer, checks size, and manages HTTP uploads for large blobs.Implement SmartReceive(msg): Parses envelope, handles the HTTP fetch with Exponential Backoff (to avoid race conditions), and restores the DataFrame.Include a basic HTTP.listen server to serve as the temporary storage.3. The JavaScript Module:Implement a symmetric SmartSend using nats.js and apache-arrow.Implement a JetStream Pull Consumer for SmartReceive to ensure backpressure and memory safety.4. Performance & Reliability:Demonstrate "Zero-Copy" reading of the Arrow IPC stream on the JS side.Log the correlation_id at every stage for distributed tracing. - - - - - - - -Create a walkthrough for Julia service-A service sending a mix-content chat message to Julia service-B. the chat message must includes - - - - - -I updated the following: -- NATSBridge.jl. Essentially I add NATS_connection keyword and new publish_message function to support the keyword. -Use them and ONLY them as ground truth. -Then update the following files accordingly: -- architecture.md -- implementation.md - -All API should be semantically consistent and naming should be consistent across the board. - - - - - - - - - -Task: Update NATSBridge.js to reflect recent changes in NATSBridge.jl and docs -Context: NATSBridge.jl and docs has been updated. -Requirements: -Source of Truth: Treat the updated NATSBridge.jl and docs as the definitive source. -API Consistency: Ensure the Main Package API (e.g., smartsend(), publish_message()) uses consistent naming across all three supported languages. -Ecosystem Variance: Low-level native functions (e.g., NATS.connect(), JSON.read()) should follow the conventions of the specific language ecosystem and do not require cross-language consistency. - - - - - - - -I'm expanding this Julia package (NATSBridge) into a cross-platform project by adding a JavaScript and Python/MicroPython implementation. To ensure accuracy, the Julia src directory will serve as the ground truth, as the documentation may be outdated. - -My goal is to maintain interface parity at the high-level API for a consistent user experience, while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based patterns in JS and Python/MicroPython) - -Now, help me do the following: -1) check architecture.md for any mistake. - - - - -Help me expands this Julia package (NATSBridge) into a cross-platform project by adding a JavaScript and Python/MicroPython implementation. To ensure accuracy, NATSBridge.jl will serve as the ground truth, as the documentation may be outdated. - -My goal is to maintain interface parity at the high-level API for a consistent user experience, while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based patterns in JS and Python/MicroPython) - -Now do the following: -1) check docs to see if there is any mistake. - - - - - -I'm expanding this Julia package (NATSBridge) into a cross-platform project by adding -a JavaScript, Python and MicroPython implementation. -The following will serve as the ground truth: -- test_julia_mix_payloads_sender.jl -- NATSBridge.jl -- test_julia_mix_payloads_receiver.jl -- architecture.md - -My goal is to maintain interface parity at the high-level API for a consistent user experience, -while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each -respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based -patterns in JS, Python and MicroPython) - -Now, help me do the following: -1) Check whether natsbridge.js needs update or it already up to date. - - - - - - - - - - - - - - - - - diff --git a/DO_NOT_READ_AI_prompt.txt b/DO_NOT_READ_AI_prompt.txt new file mode 100644 index 0000000..4318e8c --- /dev/null +++ b/DO_NOT_READ_AI_prompt.txt @@ -0,0 +1,154 @@ +Consider the following scenarios: +Scenario 1: The "Command & Control" Loop (Low Latency)Focus: Small payloads, Core NATS, bi-directional JSON.The Action: A user on a JavaScript dashboard clicks a "Start Simulation" button. This sends a JSON configuration (parameters like step_size and iterations) to Julia.The Flow: * JS (Sender): Recognizes the message is small ($< 10KB$). Packages it as a direct transport JSON envelope.Julia (Receiver): Listens on the NATS subject, decodes the JSON, and immediately acknowledges receipt with a "Running" status.Project Requirement Met: Fast, low-overhead communication for control signals without involving the fileserver. +Scenario 2: The "Deep Dive" Analysis (High Bandwidth)Focus: Large Arrow tables, Claim-Check pattern, Julia to JS.The Action: Julia finishes a heavy computation and produces a 500MB DataFrame with 10 million rows. It needs to send this to the JS frontend for visualization (e.g., using Perspective.js or D3).The Flow:Julia (Sender): Converts the DataFrame to an Arrow IPC stream. It sees the size is $> 1MB$, so it uploads the bytes to the HTTP fileserver. It then publishes a NATS message with transport: "link" and the URL.JS (Receiver): Receives the URL, fetches the data via fetch(), and uses tableFromIPC() to load the data into memory with zero-copy.Project Requirement Met: Handling massive datasets that exceed NATS message limits while maintaining data integrity across languages. +Scenario 3: Live Audio/Signal Processing (Multimedia & Metadata)Focus: Raw binary, bi-directional streaming, NATS Headers.The Action: The JS client captures a 2-second "chunk" of microphone audio. It needs Julia to perform a Fast Fourier Transform (FFT) or AI transcription.The Flow:JS (Sender): Sends the raw binary WAV/PCM data. It uses NATS Headers to store the metadata ($fs = 44.1kHz$, $channels = 1$) to keep the payload purely binary.Julia (Receiver): Processes the audio and sends back a JSON result (the transcription) and an Arrow Table (the frequency spectrum data).Project Requirement Met: Bi-directional flow involving mixed media (Audio) and technical results (Arrow). +Scenario 4: The "Catch-Up" (Persistence & JetStream)Focus: NATS JetStream, late-joining consumers, state sync.The Action: Julia is constantly publishing "System Health" updates. The JS dashboard is closed for 10 minutes. When the user re-opens the dashboard, they need to see the last 10 minutes of history.The Flow:NATS (Server): Uses a JetStream with a Limits retention policy.JS (Consumer): Connects and requests a "Replay" from the last 10 minutes. It receives a mix of direct (small updates) and link (historical snapshots) messages.Project Requirement Met: Temporal decoupling—consumers can receive data that was sent while they were offline. + +Role: Principal Systems Architect & Lead Software Engineer.Objective: Implement a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream).⚠️ STRICT ARCHITECTURAL CONSTRAINTS (Non-Negotiable)Transport Strategy (Claim-Check Pattern):Direct Path: If payload is < 1MB, send data directly via NATS inside the message envelope (Base64 encoded).Link Path: If payload is > 1MB, upload to a shared HTTP fileserver/store. The NATS message must only contain the metadata and the download URL.Tabular Data Format: * MUST use Apache Arrow IPC Stream for all tables/DataFrames. No CSV or standard JSON-serialization of tables allowed.System Symmetry: * Both services must function as Producers AND Consumers.Modular Elegance: * Implementation must be abstracted into a SmartSend function and a SmartReceive handler. The developer calling these functions should not need to care if the data is going via NATS direct or HTTP link.Technical Stack & Use CasesJulia: NATS.jl, Arrow.jl, JSON3.jl, HTTP.jl.Node.js: nats.js, apache-arrow.Scenarios to Support: * Large Data: Sending a 500MB Arrow table from Julia $\rightarrow$ JS.Media: Sending a 5MB WAV file from JS $\rightarrow$ Julia.Signals: Sending small JSON control commands ($< 10KB$) directly via NATS.Implementation Requirements1. Unified JSON Envelope:Define a schema containing: correlation_id (UUID), type (table/binary/json), transport (direct/link), payload (if direct), and url (if link).2. The Julia Module:Implement SmartSend(subject, data, type): Handles Arrow serialization to an IOBuffer, checks size, and manages HTTP uploads for large blobs.Implement SmartReceive(msg): Parses envelope, handles the HTTP fetch with Exponential Backoff (to avoid race conditions), and restores the DataFrame.Include a basic HTTP.listen server to serve as the temporary storage.3. The JavaScript Module:Implement a symmetric SmartSend using nats.js and apache-arrow.Implement a JetStream Pull Consumer for SmartReceive to ensure backpressure and memory safety.4. Performance & Reliability:Demonstrate "Zero-Copy" reading of the Arrow IPC stream on the JS side.Log the correlation_id at every stage for distributed tracing. + + + + + + + +Create a walkthrough for Julia service-A service sending a mix-content chat message to Julia service-B. the chat message must includes + + + + + +I updated the following: +- NATSBridge.jl. Essentially I add NATS_connection keyword and new publish_message function to support the keyword. +Use them and ONLY them as ground truth. +Then update the following files accordingly: +- architecture.md +- implementation.md + +All API should be semantically consistent and naming should be consistent across the board. + + + + + + + + + +Task: Update NATSBridge.js to reflect recent changes in NATSBridge.jl and docs +Context: NATSBridge.jl and docs has been updated. +Requirements: +Source of Truth: Treat the updated NATSBridge.jl and docs as the definitive source. +API Consistency: Ensure the Main Package API (e.g., smartsend(), publish_message()) uses consistent naming across all three supported languages. +Ecosystem Variance: Low-level native functions (e.g., NATS.connect(), JSON.read()) should follow the conventions of the specific language ecosystem and do not require cross-language consistency. + + + + + + + +I'm expanding this Julia package (NATSBridge) into a cross-platform project by adding a JavaScript and Python/MicroPython implementation. To ensure accuracy, the Julia src directory will serve as the ground truth, as the documentation may be outdated. + +My goal is to maintain interface parity at the high-level API for a consistent user experience, while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based patterns in JS and Python/MicroPython) + +Now, help me do the following: +1) check architecture.md for any mistake. + + + + +Help me expands this Julia package (NATSBridge) into a cross-platform project by adding a JavaScript and Python/MicroPython implementation. To ensure accuracy, NATSBridge.jl will serve as the ground truth, as the documentation may be outdated. + +My goal is to maintain interface parity at the high-level API for a consistent user experience, while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based patterns in JS and Python/MicroPython) + +Now do the following: +1) check docs to see if there is any mistake. + + + + + +I'm expanding this Julia package (NATSBridge) into a cross-platform project by adding +a JavaScript, Python and MicroPython implementation. +The following will serve as the ground truth: +- test_julia_mix_payloads_sender.jl +- NATSBridge.jl +- test_julia_mix_payloads_receiver.jl +- architecture.md + +My goal is to maintain interface parity at the high-level API for a consistent user experience, +while ensuring the low-level implementation adheres strictly to the idiomatic conventions of each +respective language (e.g., multiple dispatch in Julia vs. asynchronous, prototype, or class-based +patterns in JS, Python and MicroPython) + +Now, help me do the following: +1) Check whether natsbridge.js needs update or it already up to date. + + + + + + + +# ---------------------------------------------- 100 --------------------------------------------- # + +Got it — let’s rebuild your table in my own teaching style, keeping it crisp, intuitive, and easy for students to grasp. I’ll emphasize **purpose, audience, format, example, and KPI** in a way that flows like a story of how projects move from idea → contract → design → code → review → operations. + +--- + +### SDD + GitOps Documentation Framework + +| Document | Purpose (Rationale) | Primary Audience | Format / Content | Example (SaaS Context) | Measurement (KPI) | +|-----------------|---------------------|-----------------|------------------|------------------------|-------------------| +| **Requirements** | Capture the **business intent** — why we’re building this and what success looks like. Defines boundaries and user‑visible outcomes. | Stakeholders, Product Owners, Lead Developers | User stories, PRDs, acceptance criteria, non‑functional constraints. | “System must process tabular data from Julia to SvelteKit UI with <200ms latency for 5‑member teams.” | 95% of requests complete <200ms (synthetic monitoring). | +| **Specification** | The **technical contract** — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test. | Developers, QA Engineers, CI/CD pipelines | OpenAPI, Protobuf, AsyncAPI. Endpoint definitions, schemas, error codes. | `contract.yaml` defining a NATS subject that accepts Arrow streams with snake_case headers. | 100% of messages validated against spec (CI block rate). | +| **Architecture** | The **blueprint** — how components fit together, interact, and scale. Guides system structure and trade‑offs. | Architects, Senior Developers, DevOps | C4 diagrams, Mermaid.js, component/network/storage models. | Diagram showing 6‑node cluster routing traffic via Caddy → Node.js API → Julia pods. | 100% of major decisions logged with trade‑off analysis. | +| **Walkthrough** | The **story of flow** — shows how pieces connect end‑to‑end and why steps are sequenced. Builds intuition for new devs. | New Developers, Team Members | TOUR.md, Loom videos, sequence diagrams. Step‑by‑step traces with rationale. | “UI sends JSON → Node.js wraps Claim‑Check → Julia pulls Arrow data (prevents NATS overflow).” | New developers ship feature in <2 days (PR timeline). | +| **Implementation** | The **real code** — business logic, helpers, tests, configs. Where design becomes executable. | Developers, Code Reviewers | Source code, README.md, unit tests, setup scripts. | Julia function for matrix calculation + SvelteKit component rendering table. | >80% unit test coverage, <5% drift from spec. | +| **Validation** | The **enforcer** — ensures implementation matches the spec. Blocks drift and human error. | Automation servers, QA, Lead Developers | CI jobs, contract tests, linting, integration checks. | CI job rejects PR with camelCase field not allowed by YAML spec. | <1% of PRs bypass validation gates. | +| **Runbook** | The **operational manual** — how the system lives in production, scales, and recovers. Guides on‑call engineers. | DevOps, SREs, On‑call Developers | K8s manifests, Helm charts, Markdown guides. Deployment, scaling, backup/restore, troubleshooting. | GitOps manifest ensuring 6 Julia replicas restart if memory >80%. | MTTR <15 minutes for P1 incidents. | + + + + + + + + +# ---------------------------------------------- 100 --------------------------------------------- # + +SDD + GitOps Documentation Stack +Document,"Purpose (The ""Rationale"")",Primary Audience,Format / Content,Example (SaaS Context),"Measurement (KPI)" +Requirements,"Defines the ""Why"" and the Business Boundary. It sets the constraints and success criteria so the team knows when a feature is ""done"" from a user's perspective.","Stakeholders, Product Owners, Lead Developers","Format: User Stories, PRDs. Content: Functional goals, non-functional requirements (latency, scale), and explicit ""out-of-scope"" items.","""The system must process high-volume tabular data from Julia to the SvelteKit UI with <200ms latency for 5-member teams."",""Pass/Fail: 95% of requests complete <200ms (measured via synthetic monitoring)"" +The Spec,"The Technical Contract. It serves as the single source of truth that defines the shape of data. In SDD, this file drives code generation and automated testing.","Developers, QA Engineers, CI/CD Pipelines","Format: OpenAPI (YAML), Protobuf, AsyncAPI. Content: Endpoint definitions, strict data types, error codes, and request/response schemas.",A contract.yaml defining a NATS subject that accepts an Apache Arrow stream with specific snake_case headers.",""Schema Validation Rate: 100% of messages validated against spec (CI block rate)"" +Architecture,"The Structural Blueprint. It explains how the ""pieces"" are arranged in the cluster. It defines the relationships between services, databases, and external providers.","System Architects, Senior Developers, DevOps","Format: C4 Model Diagrams, Mermaid.js. Content: Component diagrams, network flow, storage strategy, and technology stack definitions.",A diagram showing how the 6-node cluster routes traffic through Caddy to the Node.js API and offloads heavy math to Julia pods.",""Architecture Decision Log: 100% of major decisions documented with trade-off analysis"" +Walkthrough,"The Intuition & Flow. It connects multiple APIs/services into a cohesive end-to-end story. It explains the ""steps"" and the ""rationale"" behind the sequence of operations.","New Developers, Current Team Members","Format: TOUR.md, Loom videos, Sequence Diagrams. Content: Step-by-step trace of a feature, explanation of state changes, and the ""why"" behind complex logic.","""End-to-End Trace:"" 1. UI sends JSON to Node.js. 2. Node.js wraps it in a Claim-Check. 3. Julia pulls the Arrow data. Rationale: This prevents NATS memory overflow.",""Onboarding Velocity: New developers deploy feature in <2 days (tracked via PR timeline)"" +Implementation,"The Functional Reality. This is the actual execution of the logic. In SDD, parts of this are auto-generated to ensure it never drifts from the Spec.","Developers, Code Reviewers","Format: Source Code (Git), README.md. Content: Business logic, internal helper functions, unit tests, and local setup instructions.",The Julia function that performs the matrix calculation and the SvelteKit component that renders the resulting table.",""Code Coverage: >80% unit test coverage, <5% test drift from spec"" +Validation,"The Enforcement Layer. It ensures that the ""Reality"" (Code) actually matches the ""Contract"" (Spec). It prevents human error from breaking the system.","Automation Servers, QA, Lead Developers","Format: GitHub Actions, Dredd, Prism. Content: Contract tests, linting rules, and integration tests that check API compliance.",A CI job that blocks a Pull Request because a developer added a camelCase field that isn't allowed in the shared YAML spec.",""Block Rate: <1% of PRs reach production without validation (CI gate pass rate)"" +Runbook,"The Operational Life-Support. It defines how the system lives in production and how to fix it. In GitOps, the ""State"" is declared here.","DevOps, SREs, On-call Developers","Format: K8s Manifests, Helm Charts, Markdown. Content: Deployment steps, scaling triggers, backup/restore commands, and troubleshooting guides.",A GitOps manifest in Flux that ensures 6 replicas of the Julia service are always running and restarts them if memory hits 80%.",""MTTR: <15 minutes for P1 incidents (tracked via incident management system)"" + +Do you understand the provided text? Don't fucking change the table content. I want you to add "Measurement (KPI)" column. it is only example of course. This table will be used for consult and teaching. + + +# ---------------------------------------------- 100 --------------------------------------------- # + +Can you write the table and explain this approach and each doc in details then save to docs/SDD_FRAMEWORK.md so I can consult it later. +Don't forget to add How to use this approach effectively. + + +# ---------------------------------------------- 100 --------------------------------------------- # + +Since I develop src folder before I adopt SDD_FRAMEWORK.md approach, can you check src folder and my current doc files then write docs/requirements.md according to SDD framework? Treat src as ground truth. + +# ---------------------------------------------- 100 --------------------------------------------- # + + + + + + + + + + diff --git a/docs/SDD_FRAMEWORK.md b/docs/SDD_FRAMEWORK.md index 099a17b..f278dcf 100644 --- a/docs/SDD_FRAMEWORK.md +++ b/docs/SDD_FRAMEWORK.md @@ -1,295 +1,402 @@ # SDD + GitOps Documentation Framework -## Overview - -The **SDD (Software Design Documentation) + GitOps Documentation Framework** is a comprehensive, structured approach to software development documentation that aligns technical work with business outcomes through clear separation of concerns. - -This framework ensures that every piece of documentation serves a specific purpose, reaches the right audience, and is measurable through clear KPIs and SLOs. +This document defines the documentation framework for the NATSBridge project. It establishes a structured approach to creating, maintaining, and evolving technical documentation in alignment with GitOps principles—ensuring that documentation is versioned, auditable, and continuously validated alongside the codebase. --- -## The Documentation Matrix +## The SDD Framework: Seven Pillars of Documentation -| Document | Purpose & Rationale (The "Why") | Audience | Format / Content | Measurement (KPI/SLO) | Example (SaaS Context) | -|----------|---------------------------------|----------|------------------|----------------------|------------------------| -| **Requirements** | The Business North Star. Defines exactly what problem the user has and what success looks like. It prevents "feature creep" by setting hard boundaries on what we will NOT build. | Founder, Team, PM | Format: Shared Wiki (Notion/GitHub Wiki). Content: User stories, business constraints, competitive context, and success metrics. | KPI: Business Outcomes. Measured by User Retention, Conversion Rates, and Monthly Recurring Revenue (MRR). | "The system must process high-volume math so clients see reports instantly. Goal: 15% increase in daily active users." | -| **Spec** | The Technical Contract. A machine-readable, strictly typed definition of all data interfaces. It is the "Single Source of Truth" that prevents bugs caused by communication gaps between services. | Developers, QA, Automation | Format: OpenAPI/YAML or Protobuf. Content: API endpoints, snake_case key naming, data validation rules, and error response codes. | SLA/SLO: System Performance. Measured by API Uptime (99.9%), Response Latency (<100ms), and Error Rates. | A `contract.yaml` defining exactly how Julia sends Arrow data to Node.js. It forces `user_id` to be a UUID. | -| **Architecture** | The Structural Blueprint. A visual map of how the components (services, DBs, networks) fit together. It shows how the data flows through the 6-node cluster and where bottlenecks live. | Senior Devs, DevOps | Format: Diagrams-as-code (Mermaid.js). Content: System Context diagrams, Database ERDs, Network Security Policies, and Infrastructure maps. | Efficiency Metrics: Resource utilization. Measured by CPU Load (<70%), RAM per pod, and internal network throughput. | A diagram showing the data path: Caddy (Proxy) → Node.js (API) → NATS (Queue) → Julia (Math Engine). | -| **Walkthrough** | The Intuition & Logic. A narrative guide that explains the "steps" and "rationale" behind end-to-end flows. It's about building a mental model so devs understand why the sequence matters. | The Team, New Hires | Format: TOUR.md file or Loom Video. Content: Step-by-step traces of core features, explanation of architectural trade-offs, and "The Big Picture" flow. | Quality: Developer Velocity. Measured by "Time-to-First-Commit" for new hires and reduction in conceptual bugs. | "End-to-End Trace:" 1. UI sends JSON. 2. API wraps it in Claim-Check. 3. Julia pulls it. Rationale: To avoid NATS memory spikes. | -| **Implementation** | The Functional Reality. The actual code that does the work. In SDD, the "boring" parts (types/routes) are auto-generated from the Spec to ensure the code never lies. | Developers, Reviewers | Format: Git Repository. Content: Business logic, internal helper functions, Unit Tests, and a README.md for local environment setup. | Code Health: Internal Quality. Measured by Test Coverage (90%+), Linting compliance, and Cyclomatic Complexity. | The SvelteKit frontend components and the specific Julia math-processing functions. | -| **Validation** | The Enforcement Layer. Automated gates that prove the Implementation matches the Spec. It prevents human error (like changing a key name) from reaching production. | CI/CD Pipeline, QA | Format: GitHub Actions / Tests. Content: Contract tests (Dredd/Prism), Integration tests, and Security scans that run on every pull request. | Compliance: Safety Metrics. Measured by Build Success Rate and 0 "Contract Violations" in the production logs. | A CI job that blocks a Pull Request because a developer used camelCase in a database field instead of snake_case. | -| **Maintenance** | The Health & Evolution. Defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software over time. | The Team, DevOps | Format: MAINTENANCE.md. Content: Dependency update schedules, Secret rotation steps, DB Migration logs, and Tech Debt "Graveyard" tracking. | Sustainability: System Longevity. Measured by "Package Age," "Security Vulnerabilities Found," and "Migration Success Rate." | "Steps to upgrade the Julia version across all 6 nodes without downtime using a Blue-Green deployment strategy." | -| **Runbook** | The Operational Life-Support. The instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. | DevOps, SRE, On-call Devs | Format: K8s Manifests (Flux/Argo). Content: Deployment steps, Scaling triggers, Backup/Restore procedures, and "3:00 AM" troubleshooting guides. | Reliability: Operational Health. Measured by MTTR (Mean Time to Recovery) and Error-Free Deployments. | A Flux manifest that ensures 6 replicas of the Julia service are always healthy and restarts them if they hit 80% RAM. | +| Document | Purpose (Rationale) | Primary Audience | Format / Content | Example (SaaS Context) | Measurement (KPI) | +|----------|---------------------|-----------------|------------------|------------------------|-------------------| +| **Requirements** | Capture the **business intent** — why we're building this and what success looks like. Defines boundaries and user-visible outcomes. | Stakeholders, Product Owners, Lead Developers | User stories, PRDs, acceptance criteria, non-functional constraints. | "System must process tabular data from Julia to SvelteKit UI with <200ms latency for 5-member teams." | 95% of requests complete <200ms (synthetic monitoring). | +| **Specification** | The **technical contract** — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test. | Developers, QA Engineers, CI/CD pipelines | OpenAPI, Protobuf, AsyncAPI. Endpoint definitions, schemas, error codes. | `contract.yaml` defining a NATS subject that accepts Arrow streams with snake_case headers. | 100% of messages validated against spec (CI block rate). | +| **Architecture** | The **blueprint** — how components fit together, interact, and scale. Guides system structure and trade-offs. | Architects, Senior Developers, DevOps | C4 diagrams, Mermaid.js, component/network/storage models. | Diagram showing 6-node cluster routing traffic via Caddy → Node.js API → Julia pods. | 100% of major decisions logged with trade-off analysis. | +| **Walkthrough** | The **story of flow** — shows how pieces connect end-to-end and why steps are sequenced. Builds intuition for new devs. | New Developers, Team Members | TOUR.md, Loom videos, sequence diagrams. Step-by-step traces with rationale. | "UI sends JSON → Node.js wraps Claim-Check → Julia pulls Arrow data (prevents NATS overflow)." | New developers ship feature in <2 days (PR timeline). | +| **Implementation** | The **real code** — business logic, helpers, tests, configs. Where design becomes executable. | Developers, Code Reviewers | Source code, README.md, unit tests, setup scripts. | Julia function for matrix calculation + SvelteKit component rendering table. | >80% unit test coverage, <5% drift from spec. | +| **Validation** | The **enforcer** — ensures implementation matches the spec. Blocks drift and human error. | Automation servers, QA, Lead Developers | CI jobs, contract tests, linting, integration checks. | CI job rejects PR with camelCase field not allowed by YAML spec. | <1% of PRs bypass validation gates. | +| **Runbook** | The **operational manual** — how the system lives in production, scales, and recovers. Guides on-call engineers. | DevOps, SREs, On-call Developers | K8s manifests, Helm charts, Markdown guides. Deployment, scaling, backup/restore, troubleshooting. | GitOps manifest ensuring 6 Julia replicas restart if memory >80%. | MTTR <15 minutes for P1 incidents. | --- -## Detailed Breakdown of Each Document Type +## Detailed Document Descriptions ### 1. Requirements -**Purpose**: Establish the Business North Star +**Purpose**: Capture the *business intent* — why we're building this and what success looks like. Defines boundaries and user-visible outcomes. -The Requirements document is your anchor point. It answers the fundamental question: "What problem are we solving, and how do we know we've succeeded?" +**Why It Matters**: +- Aligns engineering efforts with business goals +- Provides a north star for feature development +- Establishes acceptance criteria before implementation begins +- Creates a contract between product and engineering -**Key Characteristics**: -- **Business-Focused**: Written in business terms, not technical jargon -- **Boundary-Setting**: Explicitly defines what we will NOT build -- **Outcome-Oriented**: Focuses on user outcomes, not features +**Content Guidelines**: +- User stories with clear acceptance criteria (As a X, I want Y so that Z) +- Product Requirements Documents (PRDs) with success metrics +- Non-functional requirements (performance, security, scalability) +- Boundary definitions (what's in scope vs. out of scope) **Best Practices**: -- Include user stories that describe the user's perspective -- Document business constraints (regulatory, legal, compliance) -- Define competitive context and market positioning -- Establish clear success metrics from day one - -**Common Pitfalls to Avoid**: -- Vague descriptions like "improve user experience" -- Changing requirements without updating the document -- Not defining what's out of scope +- Link each requirement to a measurable KPI +- Keep requirements testable and verifiable +- Maintain backward compatibility with existing requirements +- Review and update requirements as business context changes --- -### 2. Spec (Specification) +### 2. Specification -**Purpose**: Create the Technical Contract +**Purpose**: The *technical contract* — precise rules for inputs, outputs, and data shape. Ensures consistency across dev and test. -The Spec serves as the Single Source of Truth for all data interfaces. It's a machine-readable definition that ensures consistency across services. +**Why It Matters**: +- Prevents implementation drift between components +- Enables contract testing in CI/CD pipelines +- Provides a single source of truth for data structures +- Facilitates integration between teams -**Key Characteristics**: -- **Machine-Readable**: Can be parsed by tools for validation and code generation -- **Strictly Typed**: Enforces data types and validation rules -- **Comprehensive**: Covers all endpoints, request/response formats, and error codes +**Content Guidelines**: +- API endpoint definitions (methods, paths, parameters) +- Request/response schemas (JSON, XML, Protobuf, AsyncAPI) +- Error codes and their meanings +- Data validation rules and constraints +- Rate limiting and quota definitions **Best Practices**: -- Use OpenAPI/Swagger for REST APIs or Protobuf for gRPC -- Enforce consistent naming conventions (e.g., snake_case) -- Define validation rules for all data fields -- Document all possible error responses - -**Common Pitfalls to Avoid**: -- Letting the spec diverge from the implementation -- Incomplete error handling documentation -- Not versioning the API spec +- Use formal specification languages (OpenAPI 3.0+, AsyncAPI) +- Version specifications alongside code +- Generate client SDKs from specifications +- Block CI on specification violations +- Document edge cases and error scenarios --- ### 3. Architecture -**Purpose**: Visualize the System Structure +**Purpose**: The *blueprint* — how components fit together, interact, and scale. Guides system structure and trade-offs. -The Architecture document provides a visual map of how components fit together. It helps identify bottlenecks and understand data flow. +**Why It Matters**: +- Provides a mental model for system design +- Guides technical decision-making and trade-off analysis +- Facilitates onboarding of new architects and senior developers +- Documents scaling and performance considerations -**Key Characteristics**: -- **Visual**: Uses diagrams to represent complex relationships -- **Comprehensive**: Covers system context, data flow, and infrastructure -- **Living Document**: Updated as the system evolves +**Content Guidelines**: +- C4 diagrams (Context, Container, Component levels) +- Mermaid.js flowcharts for sequence diagrams +- Component interaction diagrams +- Network topology and data flow +- Storage and caching strategies +- Scaling and resilience patterns **Best Practices**: -- Use Mermaid.js for diagrams-as-code (versionable in Git) -- Include multiple views: System Context, C4 model, ERDs, network topology -- Document trade-offs and architectural decisions -- Show data flow through the system - -**Common Pitfalls to Avoid**: -- Over-engineering diagrams with unnecessary detail -- Not updating diagrams when the architecture changes -- Using static images instead of diagrams-as-code +- Use diagrams that are easy to update (Mermaid.js over static images) +- Document trade-off decisions with Rationale Documents +- Include scaling considerations for each component +- Document failure modes and recovery strategies +- Keep architecture diagrams versioned with code --- ### 4. Walkthrough -**Purpose**: Build Mental Models +**Purpose**: The *story of flow* — shows how pieces connect end-to-end and why steps are sequenced. Builds intuition for new devs. -The Walkthrough document explains the "why" behind the "how." It helps developers understand the rationale behind design decisions. +**Why It Matters**: +- Reduces onboarding time for new developers +- Provides context that code comments alone cannot convey +- Explains the "why" behind architectural decisions +- Helps identify gaps in the system design -**Key Characteristics**: -- **Narrative-Driven**: Tells a story about how the system works -- **Context-Rich**: Explains trade-offs and decisions -- **End-to-End**: Traces flows from user input to system output +**Content Guidelines**: +- Step-by-step flow descriptions with rationale +- Sequence diagrams showing request/response patterns +- "Tour of the codebase" guides +- Video walkthroughs (Loom, internal recordings) +- Debugging and tracing examples **Best Practices**: -- Document step-by-step traces of core features -- Explain architectural trade-offs and why you chose them -- Include "The Big Picture" context -- Use real examples and data flows - -**Common Pitfalls to Avoid**: -- Only documenting the happy path -- Assuming developers will figure out the "why" -- Not explaining the rationale behind decisions +- Walk through real user journeys, not just technical flows +- Include "what could go wrong" scenarios +- Link walkthroughs to relevant code locations +- Keep walkthroughs updated with architecture changes +- Make walkthroughs interactive where possible --- ### 5. Implementation -**Purpose**: The Functional Reality +**Purpose**: The *real code* — business logic, helpers, tests, configs. Where design becomes executable. -The Implementation is the actual code that does the work. In SDD, the "boring" parts are auto-generated from the Spec to ensure consistency. +**Why It Matters**: +- This is the actual artifact that runs in production +- Code is the ultimate source of truth (when it matches spec) +- Tests validate correctness and prevent regressions +- Configuration files define runtime behavior -**Key Characteristics**: -- **Machine-Generated**: Types and routes auto-generated from Spec -- **Human-Written**: Business logic and helper functions -- **Tested**: Includes unit and integration tests +**Content Guidelines**: +- Business logic implementation +- Helper functions and utilities +- Unit and integration tests +- Configuration files (YAML, JSON, environment) +- Setup and development scripts +- Code organization and module structure **Best Practices**: -- Auto-generate boring parts (types, routes) from the Spec -- Keep business logic separate from boilerplate -- Maintain comprehensive test coverage -- Document the local development setup - -**Common Pitfalls to Avoid**: -- Hand-writing types that should be auto-generated -- Inconsistent code style -- Insufficient test coverage +- Follow consistent code style and conventions +- Write tests before or alongside implementation (TDD/BDD) +- Document complex logic with inline comments +- Keep configuration externalized and versioned +- Use type annotations where applicable --- ### 6. Validation -**Purpose**: Enforce the Contract +**Purpose**: The *enforcer* — ensures implementation matches the spec. Blocks drift and human error. -The Validation layer provides automated gates that ensure the Implementation matches the Spec. It prevents human error from reaching production. +**Why It Matters**: +- Prevents breaking changes from reaching production +- Catches specification violations early in the CI pipeline +- Maintains data integrity and API consistency +- Reduces manual QA effort through automation -**Key Characteristics**: -- **Automated**: Runs on every commit/Pull Request -- **Comprehensive**: Covers contract tests, integration tests, and security scans -- **Blocking**: Prevents merges that violate the contract +**Content Guidelines**: +- CI/CD pipeline configurations +- Contract testing scripts +- Linting rules and configurations +- Integration test suites +- Schema validation jobs +- Security scanning and audit jobs **Best Practices**: -- Use contract testing tools (Dredd, Prism) to validate API contracts -- Run integration tests on every commit -- Include security scans in the CI pipeline -- Fail builds on contract violations - -**Common Pitfalls to Avoid**: -- Not running tests on every commit -- Allowing manual overrides of validation gates -- Not updating tests when the Spec changes +- Fail CI on specification violations +- Run validation jobs on every commit and PR +- Use automated code review tools +- Maintain validation job health dashboard +- Document validation failure remediation steps --- -### 7. Maintenance +### 7. Runbook -**Purpose**: Ensure Long-Term Health +**Purpose**: The *operational manual* — how the system lives in production, scales, and recovers. Guides on-call engineers. -The Maintenance document defines how to upgrade dependencies, manage technical debt, and rotate secrets. It's the guide for "future-proofing" the software. +**Why It Matters**: +- Reduces Mean Time To Recovery (MTTR) for incidents +- Provides step-by-step guidance for common issues +- Documents scaling and deployment procedures +- Ensures operational knowledge is not siloed -**Key Characteristics**: -- **Procedural**: Step-by-step instructions for common tasks -- **Scheduled**: Includes regular maintenance windows -- **Documented**: Tracks technical debt and migration history +**Content Guidelines**: +- Deployment procedures (manual and automated) +- Scaling instructions (horizontal/vertical) +- Backup and restore procedures +- Troubleshooting guides for common issues +- Runbook entries for specific error codes +- Contact information and escalation paths **Best Practices**: -- Document dependency update schedules -- Create secret rotation procedures -- Track technical debt in a "Graveyard" -- Document migration history and rollback procedures - -**Common Pitfalls to Avoid**: -- Ad-hoc upgrades without documentation -- Ignoring technical debt until it becomes critical -- Not testing upgrades in staging first - ---- - -### 8. Runbook - -**Purpose**: Operational Life-Support - -The Runbook provides instructions for when the system is alive (or dying). In GitOps, this is the "Desired State" of the infrastructure. - -**Key Characteristics**: -- **Action-Oriented**: Step-by-step instructions for common operations -- **Automated**: Infrastructure as code defines the desired state -- **Crisis-Ready**: Includes "3:00 AM" troubleshooting guides - -**Best Practices**: -- Document deployment procedures -- Define scaling triggers and procedures -- Include backup and restore procedures -- Create troubleshooting guides for common issues - -**Common Pitfalls to Avoid**: -- Not documenting procedures for common issues -- Not testing runbook procedures -- Not versioning runbooks with the infrastructure +- Write runbooks for every P1/P2 incident +- Include exact commands and configuration snippets +- Test runbooks periodically (chaos engineering) +- Link runbook entries to relevant documentation +- Keep runbooks updated when system changes --- ## How to Use This Approach Effectively -### Phase 1: Foundation (Week 1-2) +### 1. Start with Requirements -1. **Create Requirements Document** - - Define the Business North Star - - Establish success metrics - - Define out-of-scope items +Before writing any code or documentation, establish clear requirements. Ask: +- What business problem are we solving? +- How will we measure success? +- What are the non-negotiable constraints? -2. **Write the Spec** - - Define all data interfaces - - Establish naming conventions - - Document validation rules +**Action**: Create a `docs/requirements/` directory and start with `PRD.md` and `KPIs.md`. -3. **Design Architecture** - - Create system diagrams - - Document data flow - - Identify potential bottlenecks +### 2. Define the Specification First -### Phase 2: Development (Week 3+) +Once requirements are stable, define the technical specification. This becomes the contract for implementation. -4. **Write Walkthrough** - - Document end-to-end flows - - Explain architectural trade-offs - - Create mental models for developers +**Action**: Create `docs/specification/` with `contract.yaml` (or appropriate format) and `error-codes.md`. -5. **Implement Code** - - Auto-generate boring parts from Spec - - Write business logic - - Implement tests +### 3. Design the Architecture -### Phase 3: Quality Assurance +With requirements and specification in place, design the architecture. Document trade-off decisions explicitly. -6. **Set Up Validation** - - Configure CI/CD pipeline - - Set up contract testing - - Configure security scans +**Action**: Create `docs/architecture/` with Mermaid diagrams and `trade-offs.md`. -7. **Create Runbook** - - Document deployment procedures - - Define scaling triggers - - Create troubleshooting guides +### 4. Create Walkthroughs Early -### Phase 4: Maintenance +As soon as the architecture is defined, create walkthroughs. This helps identify gaps and provides onboarding material. -8. **Document Maintenance** - - Create dependency update schedule - - Document secret rotation - - Track technical debt +**Action**: Create `docs/walkthrough/` with `TOUR.md` and sequence diagrams. + +### 5. Implement with Validation in Mind + +Write implementation code that adheres to the specification. Build validation into the CI pipeline from day one. + +**Action**: Ensure test files are co-located with implementation and run on every commit. + +### 6. Automate Validation + +Build automated validation that runs in CI/CD. This ensures spec compliance and prevents drift. + +**Action**: Configure CI jobs to validate against specification and block PRs on violations. + +### 7. Document Operations from Day One + +Create runbook entries as soon as deployment procedures are established. Update them when incidents occur. + +**Action**: Create `docs/runbook/` with entries for deployment, scaling, and common issues. --- -## Key Principles for Success +## GitOps Integration -1. **Separation of Concerns**: Keep business concerns separate from technical concerns -2. **Machine-Readable Contracts**: Use OpenAPI/Protobuf for specs to enable automation -3. **Automation**: Automate boring parts and validation to reduce human error -4. **Measurability**: Every document should have measurable outcomes -5. **Version Control**: Keep all documentation in Git for history and collaboration -6. **Living Documents**: Update documentation as the system evolves -7. **Audience-Focused**: Write for the intended audience's needs and knowledge level +This documentation framework aligns with GitOps principles: + +| GitOps Principle | Documentation Alignment | +|-----------------|------------------------| +| **Versioned** | All documentation lives in git, with history and audit trail | +| ** declarative** | Specifications and architecture are declarative contracts | +| **Automated** | Validation jobs automate spec compliance checks | +| **Self-Service** | Walkthroughs and runbooks enable self-service onboarding and operations | +| **Observability** | KPIs and metrics are defined for each documentation artifact | + +**Git Structure**: +``` +docs/ +├── requirements/ # PRDs, user stories, KPIs +├── specification/ # OpenAPI, Protobuf, AsyncAPI specs +├── architecture/ # C4 diagrams, Mermaid, trade-off docs +├── walkthrough/ # TOUR.md, sequence diagrams +├── implementation/ # Source code (in src/) +├── validation/ # CI configs, test suites +└── runbook/ # Deployment, scaling, troubleshooting +``` + +--- + +## Metrics and Continuous Improvement + +Each documentation artifact has associated KPIs. Track these to ensure quality: + +| Document | KPI | Target | +|----------|-----|--------| +| Requirements | Requirement coverage | 100% of features have associated requirements | +| Specification | Spec compliance rate | 100% of messages validate against spec | +| Architecture | Decision documentation | 100% of major decisions logged with trade-offs | +| Walkthrough | New dev time-to-first-PR | <2 days from onboarding to first contribution | +| Implementation | Test coverage | >80% unit test coverage | +| Validation | Bypass rate | <1% of PRs bypass validation gates | +| Runbook | MTTR | <15 minutes for P1 incidents | + +**Review Cadence**: +- Weekly: Review KPI dashboards and documentation gaps +- Monthly: Update documentation based on incident learnings +- Quarterly: Full framework review and improvement + +--- + +## Template Examples + +### Requirements Template +```markdown +# PRD: Feature Name + +## Business Goal +[What problem are we solving?] + +## Success Metrics +- [Metric 1]: Target [value] +- [Metric 2]: Target [value] + +## User Stories +- As a [role], I want [feature] so that [benefit] + - Acceptance Criteria: [details] + +## Non-Functional Requirements +- Performance: [details] +- Security: [details] +- Scalability: [details] + +## Out of Scope +- [What's explicitly excluded] +``` + +### Specification Template +```yaml +# contract.yaml +openapi: 3.0.0 +info: + title: NATSBridge API + version: 1.0.0 +paths: + /api/v1/endpoint: + post: + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/Request' + responses: + '200': + description: Success + content: + application/json: + schema: + $ref: '#/components/schemas/Response' +``` + +### Architecture Template +```mermaid +%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#3b82f6'}}}%% +flowchart TD + A[Client] --> B[Caddy] + B --> C[Node.js API] + C --> D[Julia Worker] + D --> E[NATS Cluster] + E --> F[Storage] + + style A fill:#f9f9f9,stroke:#333 + style E fill:#e0e7ff,stroke:#3b82f6 +``` + +### Runbook Template +```markdown +# Runbook: Service Restart + +**Severity**: P2 +**Estimated Time**: 5 minutes + +## Symptoms +- Service is unresponsive +- Health checks are failing + +## Steps +1. SSH to the host +2. Run: `kubectl rollout restart deployment/natsbridge` +3. Monitor: `kubectl get pods -l app=natsbridge -w` + +## Rollback +- Run: `kubectl rollout undo deployment/natsbridge` + +## Post-Incident +- [ ] Review logs for root cause +- [ ] Update runbook if needed +``` --- ## Conclusion -The SDD + GitOps Documentation Framework provides a comprehensive, structured approach to software development documentation. By following this framework, teams can ensure that: +This SDD + GitOps Documentation Framework ensures that documentation is: +- **Structured**: Seven distinct artifacts with clear purposes +- **Automated**: Validation and CI/CD integration +- **Versioned**: All documentation in git with history +- **Measurable**: KPIs for quality and effectiveness +- **Actionable**: Practical templates and examples -- Business goals are clearly defined and measurable -- Technical contracts are machine-readable and enforced -- System architecture is visualized and understood -- Developers have clear mental models of the system -- Code quality is maintained through automation -- Operations are reliable and repeatable - -This framework is not just about documentation—it's about creating a shared understanding across the entire team and ensuring that every decision is aligned with business goals. \ No newline at end of file +Use this framework as a living document—update it as your team's needs evolve. \ No newline at end of file From 7c83c06d6c68db69ffa39881c9e744ae428cbea7 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 13:35:49 +0700 Subject: [PATCH 07/29] update --- docs/requirements.md | 420 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 420 insertions(+) create mode 100644 docs/requirements.md diff --git a/docs/requirements.md b/docs/requirements.md new file mode 100644 index 0000000..02d6d64 --- /dev/null +++ b/docs/requirements.md @@ -0,0 +1,420 @@ +# Requirements Document: NATSBridge + +**Version**: 1.0.0 +**Date**: 2026-03-13 +**Status**: Active +**Ground Truth**: [`src/NATSBridge.jl`](../src/NATSBridge.jl) + +--- + +## Executive Summary + +NATSBridge is a cross-platform, bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. The system implements the **Claim-Check pattern** for efficient handling of large payloads (>1MB) by uploading them to an HTTP file server instead of sending raw binary data over NATS. + +--- + +## Business Goals + +### Primary Objectives + +1. **Cross-Platform Interoperability**: Enable seamless data exchange between Julia, JavaScript, Python, and MicroPython applications without platform-specific barriers. + +2. **Efficient Large Payload Handling**: Implement intelligent transport selection based on payload size: + - **Direct Transport**: Small payloads (<1MB) sent directly via NATS + - **Link Transport**: Large payloads (≥1MB) uploaded to HTTP file server, URL sent via NATS + +3. **Unified API Across Platforms**: Provide consistent `smartsend()` and `smartreceive()` functions across all supported platforms while maintaining idiomatic implementations. + +4. **Developer Productivity**: Reduce onboarding time and simplify integration through comprehensive documentation and test examples. + +### Success Metrics + +| Metric | Target | Measurement Method | +|--------|--------|-------------------| +| 95% of messages complete within 200ms | 95% | Synthetic monitoring | +| <2 days from onboarding to first PR | 2 days | PR timeline tracking | +| 100% of messages validate against spec | 100% | CI block rate | +| >80% unit test coverage | 80% | Test coverage tools | +| <1% of PRs bypass validation gates | 1% | CI gate analysis | +| MTTR <15 minutes for P1 incidents | 15 minutes | Incident tracking | + +--- + +## User Stories + +### Core Functionality + +| Story | Priority | Acceptance Criteria | +|-------|----------|---------------------| +| **As a Julia developer**, I want to send text messages to JavaScript applications | P1 | Text messages are serialized, encoded, and received correctly across platforms | +| **As a Python developer**, I want to send tabular data to Julia applications | P1 | DataFrame exchange works with both Arrow IPC and JSON formats | +| **As a JavaScript developer**, I want to send large files (>1MB) to other applications | P1 | Large files are automatically uploaded to file server and URLs are sent via NATS | +| **As a MicroPython developer**, I want to send sensor data with minimal memory usage | P1 | Direct transport works for payloads <100KB on memory-constrained devices | + +### Multi-Payload Support + +| Story | Priority | Acceptance Criteria | +|-------|----------|---------------------| +| **As a developer**, I want to send mixed-content messages (text + image + file) | P1 | NATSBridge accepts list of (dataname, data, type) tuples and handles each payload appropriately | +| **As a developer**, I want to receive multi-payload messages | P1 | NATSBridge returns payloads as list of tuples with correct types preserved | + +### File Server Integration + +| Story | Priority | Acceptance Criteria | +|-------|----------|---------------------| +| **As a developer**, I want to use Plik as the file server | P2 | Plik one-shot upload mode is supported with upload ID and token handling | +| **As a developer**, I want to use custom HTTP file servers | P2 | Handler function abstraction allows plugging in AWS S3 or custom implementations | + +### Reliability Features + +| Story | Priority | Acceptance Criteria | +|-------|----------|---------------------| +| **As a developer**, I want automatic retry on file server download failures | P1 | Exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) | +| **As a developer**, I want message tracing across distributed systems | P1 | Correlation ID is propagated through all message processing steps | + +--- + +## Non-Functional Requirements + +### Performance Requirements + +| Requirement | Specification | Test Method | +|-------------|---------------|-------------| +| Message serialization overhead | <50ms for 10KB payload | Benchmark tests | +| Message deserialization overhead | <50ms for 10KB payload | Benchmark tests | +| NATS connection establishment | <100ms | Connection pool benchmarks | +| File upload latency | <1s for 1MB file | Integration tests | +| File download latency | <1s for 1MB file | Integration tests | + +### Scalability Requirements + +| Requirement | Specification | +|-------------|---------------| +| Concurrent connections | Support 100+ simultaneous NATS connections | +| Message throughput | Handle 1000+ messages/second per instance | +| File server scalability | Support horizontal scaling of file server backend | + +### Reliability Requirements + +| Requirement | Specification | +|-------------|---------------| +| Message delivery | At-least-once delivery semantics via NATS | +| File server availability | Graceful degradation when file server is unavailable | +| Connection recovery | Auto-reconnect on NATS connection failure | + +### Security Requirements + +| Requirement | Specification | +|-------------|---------------| +| Payload integrity | SHA-256 checksum support via metadata | +| Transport security | TLS support for NATS connections | +| File server security | Authentication token for file uploads | + +### Compatibility Requirements + +| Platform | Minimum Version | Notes | +|----------|-----------------|-------| +| Julia | 1.7+ | Arrow.jl required for arrowtable support | +| Node.js | 16+ | nats.js required | +| Python | 3.8+ | pyarrow required for arrowtable support | +| MicroPython | 1.19+ | Limited to direct transport | + +--- + +## Out of Scope + +### Phase 1 (Current Implementation) + +| Feature | Reason | +|---------|--------| +| NATS JetStream support | Core NATS sufficient for current use cases | +| Message compression | Compression adds complexity without clear benefit | +| Message encryption | Payload encryption is application-layer concern | +| Persistent message queues | NATS request-reply pattern sufficient | +| Advanced routing rules | Simple NATS subject matching sufficient | + +### Future Considerations + +| Feature | Future Phase | +|---------|--------------| +| JetStream streams and consumers | Phase 2 | +| Message TTL and dead-letter queues | Phase 3 | +| Message tracing with OpenTelemetry | Phase 3 | +| Rate limiting and quota management | Phase 4 | + +--- + +## Boundary Definitions + +### What NATSBridge Handles + +| Function | Description | +|----------|-------------| +| Message serialization | Converts data types to binary format | +| Message encoding | Base64, JSON, Arrow IPC encoding | +| Transport selection | Direct vs link based on size threshold | +| NATS publishing | Publishes messages to NATS subjects | +| NATS subscription | Receives and processes NATS messages | +| File server upload | Uploads large payloads to HTTP server | +| File server download | Downloads payloads from HTTP server with retry | +| Correlation ID generation | Creates and propagates UUIDs | +| Data deserialization | Converts binary format back to native types | + +### What NATSBridge Does NOT Handle + +| Function | Handled By | +|----------|------------| +| NATS server management | External NATS deployment | +| File server management | External HTTP server deployment | +| Application business logic | Application code using NATSBridge | +| Message encryption | Application layer | +| Message compression | Application layer | +| Authentication/Authorization | NATS server configuration | + +--- + +## Payload Type Requirements + +### Supported Payload Types + +| Type | Julia | JavaScript | Python | MicroPython | Description | +|------|-------|------------|--------|-------------|-------------| +| `text` | `String` | `string` | `str` | `str` | Plain text strings | +| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | `dict` | JSON-serializable data | +| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` | `pandas.DataFrame` | ❌ | Tabular data (Arrow IPC) | +| `jsontable` | `Vector{NamedTuple}` | `Array` | `list[dict]` | ⚠️ | Tabular data (JSON) | +| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | Image binary data | +| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | Audio binary data | +| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | Video binary data | +| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | `bytearray` | Generic binary data | + +### Encoding Requirements + +| Payload Type | Encoding Method | Notes | +|--------------|-----------------|-------| +| `text` | UTF-8 → Base64 | Text must be String type | +| `dictionary` | JSON → Base64 | JSON.jl for Julia | +| `arrowtable` | Arrow IPC → Base64 | Requires Arrow.jl/pyarrow | +| `jsontable` | JSON → Base64 | Human-readable format | +| `image`/`audio`/`video`/`binary` | Direct → Base64 | Binary data preserved | + +--- + +## Size Threshold Requirements + +### Direct Transport Threshold + +| Platform | Threshold | Notes | +|----------|-----------|-------| +| Desktop (Julia/JS/Python) | 1MB (1,048,576 bytes) | Default size threshold | +| MicroPython | 100KB (102,400 bytes) | Lower threshold for memory constraints | + +### Maximum Payload Size + +| Platform | Maximum | Notes | +|----------|---------|-------| +| Desktop | Unlimited | Limited by NATS server configuration | +| MicroPython | 50KB | Hard limit due to 256KB-1MB memory | + +--- + +## Message Envelope Requirements + +### Required Fields + +| Field | Type | Purpose | +|-------|------|---------| +| `correlation_id` | String (UUID) | Track message flow across systems | +| `msg_id` | String (UUID) | Unique message identifier | +| `timestamp` | String (ISO 8601) | Message publication timestamp | +| `send_to` | String | NATS subject to publish to | +| `msg_purpose` | String | ACK, NACK, updateStatus, shutdown, chat | +| `sender_name` | String | Sender application name | +| `sender_id` | String (UUID) | Sender unique identifier | +| `receiver_name` | String | Receiver application name (empty = broadcast) | +| `receiver_id` | String (UUID) | Receiver unique identifier (empty = broadcast) | +| `reply_to` | String | Topic for reply messages | +| `reply_to_msg_id` | String | Message ID being replied to | +| `broker_url` | String | NATS server URL | +| `metadata` | Dict | Message-level metadata | +| `payloads` | Array | List of payload objects | + +### Payload Fields + +| Field | Type | Purpose | +|-------|------|---------| +| `id` | String (UUID) | Unique payload identifier | +| `dataname` | String | Name of the payload | +| `payload_type` | String | Type: text, dictionary, arrowtable, etc. | +| `transport` | String | direct or link | +| `encoding` | String | none, json, base64, arrow-ipc | +| `size` | Integer | Payload size in bytes | +| `data` | Any | Base64 string or URL | +| `metadata` | Dict | Payload-level metadata | + +--- + +## Error Handling Requirements + +### Error Codes + +| Error | Condition | Response | +|-------|-----------|----------| +| `Unknown payload_type` | Unsupported type | Throw error | +| `Failed to upload` | File server error | Throw error | +| `Failed to fetch` | File server unavailable | Retry with exponential backoff | +| `Unknown transport` | Invalid transport type | Throw error | +| `NATS connection failed` | NATS unavailable | Throw error | + +### Exception Handling + +| Scenario | Handler | +|----------|---------| +| File server unavailable | Retry up to 5 times with exponential backoff | +| NATS publish failure | Connection auto-reconnect | +| Deserialization error | Log correlation ID and throw error | +| Memory overflow (MicroPython) | Reject payloads >50KB | + +--- + +## Testing Requirements + +### Unit Tests + +| Test Category | Coverage | Files | +|---------------|----------|-------| +| Serialization | All payload types | `test/test_*_sender.*` | +| Deserialization | All payload types | `test/test_*_receiver.*` | +| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | +| File server upload | Plik integration | Platform-specific | +| File server download | Exponential backoff | Platform-specific | + +### Integration Tests + +| Test Scenario | Success Criteria | +|-------------|-----------------| +| Cross-platform text message | Julia ↔ JavaScript ↔ Python | +| Cross-platform tabular data | Arrow IPC round-trip | +| Large file transfer | File server upload/download | +| Multi-payload mixed content | All payload types in one message | + +--- + +## API Contract + +### smartsend Signature + +```julia +function smartsend( + subject::String, + data::AbstractArray{Tuple{String, Any, String}}; + broker_url::String = "nats://localhost:4222", + fileserver_url::String = "http://localhost:8080", + fileserver_upload_handler::Function = plik_oneshot_upload, + size_threshold::Int = 1_000_000, + correlation_id::String = string(uuid4()), + msg_purpose::String = "chat", + sender_name::String = "NATSBridge", + receiver_name::String = "", + receiver_id::String = "", + reply_to::String = "", + reply_to_msg_id::String = "", + is_publish::Bool = true, + NATS_connection::Union{NATS.Connection, Nothing} = nothing, + msg_id::String = string(uuid4()), + sender_id::String = string(uuid4()) +)::Tuple{msg_envelope_v1, String} +``` + +### smartreceive Signature + +```julia +function smartreceive( + msg::NATS.Msg; + fileserver_download_handler::Function = _fetch_with_backoff, + max_retries::Int = 5, + base_delay::Int = 100, + max_delay::Int = 5000 +)::JSON.Object{String, Any} +``` + +--- + +## Dependencies + +### Required Dependencies + +| Platform | Package | Version | +|----------|---------|---------| +| Julia | NATS.jl | Latest stable | +| Julia | JSON.jl | Latest stable | +| Julia | Arrow.jl | Latest stable | +| Julia | HTTP.jl | Latest stable | +| Julia | UUIDs.jl | Latest stable | +| Node.js | nats | Latest stable | +| Node.js | node-fetch | Latest stable | +| Python | nats-py | Latest stable | +| Python | aiohttp | Latest stable | +| Python | pyarrow | Latest stable | + +### Optional Dependencies + +| Platform | Package | Use Case | +|----------|---------|----------| +| Julia | DataFrames.jl | DataFrame support for arrowtable | +| Python | pandas | DataFrame support for arrowtable | + +--- + +## Deployment Requirements + +### Minimum Infrastructure + +| Component | Minimum | Notes | +|-----------|---------|-------| +| NATS Server | 1 instance | Single node for development | +| File Server | 1 instance | HTTP server for large payloads | +| Client Memory | 50MB | Desktop platforms | +| Client Memory | 256KB | MicroPython devices | + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `NATS_URL` | `nats://localhost:4222` | NATS server URL | +| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes | + +--- + +## Versioning + +### Current Version + +- **Major**: 1 (Breaking changes require major version bump) +- **Minor**: 0 (Feature additions) +- **Patch**: 0 (Bug fixes) + +### Version Compatibility + +| Version | Supported Platforms | +|---------|---------------------| +| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, MicroPython 1.19+ | + +--- + +## Change Log + +| Date | Version | Changes | +|------|---------|---------| +| 2026-03-13 | 1.0.0 | Initial requirements document | + +--- + +## References + +- [`src/NATSBridge.jl`](../src/NATSBridge.jl) - Ground truth implementation +- [`README.md`](../README.md) - Project overview +- [`docs/architecture.md`](./architecture.md) - Architecture documentation +- [`docs/implementation.md`](./implementation.md) - Implementation details +- [`docs/walkthrough.md`](./walkthrough.md) - Usage examples \ No newline at end of file From ba567f21fc8822ad20403ca6e07a069ed6d3eb88 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 13:43:18 +0700 Subject: [PATCH 08/29] update --- docs/requirements.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/requirements.md b/docs/requirements.md index 02d6d64..d778662 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -9,7 +9,7 @@ ## Executive Summary -NATSBridge is a cross-platform, bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. The system implements the **Claim-Check pattern** for efficient handling of large payloads (>1MB) by uploading them to an HTTP file server instead of sending raw binary data over NATS. +NATSBridge is a cross-platform, bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. The system implements the **Claim-Check pattern** for efficient handling of large payloads (>0.5MB) by uploading them to an HTTP file server instead of sending raw binary data over NATS. --- @@ -20,8 +20,8 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless 1. **Cross-Platform Interoperability**: Enable seamless data exchange between Julia, JavaScript, Python, and MicroPython applications without platform-specific barriers. 2. **Efficient Large Payload Handling**: Implement intelligent transport selection based on payload size: - - **Direct Transport**: Small payloads (<1MB) sent directly via NATS - - **Link Transport**: Large payloads (≥1MB) uploaded to HTTP file server, URL sent via NATS + - **Direct Transport**: Small payloads (<0.5MB) sent directly via NATS + - **Link Transport**: Large payloads (≥0.5MB) uploaded to HTTP file server, URL sent via NATS 3. **Unified API Across Platforms**: Provide consistent `smartsend()` and `smartreceive()` functions across all supported platforms while maintaining idiomatic implementations. @@ -48,7 +48,7 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless |-------|----------|---------------------| | **As a Julia developer**, I want to send text messages to JavaScript applications | P1 | Text messages are serialized, encoded, and received correctly across platforms | | **As a Python developer**, I want to send tabular data to Julia applications | P1 | DataFrame exchange works with both Arrow IPC and JSON formats | -| **As a JavaScript developer**, I want to send large files (>1MB) to other applications | P1 | Large files are automatically uploaded to file server and URLs are sent via NATS | +| **As a JavaScript developer**, I want to send large files (>0.5MB) to other applications | P1 | Large files are automatically uploaded to file server and URLs are sent via NATS | | **As a MicroPython developer**, I want to send sensor data with minimal memory usage | P1 | Direct transport works for payloads <100KB on memory-constrained devices | ### Multi-Payload Support @@ -83,8 +83,8 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless | Message serialization overhead | <50ms for 10KB payload | Benchmark tests | | Message deserialization overhead | <50ms for 10KB payload | Benchmark tests | | NATS connection establishment | <100ms | Connection pool benchmarks | -| File upload latency | <1s for 1MB file | Integration tests | -| File download latency | <1s for 1MB file | Integration tests | +| File upload latency | <1s for 0.5MB file | Integration tests | +| File download latency | <1s for 0.5MB file | Integration tests | ### Scalability Requirements @@ -206,7 +206,7 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless | Platform | Threshold | Notes | |----------|-----------|-------| -| Desktop (Julia/JS/Python) | 1MB (1,048,576 bytes) | Default size threshold | +| Desktop (Julia/JS/Python) | 0.5MB (1,048,576 bytes) | Default size threshold | | MicroPython | 100KB (102,400 bytes) | Lower threshold for memory constraints | ### Maximum Payload Size From ceda1b770902d4d993fec26590a5b940472d0eb1 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 13:44:20 +0700 Subject: [PATCH 09/29] update --- docs/requirements.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/requirements.md b/docs/requirements.md index d778662..f7452d3 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -206,8 +206,8 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless | Platform | Threshold | Notes | |----------|-----------|-------| -| Desktop (Julia/JS/Python) | 0.5MB (1,048,576 bytes) | Default size threshold | -| MicroPython | 100KB (102,400 bytes) | Lower threshold for memory constraints | +| Desktop (Julia/JS/Python) | 0.5MB | Default size threshold | +| MicroPython | 100KB | Lower threshold for memory constraints | ### Maximum Payload Size From a8887b1fb620e9d6f48bce34643c486c5c1ba10e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 13:53:59 +0700 Subject: [PATCH 10/29] update --- docs/requirements.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/requirements.md b/docs/requirements.md index f7452d3..dc6c0e4 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -17,7 +17,7 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless ### Primary Objectives -1. **Cross-Platform Interoperability**: Enable seamless data exchange between Julia, JavaScript, Python, and MicroPython applications without platform-specific barriers. +1. **Cross-Platform Interoperability**: Enable seamless data exchange between Julia, JavaScript (for both Server-Side rendering and Client-Side rendering webapp), Python, and MicroPython applications without platform-specific barriers. 2. **Efficient Large Payload Handling**: Implement intelligent transport selection based on payload size: - **Direct Transport**: Small payloads (<0.5MB) sent directly via NATS @@ -46,9 +46,9 @@ NATSBridge is a cross-platform, bi-directional data bridge that enables seamless | Story | Priority | Acceptance Criteria | |-------|----------|---------------------| -| **As a Julia developer**, I want to send text messages to JavaScript applications | P1 | Text messages are serialized, encoded, and received correctly across platforms | +| **As a Julia developer**, I want to send text messages to JavaScript applications that lives on a server and also on a browser | P1 | Text messages are serialized, encoded, and received correctly across platforms | | **As a Python developer**, I want to send tabular data to Julia applications | P1 | DataFrame exchange works with both Arrow IPC and JSON formats | -| **As a JavaScript developer**, I want to send large files (>0.5MB) to other applications | P1 | Large files are automatically uploaded to file server and URLs are sent via NATS | +| **As a JavaScript developer**, I want to send large files (>0.5MB) from JavaScript applications that lives on a server and also on a browser to other applications | P1 | Large files are automatically uploaded to file server and URLs are sent via NATS | | **As a MicroPython developer**, I want to send sensor data with minimal memory usage | P1 | Direct transport works for payloads <100KB on memory-constrained devices | ### Multi-Payload Support From 5369df71489ca3f41aaf128eb0f91390ac9e014e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 14:20:13 +0700 Subject: [PATCH 11/29] add spec.md --- docs/spec.md | 1025 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1025 insertions(+) create mode 100644 docs/spec.md diff --git a/docs/spec.md b/docs/spec.md new file mode 100644 index 0000000..fb79304 --- /dev/null +++ b/docs/spec.md @@ -0,0 +1,1025 @@ +# Specification: NATSBridge + +**Version**: 1.0.0 +**Date**: 2026-03-13 +**Status**: Active +**Ground Truth**: [`src/NATSBridge.jl`](../src/NATSBridge.jl) +**Specification Format**: JSON Schema + AsyncAPI + +--- + +## Executive Summary + +This document defines the **technical contract** for NATSBridge - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. + +This specification serves as the single source of truth for: +- **Inputs**: What data structures are accepted by `smartsend()` +- **Outputs**: What data structures are returned by `smartreceive()` +- **Data Shapes**: Exact field names, types, and constraints +- **Error Codes**: Standardized error responses for failure scenarios + +--- + +## Specification Versioning + +| Component | Version | Notes | +|-----------|---------|-------| +| Specification | 1.0.0 | Initial release | +| Protocol | v1 | Message envelope protocol version | + +--- + +## Message Envelope Schema + +### Envelope Structure (JSON) + +```json +{ + "correlation_id": "string (UUID)", + "msg_id": "string (UUID)", + "timestamp": "string (ISO 8601 UTC)", + "send_to": "string", + "msg_purpose": "string", + "sender_name": "string", + "sender_id": "string (UUID)", + "receiver_name": "string", + "receiver_id": "string (UUID)", + "reply_to": "string", + "reply_to_msg_id": "string", + "broker_url": "string", + "metadata": "object", + "payloads": [ + { + "id": "string (UUID)", + "dataname": "string", + "payload_type": "string", + "transport": "string", + "encoding": "string", + "size": "integer", + "data": "string or URL", + "metadata": "object" + } + ] +} +``` + +### Field Definitions + +| Field | Type | Required | Validation | Description | +|-------|------|----------|------------|-------------| +| `correlation_id` | `string` | Yes | UUID v4 format | Track message flow across distributed systems | +| `msg_id` | `string` | Yes | UUID v4 format | Unique identifier for this specific message | +| `timestamp` | `string` | Yes | ISO 8601 UTC | Message publication timestamp (e.g., `2026-03-13T07:02:50.443Z`) | +| `send_to` | `string` | Yes | Non-empty string | NATS subject/topic to publish the message to | +| `msg_purpose` | `string` | Yes | Enum | Purpose of the message (see `msg_purpose` enum) | +| `sender_name` | `string` | Yes | Non-empty string | Name of the sender application | +| `sender_id` | `string` | Yes | UUID v4 format | Unique identifier for the sender | +| `receiver_name` | `string` | Yes | Any string | Name of the receiver (empty = broadcast) | +| `receiver_id` | `string` | Yes | Any string | UUID of the receiver (empty = broadcast) | +| `reply_to` | `string` | Yes | Any string | Topic where receiver should reply (empty = no reply expected) | +| `reply_to_msg_id` | `string` | Yes | Any string | Message ID this message is replying to | +| `broker_url` | `string` | Yes | Valid URL | NATS broker URL | +| `metadata` | `object` | No | Any JSON object | Message-level metadata | +| `payloads` | `array` | Yes | Non-empty array | List of payload objects | + +--- + +## Payload Schema + +### Payload Structure (JSON) + +```json +{ + "id": "string (UUID)", + "dataname": "string", + "payload_type": "string", + "transport": "string", + "encoding": "string", + "size": "integer", + "data": "string or URL", + "metadata": "object" +} +``` + +### Payload Field Definitions + +| Field | Type | Required | Validation | Description | +|-------|------|----------|------------|-------------| +| `id` | `string` | Yes | UUID v4 format | Unique identifier for this payload | +| `dataname` | `string` | Yes | Non-empty string | Name of the payload (e.g., `login_image`, `user_data`) | +| `payload_type` | `string` | Yes | Enum | Type of payload (see `payload_type` enum) | +| `transport` | `string` | Yes | Enum | Transport method: `direct` or `link` | +| `encoding` | `string` | Yes | Enum | Encoding method (see `encoding` enum) | +| `size` | `integer` | Yes | Positive integer | Size of the payload in bytes | +| `data` | `string` or `URL` | Yes | Base64 string or URL | Payload data (base64 for direct, URL for link) | +| `metadata` | `object` | No | Any JSON object | Payload-level metadata | + +--- + +## Enumerations + +### `msg_purpose` Enum + +| Value | Description | +|-------|-------------| +| `ACK` | Acknowledgment of successful message processing | +| `NACK` | Negative acknowledgment of message processing failure | +| `updateStatus` | Status update message | +| `shutdown` | Graceful shutdown request | +| `chat` | Chat/message payload | +| `command` | Command payload | +| `event` | Event payload | + +### `payload_type` Enum + +| Value | Description | Supported Platforms | Encoding Options | +|-------|-------------|---------------------|------------------| +| `text` | Plain text string | All | `base64` | +| `dictionary` | JSON object/dictionary | All | `base64`, `json` | +| `arrowtable` | Apache Arrow IPC table | Desktop (Julia/JS/Python) | `base64`, `arrow-ipc` | +| `jsontable` | JSON array of objects | All | `base64`, `json` | +| `image` | Binary image data | All | `base64` | +| `audio` | Binary audio data | All | `base64` | +| `video` | Binary video data | All | `base64` | +| `binary` | Generic binary data | All | `base64` | + +### `transport` Enum + +| Value | Description | Data Format | Use Case | +|-------|-------------|-------------|----------| +| `direct` | Payload sent directly via NATS | Base64-encoded string | Payloads < size_threshold | +| `link` | Payload uploaded to file server | HTTP URL | Payloads ≥ size_threshold | + +### `encoding` Enum + +| Value | Description | Payload Types | +|-------|-------------|---------------| +| `none` | No additional encoding | Link transport URLs | +| `base64` | Base64 encoding | Text, binary, image, audio, video | +| `json` | JSON encoding | Dictionary, jsontable | +| `arrow-ipc` | Apache Arrow IPC format | Arrowtable | + +--- + +## Transport Protocols + +### Direct Transport Protocol + +When `transport = "direct"`, the `data` field contains a Base64-encoded string of the serialized payload. + +**Flow**: +1. Serialize payload according to `payload_type` +2. Encode serialized bytes as Base64 +3. Include Base64 string in `data` field + +**Example**: +```json +{ + "transport": "direct", + "encoding": "base64", + "size": 11, + "data": "SGVsbG8gV29ybGQ=" +} +``` + +### Link Transport Protocol + +When `transport = "link"`, the `data` field contains a URL pointing to the uploaded payload. + +**Flow**: +1. Serialize payload according to `payload_type` +2. Upload to HTTP file server (e.g., Plik) +3. Include returned URL in `data` field + +**Example**: +```json +{ + "transport": "link", + "encoding": "none", + "size": 1000000, + "data": "http://localhost:8080/file/3F62E/4AgGT/data.zip" +} +``` + +--- + +## Size Thresholds + +### Desktop Platforms (Julia/JS/Python) + +| Platform | Size Threshold | Notes | +|----------|----------------|-------| +| Desktop | 500,000 bytes (0.5MB) | Default threshold | + +### MicroPython Platform + +| Platform | Size Threshold | Maximum Payload | Notes | +|----------|----------------|-----------------|-------| +| MicroPython | 100,000 bytes (100KB) | 50,000 bytes | Hard limit due to memory constraints | + +--- + +## NATS Subject Convention + +### Subject Naming Pattern + +``` +/// +``` + +**Examples**: +- `/agent/wine/api/v1/prompt` - AI agent prompt endpoint +- `/chat/user/v1/message` - User chat message +- `/system/worker/v1/status` - Worker status update + +### Subject Wildcards + +| Wildcard | Description | Example | +|----------|-------------|---------| +| `*` | Single-level wildcard | `/chat/user/v1/*` matches `/chat/user/v1/message` | +| `>` | Multi-level wildcard | `/chat/user/v1/>` matches all `/chat/user/v1/*` subjects | + +--- + +## Error Handling + +### Error Response Format + +```json +{ + "error": { + "code": "string", + "message": "string", + "details": "object" + } +} +``` + +### Error Codes + +| Code | HTTP Status | Description | Recovery | +|------|-------------|-------------|----------| +| `INVALID_ENVELOPE` | 400 | Message envelope validation failed | Fix envelope structure | +| `INVALID_PAYLOAD_TYPE` | 400 | Unsupported payload type | Use supported payload_type | +| `INVALID_TRANSPORT` | 400 | Unsupported transport type | Use `direct` or `link` | +| `UPLOAD_FAILED` | 500 | File server upload failed | Retry or use direct transport | +| `DOWNLOAD_FAILED` | 503 | File server download failed | Retry with exponential backoff | +| `NATS_CONNECTION_FAILED` | 503 | NATS connection failed | Check NATS server availability | +| `DESERIALIZATION_ERROR` | 500 | Payload deserialization failed | Check payload_type matches data | +| `SIZE_EXCEEDED` | 413 | Payload exceeds maximum size | Split payload or use link transport | + +### Exception Handling + +| Scenario | Handler | Retry Policy | +|----------|---------|--------------| +| File server unavailable | Retry up to 5 times | Exponential backoff (100ms → 5000ms) | +| NATS publish failure | Connection auto-reconnect | TCP-level reconnection | +| Deserialization error | Log correlation ID and throw | No retry (data corruption) | +| Memory overflow (MicroPython) | Reject payloads >50KB | No retry (client-side check) | + +--- + +## Serialization Rules + +### Text Serialization + +| Platform | Input Type | Serialization | Encoding | +|----------|------------|---------------|----------| +| All | `String` | UTF-8 bytes | Base64 | + +### Dictionary Serialization + +| Platform | Input Type | Serialization | Encoding | +|----------|------------|---------------|----------| +| All | `Object`/`Dict` | JSON string | Base64 or direct JSON | + +### Arrow Table Serialization + +| Platform | Input Type | Serialization | Encoding | +|----------|------------|---------------|----------| +| Desktop | `DataFrame` | Arrow IPC stream | Base64 or arrow-ipc | +| Desktop | `Arrow.Table` | Arrow IPC stream | Base64 or arrow-ipc | +| MicroPython | ❌ | Not supported | N/A | + +### JSON Table Serialization + +| Platform | Input Type | Serialization | Encoding | +|----------|------------|---------------|----------| +| All | `Vector{Dict}`/`Array` | JSON array | Base64 or direct JSON | + +### Binary Serialization + +| Platform | Input Type | Serialization | Encoding | +|----------|------------|---------------|----------| +| All | `Uint8Array`/`Buffer`/`bytes` | Raw bytes | Base64 | + +--- + +## API Contract + +### `smartsend` Function Signature + +#### Julia + +```julia +function smartsend( + subject::String, + data::AbstractArray{Tuple{String, Any, String}}; + broker_url::String = "nats://localhost:4222", + fileserver_url::String = "http://localhost:8080", + fileserver_upload_handler::Function = plik_oneshot_upload, + size_threshold::Int = 500_000, + correlation_id::String = string(uuid4()), + msg_purpose::String = "chat", + sender_name::String = "NATSBridge", + receiver_name::String = "", + receiver_id::String = "", + reply_to::String = "", + reply_to_msg_id::String = "", + is_publish::Bool = true, + NATS_connection::Union{NATS.Connection, Nothing} = nothing, + msg_id::String = string(uuid4()), + sender_id::String = string(uuid4()) +)::Tuple{msg_envelope_v1, String} +``` + +#### Python + +```python +async def smartsend( + subject: str, + data: List[Tuple[str, Any, str]], + broker_url: str = "nats://localhost:4222", + fileserver_url: str = "http://localhost:8080", + fileserver_upload_handler: Callable = plik_oneshot_upload, + size_threshold: int = 500_000, + correlation_id: str = None, + msg_purpose: str = "chat", + sender_name: str = "NATSBridge", + receiver_name: str = "", + receiver_id: str = "", + reply_to: str = "", + reply_to_msg_id: str = "", + is_publish: bool = True, + nats_connection: Any = None, + msg_id: str = None, + sender_id: str = None +) -> Tuple[Dict, str]: +``` + +#### JavaScript (Node.js) + +```typescript +async function smartsend( + subject: string, + data: Array<[string, any, string]>, + options?: { + broker_url?: string; + fileserver_url?: string; + fileserver_upload_handler?: Function; + size_threshold?: number; + correlation_id?: string; + msg_purpose?: string; + sender_name?: string; + receiver_name?: string; + receiver_id?: string; + reply_to?: string; + reply_to_msg_id?: string; + is_publish?: boolean; + nats_connection?: NATS.Connection; + msg_id?: string; + sender_id?: string; + } +): Promise<[Object, string]>; +``` + +#### JavaScript (Browser) + +```typescript +async function smartsend( + subject: string, + data: Array<[string, any, string]>, + options?: { + broker_url?: string; + fileserver_url?: string; + fileserver_upload_handler?: Function; + size_threshold?: number; + correlation_id?: string; + msg_purpose?: string; + sender_name?: string; + receiver_name?: string; + receiver_id?: string; + reply_to?: string; + reply_to_msg_id?: string; + is_publish?: boolean; + nats_connection?: NATS.Connection; + msg_id?: string; + sender_id?: string; + } +): Promise<[Object, string]>; +``` + +#### MicroPython + +```python +def smartsend( + subject: str, + data: List[Tuple[str, Any, str]], + **kwargs +) -> Tuple[Dict, str]: +``` + +### `smartreceive` Function Signature + +#### Julia + +```julia +function smartreceive( + msg::NATS.Msg; + fileserver_download_handler::Function = _fetch_with_backoff, + max_retries::Int = 5, + base_delay::Int = 100, + max_delay::Int = 5000 +)::JSON.Object{String, Any} +``` + +#### Python + +```python +async def smartreceive( + msg: Any, + fileserver_download_handler: Callable = fetch_with_backoff, + max_retries: int = 5, + base_delay: int = 100, + max_delay: int = 5000 +) -> Dict[str, Any]: +``` + +#### JavaScript (Node.js) + +```typescript +async function smartreceive( + msg: Object, + options?: { + fileserver_download_handler?: Function; + max_retries?: number; + base_delay?: number; + max_delay?: number; + } +): Promise; +``` + +#### JavaScript (Browser) + +```typescript +async function smartreceive( + msg: Object, + options?: { + fileserver_download_handler?: Function; + max_retries?: number; + base_delay?: number; + max_delay?: number; + } +): Promise; +``` + +#### MicroPython + +```python +def smartreceive(msg: Any, **kwargs) -> Dict[str, Any]: +``` + +--- + +## File Server Interface + +### Upload Handler Contract + +**Function Signature**: +```julia +function fileserver_upload_handler( + file_server_url::String, + dataname::String, + data::Vector{UInt8} +)::Dict{String, Any} +``` + +**Return Format**: +```json +{ + "status": 200, + "uploadid": "string", + "fileid": "string", + "url": "string" +} +``` + +**Required Keys**: +| Key | Type | Description | +|-----|------|-------------| +| `status` | `integer` | HTTP response status code | +| `uploadid` | `string` | Upload session identifier | +| `fileid` | `string` | File identifier within session | +| `url` | `string` | Full download URL | + +### Download Handler Contract + +**Function Signature**: +```julia +function fileserver_download_handler( + url::String, + max_retries::Int, + base_delay::Int, + max_delay::Int, + correlation_id::String +)::Vector{UInt8} +``` + +**Retry Policy**: +- Initial delay: `base_delay` milliseconds +- Maximum delay: `max_delay` milliseconds +- Multiplier: 2x per retry +- Maximum retries: `max_retries` + +--- + +## Platform-Specific Constraints + +### Desktop (Julia/JS/Python) + +| Feature | Status | Notes | +|---------|--------|-------| +| Arrow IPC | ✅ Supported | Requires Arrow.jl/pyarrow | +| JSON table | ✅ Supported | Human-readable format | +| File server upload | ✅ Supported | HTTP/HTTPS | +| File server download | ✅ Supported | HTTP/HTTPS | +| Size threshold | 500KB | Configurable | + +### MicroPython + +| Feature | Status | Notes | +|---------|--------|-------| +| Arrow IPC | ❌ Not supported | Memory constraints | +| JSON table | ⚠️ Limited | Only direct transport | +| File server upload | ❌ Not implemented | Placeholder only | +| File server download | ❌ Not implemented | Placeholder only | +| Size threshold | 100KB | Hard limit enforced | +| Max payload | 50KB | Hard limit enforced | + +--- + +## Message Flow + +### Sending Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 1. User calls smartsend(subject, data) │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 2. For each payload: │ +│ - Serialize data according to payload_type │ +│ - Calculate serialized size │ +└─────────────────────────────────────────────────────────────────┘ + │ + ├─ Size < Threshold ────────────────►┐ + │ │ + ▼ ▼ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 3. Direct Transport: │ │ +│ - Encode as Base64 │ │ +│ - Include in payload.data │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 4. Build envelope with metadata │ │ +│ - correlation_id, msg_id, timestamp │ │ +│ - sender/receiver info │ │ +│ - payloads array │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 5. Convert envelope to JSON string │ │ +│ 6. Publish to NATS subject │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 7. Return envelope and JSON string to caller │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ +``` + +### Receiving Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 1. NATS message arrives │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 2. Parse JSON envelope │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 3. For each payload: │ +│ - Check transport type │ +└─────────────────────────────────────────────────────────────────┘ + │ + ├─ transport == "direct" ──────────►┐ + │ │ + ▼ ▼ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 4. Direct Transport: │ │ +│ - Extract Base64 data │ │ +│ - Decode Base64 │ │ +│ - Deserialize based on payload_type │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 5. Link Transport: │ │ +│ - Extract URL from data │ │ +│ - Fetch with exponential backoff │ │ +│ - Deserialize based on payload_type │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 6. Replace payloads array with deserialized tuples │ │ +│ - [(dataname, data, type), ...] │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ +┌─────────────────────────────────────────────────────────────────┐ +│ 7. Return envelope with processed payloads │ +└─────────────────────────────────────────────────────────────────┘ +``` + +--- + +## Validation Rules + +### Envelope Validation + +| Rule | Condition | Error Code | +|------|-----------|------------| +| Required fields present | `correlation_id`, `msg_id`, `timestamp`, `send_to`, `payloads` | `INVALID_ENVELOPE` | +| Valid UUID format | `correlation_id`, `msg_id`, `sender_id`, `receiver_id` | `INVALID_ENVELOPE` | +| Valid timestamp format | ISO 8601 UTC | `INVALID_ENVELOPE` | +| Non-empty payloads array | `length(payloads) > 0` | `INVALID_ENVELOPE` | + +### Payload Validation + +| Rule | Condition | Error Code | +|------|-----------|------------| +| Valid payload_type | Must be in `payload_type` enum | `INVALID_PAYLOAD_TYPE` | +| Valid transport | Must be `direct` or `link` | `INVALID_TRANSPORT` | +| Valid encoding | Must match payload_type and transport | `INVALID_TRANSPORT` | +| Positive size | `size > 0` | `INVALID_PAYLOAD` | +| Valid Base64 for direct | `data` matches Base64 pattern | `DESERIALIZATION_ERROR` | +| Valid URL for link | `data` matches HTTP(S) URL pattern | `DOWNLOAD_FAILED` | + +--- + +## Test Contracts + +### Unit Test Validation + +| Test | Input | Expected Output | +|------|-------|-----------------| +| Text round-trip | `("msg", "Hello", "text")` | `("msg", "Hello", "text")` | +| Dictionary round-trip | `("data", {"key": "value"}, "dictionary")` | `("data", {"key": "value"}, "dictionary")` | +| Arrow table round-trip | `("table", DataFrame(...), "arrowtable")` | `("table", DataFrame(...), "arrowtable")` | +| Mixed payloads | `[("text", "Hello", "text"), ("img", bytes, "image")]` | `[("text", "Hello", "text"), ("img", bytes, "image")]` | +| Large payload | `("data", rand(10_000_000), "arrowtable")` | `("data", URL, "arrowtable")` with link transport | + +### Integration Test Scenarios + +| Scenario | Platforms | Expected Result | +|----------|-----------|-----------------| +| Julia ↔ JavaScript | Text, dictionary | Round-trip successful | +| Python ↔ Julia | Arrow table | Arrow IPC round-trip | +| JavaScript ↔ Python | Mixed content | All payloads preserved | +| Large file transfer | All platforms | File server upload/download | + +--- + +## Dependencies + +### Required Dependencies by Platform + +| Platform | Package | Version | Purpose | +|----------|---------|---------|---------| +| Julia | NATS.jl | Latest | NATS client | +| Julia | JSON.jl | Latest | JSON serialization | +| Julia | Arrow.jl | Latest | Arrow IPC support | +| Julia | HTTP.jl | Latest | HTTP file server | +| Julia | UUIDs.jl | Latest | UUID generation | +| Node.js | nats | Latest | NATS client | +| Node.js | node-fetch | Latest | HTTP file server | +| Python | nats-py | Latest | NATS client | +| Python | aiohttp | Latest | HTTP file server | +| Python | pyarrow | Latest | Arrow IPC support | +| MicroPython | builtin | N/A | Limited implementation | + +### Optional Dependencies + +| Platform | Package | Purpose | +|----------|---------|---------| +| Julia | DataFrames.jl | DataFrame support | +| Python | pandas | DataFrame support | + +--- + +## Change Log + +| Date | Version | Changes | +|------|---------|---------| +| 2026-03-13 | 1.0.0 | Initial specification | +| - | - | Message envelope schema defined | +| - | - | Payload schema with transport modes | +| - | - | Enumerations for payload_type, transport, encoding | +| - | - | Size thresholds for desktop/MicroPython | +| - | - | Error codes and validation rules | +| - | - | API contracts for all platforms | + +--- + +## References + +- [`docs/requirements.md`](./requirements.md) - Business requirements and user stories +- [`docs/architecture.md`](./architecture.md) - System architecture diagrams +- [`docs/implementation.md`](./implementation.md) - Implementation details +- [`src/NATSBridge.jl`](../src/NATSBridge.jl) - Ground truth implementation +- [`README.md`](../README.md) - Project overview + +--- + +## Appendix + +### A. Complete JSON Schema + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "title": "NATSBridge Envelope", + "type": "object", + "properties": { + "correlation_id": { + "type": "string", + "pattern": "^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}$", + "description": "UUID v4 format for tracking message flow" + }, + "msg_id": { + "type": "string", + "pattern": "^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}$", + "description": "Unique message identifier" + }, + "timestamp": { + "type": "string", + "pattern": "^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}Z$", + "description": "ISO 8601 UTC timestamp" + }, + "send_to": { + "type": "string", + "minLength": 1, + "description": "NATS subject to publish to" + }, + "msg_purpose": { + "type": "string", + "enum": ["ACK", "NACK", "updateStatus", "shutdown", "chat", "command", "event"], + "description": "Purpose of the message" + }, + "sender_name": { + "type": "string", + "minLength": 1, + "description": "Sender application name" + }, + "sender_id": { + "type": "string", + "pattern": "^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}$", + "description": "Sender UUID" + }, + "receiver_name": { + "type": "string", + "description": "Receiver name (empty = broadcast)" + }, + "receiver_id": { + "type": "string", + "pattern": "^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}$|^$", + "description": "Receiver UUID (empty = broadcast)" + }, + "reply_to": { + "type": "string", + "description": "Topic for reply messages" + }, + "reply_to_msg_id": { + "type": "string", + "description": "Message ID being replied to" + }, + "broker_url": { + "type": "string", + "pattern": "^nats://[^\\s]+$", + "description": "NATS broker URL" + }, + "metadata": { + "type": "object", + "description": "Message-level metadata" + }, + "payloads": { + "type": "array", + "minItems": 1, + "items": { + "$ref": "#/definitions/Payload" + } + } + }, + "required": ["correlation_id", "msg_id", "timestamp", "send_to", "msg_purpose", "sender_name", "sender_id", "receiver_name", "receiver_id", "reply_to", "reply_to_msg_id", "broker_url", "payloads"], + "definitions": { + "Payload": { + "type": "object", + "properties": { + "id": { + "type": "string", + "pattern": "^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}$" + }, + "dataname": { + "type": "string", + "minLength": 1 + }, + "payload_type": { + "type": "string", + "enum": ["text", "dictionary", "arrowtable", "jsontable", "image", "audio", "video", "binary"] + }, + "transport": { + "type": "string", + "enum": ["direct", "link"] + }, + "encoding": { + "type": "string", + "enum": ["none", "base64", "json", "arrow-ipc"] + }, + "size": { + "type": "integer", + "minimum": 1 + }, + "data": { + "anyOf": [ + { + "type": "string", + "pattern": "^(https?://[^\\s]+)$" + }, + { + "type": "string", + "pattern": "^[A-Za-z0-9+/]+=*$" + } + ] + }, + "metadata": { + "type": "object" + } + }, + "required": ["id", "dataname", "payload_type", "transport", "encoding", "size", "data"] + } + } +} +``` + +### B. AsyncAPI Specification (NATS) + +```yaml +asyncapi: '2.6.0' +info: + title: NATSBridge API + version: '1.0.0' + description: Cross-platform bi-directional data bridge using NATS + contact: + name: NATSBridge Team + url: https://github.com/your-org/NATSBridge + license: + name: MIT + url: https://opensource.org/licenses/MIT +channels: + /agent/{service}/api/v{version}/{operation}: + address: /agent/{service}/api/v{version}/{operation} + parameters: + service: + schema: + type: string + version: + schema: + type: string + enum: ['v1'] + operation: + schema: + type: string + publish: + summary: Publish message to NATS + operationId: publishMessage + message: + $ref: '#/components/message' + subscribe: + summary: Subscribe to NATS messages + operationId: subscribeMessage + message: + $ref: '#/components/message' +components: + message: + payload: + $ref: '#/components/schemas/Envelope' + schemas: + Envelope: + type: object + properties: + correlation_id: + type: string + format: uuid + msg_id: + type: string + format: uuid + timestamp: + type: string + format: date-time + send_to: + type: string + msg_purpose: + type: string + enum: [ACK, NACK, updateStatus, shutdown, chat, command, event] + sender_name: + type: string + sender_id: + type: string + format: uuid + receiver_name: + type: string + receiver_id: + type: string + format: uuid + reply_to: + type: string + reply_to_msg_id: + type: string + broker_url: + type: string + metadata: + type: object + payloads: + type: array + items: + $ref: '#/components/schemas/Payload' + required: + - correlation_id + - msg_id + - timestamp + - send_to + - msg_purpose + - sender_name + - sender_id + - receiver_name + - receiver_id + - reply_to + - reply_to_msg_id + - broker_url + - payloads + Payload: + type: object + properties: + id: + type: string + format: uuid + dataname: + type: string + payload_type: + type: string + enum: [text, dictionary, arrowtable, jsontable, image, audio, video, binary] + transport: + type: string + enum: [direct, link] + encoding: + type: string + enum: [none, base64, json, arrow-ipc] + size: + type: integer + minimum: 1 + data: + type: string + metadata: + type: object + required: + - id + - dataname + - payload_type + - transport + - encoding + - size + - data +``` + +--- + +*This specification is versioned and maintained in git alongside the codebase. All implementations must adhere to this specification.* From f045c2faef3f133b3f6cff6f37dc43c0bc4f9bc7 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 15:47:04 +0700 Subject: [PATCH 12/29] update --- docs/spec.md | 115 +++++++++++++++------------------------------------ 1 file changed, 33 insertions(+), 82 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index fb79304..208ddf3 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -572,92 +572,43 @@ function fileserver_download_handler( ### Sending Flow -``` -┌─────────────────────────────────────────────────────────────────┐ -│ 1. User calls smartsend(subject, data) │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 2. For each payload: │ -│ - Serialize data according to payload_type │ -│ - Calculate serialized size │ -└─────────────────────────────────────────────────────────────────┘ - │ - ├─ Size < Threshold ────────────────►┐ - │ │ - ▼ ▼ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 3. Direct Transport: │ │ -│ - Encode as Base64 │ │ -│ - Include in payload.data │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 4. Build envelope with metadata │ │ -│ - correlation_id, msg_id, timestamp │ │ -│ - sender/receiver info │ │ -│ - payloads array │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 5. Convert envelope to JSON string │ │ -│ 6. Publish to NATS subject │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 7. Return envelope and JSON string to caller │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ +```mermaid +flowchart TD + A[User calls smartsend(subject, data)] --> B[Serialize payload according to payload_type] + B --> C[Calculate serialized size] + C --> D{Size < Threshold?} + D -->|Yes| E[Direct Transport: Base64 encode] + D -->|No| F[Link Transport: Upload to file server] + E --> G[Build envelope with metadata] + F --> G + G --> H[Convert to JSON string] + H --> I[Publish to NATS subject] + I --> J[Return envelope and JSON string] + + style A fill:#f9f9f9,stroke:#333 + style D fill:#e0e7ff,stroke:#3b82f6 + style I fill:#e0e7ff,stroke:#3b82f6 ``` ### Receiving Flow -``` -┌─────────────────────────────────────────────────────────────────┐ -│ 1. NATS message arrives │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 2. Parse JSON envelope │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 3. For each payload: │ -│ - Check transport type │ -└─────────────────────────────────────────────────────────────────┘ - │ - ├─ transport == "direct" ──────────►┐ - │ │ - ▼ ▼ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 4. Direct Transport: │ │ -│ - Extract Base64 data │ │ -│ - Decode Base64 │ │ -│ - Deserialize based on payload_type │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 5. Link Transport: │ │ -│ - Extract URL from data │ │ -│ - Fetch with exponential backoff │ │ -│ - Deserialize based on payload_type │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 6. Replace payloads array with deserialized tuples │ │ -│ - [(dataname, data, type), ...] │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ -┌─────────────────────────────────────────────────────────────────┐ -│ 7. Return envelope with processed payloads │ -└─────────────────────────────────────────────────────────────────┘ +```mermaid +flowchart TD + A[NATS message arrives] --> B[Parse JSON envelope] + B --> C[Process each payload] + C --> D{Check transport type} + D -->|direct| E[Extract Base64 data] + D -->|link| F[Extract URL from data] + E --> G[Decode Base64] + F --> H[Fetch with exponential backoff] + G --> I[Deserialize based on payload_type] + H --> I + I --> J[Build payloads array] + J --> K[Return envelope with processed payloads] + + style A fill:#f9f9f9,stroke:#333 + style D fill:#e0e7ff,stroke:#3b82f6 + style K fill:#e0e7ff,stroke:#3b82f6 ``` --- From 42fffb8a4f166640e4c2136f97648270a35395d2 Mon Sep 17 00:00:00 2001 From: ton Date: Fri, 13 Mar 2026 08:49:38 +0000 Subject: [PATCH 13/29] revert f045c2faef3f133b3f6cff6f37dc43c0bc4f9bc7 revert update --- docs/spec.md | 115 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 82 insertions(+), 33 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 208ddf3..fb79304 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -572,43 +572,92 @@ function fileserver_download_handler( ### Sending Flow -```mermaid -flowchart TD - A[User calls smartsend(subject, data)] --> B[Serialize payload according to payload_type] - B --> C[Calculate serialized size] - C --> D{Size < Threshold?} - D -->|Yes| E[Direct Transport: Base64 encode] - D -->|No| F[Link Transport: Upload to file server] - E --> G[Build envelope with metadata] - F --> G - G --> H[Convert to JSON string] - H --> I[Publish to NATS subject] - I --> J[Return envelope and JSON string] - - style A fill:#f9f9f9,stroke:#333 - style D fill:#e0e7ff,stroke:#3b82f6 - style I fill:#e0e7ff,stroke:#3b82f6 +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 1. User calls smartsend(subject, data) │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 2. For each payload: │ +│ - Serialize data according to payload_type │ +│ - Calculate serialized size │ +└─────────────────────────────────────────────────────────────────┘ + │ + ├─ Size < Threshold ────────────────►┐ + │ │ + ▼ ▼ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 3. Direct Transport: │ │ +│ - Encode as Base64 │ │ +│ - Include in payload.data │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 4. Build envelope with metadata │ │ +│ - correlation_id, msg_id, timestamp │ │ +│ - sender/receiver info │ │ +│ - payloads array │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 5. Convert envelope to JSON string │ │ +│ 6. Publish to NATS subject │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 7. Return envelope and JSON string to caller │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ ``` ### Receiving Flow -```mermaid -flowchart TD - A[NATS message arrives] --> B[Parse JSON envelope] - B --> C[Process each payload] - C --> D{Check transport type} - D -->|direct| E[Extract Base64 data] - D -->|link| F[Extract URL from data] - E --> G[Decode Base64] - F --> H[Fetch with exponential backoff] - G --> I[Deserialize based on payload_type] - H --> I - I --> J[Build payloads array] - J --> K[Return envelope with processed payloads] - - style A fill:#f9f9f9,stroke:#333 - style D fill:#e0e7ff,stroke:#3b82f6 - style K fill:#e0e7ff,stroke:#3b82f6 +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 1. NATS message arrives │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 2. Parse JSON envelope │ +└─────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ 3. For each payload: │ +│ - Check transport type │ +└─────────────────────────────────────────────────────────────────┘ + │ + ├─ transport == "direct" ──────────►┐ + │ │ + ▼ ▼ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 4. Direct Transport: │ │ +│ - Extract Base64 data │ │ +│ - Decode Base64 │ │ +│ - Deserialize based on payload_type │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 5. Link Transport: │ │ +│ - Extract URL from data │ │ +│ - Fetch with exponential backoff │ │ +│ - Deserialize based on payload_type │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ │ + ▼ │ +┌─────────────────────────────────────────────────────────────────┐ │ +│ 6. Replace payloads array with deserialized tuples │ │ +│ - [(dataname, data, type), ...] │ │ +└─────────────────────────────────────────────────────────────────┘ │ + │ +┌─────────────────────────────────────────────────────────────────┐ +│ 7. Return envelope with processed payloads │ +└─────────────────────────────────────────────────────────────────┘ ``` --- From e99fb09298b188673e602b8214bcde38a16bbd5f Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 15:57:27 +0700 Subject: [PATCH 14/29] mermaid diagram --- docs/spec.md | 116 +++++++++++++++------------------------------------ 1 file changed, 34 insertions(+), 82 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index fb79304..003de16 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -572,92 +572,44 @@ function fileserver_download_handler( ### Sending Flow -``` -┌─────────────────────────────────────────────────────────────────┐ -│ 1. User calls smartsend(subject, data) │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 2. For each payload: │ -│ - Serialize data according to payload_type │ -│ - Calculate serialized size │ -└─────────────────────────────────────────────────────────────────┘ - │ - ├─ Size < Threshold ────────────────►┐ - │ │ - ▼ ▼ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 3. Direct Transport: │ │ -│ - Encode as Base64 │ │ -│ - Include in payload.data │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 4. Build envelope with metadata │ │ -│ - correlation_id, msg_id, timestamp │ │ -│ - sender/receiver info │ │ -│ - payloads array │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 5. Convert envelope to JSON string │ │ -│ 6. Publish to NATS subject │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 7. Return envelope and JSON string to caller │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ +```mermaid +flowchart TD + A[User calls smartsend\(subject, data\)] --> B[Serialize payload according to payload_type] + B --> C{Calculate serialized size} + C -->|Size < Threshold| D[Direct Transport: Encode as Base64] + C -->|Size >= Threshold| E[Link Transport: Upload to file server] + D --> F[Build envelope with metadata] + E --> F + F --> G[Convert envelope to JSON string] + G --> H[Publish to NATS subject] + H --> I[Return envelope and JSON string to caller] + + style A fill:#f9f9f9,stroke:#333 + style I fill:#e0e7ff,stroke:#3b82f6 + style D fill:#d1fae5,stroke:#10b981 + style E fill:#fef3c7,stroke:#f59e0b ``` ### Receiving Flow -``` -┌─────────────────────────────────────────────────────────────────┐ -│ 1. NATS message arrives │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 2. Parse JSON envelope │ -└─────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ 3. For each payload: │ -│ - Check transport type │ -└─────────────────────────────────────────────────────────────────┘ - │ - ├─ transport == "direct" ──────────►┐ - │ │ - ▼ ▼ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 4. Direct Transport: │ │ -│ - Extract Base64 data │ │ -│ - Decode Base64 │ │ -│ - Deserialize based on payload_type │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 5. Link Transport: │ │ -│ - Extract URL from data │ │ -│ - Fetch with exponential backoff │ │ -│ - Deserialize based on payload_type │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ │ - ▼ │ -┌─────────────────────────────────────────────────────────────────┐ │ -│ 6. Replace payloads array with deserialized tuples │ │ -│ - [(dataname, data, type), ...] │ │ -└─────────────────────────────────────────────────────────────────┘ │ - │ -┌─────────────────────────────────────────────────────────────────┐ -│ 7. Return envelope with processed payloads │ -└─────────────────────────────────────────────────────────────────┘ +```mermaid +flowchart TD + A[NATS message arrives] --> B[Parse JSON envelope] + B --> C[For each payload: Check transport type] + C -->|transport == direct| D[Direct Transport: Extract Base64] + C -->|transport == link| E[Link Transport: Fetch from URL] + D --> F[Decode Base64] + E --> G[Fetch with exponential backoff] + F --> H[Deserialize based on payload_type] + G --> H + H --> I[Build payloads array] + I --> J[Replace payloads array with deserialized tuples] + J --> K[Return envelope with processed payloads] + + style A fill:#f9f9f9,stroke:#333 + style K fill:#e0e7ff,stroke:#3b82f6 + style D fill:#d1fae5,stroke:#10b981 + style E fill:#fef3c7,stroke:#f59e0b ``` --- From e4d668cebb541ceaa7afd451e101a56589be873d Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 16:02:39 +0700 Subject: [PATCH 15/29] fix Sending Flow mermaid code --- docs/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/spec.md b/docs/spec.md index 003de16..27a268c 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -574,7 +574,7 @@ function fileserver_download_handler( ```mermaid flowchart TD - A[User calls smartsend\(subject, data\)] --> B[Serialize payload according to payload_type] + A[User calls smartsend subject data] --> B[Serialize payload according to payload_type] B --> C{Calculate serialized size} C -->|Size < Threshold| D[Direct Transport: Encode as Base64] C -->|Size >= Threshold| E[Link Transport: Upload to file server] From d345ddbe86c0a86b7cca34d30dfcec20929b2a6e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 16:27:49 +0700 Subject: [PATCH 16/29] update --- docs/spec.md | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 27a268c..07b40d4 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -306,6 +306,7 @@ When `transport = "link"`, the `data` field contains a URL pointing to the uploa | Platform | Input Type | Serialization | Encoding | |----------|------------|---------------|----------| | All | `Vector{Dict}`/`Array` | JSON array | Base64 or direct JSON | +| Desktop | `pandas.DataFrame` | JSON array | Base64 or direct JSON | ### Binary Serialization @@ -642,13 +643,20 @@ flowchart TD ### Unit Test Validation -| Test | Input | Expected Output | -|------|-------|-----------------| -| Text round-trip | `("msg", "Hello", "text")` | `("msg", "Hello", "text")` | -| Dictionary round-trip | `("data", {"key": "value"}, "dictionary")` | `("data", {"key": "value"}, "dictionary")` | -| Arrow table round-trip | `("table", DataFrame(...), "arrowtable")` | `("table", DataFrame(...), "arrowtable")` | -| Mixed payloads | `[("text", "Hello", "text"), ("img", bytes, "image")]` | `[("text", "Hello", "text"), ("img", bytes, "image")]` | -| Large payload | `("data", rand(10_000_000), "arrowtable")` | `("data", URL, "arrowtable")` with link transport | +| Test | Input | Expected Output | Notes | +|------|-------|-----------------|-------| +| Text round-trip | `("msg", "Hello", "text")` | `("msg", "Hello", "text")` | String serialization | +| Dictionary round-trip | `("data", {"key": "value"}, "dictionary")` | `("data", {"key": "value"}, "dictionary")` | JSON object round-trip | +| Arrow table round-trip | `("table", arrow_table_data, "arrowtable")` | `("table", arrow_table_data, "arrowtable")` | Arrow IPC round-trip | +| JSON table round-trip | `("table", [{"a":1},{"b":2}], "jsontable")` | `("table", [{"a":1},{"b":2}], "jsontable")` | JSON array of objects | +| Mixed payloads | `[("text", "Hello", "text"), ("img", bytes, "image")]` | `[("text", "Hello", "text"), ("img", bytes, "image")]` | Multiple payload types | +| Large payload | `("data", rand(10_000_000), "arrowtable")` | `("data", URL, "arrowtable")` with link transport | File server upload | + +**Platform-Specific Notes:** +- **Julia**: Use `Dict`, `Vector{Dict}`, or convert `DataFrame` to dictionary for testing +- **Python**: Use `dict`, `list[dict]`, or convert `pandas.DataFrame` to dictionary for testing +- **JavaScript**: Use plain objects `{}` and arrays `[]` +- **MicroPython**: Use plain `dict` and `list` (limited to JSON table and text types) ### Integration Test Scenarios @@ -658,6 +666,8 @@ flowchart TD | Python ↔ Julia | Arrow table | Arrow IPC round-trip | | JavaScript ↔ Python | Mixed content | All payloads preserved | | Large file transfer | All platforms | File server upload/download | +| Cross-platform JSON table | All platforms | Dictionary array round-trip | +| MicroPython ↔ Desktop | Text, dictionary only | Limited payload types | --- From 1b41d2d3e619a24da2216f767daf287c36f0fd44 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 17:05:45 +0700 Subject: [PATCH 17/29] updata --- docs/spec.md | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/docs/spec.md b/docs/spec.md index 07b40d4..e9eac22 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -116,6 +116,73 @@ This specification serves as the single source of truth for: --- +## Payload Format + +### Tuple Format for `smartsend()` + +The `smartsend()` function accepts data as an array of tuples with the format: + +``` +("data_name", data, "data_type") +``` + +| Position | Type | Description | Example | +|----------|------|-------------|---------| +| 1 | `string` | Data name - identifier for the payload | `"msg"`, `"login_image"`, `"user_data"` | +| 2 | `any` | Actual data - content to be serialized | `"Hello"`, `{"key": "value"}`, `DataFrame(...)` | +| 3 | `string` | Data type - must be in `payload_type` enum | `"text"`, `"dictionary"`, `"arrowtable"` | + +### Single Payload Example + +```julia +# Julia +smartsend("/chat/user/v1/message", [("msg", "Hello World", "text")]) +``` + +```python +# Python +await smartsend("/chat/user/v1/message", [("msg", "Hello World", "text")]) +``` + +```typescript +// JavaScript +await smartsend("/chat/user/v1/message", [["msg", "Hello World", "text"]]); +``` + +### Multiple Payloads Example + +```julia +# Julia - Mixed text and binary data +data = [ + ("msg", "Hello", "text"), + ("img", binary_data, "image") +] +smartsend("/agent/v1/process", data) +``` + +```python +# Python - Mixed types +data = [ + ("msg", "Hello", "text"), + ("img", binary_data, "image") +] +await smartsend("/agent/v1/process", data) +``` + +### Data Type Mapping + +| Platform | Input Type | Data Type String | +|----------|------------|------------------| +| All | `String` | `"text"` | +| All | `Dict`/`Object` | `"dictionary"` | +| Desktop | `DataFrame` | `"arrowtable"` or `"jsontable"` | +| All | `Array` of objects | `"jsontable"` | +| All | `Uint8Array`/`Buffer`/`bytes` | `"binary"` | +| Desktop | `Arrow.Table` | `"arrowtable"` | +| All | Image/Audio/Video binary | `"image"`, `"audio"`, `"video"` | + +--- + ## Enumerations ### `msg_purpose` Enum @@ -649,7 +716,7 @@ flowchart TD | Dictionary round-trip | `("data", {"key": "value"}, "dictionary")` | `("data", {"key": "value"}, "dictionary")` | JSON object round-trip | | Arrow table round-trip | `("table", arrow_table_data, "arrowtable")` | `("table", arrow_table_data, "arrowtable")` | Arrow IPC round-trip | | JSON table round-trip | `("table", [{"a":1},{"b":2}], "jsontable")` | `("table", [{"a":1},{"b":2}], "jsontable")` | JSON array of objects | -| Mixed payloads | `[("text", "Hello", "text"), ("img", bytes, "image")]` | `[("text", "Hello", "text"), ("img", bytes, "image")]` | Multiple payload types | +| Mixed payloads | `[("msg", "Hello", "text"), ("imgname", bytes, "binary")]` | `[("msg", "Hello", "text"), ("imgname", bytes, "binary")]` | Multiple payload types | | Large payload | `("data", rand(10_000_000), "arrowtable")` | `("data", URL, "arrowtable")` with link transport | File server upload | **Platform-Specific Notes:** From 24d818bfe144cd235bb41c5e48f13c086570200d Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 17:19:11 +0700 Subject: [PATCH 18/29] update test --- docs/spec.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index e9eac22..20b113f 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -727,14 +727,18 @@ flowchart TD ### Integration Test Scenarios -| Scenario | Platforms | Expected Result | -|----------|-----------|-----------------| -| Julia ↔ JavaScript | Text, dictionary | Round-trip successful | -| Python ↔ Julia | Arrow table | Arrow IPC round-trip | -| JavaScript ↔ Python | Mixed content | All payloads preserved | -| Large file transfer | All platforms | File server upload/download | -| Cross-platform JSON table | All platforms | Dictionary array round-trip | -| MicroPython ↔ Desktop | Text, dictionary only | Limited payload types | +| Scenario | Platforms | Payload Type | Size | Transport | Expected Result | +|----------|-----------|--------------|------|-----------|-----------------| +| Single text payload | All | text | Small | direct | Round-trip successful | +| Single dictionary payload | All | dictionary | Small | direct | Round-trip successful | +| Single arrow table | Julia/JS/Python | arrowtable | Small | direct | Arrow IPC round-trip | +| Single JSON table | All | jsontable | Small | direct | Dictionary array round-trip | +| Mixed payloads (text + image) | All | text + image | Small | direct | All payloads preserved | +| Mixed payloads (dict + binary) | All | dictionary + binary | Small | direct | All payloads preserved | +| Large file (link transport) | All | arrowtable/image | Large | link | File server upload/download | +| Cross-platform JSON table | All | jsontable | Small | direct | Dictionary array round-trip | +| MicroPython ↔ Desktop | MicroPython ↔ Desktop | text/dictionary | Small | direct | Limited payload types | +| Desktop ↔ Desktop (all combos) | Julia↔JS↔Python | All types | Small/Large | direct/link | Full compatibility | --- From 6b9d175e827a89d2b6ec38fe247dd3b97180860e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 17:29:22 +0700 Subject: [PATCH 19/29] update --- docs/spec.md | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 20b113f..d4da533 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -727,17 +727,19 @@ flowchart TD ### Integration Test Scenarios -| Scenario | Platforms | Payload Type | Size | Transport | Expected Result | -|----------|-----------|--------------|------|-----------|-----------------| -| Single text payload | All | text | Small | direct | Round-trip successful | -| Single dictionary payload | All | dictionary | Small | direct | Round-trip successful | -| Single arrow table | Julia/JS/Python | arrowtable | Small | direct | Arrow IPC round-trip | -| Single JSON table | All | jsontable | Small | direct | Dictionary array round-trip | -| Mixed payloads (text + image) | All | text + image | Small | direct | All payloads preserved | -| Mixed payloads (dict + binary) | All | dictionary + binary | Small | direct | All payloads preserved | -| Large file (link transport) | All | arrowtable/image | Large | link | File server upload/download | -| Cross-platform JSON table | All | jsontable | Small | direct | Dictionary array round-trip | -| MicroPython ↔ Desktop | MicroPython ↔ Desktop | text/dictionary | Small | direct | Limited payload types | +| Scenario | Platforms | Payloads | Size Mix | Transport | Expected Result | +|----------|-----------|----------|----------|-----------|-----------------| +| Single text (small) | All | `text` | Small | direct | Round-trip successful | +| Single dictionary (small) | All | `dictionary` | Small | direct | Round-trip successful | +| Single arrow table (small) | Julia/JS/Python | `arrowtable` | Small | direct | Arrow IPC round-trip | +| Single JSON table (small) | All | `jsontable` | Small | direct | Dictionary array round-trip | +| Single text (large) | All | `text` | Large | link | File server upload/download | +| Single JSON table (large) | All | `jsontable` | Large | link | File server upload/download | +| Mixed payloads (small) | All | `text` + `dictionary` + `image` | All small | direct | All payloads preserved | +| Mixed payloads (large) | All | `text` + `dictionary` + `image` | All large | link | All payloads via file server | +| Mixed payloads (combo) | All | `text` (small) + `image` (large) | Mixed | direct/link | Correct transport per payload | +| Cross-platform JSON table | All | `jsontable` | Small | direct | Dictionary array round-trip | +| MicroPython ↔ Desktop | MicroPython ↔ Desktop | `text`/`dictionary` | Small | direct | Limited payload types | | Desktop ↔ Desktop (all combos) | Julia↔JS↔Python | All types | Small/Large | direct/link | Full compatibility | --- From 8d31c5829b62b01248ff75d96409ddbfd63869b7 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 17:37:21 +0700 Subject: [PATCH 20/29] update --- docs/spec.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index d4da533..099c8df 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -733,11 +733,17 @@ flowchart TD | Single dictionary (small) | All | `dictionary` | Small | direct | Round-trip successful | | Single arrow table (small) | Julia/JS/Python | `arrowtable` | Small | direct | Arrow IPC round-trip | | Single JSON table (small) | All | `jsontable` | Small | direct | Dictionary array round-trip | +| Single image (small) | All | `image` | Small | direct | Binary round-trip | +| Single audio (small) | All | `audio` | Small | direct | Binary round-trip | +| Single video (small) | All | `video` | Small | direct | Binary round-trip | +| Single binary (small) | All | `binary` | Small | direct | Binary round-trip | | Single text (large) | All | `text` | Large | link | File server upload/download | | Single JSON table (large) | All | `jsontable` | Large | link | File server upload/download | -| Mixed payloads (small) | All | `text` + `dictionary` + `image` | All small | direct | All payloads preserved | -| Mixed payloads (large) | All | `text` + `dictionary` + `image` | All large | link | All payloads via file server | -| Mixed payloads (combo) | All | `text` (small) + `image` (large) | Mixed | direct/link | Correct transport per payload | +| Single image (large) | All | `image` | Large | link | File server upload/download | +| **Ultimate Test** | Julia/JS/Python | `text` (small) + `dictionary` (small) + `arrowtable` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `arrowtable` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | +| **Ultimate Test** | Python | `text` (small) + `dictionary` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | +| **Ultimate Test** | JavaScript | `text` (small) + `dictionary` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | +| **Ultimate Test** | MicroPython | `text` (small) + `dictionary` (small) + `text` (large) + `dictionary` (large) | Mixed | direct | Limited to text/dictionary with direct transport only | | Cross-platform JSON table | All | `jsontable` | Small | direct | Dictionary array round-trip | | MicroPython ↔ Desktop | MicroPython ↔ Desktop | `text`/`dictionary` | Small | direct | Limited payload types | | Desktop ↔ Desktop (all combos) | Julia↔JS↔Python | All types | Small/Large | direct/link | Full compatibility | From 3e6ac1430aa0c6fb496dc9af772d3d2c19d2ba00 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 17:40:15 +0700 Subject: [PATCH 21/29] update --- docs/spec.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 099c8df..7908319 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -741,8 +741,6 @@ flowchart TD | Single JSON table (large) | All | `jsontable` | Large | link | File server upload/download | | Single image (large) | All | `image` | Large | link | File server upload/download | | **Ultimate Test** | Julia/JS/Python | `text` (small) + `dictionary` (small) + `arrowtable` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `arrowtable` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | -| **Ultimate Test** | Python | `text` (small) + `dictionary` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | -| **Ultimate Test** | JavaScript | `text` (small) + `dictionary` (small) + `jsontable` (small) + `image` (small) + `audio` (small) + `video` (small) + `binary` (small) + `text` (large) + `dictionary` (large) + `jsontable` (large) + `image` (large) | Mixed | direct/link | All payloads preserved with correct transport | | **Ultimate Test** | MicroPython | `text` (small) + `dictionary` (small) + `text` (large) + `dictionary` (large) | Mixed | direct | Limited to text/dictionary with direct transport only | | Cross-platform JSON table | All | `jsontable` | Small | direct | Dictionary array round-trip | | MicroPython ↔ Desktop | MicroPython ↔ Desktop | `text`/`dictionary` | Small | direct | Limited payload types | From 7bc3e4992a0694936b120362b8c3a4307996630e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 18:41:18 +0700 Subject: [PATCH 22/29] update architecture.md --- docs/architecture.md | 990 ++++++++++++++++++++++------------- docs/earlier_architecture.md | 475 +++++++++++++++++ 2 files changed, 1091 insertions(+), 374 deletions(-) create mode 100644 docs/earlier_architecture.md diff --git a/docs/architecture.md b/docs/architecture.md index b1f7929..340679b 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,379 +1,605 @@ -# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge +# Architecture Documentation: NATSBridge -## Overview - -This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. - -**Supported Platforms:** -- **Julia** - Ground truth implementation with full feature set -- **JavaScript** - Node.js and browser-compatible implementation -- **Python/MicroPython** - Desktop and embedded-compatible implementation - -### Cross-Platform Design Principles - -1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior -2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python) -3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms -4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility +**Version**: 1.0.0 +**Date**: 2026-03-13 +**Status**: Active +**Ground Truth**: [`src/NATSBridge.jl`](../src/NATSBridge.jl) +**Architecture Level**: C4 Container Level --- -## High-Level API Standard (Cross-Platform) +## Executive Summary -### Unified API Signature +This document defines the **blueprint** for NATSBridge - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. -All three platforms expose the same high-level API: - -**Input Format (smartsend):** -``` -[(dataname1, data1, type1), (dataname2, data2, type2), ...] -``` - -**Output Format (smartreceive):** -``` -{ - "correlation_id": "...", - "msg_id": "...", - "timestamp": "...", - "send_to": "...", - "msg_purpose": "...", - "sender_name": "...", - "sender_id": "...", - "receiver_name": "...", - "receiver_id": "...", - "reply_to": "...", - "reply_to_msg_id": "...", - "broker_url": "...", - "metadata": {...}, - "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] -} -``` - -### Supported Payload Types - -| Type | Julia | JavaScript | Python/MicroPython | -|------|-------|------------|-------------------| -| `text` | `String` | `string` | `str` | -| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | -| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | -| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` | - -**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead. - -### Cross-Platform API Examples - -**Julia:** -```julia -using NATSBridge - -# Send -env, env_json_str = smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns JSON.Object{String, Any} -env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff) -# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}} -# Access payloads: for (dataname, data, type) in env["payloads] -``` - -**JavaScript:** -```javascript -const NATSBridge = require('natsbridge'); - -// Send -const [env, env_json_str] = await NATSBridge.smartsend( - "/chat", - [ - ["message", "Hello!", "text"], - ["image", imageBuffer, "image"] - ], - { broker_url: "nats://localhost:4222" } -); - -// Receive - returns Promise -const env = await NATSBridge.smartreceive(msg, { - fileserver_download_handler: fetchWithBackoff -}); -// env is an object with "payloads" field containing Array of arrays -// Access payloads: for (const [dataname, data, type] of env.payloads) -``` - -**Python:** -```python -from natsbridge import NATSBridge - -# Send -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns Tuple[Dict, str] -env = NATSBridge.smartreceive( - msg, - fileserver_download_handler=fetch_with_backoff -) -# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]] -# Access payloads: for dataname, data, type_ in env["payloads"] -``` - -**MicroPython:** -```python -from natsbridge import NATSBridge - -# Send (limited to direct transport due to memory constraints) -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text")], - broker_url="nats://localhost:4222" -) -``` +This architecture document serves as the single source of truth for: +- **System Structure**: How components fit together and interact +- **Scaling Considerations**: How the system scales horizontally and vertically +- **Failure Modes**: How the system handles failures and recovers +- **Trade-off Decisions**: The rationale behind architectural decisions --- -## Architecture Diagram (Cross-Platform) +## Architecture Overview + +### C4 Context Diagram ```mermaid flowchart TD - subgraph Client - App[Julia/JS/Python/MicroPython Application] + subgraph "External Systems" + NATS_Server[NATS Server] + File_Server[HTTP File Server
Plik/AWS S3/Custom] end - subgraph Server - Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service] - NATS[NATS Server] - FileServer[HTTP File Server] + subgraph "Client Applications" + Julia_App[Julia Application] + JS_App[JavaScript Application
Node.js/Browser] + Python_App[Python Application
Desktop] + MicroPython_App[MicroPython Device] end - App -->|NATS| NATS - NATS -->|NATS| Julia/JS/Python/MicroPython - Julia/JS/Python/MicroPython -->|NATS| NATS - Julia/JS/Python/MicroPython -->|HTTP POST| FileServer + Julia_App -->|NATS| NATS_Server + JS_App -->|NATS| NATS_Server + Python_App -->|NATS| NATS_Server + MicroPython_App -->|NATS| NATS_Server - style App fill:#e8f5e9 - style Julia/JS/Python/MicroPython fill:#e8f5e9 - style NATS fill:#fff3e0 - style FileServer fill:#f3e5f5 + Julia_App -->|HTTP| File_Server + JS_App -->|HTTP| File_Server + Python_App -->|HTTP| File_Server + MicroPython_App -->|HTTP| File_Server + + style NATS_Server fill:#fff3e0,stroke:#f57c00 + style File_Server fill:#f3e5f5,stroke:#9c27b4 + style Julia_App fill:#e8f5e9,stroke:#4caf50 + style JS_App fill:#e3f2fd,stroke:#2196f3 + style Python_App fill:#e3f2fd,stroke:#2196f3 + style MicroPython_App fill:#fce4ec,stroke:#e91e63 +``` + +### C4 Container Diagram + +```mermaid +flowchart TD + subgraph "Client Container" + Julia_Module[Julia NATSBridge Module] + JS_Module[JavaScript NATSBridge Module] + Python_Module[Python NATSBridge Module] + MicroPython_Module[MicroPython NATSBridge Module] + end + + subgraph "NATS Container" + NATS_Client[NATS Client] + NATS_Broker[NATS Broker] + end + + subgraph "File Server Container" + File_Client[HTTP Client] + File_Server[File Server] + end + + Julia_Module --> NATS_Client + JS_Module --> NATS_Client + Python_Module --> NATS_Client + MicroPython_Module --> NATS_Client + + NATS_Client --> NATS_Broker + + Julia_Module --> File_Client + JS_Module --> File_Client + Python_Module --> File_Client + MicroPython_Module --> File_Client + + File_Client --> File_Server + + style Julia_Module fill:#e8f5e9,stroke:#4caf50 + style JS_Module fill:#e3f2fd,stroke:#2196f3 + style Python_Module fill:#e3f2fd,stroke:#2196f3 + style MicroPython_Module fill:#fce4ec,stroke:#e91e63 + style NATS_Broker fill:#fff3e0,stroke:#f57c00 + style File_Server fill:#f3e5f5,stroke:#9c27b4 +``` + +### C4 Component Diagram (Julia Implementation) + +```mermaid +flowchart TD + subgraph "NATSBridge Module" + SmartSend[smartsend Function] + SmartReceive[smartreceive Function] + + Serialize[_serialize_data] + Deserialize[_deserialize_data] + + BuildEnvelope[build_envelope] + BuildPayload[build_payload] + + PublishMessage[publish_message] + + FileServerUpload[fileserver_upload_handler] + FileServerDownload[fileserver_download_handler] + end + + subgraph "Data Models" + Payload[MsgPayloadV1 Struct] + Envelope[MsgEnvelopeV1 Struct] + end + + SmartSend --> Serialize + SmartSend --> BuildEnvelope + SmartSend --> BuildPayload + SmartSend --> PublishMessage + SmartSend --> FileServerUpload + + SmartReceive --> Deserialize + SmartReceive --> FileServerDownload + + Serialize --> Payload + BuildEnvelope --> Envelope + BuildPayload --> Payload + + style SmartSend fill:#d1fae5,stroke:#10b981 + style SmartReceive fill:#d1fae5,stroke:#10b981 + style PublishMessage fill:#fef3c7,stroke:#f59e0b + style FileServerUpload fill:#fef3c7,stroke:#f59e0b + style FileServerDownload fill:#fef3c7,stroke:#f59e0b ``` --- -## System Components +## High-Level Architecture -### 1. msg_envelope_v1 - Message Envelope +### System Components + +| Component | Purpose | Platform Support | +|-----------|---------|------------------| +| **smartsend** | Send data via NATS with automatic transport selection | All | +| **smartreceive** | Receive and process NATS messages | All | +| **_serialize_data** | Serialize data according to payload type | All | +| **_deserialize_data** | Deserialize bytes to native data types | All | +| **_build_envelope** | Build message envelope from payloads | All | +| **_build_payload** | Build payload object from serialized data | All | +| **publish_message** | Publish message to NATS subject | All | +| **fileserver_upload_handler** | Upload large payloads to HTTP server | Desktop | +| **fileserver_download_handler** | Download payloads from HTTP server | Desktop | + +### Data Flow + +```mermaid +flowchart TD + A[User calls smartsend subject data] --> B[Process each payload] + B --> C{Calculate serialized size} + C -->|Size < Threshold| D[Direct Transport] + C -->|Size >= Threshold| E[Link Transport] + + D --> F[Serialize data] + F --> G[Base64 encode] + G --> H[Build payload object] + + E --> I[Serialize data] + I --> J[Upload to file server] + J --> K[Get download URL] + K --> H + + H --> L[Build envelope] + L --> M[Convert to JSON] + M --> N[Publish to NATS] + + style A fill:#f9f9f9,stroke:#333 + style N fill:#e0e7ff,stroke:#3b82f6 + style D fill:#d1fae5,stroke:#10b981 + style E fill:#fef3c7,stroke:#f59e0b +``` + +--- + +## Message Envelope Architecture + +### msg_envelope_v1 Structure (Julia) + +```julia +struct msg_envelope_v1 + correlation_id::String # UUID v4 for distributed tracing + msg_id::String # UUID v4 for this message + timestamp::String # ISO 8601 UTC timestamp + + send_to::String # NATS subject to publish to + msg_purpose::String # ACK, NACK, updateStatus, shutdown, chat + sender_name::String # Sender application name + sender_id::String # UUID v4 of sender + receiver_name::String # Receiver application name (empty = broadcast) + receiver_id::String # UUID v4 of receiver (empty = broadcast) + + reply_to::String # Topic for reply messages + reply_to_msg_id::String # Message ID being replied to + broker_url::String # NATS broker URL + + metadata::Dict{String, Any} # Message-level metadata + payloads::Vector{msg_payload_v1} # List of payloads +end +``` + +### msg_payload_v1 Structure (Julia) + +```julia +struct msg_payload_v1 + id::String # UUID v4 for this payload + dataname::String # Name of the payload + payload_type::String # text, dictionary, arrowtable, etc. + transport::String # direct or link + encoding::String # none, json, base64, arrow-ipc + size::Integer # Size in bytes + data::Any # Base64 string or URL + metadata::Dict{String, Any} # Payload-level metadata +end +``` + +### JSON Schema (Cross-Platform) -**JSON Schema (Identical Across All Platforms):** ```json { - "correlation_id": "uuid-v4-string", - "msg_id": "uuid-v4-string", - "timestamp": "2024-01-15T10:30:00Z", - - "send_to": "topic/subject", - "msg_purpose": "ACK | NACK | updateStatus | shutdown | chat", - "sender_name": "agent-wine-web-frontend", - "sender_id": "uuid4", - "receiver_name": "agent-backend", - "receiver_id": "uuid4", - "reply_to": "topic", - "reply_to_msg_id": "uuid4", - "broker_url": "nats://localhost:4222", - - "metadata": { - "content_type": "application/octet-stream", - "content_length": 123456 - }, - + "correlation_id": "string (UUID v4)", + "msg_id": "string (UUID v4)", + "timestamp": "string (ISO 8601 UTC)", + "send_to": "string", + "msg_purpose": "string", + "sender_name": "string", + "sender_id": "string (UUID v4)", + "receiver_name": "string", + "receiver_id": "string (UUID v4)", + "reply_to": "string", + "reply_to_msg_id": "string", + "broker_url": "string", + "metadata": "object", "payloads": [ { - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image", - "transport": "direct", - "encoding": "base64", - "size": 15433, - "data": "base64-encoded-string", - "metadata": { - "checksum": "sha256_hash" - } - }, - { - "id": "uuid4", - "dataname": "large_arrow_table", - "payload_type": "arrowtable", - "transport": "link", - "encoding": "arrow-ipc", - "size": 524288, - "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow", - "metadata": {} + "id": "string (UUID v4)", + "dataname": "string", + "payload_type": "string", + "transport": "string", + "encoding": "string", + "size": "integer", + "data": "string or URL", + "metadata": "object" } ] } ``` -### 2. msg_payload_v1 - Payload Structure +--- -**JSON Schema (Identical Across All Platforms):** -```json -{ - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary", - "transport": "direct | link", - "encoding": "none | json | base64 | arrow-ipc", - "size": 15433, - "data": "base64-encoded-string | http-url | json-string", - "metadata": { - "checksum": "sha256_hash" - } -} -``` +## Payload Type Architecture -### 3. Transport Strategy Decision Logic (Cross-Platform) +### Supported Payload Types -``` -┌─────────────────────────────────────────────────────────────┐ -│ smartsend Function (All Platforms) │ -│ Accepts: [(dataname1, data1, type1), ...] │ -│ (Type is per payload, not standalone) │ -└─────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ For each payload: │ -│ 1. Extract type from tuple/array │ -│ 2. Serialize based on type │ -│ 3. Check payload size │ -└─────────────────────────────────────────────────────────────┘ - │ - ┌───────────┴────────────┐ - ▼ ▼ - ┌──────────────┐ ┌──────────────┐ - │ Direct Path │ │ Link Path │ - │ (< 1MB) │ │ (>= 1MB) │ - │ │ │ │ - │ • Serialize │ │ • Serialize │ - │ to buffer │ │ to buffer │ - │ • Base64/JSON│ │ • Upload to │ - │ encode │ │ HTTP Server│ - │ • Publish to │ │ • Publish to │ - │ NATS │ │ NATS with │ - │ (in msg) │ │ URL │ - └──────────────┘ └──────────────┘ +| Type | Description | Serialization | Encoding | Platforms | +|------|-------------|---------------|----------|-----------| +| `text` | Plain text string | UTF-8 bytes | Base64 | All | +| `dictionary` | JSON object | JSON string | Base64/JSON | All | +| `arrowtable` | Apache Arrow IPC | Arrow IPC stream | Base64/arrow-ipc | Desktop | +| `jsontable` | JSON array of objects | JSON string | Base64/json | All | +| `image` | Binary image data | Raw bytes | Base64 | All | +| `audio` | Binary audio data | Raw bytes | Base64 | All | +| `video` | Binary video data | Raw bytes | Base64 | All | +| `binary` | Generic binary data | Raw bytes | Base64 | All | + +### Serialization Logic + +```mermaid +flowchart TD + A[Input data + payload_type] --> B{Payload Type} + + B -->|"text"| C[UTF-8 encode] + B -->|"dictionary"| D[JSON serialize] + B -->|"arrowtable"| E[Arrow IPC serialize] + B -->|"jsontable"| F[JSON serialize] + B -->|"image"| G[Raw bytes] + B -->|"audio"| H[Raw bytes] + B -->|"video"| I[Raw bytes] + B -->|"binary"| J[Raw bytes] + + C --> K[Return bytes] + D --> K + E --> K + F --> K + G --> K + H --> K + I --> K + J --> K + + style A fill:#f9f9f9,stroke:#333 + style K fill:#e0e7ff,stroke:#3b82f6 ``` --- -## Platform Comparison Matrix +## Transport Strategy Architecture -| Feature | Julia | JavaScript | Python | MicroPython | -|---------|-------|------------|--------|-------------| -| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ | -| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) | -| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ | -| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) | -| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ | -| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) | -| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ | -| **jsontable Support** | ✅ | ✅ | ✅ | ❌ | -| **Direct Transport** | ✅ | ✅ | ✅ | ✅ | -| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) | -| **Handler Functions** | ✅ | ✅ | ✅ | ✅ | -| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ | +### Size Threshold Decision Logic + +| Platform | Size Threshold | Notes | +|----------|----------------|-------| +| Desktop (Julia/JS/Python) | 500,000 bytes (0.5MB) | Default threshold | +| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | + +### Transport Selection Flow + +```mermaid +flowchart TD + A[smartsend called] --> B[Serialize payload] + B --> C[Calculate size] + C --> D{Size < Threshold?} + + D -->|Yes| E[Direct Transport] + D -->|No| F[Link Transport] + + E --> G[Base64 encode] + G --> H[Build payload with direct transport] + + F --> I[Upload to file server] + I --> J[Get download URL] + J --> K[Build payload with link transport] + + H --> L[Build envelope] + K --> L + + style A fill:#f9f9f9,stroke:#333 + style L fill:#e0e7ff,stroke:#3b82f6 + style E fill:#d1fae5,stroke:#10b981 + style F fill:#fef3c7,stroke:#f59e0b +``` + +### Direct Transport Protocol + +When `transport = "direct"`, the `data` field contains a Base64-encoded string of the serialized payload. + +**Encoding Rules**: +- `text`: UTF-8 → Base64 +- `dictionary`: JSON → Base64 (or direct JSON) +- `arrowtable`: Arrow IPC → Base64 (or arrow-ipc) +- `jsontable`: JSON → Base64 (or direct JSON) +- `image`/`audio`/`video`/`binary`: Raw bytes → Base64 + +### Link Transport Protocol + +When `transport = "link"`, the `data` field contains a URL pointing to the uploaded payload. + +**Upload Flow**: +1. Serialize payload according to `payload_type` +2. Upload to HTTP file server (e.g., Plik) +3. Include returned URL in `data` field + +**Download Flow**: +1. Extract URL from payload +2. Fetch with exponential backoff (max 5 retries) +3. Deserialize based on `payload_type` --- -## Platform-Specific Architecture Patterns +## Platform-Specific Architecture -### Julia: Multiple Dispatch Pattern +### Julia Architecture Julia leverages multiple dispatch for type-specific implementations: -- **Function overloading** based on argument types -- **Struct-based data models** with explicit types -- **Native Arrow IPC** support via Arrow.jl +- **Multiple Dispatch**: Function overloading based on argument types +- **Struct-based Data Models**: Explicit type definitions with `struct` +- **Native Arrow IPC**: Support via `Arrow.jl` +- **Async/Await**: Tasks for non-blocking I/O -### JavaScript: Prototype + Async Pattern +```julia +# Multiple dispatch for serialization +function _serialize_data(data::String, payload_type::String) + # Text serialization +end + +function _serialize_data(data::Dict, payload_type::String) + # Dictionary serialization +end + +function _serialize_data(data::DataFrame, payload_type::String) + # Arrow table serialization +end +``` + +### JavaScript Architecture JavaScript uses async/await for non-blocking I/O: -- **Class-based NATS client** for connection management -- **Module-level utility functions** for serialization -- **Native ArrayBuffer** for binary data handling +- **Class-based NATS Client**: Connection management +- **Module-level Utilities**: Serialization functions +- **Native ArrayBuffer**: Binary data handling +- **Fetch API**: HTTP file server communication -### Python: Class-Based Pattern +```javascript +// Class-based NATS client +class NATSClient { + constructor(url) { + this.url = url; + this.connection = null; + } + + async connect() { + this.connection = await nats.connect({ servers: this.url }); + } +} +``` + +### Python Architecture Python uses classes for stateful operations: -- **Class-based NATSBridge** with type hints -- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1) -- **Async/await** for I/O operations +- **Class-based NATSBridge**: Encapsulated API +- **Dataclasses**: Structured data (MsgPayloadV1, MsgEnvelopeV1) +- **Async/await**: I/O operations +- **pyarrow**: Arrow IPC support -### MicroPython: Synchronous Pattern +```python +class NATSBridge: + DEFAULT_SIZE_THRESHOLD = 500_000 + + def __init__(self, broker_url=None, fileserver_url=None): + self.broker_url = broker_url or self.DEFAULT_BROKER_URL + self.fileserver_url = fileserver_url or self.DEFAULT_FILESERVER_URL +``` + +### MicroPython Architecture MicroPython has significant constraints: -- **Synchronous API** (no async/await) -- **Memory-constrained** (256KB - 1MB) -- **Limited payload support** (no tables, max 50KB) +- **Synchronous API**: No async/await +- **Memory-constrained**: 256KB - 1MB +- **Limited payload support**: No tables, max 50KB +- **Simplified UUID generation**: Custom implementation + +```python +# MicroPython constraints +DEFAULT_SIZE_THRESHOLD = 100_000 # 100KB +MAX_PAYLOAD_SIZE = 50_000 # 50KB hard limit +``` --- -## Cross-Platform Compatibility Notes +## Scaling Architecture -### 1. Payload Type Consistency +### Horizontal Scaling -All platforms use the same payload type values for tabular data: +| Component | Scaling Strategy | +|-----------|------------------| +| **NATS Server** | Cluster deployment with multiple nodes | +| **File Server** | Load balancer + multiple instances | +| **Client Applications** | Deploy multiple instances behind load balancer | -| Platform | Table Types | -|----------|-------------| -| Julia | `"arrowtable"`, `"jsontable"` | -| JavaScript | `"arrowtable"`, `"jsontable"` | -| Python | `"arrowtable"`, `"jsontable"` | -| MicroPython | Not supported | +### Vertical Scaling +| Component | Scaling Strategy | +|-----------|------------------| +| **NATS Server** | Increase memory, CPU, disk I/O | +| **File Server** | Increase memory, CPU, disk capacity | +| **Client Applications** | Increase heap size (Python/JS) | -### 2. Direct Transport Encoding Field +### Performance Considerations -The encoding field in direct transport payloads differs between platforms: - -| Platform | Encoding for Direct Transport | -|----------|-------------------------------| -| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| Python | Always `"base64"` for all direct transport payloads | -| MicroPython | Always `"base64"` for all direct transport payloads | - -**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. - -### 3. MicroPython Limitations - -MicroPython has significant constraints that affect feature support: - -| Feature | Desktop Platforms | MicroPython | -|---------|-------------------|-------------| -| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | -| `jsontable` | ✅ | ❌ (not supported - memory constraints) | -| `table` | ✅ | ❌ (not supported - memory constraints) | -| Async/await | ✅ | ❌ (synchronous only) | -| File upload/download | ✅ | ⚠️ (placeholder implementations) | -| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | -| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | - -**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. +| Metric | Target | Notes | +|--------|--------|-------| +| Message serialization overhead | <50ms | For 10KB payload | +| Message deserialization overhead | <50ms | For 10KB payload | +| NATS connection establishment | <100ms | Connection pool recommended | +| File upload latency | <1s | For 0.5MB file | +| File download latency | <1s | For 0.5MB file | --- -## Configuration +## Failure Modes and Recovery + +### NATS Connection Failure + +**Scenario**: NATS server unavailable + +**Handler**: +- Connection auto-reconnect via TCP-level reconnection +- Retry with exponential backoff for publish operations + +**Recovery**: +- NATS client automatically attempts reconnection +- Application can check connection status before publishing + +### File Server Unavailable + +**Scenario**: HTTP file server unavailable during upload/download + +**Handler**: +- Retry up to 5 times with exponential backoff (100ms → 5000ms) +- Fallback to direct transport for upload (MicroPython) + +**Recovery**: +- Exponential backoff: `delay = min(delay * 2, max_delay)` +- After max retries, throw error with correlation ID + +### Deserialization Error + +**Scenario**: Payload type mismatch or corrupted data + +**Handler**: +- Log correlation ID and throw error +- No retry (data corruption) + +**Recovery**: +- Application must validate payload_type matches data type +- Use proper serialization before sending + +### Memory Overflow (MicroPython) + +**Scenario**: Payload exceeds maximum size (50KB) + +**Handler**: +- Reject payloads >50KB with MemoryError +- No retry (client-side check) + +**Recovery**: +- Application must split large payloads +- Use direct transport only for small payloads + +--- + +## Trade-off Decisions + +### Decision 1: Direct vs Link Transport Threshold + +**Trade-off**: Memory vs Network I/O + +**Decision**: Use 0.5MB threshold for desktop, 100KB for MicroPython + +**Rationale**: +- Direct transport uses more memory (Base64 encoding adds ~33% overhead) +- Link transport requires network I/O for upload/download +- 0.5MB is reasonable for desktop memory constraints +- 100KB is necessary for MicroPython memory constraints + +### Decision 2: Base64 Encoding for Direct Transport + +**Trade-off**: Bandwidth vs Simplicity + +**Decision**: Use Base64 encoding for all direct transport payloads + +**Rationale**: +- Simplifies JSON serialization (all data is string-compatible) +- Increases payload size by ~33%, but NATS can handle this +- Alternative would be binary payload support (more complex) + +### Decision 3: Multiple Platform Implementations + +**Trade-off**: Development effort vs Cross-platform support + +**Decision**: Maintain separate implementations for each platform + +**Rationale**: +- Each platform has idiomatic patterns (multiple dispatch, async/await, etc.) +- Maintains developer productivity and code quality +- API parity ensures cross-platform compatibility + +### Decision 4: Handler Function Abstraction + +**Trade-off**: Flexibility vs Simplicity + +**Decision**: Abstract file server operations through handler functions + +**Rationale**: +- Allows support for different file server implementations (Plik, AWS S3, custom) +- Maintains simplicity for common use cases +- Enables plug-in architecture for custom backends + +--- + +## Deployment Architecture + +### Minimum Infrastructure + +| Component | Minimum | Notes | +|-----------|---------|-------| +| NATS Server | 1 instance | Single node for development | +| File Server | 1 instance | HTTP server for large payloads | +| Client Memory | 50MB | Desktop platforms | +| Client Memory | 256KB | MicroPython devices | ### Environment Variables @@ -381,95 +607,111 @@ MicroPython has significant constraints that affect feature support: |----------|---------|-------------| | `NATS_URL` | `nats://localhost:4222` | NATS server URL | | `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | -| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes | -### MicroPython-Specific Configuration +### Container Deployment -```python -# micropython.conf -NATS_URL = "nats://broker.local:4222" -FILESERVER_URL = "http://fileserver.local:8080" -SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices -MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython +```mermaid +flowchart TD + subgraph "Docker Network" + NATS_Container[NATS Server] + FileServer_Container[Plik File Server] + App_Container[Application Container] + end + + App_Container -->|NATS| NATS_Container + App_Container -->|HTTP| FileServer_Container + + style NATS_Container fill:#fff3e0,stroke:#f57c00 + style FileServer_Container fill:#f3e5f5,stroke:#9c27b4 + style App_Container fill:#e3f2fd,stroke:#2196f3 ``` --- -## Performance Considerations +## Security Considerations -### Zero-Copy Reading +### Payload Integrity -| Platform | Strategy | -|----------|----------| -| **Julia** | `Arrow.read()` with memory-mapped files | -| **JavaScript** | `ArrayBuffer` with `DataView` | -| **Python** | `pyarrow` memory mapping | -| **MicroPython** | Not available (streaming only) | +**Mechanism**: SHA-256 checksum via metadata -### Exponential Backoff +**Implementation**: +- Sender calculates checksum and stores in payload metadata +- Receiver validates checksum on receipt -All platforms implement exponential backoff for HTTP downloads: +### Transport Security -``` -delay = base_delay -for attempt in 1:max_retries: - try: - response = fetch(url) - if success: return response - except: - if attempt < max_retries: - sleep(delay) - delay = min(delay * 2, max_delay) -``` +**Mechanism**: TLS support for NATS connections -### Correlation ID Logging +**Implementation**: +- Use `nats://` URL for plain text +- Use `tls://` URL for TLS-encrypted connections -All platforms use correlation IDs for distributed tracing: +### File Server Security -``` -[timestamp] [Correlation: abc123] Message published to subject -``` +**Mechanism**: Authentication token for file uploads -### Serialization Performance Comparison - -| Format | Use Case | Pros | Cons | -|--------|----------|------|------| -| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library | -| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema | -| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow | +**Implementation**: +- Plik uses upload token in `X-UploadToken` header +- Application can implement custom authentication --- -## Summary +## Testing Architecture -This cross-platform NATS bridge provides: +### Unit Test Coverage -1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython -2. **Idiomatic Implementations**: - - Julia: Multiple dispatch and struct-based design - - JavaScript: Async/await and prototype-based utilities - - Python: Class-based design with type hints - - MicroPython: Synchronous API with memory constraints -3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas -4. **Handler Abstraction**: File server operations abstracted through configurable handlers -5. **Platform-Specific Optimizations**: - - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data - - **JSON** (`jsontable`): Universal human-readable format for smaller tables - - **Python table**: Unified table type for Python-specific implementations - - Streaming support in MicroPython +| Test Category | Coverage | Files | +|---------------|----------|-------| +| Serialization | All payload types | `test/test_*_sender.*` | +| Deserialization | All payload types | `test/test_*_receiver.*` | +| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | +| File server upload | Plik integration | Platform-specific | +| File server download | Exponential backoff | Platform-specific | -The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms. +### Integration Test Scenarios -### Datatype Summary +| Scenario | Platforms | Payloads | Transport | Expected Result | +|----------|-----------|----------|-----------|-----------------| +| Cross-platform text | Julia ↔ JS ↔ Python | text | direct | Round-trip successful | +| Arrow IPC round-trip | Julia ↔ JS ↔ Python | arrowtable | direct | Arrow IPC preserved | +| Large file transfer | All | image/audio/video | link | File server upload/download | +| Multi-payload mixed | All | text + image + file | direct/link | All payloads preserved | -| Datatype | Serialization | Use Case | Encoding | Supported Platforms | -|----------|---------------|----------|----------|---------------------| -| `text` | UTF-8 bytes | Text messages, chat content | `utf-8` → `base64` | All | -| `dictionary` | JSON | Structured key-value data, config | `json` → `base64` | All | -| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | -| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | -| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc` → `base64` | Python | -| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary` → `base64` | All | -| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary` → `base64` | All | -| `video` | Binary | Video files (MP4, AVI, etc.) | `binary` → `base64` | All | -| `binary` | Binary | Generic binary data, files | `binary` → `base64` | All | +--- + +## Versioning + +### Architecture Versioning + +| Component | Version | Notes | +|-----------|---------|-------| +| Architecture | 1.0.0 | Initial release | +| Protocol | v1 | Message envelope protocol version | + +### Backward Compatibility + +| Version | Supported Platforms | +|---------|---------------------| +| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, MicroPython 1.19+ | + +--- + +## Change Log + +| Date | Version | Changes | +|------|---------|---------| +| 2026-03-13 | 1.0.0 | Initial architecture documentation | + +--- + +## References + +- [`docs/requirements.md`](./requirements.md) - Business requirements and user stories +- [`docs/spec.md`](./spec.md) - Technical specification and contracts +- [`src/NATSBridge.jl`](../src/NATSBridge.jl) - Ground truth implementation +- [`README.md`](../README.md) - Project overview + +--- + +*This architecture document is versioned and maintained in git alongside the codebase. All implementations must adhere to this architecture.* diff --git a/docs/earlier_architecture.md b/docs/earlier_architecture.md new file mode 100644 index 0000000..b1f7929 --- /dev/null +++ b/docs/earlier_architecture.md @@ -0,0 +1,475 @@ +# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge + +## Overview + +This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. + +**Supported Platforms:** +- **Julia** - Ground truth implementation with full feature set +- **JavaScript** - Node.js and browser-compatible implementation +- **Python/MicroPython** - Desktop and embedded-compatible implementation + +### Cross-Platform Design Principles + +1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior +2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python) +3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms +4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility + +--- + +## High-Level API Standard (Cross-Platform) + +### Unified API Signature + +All three platforms expose the same high-level API: + +**Input Format (smartsend):** +``` +[(dataname1, data1, type1), (dataname2, data2, type2), ...] +``` + +**Output Format (smartreceive):** +``` +{ + "correlation_id": "...", + "msg_id": "...", + "timestamp": "...", + "send_to": "...", + "msg_purpose": "...", + "sender_name": "...", + "sender_id": "...", + "receiver_name": "...", + "receiver_id": "...", + "reply_to": "...", + "reply_to_msg_id": "...", + "broker_url": "...", + "metadata": {...}, + "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] +} +``` + +### Supported Payload Types + +| Type | Julia | JavaScript | Python/MicroPython | +|------|-------|------------|-------------------| +| `text` | `String` | `string` | `str` | +| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | +| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | +| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | +| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | +| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` | + +**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead. + +### Cross-Platform API Examples + +**Julia:** +```julia +using NATSBridge + +# Send +env, env_json_str = smartsend( + "/chat", + [("message", "Hello!", "text"), ("image", image_bytes, "image")], + broker_url="nats://localhost:4222" +) + +# Receive - returns JSON.Object{String, Any} +env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff) +# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}} +# Access payloads: for (dataname, data, type) in env["payloads] +``` + +**JavaScript:** +```javascript +const NATSBridge = require('natsbridge'); + +// Send +const [env, env_json_str] = await NATSBridge.smartsend( + "/chat", + [ + ["message", "Hello!", "text"], + ["image", imageBuffer, "image"] + ], + { broker_url: "nats://localhost:4222" } +); + +// Receive - returns Promise +const env = await NATSBridge.smartreceive(msg, { + fileserver_download_handler: fetchWithBackoff +}); +// env is an object with "payloads" field containing Array of arrays +// Access payloads: for (const [dataname, data, type] of env.payloads) +``` + +**Python:** +```python +from natsbridge import NATSBridge + +# Send +env, env_json_str = NATSBridge.smartsend( + "/chat", + [("message", "Hello!", "text"), ("image", image_bytes, "image")], + broker_url="nats://localhost:4222" +) + +# Receive - returns Tuple[Dict, str] +env = NATSBridge.smartreceive( + msg, + fileserver_download_handler=fetch_with_backoff +) +# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]] +# Access payloads: for dataname, data, type_ in env["payloads"] +``` + +**MicroPython:** +```python +from natsbridge import NATSBridge + +# Send (limited to direct transport due to memory constraints) +env, env_json_str = NATSBridge.smartsend( + "/chat", + [("message", "Hello!", "text")], + broker_url="nats://localhost:4222" +) +``` + +--- + +## Architecture Diagram (Cross-Platform) + +```mermaid +flowchart TD + subgraph Client + App[Julia/JS/Python/MicroPython Application] + end + + subgraph Server + Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service] + NATS[NATS Server] + FileServer[HTTP File Server] + end + + App -->|NATS| NATS + NATS -->|NATS| Julia/JS/Python/MicroPython + Julia/JS/Python/MicroPython -->|NATS| NATS + Julia/JS/Python/MicroPython -->|HTTP POST| FileServer + + style App fill:#e8f5e9 + style Julia/JS/Python/MicroPython fill:#e8f5e9 + style NATS fill:#fff3e0 + style FileServer fill:#f3e5f5 +``` + +--- + +## System Components + +### 1. msg_envelope_v1 - Message Envelope + +**JSON Schema (Identical Across All Platforms):** +```json +{ + "correlation_id": "uuid-v4-string", + "msg_id": "uuid-v4-string", + "timestamp": "2024-01-15T10:30:00Z", + + "send_to": "topic/subject", + "msg_purpose": "ACK | NACK | updateStatus | shutdown | chat", + "sender_name": "agent-wine-web-frontend", + "sender_id": "uuid4", + "receiver_name": "agent-backend", + "receiver_id": "uuid4", + "reply_to": "topic", + "reply_to_msg_id": "uuid4", + "broker_url": "nats://localhost:4222", + + "metadata": { + "content_type": "application/octet-stream", + "content_length": 123456 + }, + + "payloads": [ + { + "id": "uuid4", + "dataname": "login_image", + "payload_type": "image", + "transport": "direct", + "encoding": "base64", + "size": 15433, + "data": "base64-encoded-string", + "metadata": { + "checksum": "sha256_hash" + } + }, + { + "id": "uuid4", + "dataname": "large_arrow_table", + "payload_type": "arrowtable", + "transport": "link", + "encoding": "arrow-ipc", + "size": 524288, + "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow", + "metadata": {} + } + ] +} +``` + +### 2. msg_payload_v1 - Payload Structure + +**JSON Schema (Identical Across All Platforms):** +```json +{ + "id": "uuid4", + "dataname": "login_image", + "payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary", + "transport": "direct | link", + "encoding": "none | json | base64 | arrow-ipc", + "size": 15433, + "data": "base64-encoded-string | http-url | json-string", + "metadata": { + "checksum": "sha256_hash" + } +} +``` + +### 3. Transport Strategy Decision Logic (Cross-Platform) + +``` +┌─────────────────────────────────────────────────────────────┐ +│ smartsend Function (All Platforms) │ +│ Accepts: [(dataname1, data1, type1), ...] │ +│ (Type is per payload, not standalone) │ +└─────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ For each payload: │ +│ 1. Extract type from tuple/array │ +│ 2. Serialize based on type │ +│ 3. Check payload size │ +└─────────────────────────────────────────────────────────────┘ + │ + ┌───────────┴────────────┐ + ▼ ▼ + ┌──────────────┐ ┌──────────────┐ + │ Direct Path │ │ Link Path │ + │ (< 1MB) │ │ (>= 1MB) │ + │ │ │ │ + │ • Serialize │ │ • Serialize │ + │ to buffer │ │ to buffer │ + │ • Base64/JSON│ │ • Upload to │ + │ encode │ │ HTTP Server│ + │ • Publish to │ │ • Publish to │ + │ NATS │ │ NATS with │ + │ (in msg) │ │ URL │ + └──────────────┘ └──────────────┘ +``` + +--- + +## Platform Comparison Matrix + +| Feature | Julia | JavaScript | Python | MicroPython | +|---------|-------|------------|--------|-------------| +| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ | +| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) | +| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ | +| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) | +| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ | +| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) | +| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ | +| **jsontable Support** | ✅ | ✅ | ✅ | ❌ | +| **Direct Transport** | ✅ | ✅ | ✅ | ✅ | +| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) | +| **Handler Functions** | ✅ | ✅ | ✅ | ✅ | +| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ | + +--- + +## Platform-Specific Architecture Patterns + +### Julia: Multiple Dispatch Pattern + +Julia leverages multiple dispatch for type-specific implementations: + +- **Function overloading** based on argument types +- **Struct-based data models** with explicit types +- **Native Arrow IPC** support via Arrow.jl + +### JavaScript: Prototype + Async Pattern + +JavaScript uses async/await for non-blocking I/O: + +- **Class-based NATS client** for connection management +- **Module-level utility functions** for serialization +- **Native ArrayBuffer** for binary data handling + +### Python: Class-Based Pattern + +Python uses classes for stateful operations: + +- **Class-based NATSBridge** with type hints +- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1) +- **Async/await** for I/O operations + +### MicroPython: Synchronous Pattern + +MicroPython has significant constraints: + +- **Synchronous API** (no async/await) +- **Memory-constrained** (256KB - 1MB) +- **Limited payload support** (no tables, max 50KB) + +--- + +## Cross-Platform Compatibility Notes + +### 1. Payload Type Consistency + +All platforms use the same payload type values for tabular data: + +| Platform | Table Types | +|----------|-------------| +| Julia | `"arrowtable"`, `"jsontable"` | +| JavaScript | `"arrowtable"`, `"jsontable"` | +| Python | `"arrowtable"`, `"jsontable"` | +| MicroPython | Not supported | + + +### 2. Direct Transport Encoding Field + +The encoding field in direct transport payloads differs between platforms: + +| Platform | Encoding for Direct Transport | +|----------|-------------------------------| +| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | +| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | +| Python | Always `"base64"` for all direct transport payloads | +| MicroPython | Always `"base64"` for all direct transport payloads | + +**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. + +### 3. MicroPython Limitations + +MicroPython has significant constraints that affect feature support: + +| Feature | Desktop Platforms | MicroPython | +|---------|-------------------|-------------| +| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | +| `jsontable` | ✅ | ❌ (not supported - memory constraints) | +| `table` | ✅ | ❌ (not supported - memory constraints) | +| Async/await | ✅ | ❌ (synchronous only) | +| File upload/download | ✅ | ⚠️ (placeholder implementations) | +| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | +| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | + +**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. + +--- + +## Configuration + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `NATS_URL` | `nats://localhost:4222` | NATS server URL | +| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | + +### MicroPython-Specific Configuration + +```python +# micropython.conf +NATS_URL = "nats://broker.local:4222" +FILESERVER_URL = "http://fileserver.local:8080" +SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices +MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython +``` + +--- + +## Performance Considerations + +### Zero-Copy Reading + +| Platform | Strategy | +|----------|----------| +| **Julia** | `Arrow.read()` with memory-mapped files | +| **JavaScript** | `ArrayBuffer` with `DataView` | +| **Python** | `pyarrow` memory mapping | +| **MicroPython** | Not available (streaming only) | + +### Exponential Backoff + +All platforms implement exponential backoff for HTTP downloads: + +``` +delay = base_delay +for attempt in 1:max_retries: + try: + response = fetch(url) + if success: return response + except: + if attempt < max_retries: + sleep(delay) + delay = min(delay * 2, max_delay) +``` + +### Correlation ID Logging + +All platforms use correlation IDs for distributed tracing: + +``` +[timestamp] [Correlation: abc123] Message published to subject +``` + +### Serialization Performance Comparison + +| Format | Use Case | Pros | Cons | +|--------|----------|------|------| +| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library | +| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema | +| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow | + +--- + +## Summary + +This cross-platform NATS bridge provides: + +1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython +2. **Idiomatic Implementations**: + - Julia: Multiple dispatch and struct-based design + - JavaScript: Async/await and prototype-based utilities + - Python: Class-based design with type hints + - MicroPython: Synchronous API with memory constraints +3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas +4. **Handler Abstraction**: File server operations abstracted through configurable handlers +5. **Platform-Specific Optimizations**: + - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data + - **JSON** (`jsontable`): Universal human-readable format for smaller tables + - **Python table**: Unified table type for Python-specific implementations + - Streaming support in MicroPython + +The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms. + +### Datatype Summary + +| Datatype | Serialization | Use Case | Encoding | Supported Platforms | +|----------|---------------|----------|----------|---------------------| +| `text` | UTF-8 bytes | Text messages, chat content | `utf-8` → `base64` | All | +| `dictionary` | JSON | Structured key-value data, config | `json` → `base64` | All | +| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | +| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | +| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc` → `base64` | Python | +| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary` → `base64` | All | +| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary` → `base64` | All | +| `video` | Binary | Video files (MP4, AVI, etc.) | `binary` → `base64` | All | +| `binary` | Binary | Generic binary data, files | `binary` → `base64` | All | From 8a5eef6b13bc1fa11b818b15c641427d026eed15 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 20:53:35 +0700 Subject: [PATCH 23/29] update --- docs/earlier_architecture.md | 475 -------- docs/implementation.md | 1859 ------------------------------ docs/walkthrough.md | 2087 ++++++++++++++-------------------- 3 files changed, 837 insertions(+), 3584 deletions(-) delete mode 100644 docs/earlier_architecture.md delete mode 100644 docs/implementation.md diff --git a/docs/earlier_architecture.md b/docs/earlier_architecture.md deleted file mode 100644 index b1f7929..0000000 --- a/docs/earlier_architecture.md +++ /dev/null @@ -1,475 +0,0 @@ -# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge - -## Overview - -This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. - -**Supported Platforms:** -- **Julia** - Ground truth implementation with full feature set -- **JavaScript** - Node.js and browser-compatible implementation -- **Python/MicroPython** - Desktop and embedded-compatible implementation - -### Cross-Platform Design Principles - -1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior -2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python) -3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms -4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility - ---- - -## High-Level API Standard (Cross-Platform) - -### Unified API Signature - -All three platforms expose the same high-level API: - -**Input Format (smartsend):** -``` -[(dataname1, data1, type1), (dataname2, data2, type2), ...] -``` - -**Output Format (smartreceive):** -``` -{ - "correlation_id": "...", - "msg_id": "...", - "timestamp": "...", - "send_to": "...", - "msg_purpose": "...", - "sender_name": "...", - "sender_id": "...", - "receiver_name": "...", - "receiver_id": "...", - "reply_to": "...", - "reply_to_msg_id": "...", - "broker_url": "...", - "metadata": {...}, - "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] -} -``` - -### Supported Payload Types - -| Type | Julia | JavaScript | Python/MicroPython | -|------|-------|------------|-------------------| -| `text` | `String` | `string` | `str` | -| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | -| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | -| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` | - -**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead. - -### Cross-Platform API Examples - -**Julia:** -```julia -using NATSBridge - -# Send -env, env_json_str = smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns JSON.Object{String, Any} -env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff) -# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}} -# Access payloads: for (dataname, data, type) in env["payloads] -``` - -**JavaScript:** -```javascript -const NATSBridge = require('natsbridge'); - -// Send -const [env, env_json_str] = await NATSBridge.smartsend( - "/chat", - [ - ["message", "Hello!", "text"], - ["image", imageBuffer, "image"] - ], - { broker_url: "nats://localhost:4222" } -); - -// Receive - returns Promise -const env = await NATSBridge.smartreceive(msg, { - fileserver_download_handler: fetchWithBackoff -}); -// env is an object with "payloads" field containing Array of arrays -// Access payloads: for (const [dataname, data, type] of env.payloads) -``` - -**Python:** -```python -from natsbridge import NATSBridge - -# Send -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns Tuple[Dict, str] -env = NATSBridge.smartreceive( - msg, - fileserver_download_handler=fetch_with_backoff -) -# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]] -# Access payloads: for dataname, data, type_ in env["payloads"] -``` - -**MicroPython:** -```python -from natsbridge import NATSBridge - -# Send (limited to direct transport due to memory constraints) -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text")], - broker_url="nats://localhost:4222" -) -``` - ---- - -## Architecture Diagram (Cross-Platform) - -```mermaid -flowchart TD - subgraph Client - App[Julia/JS/Python/MicroPython Application] - end - - subgraph Server - Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service] - NATS[NATS Server] - FileServer[HTTP File Server] - end - - App -->|NATS| NATS - NATS -->|NATS| Julia/JS/Python/MicroPython - Julia/JS/Python/MicroPython -->|NATS| NATS - Julia/JS/Python/MicroPython -->|HTTP POST| FileServer - - style App fill:#e8f5e9 - style Julia/JS/Python/MicroPython fill:#e8f5e9 - style NATS fill:#fff3e0 - style FileServer fill:#f3e5f5 -``` - ---- - -## System Components - -### 1. msg_envelope_v1 - Message Envelope - -**JSON Schema (Identical Across All Platforms):** -```json -{ - "correlation_id": "uuid-v4-string", - "msg_id": "uuid-v4-string", - "timestamp": "2024-01-15T10:30:00Z", - - "send_to": "topic/subject", - "msg_purpose": "ACK | NACK | updateStatus | shutdown | chat", - "sender_name": "agent-wine-web-frontend", - "sender_id": "uuid4", - "receiver_name": "agent-backend", - "receiver_id": "uuid4", - "reply_to": "topic", - "reply_to_msg_id": "uuid4", - "broker_url": "nats://localhost:4222", - - "metadata": { - "content_type": "application/octet-stream", - "content_length": 123456 - }, - - "payloads": [ - { - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image", - "transport": "direct", - "encoding": "base64", - "size": 15433, - "data": "base64-encoded-string", - "metadata": { - "checksum": "sha256_hash" - } - }, - { - "id": "uuid4", - "dataname": "large_arrow_table", - "payload_type": "arrowtable", - "transport": "link", - "encoding": "arrow-ipc", - "size": 524288, - "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow", - "metadata": {} - } - ] -} -``` - -### 2. msg_payload_v1 - Payload Structure - -**JSON Schema (Identical Across All Platforms):** -```json -{ - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary", - "transport": "direct | link", - "encoding": "none | json | base64 | arrow-ipc", - "size": 15433, - "data": "base64-encoded-string | http-url | json-string", - "metadata": { - "checksum": "sha256_hash" - } -} -``` - -### 3. Transport Strategy Decision Logic (Cross-Platform) - -``` -┌─────────────────────────────────────────────────────────────┐ -│ smartsend Function (All Platforms) │ -│ Accepts: [(dataname1, data1, type1), ...] │ -│ (Type is per payload, not standalone) │ -└─────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ For each payload: │ -│ 1. Extract type from tuple/array │ -│ 2. Serialize based on type │ -│ 3. Check payload size │ -└─────────────────────────────────────────────────────────────┘ - │ - ┌───────────┴────────────┐ - ▼ ▼ - ┌──────────────┐ ┌──────────────┐ - │ Direct Path │ │ Link Path │ - │ (< 1MB) │ │ (>= 1MB) │ - │ │ │ │ - │ • Serialize │ │ • Serialize │ - │ to buffer │ │ to buffer │ - │ • Base64/JSON│ │ • Upload to │ - │ encode │ │ HTTP Server│ - │ • Publish to │ │ • Publish to │ - │ NATS │ │ NATS with │ - │ (in msg) │ │ URL │ - └──────────────┘ └──────────────┘ -``` - ---- - -## Platform Comparison Matrix - -| Feature | Julia | JavaScript | Python | MicroPython | -|---------|-------|------------|--------|-------------| -| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ | -| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) | -| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ | -| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) | -| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ | -| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) | -| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ | -| **jsontable Support** | ✅ | ✅ | ✅ | ❌ | -| **Direct Transport** | ✅ | ✅ | ✅ | ✅ | -| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) | -| **Handler Functions** | ✅ | ✅ | ✅ | ✅ | -| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ | - ---- - -## Platform-Specific Architecture Patterns - -### Julia: Multiple Dispatch Pattern - -Julia leverages multiple dispatch for type-specific implementations: - -- **Function overloading** based on argument types -- **Struct-based data models** with explicit types -- **Native Arrow IPC** support via Arrow.jl - -### JavaScript: Prototype + Async Pattern - -JavaScript uses async/await for non-blocking I/O: - -- **Class-based NATS client** for connection management -- **Module-level utility functions** for serialization -- **Native ArrayBuffer** for binary data handling - -### Python: Class-Based Pattern - -Python uses classes for stateful operations: - -- **Class-based NATSBridge** with type hints -- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1) -- **Async/await** for I/O operations - -### MicroPython: Synchronous Pattern - -MicroPython has significant constraints: - -- **Synchronous API** (no async/await) -- **Memory-constrained** (256KB - 1MB) -- **Limited payload support** (no tables, max 50KB) - ---- - -## Cross-Platform Compatibility Notes - -### 1. Payload Type Consistency - -All platforms use the same payload type values for tabular data: - -| Platform | Table Types | -|----------|-------------| -| Julia | `"arrowtable"`, `"jsontable"` | -| JavaScript | `"arrowtable"`, `"jsontable"` | -| Python | `"arrowtable"`, `"jsontable"` | -| MicroPython | Not supported | - - -### 2. Direct Transport Encoding Field - -The encoding field in direct transport payloads differs between platforms: - -| Platform | Encoding for Direct Transport | -|----------|-------------------------------| -| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| Python | Always `"base64"` for all direct transport payloads | -| MicroPython | Always `"base64"` for all direct transport payloads | - -**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. - -### 3. MicroPython Limitations - -MicroPython has significant constraints that affect feature support: - -| Feature | Desktop Platforms | MicroPython | -|---------|-------------------|-------------| -| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | -| `jsontable` | ✅ | ❌ (not supported - memory constraints) | -| `table` | ✅ | ❌ (not supported - memory constraints) | -| Async/await | ✅ | ❌ (synchronous only) | -| File upload/download | ✅ | ⚠️ (placeholder implementations) | -| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | -| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | - -**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. - ---- - -## Configuration - -### Environment Variables - -| Variable | Default | Description | -|----------|---------|-------------| -| `NATS_URL` | `nats://localhost:4222` | NATS server URL | -| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | -| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | - -### MicroPython-Specific Configuration - -```python -# micropython.conf -NATS_URL = "nats://broker.local:4222" -FILESERVER_URL = "http://fileserver.local:8080" -SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices -MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython -``` - ---- - -## Performance Considerations - -### Zero-Copy Reading - -| Platform | Strategy | -|----------|----------| -| **Julia** | `Arrow.read()` with memory-mapped files | -| **JavaScript** | `ArrayBuffer` with `DataView` | -| **Python** | `pyarrow` memory mapping | -| **MicroPython** | Not available (streaming only) | - -### Exponential Backoff - -All platforms implement exponential backoff for HTTP downloads: - -``` -delay = base_delay -for attempt in 1:max_retries: - try: - response = fetch(url) - if success: return response - except: - if attempt < max_retries: - sleep(delay) - delay = min(delay * 2, max_delay) -``` - -### Correlation ID Logging - -All platforms use correlation IDs for distributed tracing: - -``` -[timestamp] [Correlation: abc123] Message published to subject -``` - -### Serialization Performance Comparison - -| Format | Use Case | Pros | Cons | -|--------|----------|------|------| -| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library | -| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema | -| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow | - ---- - -## Summary - -This cross-platform NATS bridge provides: - -1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython -2. **Idiomatic Implementations**: - - Julia: Multiple dispatch and struct-based design - - JavaScript: Async/await and prototype-based utilities - - Python: Class-based design with type hints - - MicroPython: Synchronous API with memory constraints -3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas -4. **Handler Abstraction**: File server operations abstracted through configurable handlers -5. **Platform-Specific Optimizations**: - - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data - - **JSON** (`jsontable`): Universal human-readable format for smaller tables - - **Python table**: Unified table type for Python-specific implementations - - Streaming support in MicroPython - -The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms. - -### Datatype Summary - -| Datatype | Serialization | Use Case | Encoding | Supported Platforms | -|----------|---------------|----------|----------|---------------------| -| `text` | UTF-8 bytes | Text messages, chat content | `utf-8` → `base64` | All | -| `dictionary` | JSON | Structured key-value data, config | `json` → `base64` | All | -| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | -| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | -| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc` → `base64` | Python | -| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary` → `base64` | All | -| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary` → `base64` | All | -| `video` | Binary | Video files (MP4, AVI, etc.) | `binary` → `base64` | All | -| `binary` | Binary | Generic binary data, files | `binary` → `base64` | All | diff --git a/docs/implementation.md b/docs/implementation.md deleted file mode 100644 index 8467f78..0000000 --- a/docs/implementation.md +++ /dev/null @@ -1,1859 +0,0 @@ -# Cross-Platform Implementation Guide: Bi-Directional Data Bridge - -## Overview - -This document describes the detailed implementation of the high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. - -**Supported Platforms:** -- **Julia** - Ground truth implementation (reference) -- **JavaScript** - Node.js and browser implementation -- **Python/MicroPython** - Desktop and embedded implementation - ---- - -## Cross-Platform Compatibility Notes - -### 1. Python Payload Type Naming - -The Python implementation uses `"table"` as a single payload type for both Arrow and JSON table serialization, while Julia and JavaScript use separate types (`"arrowtable"` and `"jsontable"`): - -| Platform | Table Types | -|----------|-------------| -| Julia | `"arrowtable"`, `"jsontable"` | -| JavaScript | `"arrowtable"`, `"jsontable"` | -| Python | `"table"` (single type) | -| MicroPython | Not supported | - -**Impact:** When exchanging data between Python and Julia/JavaScript, the payload type will differ. Python code should use `"table"` while Julia/JavaScript code should use `"arrowtable"` or `"jsontable"`. - -### 2. Direct Transport Encoding Field - -The encoding field in direct transport payloads differs between platforms: - -| Platform | Encoding for Direct Transport | -|----------|-------------------------------| -| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| Python | Always `"base64"` for all direct transport payloads | -| MicroPython | Always `"base64"` for all direct transport payloads | - -**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. - -### 3. MicroPython Limitations - -MicroPython has significant constraints that affect feature support: - -| Feature | Desktop Platforms | MicroPython | -|---------|-------------------|-------------| -| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | -| `jsontable` | ✅ | ❌ (not supported - memory constraints) | -| `table` | ✅ | ❌ (not supported - memory constraints) | -| Async/await | ✅ | ❌ (synchronous only) | -| File upload/download | ✅ | ⚠️ (placeholder implementations) | -| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | -| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | - -**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. - ---- - -## Implementation Files - -| Language | Implementation File | Description | -|----------|---------------------|-------------| -| **Julia** | [`src/NATSBridge.jl`](../src/NATSBridge.jl) | Full Julia implementation with Arrow IPC support | -| **JavaScript** | `src/natsbridge.js` | Node.js/browser implementation | -| **Python** | `src/natsbridge.py` | Desktop Python implementation | -| **MicroPython** | `src/natsbridge_mpy.py` | MicroPython implementation (limited features) | - ---- - -## File Server Handler Architecture - -The system uses **handler functions** to abstract file server operations, allowing support for different file server implementations (e.g., Plik, AWS S3, custom HTTP server). - -### Handler Function Signatures - -#### Julia - -```julia -# Upload handler - uploads data to file server and returns URL -fileserver_upload_handler( - fileserver_url::String, - dataname::String, - data::Vector{UInt8} -)::Dict{String, Any} - -# Download handler - fetches data from file server URL with exponential backoff -fileserver_download_handler( - url::String, - max_retries::Int, - base_delay::Int, - max_delay::Int, - correlation_id::String -)::Vector{UInt8} -``` - -#### JavaScript - -```javascript -// Upload handler - async function -async function fileserver_upload_handler( - fileserver_url, - dataname, - data // Uint8Array -) { - // Returns: { status, uploadid, fileid, url } -} - -// Download handler - async function -async function fileserver_download_handler( - url, - max_retries, - base_delay, - max_delay, - correlation_id -) { - // Returns: Uint8Array -} -``` - -#### Python - -```python -# Upload handler - async function -async def fileserver_upload_handler( - fileserver_url: str, - dataname: str, - data: bytes -) -> Dict[str, Any]: - """ - Upload data to file server. - - Returns: - Dict with keys: 'status', 'uploadid', 'fileid', 'url' - """ - pass - -# Download handler - async function -async def fileserver_download_handler( - url: str, - max_retries: int, - base_delay: int, - max_delay: int, - correlation_id: str -) -> bytes: - """ - Download data from URL with exponential backoff. - - Returns: - Downloaded bytes - """ - pass -``` - -#### MicroPython - -```python -# Upload handler - synchronous (no async in MicroPython) -def fileserver_upload_handler( - fileserver_url: str, - dataname: str, - data: bytearray -) -> Dict: - """ - Upload data to file server (synchronous). - - Returns: - Dict with keys: 'status', 'url' - """ - pass - -# Download handler - synchronous -def fileserver_download_handler( - url: str, - max_retries: int, - base_delay: int, - max_delay: int, - correlation_id: str -) -> bytearray: - """ - Download data from URL with exponential backoff (synchronous). - - Returns: - Downloaded bytes - """ - pass -``` - ---- - -## Multi-Payload Support (Standard API) - -The system uses a **standardized list-of-tuples format** for all payload operations across all platforms. - -### API Standard - -``` -# Input format for smartsend (always a list of tuples with type info) -[(dataname1, data1, type1), (dataname2, data2, type2), ...] - -# Output format for smartreceive (returns a dictionary with payloads field containing list of tuples) -{ - "correlation_id": "...", - "msg_id": "...", - "timestamp": "...", - "send_to": "...", - "msg_purpose": "...", - "sender_name": "...", - "sender_id": "...", - "receiver_name": "...", - "receiver_id": "...", - "reply_to": "...", - "reply_to_msg_id": "...", - "broker_url": "...", - "metadata": {...}, - "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] -} -``` - -### Supported Types - -| Type | Julia | JavaScript | Python | MicroPython | -|------|-------|------------|--------|-------------| -| `text` | `String` | `string` | `str` | `str` | -| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | `dict` | -| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | ❌ (not supported) | -| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | ⚠️ (limited) | -| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | ❌ | -| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | -| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | -| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes` | `bytearray` | -| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | `bytearray` | - -**Note:** Python uses `"table"` as a single type for both Arrow and JSON table serialization. When exchanging data between Python and Julia/JavaScript, ensure the payload type is correctly translated (`"table"` ↔ `"arrowtable"` or `"jsontable"`). - ---- - -## Platform-Specific Implementations - -### Julia Implementation - -#### Module Structure - -```julia -module NATSBridge - using NATS, JSON, Arrow, HTTP, UUIDs, Dates, Base64 - - # Constants - const DEFAULT_SIZE_THRESHOLD = 1_000_000 # 1MB - const DEFAULT_BROKER_URL = "nats://localhost:4222" - const DEFAULT_FILESERVER_URL = "http://localhost:8080" - - # Structs - struct msg_payload_v1 - id::String - dataname::String - payload_type::String - transport::String - encoding::String - size::Integer - data::Any - metadata::Dict{String, Any} - end - - struct msg_envelope_v1 - correlation_id::String - msg_id::String - timestamp::String - send_to::String - msg_purpose::String - sender_name::String - sender_id::String - receiver_name::String - receiver_id::String - reply_to::String - reply_to_msg_id::String - broker_url::String - metadata::Dict{String, Any} - payloads::Vector{msg_payload_v1} - end - - # Main functions - function smartsend(...) end - function smartreceive(...) end - - # Utility functions - function _serialize_data(...) end - function _deserialize_data(...) end - function envelope_to_json(...) end - function log_trace(...) end - - # File server handlers - function plik_oneshot_upload(...) end - function _fetch_with_backoff(...) end - function publish_message(...) end - - # Internal helpers - function _get_payload_bytes(...) end -end -``` - -#### Multiple Dispatch Pattern - -Julia leverages multiple dispatch for type-specific implementations: - -```julia -# publish_message has two overloads based on argument types -function publish_message(broker_url::String, subject::String, message::String, correlation_id::String) - conn = NATS.connect(broker_url) - publish_message(conn, subject, message, correlation_id) -end - -function publish_message(conn::NATS.Connection, subject::String, message::String, correlation_id::String) - try - NATS.publish(conn, subject, message) - log_trace(correlation_id, "Message published to $subject") - finally - NATS.drain(conn) - end -end - -# Type-specific serialization -function _serialize_data(data::String, payload_type::String) - # Text handling - return Vector{UInt8}(data) -end - -function _serialize_data(data::Dict, payload_type::String) - # Dictionary handling - json_str = JSON.json(data) - return Vector{UInt8}(json_str) -end - -function _serialize_data(data::DataFrame, payload_type::String) - # Table handling - arrowtable - io = IOBuffer() - Arrow.write(io, data) - return take!(io) -end -``` - -#### smartsend Implementation - -```julia -function smartsend( - subject::String, - data::AbstractArray{Tuple{String, T1, String}, 1}; - broker_url::String = DEFAULT_BROKER_URL, - fileserver_url = DEFAULT_FILESERVER_URL, - fileserver_upload_handler::Function = plik_oneshot_upload, - size_threshold::Int = DEFAULT_SIZE_THRESHOLD, - correlation_id::String = string(uuid4()), - msg_purpose::String = "chat", - sender_name::String = "NATSBridge", - receiver_name::String = "", - receiver_id::String = "", - reply_to::String = "", - reply_to_msg_id::String = "", - is_publish::Bool = true, - NATS_connection::Union{NATS.Connection, Nothing} = nothing, - msg_id::String = string(uuid4()), - sender_id::String = string(uuid4()) -)::Tuple{msg_envelope_v1, String} where {T1<:Any} - - log_trace(correlation_id, "Starting smartsend for subject: $subject") - - # Process each payload in the list - payloads = msg_payload_v1[] - for (dataname, payload_data, payload_type) in data - # Serialize data based on type - payload_bytes = _serialize_data(payload_data, payload_type) - - payload_size = length(payload_bytes) - log_trace(correlation_id, "Serialized payload '$dataname' size: $payload_size bytes") - - # Decision: Direct vs Link - if payload_size < size_threshold - # Direct path - Base64 encode and send via NATS - payload_b64 = Base64.base64encode(payload_bytes) - log_trace(correlation_id, "Using direct transport for $payload_size bytes") - - payload = msg_payload_v1( - payload_b64, - payload_type; - id = string(uuid4()), - dataname = dataname, - transport = "direct", - encoding = "base64", - size = payload_size, - metadata = Dict{String, Any}("payload_bytes" => payload_size) - ) - push!(payloads, payload) - else - # Link path - Upload to HTTP server, send URL via NATS - log_trace(correlation_id, "Using link transport, uploading to fileserver") - - response = fileserver_upload_handler(fileserver_url, dataname, payload_bytes) - - if response["status"] != 200 - error("Failed to upload data to fileserver: $(response["status"])") - end - - url = response["url"] - log_trace(correlation_id, "Uploaded to URL: $url") - - payload = msg_payload_v1( - url, - payload_type; - id = string(uuid4()), - dataname = dataname, - transport = "link", - encoding = "none", - size = payload_size, - metadata = Dict{String, Any}() - ) - push!(payloads, payload) - end - end - - # Create msg_envelope_v1 with all payloads - # Note: First positional argument is "send_to" (the NATS subject), not "subject" - env = msg_envelope_v1( - subject, # send_to: NATS subject to publish to - payloads; - correlation_id = correlation_id, - msg_id = msg_id, - msg_purpose = msg_purpose, - sender_name = sender_name, - sender_id = sender_id, - receiver_name = receiver_name, - receiver_id = receiver_id, - reply_to = reply_to, - reply_to_msg_id = reply_to_msg_id, - broker_url = broker_url, - metadata = Dict{String, Any}(), - ) - - env_json_str = envelope_to_json(env) - - if is_publish == false - # skip publish - elseif is_publish == true && NATS_connection === nothing - publish_message(broker_url, subject, env_json_str, correlation_id) - elseif is_publish == true && NATS_connection !== nothing - publish_message(NATS_connection, subject, env_json_str, correlation_id) - end - - return (env, env_json_str) -end -``` - -#### smartreceive Implementation - -```julia -function smartreceive( - msg::NATS.Msg; - fileserver_download_handler::Function = _fetch_with_backoff, - max_retries::Int = 5, - base_delay::Int = 100, - max_delay::Int = 5000 -)::JSON.Object{String, Any} - # Parse the JSON envelope - env_json_obj = JSON.parse(String(msg.payload)) - log_trace(env_json_obj["correlation_id"], "Processing received message") - - # Process all payloads in the envelope - payloads_list = Tuple{String, Any, String}[] - - num_payloads = length(env_json_obj["payloads"]) - - for i in 1:num_payloads - payload = env_json_obj["payloads"][i] - transport = String(payload["transport"]) - dataname = String(payload["dataname"]) - - if transport == "direct" - log_trace(env_json_obj["correlation_id"], "Direct transport - decoding payload '$dataname'") - - # Extract base64 payload from the payload - payload_b64 = String(payload["data"]) - - # Decode Base64 payload - payload_bytes = Base64.base64decode(payload_b64) - - # Deserialize based on type - data_type = String(payload["payload_type"]) - data = _deserialize_data(payload_bytes, data_type, env_json_obj["correlation_id"]) - - push!(payloads_list, (dataname, data, data_type)) - elseif transport == "link" - # Extract download URL from the payload - url = String(payload["data"]) - log_trace(env_json_obj["correlation_id"], "Link transport - fetching '$dataname' from URL: $url") - - # Fetch with exponential backoff using the download handler - downloaded_data = fileserver_download_handler(url, max_retries, base_delay, max_delay, env_json_obj["correlation_id"]) - - # Deserialize based on type - data_type = String(payload["payload_type"]) - data = _deserialize_data(downloaded_data, data_type, env_json_obj["correlation_id"]) - - push!(payloads_list, (dataname, data, data_type)) - else - error("Unknown transport type for payload '$dataname': $(transport)") - end - end - env_json_obj["payloads"] = payloads_list - return env_json_obj -end -``` - -#### _serialize_data Implementation - -```julia -function _serialize_data(data::Any, payload_type::String) - if payload_type == "text" - if isa(data, String) - data_bytes = Vector{UInt8}(data) - return data_bytes - else - error("Text data must be a String") - end - elseif payload_type == "dictionary" - json_str = JSON.json(data) - json_str_bytes = Vector{UInt8}(json_str) - return json_str_bytes - elseif payload_type == "arrowtable" - # Serialize DataFrame to Arrow IPC format - io = IOBuffer() - Arrow.write(io, data) - return take!(io) - elseif payload_type == "jsontable" - # Serialize to JSON - # data is Vector{NamedTuple} or Vector{Dict} - json_str = JSON.json(data) - return Vector{UInt8}(json_str) - elseif payload_type == "image" - if isa(data, Vector{UInt8}) - return data - else - error("Image data must be Vector{UInt8}") - end - elseif payload_type == "audio" - if isa(data, Vector{UInt8}) - return data - else - error("Audio data must be Vector{UInt8}") - end - elseif payload_type == "video" - if isa(data, Vector{UInt8}) - return data - else - error("Video data must be Vector{UInt8}") - end - elseif payload_type == "binary" - if isa(data, IOBuffer) - return take!(data) - elseif isa(data, Vector{UInt8}) - return data - else - error("Binary data must be binary (Vector{UInt8} or IOBuffer)") - end - else - error("Unknown payload_type: $payload_type") - end -end -``` - -#### _deserialize_data Implementation - -```julia -function _deserialize_data( - data::Vector{UInt8}, - payload_type::String, - correlation_id::String -) - if payload_type == "text" - return String(data) - elseif payload_type == "dictionary" - json_str = String(data) - return JSON.parse(json_str) - elseif payload_type == "arrowtable" - # Deserialize from Arrow IPC format - io = IOBuffer(data) - arrow_table = Arrow.Table(io) - return arrow_table - elseif payload_type == "jsontable" - # Deserialize from JSON format - # Returns Vector{NamedTuple} or Vector{Dict} - json_str = String(data) - parsed = JSON.parse(json_str) - return parsed - elseif payload_type == "image" - return data - elseif payload_type == "audio" - return data - elseif payload_type == "video" - return data - elseif payload_type == "binary" - return data - else - error("Unknown payload_type: $payload_type") - end -end -``` - -#### _fetch_with_backoff Implementation - -```julia -function _fetch_with_backoff( - url::String, - max_retries::Int, - base_delay::Int, - max_delay::Int, - correlation_id::String -) - delay = base_delay - for attempt in 1:max_retries - try - response = HTTP.request("GET", url) - if response.status == 200 - log_trace(correlation_id, "Successfully fetched data from $url on attempt $attempt") - return response.body - else - error("Failed to fetch: $(response.status)") - end - catch e - log_trace(correlation_id, "Attempt $attempt failed: $(typeof(e))") - - if attempt < max_retries - sleep(delay / 1000.0) - delay = min(delay * 2, max_delay) - end - end - end - - error("Failed to fetch data after $max_retries attempts") -end -``` - -#### plik_oneshot_upload Implementation - -**Overload 1: Upload from binary data** - -```julia -function plik_oneshot_upload(file_server_url::String, dataname::String, data::Vector{UInt8}) - # Get upload id - url_getUploadID = "$file_server_url/upload" - headers = ["Content-Type" => "application/json"] - body = """{ "OneShot" : true }""" - http_response = HTTP.request("POST", url_getUploadID, headers, body; body_is_form=false) - response_json = JSON.parse(http_response.body) - uploadid = response_json["id"] - uploadtoken = response_json["uploadToken"] - - # Upload file - file_multipart = HTTP.Multipart(dataname, IOBuffer(data), "application/octet-stream") - url_upload = "$file_server_url/file/$uploadid" - headers = ["X-UploadToken" => uploadtoken] - - form = HTTP.Form(Dict( - "file" => file_multipart - )) - - http_response = nothing - try - http_response = HTTP.post(url_upload, headers, form) - catch e - @error "Request failed" exception=e - end - response_json = JSON.parse(http_response.body) - fileid = response_json["id"] - - url = "$file_server_url/file/$uploadid/$fileid/$dataname" - - return Dict("status" => http_response.status, "uploadid" => uploadid, "fileid" => fileid, "url" => url) -end -``` - -**Overload 2: Upload from file path** - -```julia -function plik_oneshot_upload(file_server_url::String, filepath::String) - # Get upload id - filename = basename(filepath) - url_getUploadID = "$file_server_url/upload" - headers = ["Content-Type" => "application/json"] - body = """{ "OneShot" : true }""" - http_response = HTTP.request("POST", url_getUploadID, headers, body; body_is_form=false) - response_json = JSON.parse(http_response.body) - - uploadid = response_json["id"] - uploadtoken = response_json["uploadToken"] - - # Upload file - url_upload = "$file_server_url/file/$uploadid" - headers = ["X-UploadToken" => uploadtoken] - http_response = open(filepath, "r") do file_stream - form = HTTP.Form(Dict("file" => file_stream)) - - # Adding status_exception=false prevents 4xx/5xx from triggering 'catch' - HTTP.post(url_upload, headers, form; status_exception = false) - end - - if !isnothing(http_response) && http_response.status == 200 - # Success - response already logged by caller - else - error("Failed to upload file: server returned status $(http_response.status)") - end - response_json = JSON.parse(http_response.body) - fileid = response_json["id"] - - # url of the uploaded data e.g. "http://192.168.1.20:8080/file/3F62E/4AgGT/test.zip" - url = "$file_server_url/file/$uploadid/$fileid/$filename" - - return Dict("status" => http_response.status, "uploadid" => uploadid, "fileid" => fileid, "url" => url) -end -``` - ---- - -### JavaScript Implementation - -#### Module Structure - -```javascript -// natsbridge.js -const nats = require('nats'); -const crypto = require('crypto'); -const fetch = require('node-fetch'); - -// UUID generation using built-in crypto module -const uuidv4 = () => crypto.randomUUID(); - -const DEFAULT_SIZE_THRESHOLD = 1_000_000; -const DEFAULT_BROKER_URL = 'nats://localhost:4222'; -const DEFAULT_FILESERVER_URL = 'http://localhost:8080'; - -class NATSClient { - constructor(url) { - this.url = url; - this.connection = null; - } - - async connect() { - this.connection = await nats.connect({ servers: this.url }); - return this.connection; - } - - async publish(subject, message) { - if (!this.connection) { - await this.connect(); - } - await this.connection.publish(subject, message); - } - - async close() { - if (this.connection) { - this.connection.close(); - } - } -} - -async function smartsend(subject, data, options = {}) { - // Implementation -} - -async function smartreceive(msg, options = {}) { - // Implementation -} - -module.exports = { - NATSClient, - smartsend, - smartreceive, - plikOneshotUpload, - fetchWithBackoff -}; -``` - -#### smartsend Implementation - -```javascript -const nats = require('nats'); -const crypto = require('crypto'); -const fetch = require('node-fetch'); -const arrow = require('apache-arrow'); - -// UUID generation using built-in crypto module -const uuidv4 = () => crypto.randomUUID(); - -const DEFAULT_SIZE_THRESHOLD = 1_000_000; -const DEFAULT_BROKER_URL = 'nats://localhost:4222'; -const DEFAULT_FILESERVER_URL = 'http://localhost:8080'; - -async function smartsend(subject, data, options = {}) { - const { - broker_url = DEFAULT_BROKER_URL, - fileserver_url = DEFAULT_FILESERVER_URL, - fileserver_upload_handler = plikOneshotUpload, - size_threshold = DEFAULT_SIZE_THRESHOLD, - correlation_id = uuidv4(), - msg_purpose = 'chat', - sender_name = 'NATSBridge', - receiver_name = '', - receiver_id = '', - reply_to = '', - reply_to_msg_id = '', - is_publish = true, - nats_connection = null, - msg_id = uuidv4(), - sender_id = uuidv4() - } = options; - - console.log(`[Correlation: ${correlation_id}] Starting smartsend for subject: ${subject}`); - - // Process payloads - const payloads = []; - for (const [dataname, payloadData, payloadType] of data) { - const payloadBytes = await serializeData(payloadData, payloadType); - const payloadSize = payloadBytes.byteLength; - - console.log(`[Correlation: ${correlation_id}] Serialized payload '${dataname}' (type: ${payloadType}) size: ${payloadSize} bytes`); - - if (payloadSize < size_threshold) { - // Direct path - const payloadB64 = bufferToBase64(payloadBytes); - console.log(`[Correlation: ${correlation_id}] Using direct transport for ${payloadSize} bytes`); - - payloads.push({ - id: uuidv4(), - dataname, - payload_type: payloadType, - transport: 'direct', - encoding: 'base64', - size: payloadSize, - data: payloadB64, - metadata: { payload_bytes: payloadSize } - }); - } else { - // Link path - console.log(`[Correlation: ${correlation_id}] Using link transport, uploading to fileserver`); - - const response = await fileserver_upload_handler(fileserver_url, dataname, payloadBytes); - - if (response.status !== 200) { - throw new Error(`Failed to upload data to fileserver: ${response.status}`); - } - - console.log(`[Correlation: ${correlation_id}] Uploaded to URL: ${response.url}`); - - payloads.push({ - id: uuidv4(), - dataname, - payload_type: payloadType, - transport: 'link', - encoding: 'none', - size: payloadSize, - data: response.url, - metadata: {} - }); - } - } - - // Build envelope - const env = { - correlation_id, - msg_id, - timestamp: new Date().toISOString(), - send_to: subject, - msg_purpose, - sender_name, - sender_id, - receiver_name, - receiver_id, - reply_to, - reply_to_msg_id, - broker_url, - metadata: {}, - payloads - }; - - const env_json_str = JSON.stringify(env); - - if (is_publish) { - if (nats_connection) { - await publishMessage(nats_connection, subject, env_json_str, correlation_id); - } else { - await publishMessage(broker_url, subject, env_json_str, correlation_id); - } - } - - return [env, env_json_str]; -} -``` - -#### serializeData Implementation - -```javascript -const arrow = require('apache-arrow'); - -async function serializeData(data, payload_type) { - if (payload_type === 'text') { - if (typeof data === 'string') { - return Buffer.from(data, 'utf8'); - } else { - throw new Error('Text data must be a string'); - } - } else if (payload_type === 'dictionary') { - const jsonStr = JSON.stringify(data); - return Buffer.from(jsonStr, 'utf8'); - } else if (payload_type === 'arrowtable') { - // Convert Array to Arrow IPC - if (!Array.isArray(data) || data.length === 0) { - throw new Error('arrowtable data must be a non-empty array of objects'); - } - - // Create schema from first row - const schemaFields = Object.keys(data[0]).map(key => - new arrow.Field(key, arrow.any()) - ); - const schema = new arrow.Schema(schemaFields); - - // Create writer - const writer = new arrow.RecordBatchWriter([schema]); - - // Write rows - for (const row of data) { - const recordBatch = arrow.recordBatch.fromObjects([row], schema); - writer.write(recordBatch); - } - await writer.close(); - - // Read buffer - return writer.toBuffer(); - } else if (payload_type === 'jsontable') { - // Serialize directly to JSON - const jsonStr = JSON.stringify(data); - return Buffer.from(jsonStr, 'utf8'); - } else if (payload_type === 'image') { - if (data instanceof Uint8Array || Buffer.isBuffer(data)) { - return Buffer.from(data); - } else { - throw new Error('Image data must be Uint8Array or Buffer'); - } - } else if (payload_type === 'audio') { - if (data instanceof Uint8Array || Buffer.isBuffer(data)) { - return Buffer.from(data); - } else { - throw new Error('Audio data must be Uint8Array or Buffer'); - } - } else if (payload_type === 'video') { - if (data instanceof Uint8Array || Buffer.isBuffer(data)) { - return Buffer.from(data); - } else { - throw new Error('Video data must be Uint8Array or Buffer'); - } - } else if (payload_type === 'binary') { - if (data instanceof Uint8Array || Buffer.isBuffer(data)) { - return Buffer.from(data); - } else { - throw new Error('Binary data must be Uint8Array or Buffer'); - } - } else { - throw new Error(`Unknown payload_type: ${payload_type}`); - } -} - -function bufferToBase64(buffer) { - return buffer.toString('base64'); -} -``` - -#### deserializeData Implementation - -```javascript -const arrow = require('apache-arrow'); - -async function deserializeData(data, payload_type, correlation_id) { - if (payload_type === 'text') { - return Buffer.from(data).toString('utf8'); - } else if (payload_type === 'dictionary') { - const jsonStr = Buffer.from(data).toString('utf8'); - return JSON.parse(jsonStr); - } else if (payload_type === 'arrowtable') { - // Deserialize from Arrow IPC - const buffer = Buffer.from(data); - const table = arrow.tableFromRawBytes(buffer); - return table; - } else if (payload_type === 'jsontable') { - // Deserialize from JSON - returns Array - const jsonStr = Buffer.from(data).toString('utf8'); - return JSON.parse(jsonStr); - } else if (payload_type === 'image') { - return Buffer.from(data); - } else if (payload_type === 'audio') { - return Buffer.from(data); - } else if (payload_type === 'video') { - return Buffer.from(data); - } else if (payload_type === 'binary') { - return Buffer.from(data); - } else { - throw new Error(`Unknown payload_type: ${payload_type}`); - } -} -``` - -#### fetchWithBackoff Implementation - -```javascript -async function fetchWithBackoff(url, max_retries, base_delay, max_delay, correlation_id) { - let delay = base_delay; - - for (let attempt = 1; attempt <= max_retries; attempt++) { - try { - const response = await fetch(url); - - if (response.status === 200) { - console.log(`[Correlation: ${correlation_id}] Successfully fetched data from ${url} on attempt ${attempt}`); - return await response.arrayBuffer(); - } else { - throw new Error(`Failed to fetch: ${response.status}`); - } - } catch (e) { - console.log(`[Correlation: ${correlation_id}] Attempt ${attempt} failed: ${e.constructor.name}`); - - if (attempt < max_retries) { - await new Promise(resolve => setTimeout(resolve, delay)); - delay = Math.min(delay * 2, max_delay); - } - } - } - - throw new Error(`Failed to fetch data after ${max_retries} attempts`); -} -``` - -#### plikOneshotUpload Implementation - -```javascript -async function plikOneshotUpload(file_server_url, dataname, data) { - // Get upload id - const url_getUploadID = `${file_server_url}/upload`; - const headers = { 'Content-Type': 'application/json' }; - const body = JSON.stringify({ OneShot: true }); - - const http_response = await fetch(url_getUploadID, { - method: 'POST', - headers, - body - }); - - const response_json = await http_response.json(); - const uploadid = response_json.id; - const uploadtoken = response_json.uploadToken; - - // Upload file - const url_upload = `${file_server_url}/file/${uploadid}`; - const form = new FormData(); - const blob = new Blob([data]); - form.append('file', blob, dataname); - - const upload_headers = { - 'X-UploadToken': uploadtoken - }; - - const upload_response = await fetch(url_upload, { - method: 'POST', - headers: upload_headers, - body: form - }); - - const upload_json = await upload_response.json(); - const fileid = upload_json.id; - - const url = `${file_server_url}/file/${uploadid}/${fileid}/${dataname}`; - - return { - status: upload_response.status, - uploadid, - fileid, - url - }; -} -``` - ---- - -### Python Implementation - -#### Module Structure - -```python -# natsbridge.py -import asyncio -import base64 -import json -import uuid -import time -from typing import Any, Dict, List, Tuple, Union, Callable -from dataclasses import dataclass, field -from datetime import datetime - -try: - import pyarrow as arrow - import pyarrow.parquet as pq - ARROW_AVAILABLE = True -except ImportError: - ARROW_AVAILABLE = False - -try: - import aiohttp - import nats - from nats.aio.client import Client as NATSClient - NATS_AVAILABLE = True -except ImportError: - NATS_AVAILABLE = False - - -DEFAULT_SIZE_THRESHOLD = 1_000_000 -DEFAULT_BROKER_URL = "nats://localhost:4222" -DEFAULT_FILESERVER_URL = "http://localhost:8080" - - -@dataclass -class MsgPayloadV1: - """Message payload structure.""" - id: str - dataname: str - payload_type: str - transport: str - encoding: str - size: int - data: Union[str, bytes] - metadata: Dict[str, Any] = field(default_factory=dict) - - -@dataclass -class MsgEnvelopeV1: - """Message envelope structure.""" - correlation_id: str - msg_id: str - timestamp: str - send_to: str - msg_purpose: str - sender_name: str - sender_id: str - receiver_name: str - receiver_id: str - reply_to: str - reply_to_msg_id: str - broker_url: str - metadata: Dict[str, Any] = field(default_factory=dict) - payloads: List[MsgPayloadV1] = field(default_factory=list) - - -class NATSBridge: - """Cross-platform NATS bridge implementation.""" - - def __init__(self, broker_url: str = None, fileserver_url: str = None): - self.broker_url = broker_url or DEFAULT_BROKER_URL - self.fileserver_url = fileserver_url or DEFAULT_FILESERVER_URL - self._nats_client: NATSClient = None - - async def smartsend(self, subject: str, data: List[Tuple[str, Any, str]], **kwargs) -> Tuple[Dict, str]: - """Send data via NATS.""" - pass - - async def smartreceive(self, msg: Any, **kwargs) -> Dict: - """Receive and process NATS message.""" - pass -``` - -#### smartsend Implementation - -```python -import asyncio -import base64 -import json -import uuid -from typing import Any, Dict, List, Tuple, Union, Callable -from datetime import datetime - -DEFAULT_SIZE_THRESHOLD = 1_000_000 -DEFAULT_BROKER_URL = "nats://localhost:4222" -DEFAULT_FILESERVER_URL = "http://localhost:8080" - - -async def smartsend( - subject: str, - data: List[Tuple[str, Any, str]], - broker_url: str = DEFAULT_BROKER_URL, - fileserver_url: str = DEFAULT_FILESERVER_URL, - fileserver_upload_handler: Callable = plik_oneshot_upload, - size_threshold: int = DEFAULT_SIZE_THRESHOLD, - correlation_id: str = None, - msg_purpose: str = "chat", - sender_name: str = "NATSBridge", - receiver_name: str = "", - receiver_id: str = "", - reply_to: str = "", - reply_to_msg_id: str = "", - is_publish: bool = True, - nats_connection: Any = None, - msg_id: str = None, - sender_id: str = None -) -> Tuple[Dict, str]: - """ - Send data via NATS with automatic transport selection. - - Args: - subject: NATS subject to publish to - data: List of (dataname, data, type) tuples - **kwargs: Additional options - - Returns: - Tuple of (env, env_json_str) - """ - if correlation_id is None: - correlation_id = str(uuid.uuid4()) - if msg_id is None: - msg_id = str(uuid.uuid4()) - if sender_id is None: - sender_id = str(uuid.uuid4()) - - print(f"[Correlation: {correlation_id}] Starting smartsend for subject: {subject}") - - # Process payloads - payloads = [] - for dataname, payload_data, payload_type in data: - payload_bytes = _serialize_data(payload_data, payload_type) - payload_size = len(payload_bytes) - - print(f"[Correlation: {correlation_id}] Serialized payload '{dataname}' (type: {payload_type}) size: {payload_size} bytes") - - if payload_size < size_threshold: - # Direct path - payload_b64 = base64.b64encode(payload_bytes).decode('utf-8') - print(f"[Correlation: {correlation_id}] Using direct transport for {payload_size} bytes") - - payloads.append({ - 'id': str(uuid.uuid4()), - 'dataname': dataname, - 'payload_type': payload_type, - 'transport': 'direct', - 'encoding': 'base64', - 'size': payload_size, - 'data': payload_b64, - 'metadata': {'payload_bytes': payload_size} - }) - else: - # Link path - print(f"[Correlation: {correlation_id}] Using link transport, uploading to fileserver") - - response = await fileserver_upload_handler(fileserver_url, dataname, payload_bytes) - - if response['status'] != 200: - raise Exception(f"Failed to upload data to fileserver: {response['status']}") - - print(f"[Correlation: {correlation_id}] Uploaded to URL: {response['url']}") - - payloads.append({ - 'id': str(uuid.uuid4()), - 'dataname': dataname, - 'payload_type': payload_type, - 'transport': 'link', - 'encoding': 'none', - 'size': payload_size, - 'data': response['url'], - 'metadata': {} - }) - - # Build envelope - env = { - 'correlation_id': correlation_id, - 'msg_id': msg_id, - 'timestamp': datetime.utcnow().isoformat() + 'Z', - 'send_to': subject, - 'msg_purpose': msg_purpose, - 'sender_name': sender_name, - 'sender_id': sender_id, - 'receiver_name': receiver_name, - 'receiver_id': receiver_id, - 'reply_to': reply_to, - 'reply_to_msg_id': reply_to_msg_id, - 'broker_url': broker_url, - 'metadata': {}, - 'payloads': payloads - } - - env_json_str = json.dumps(env) - - if is_publish: - if nats_connection: - await publish_message(nats_connection, subject, env_json_str, correlation_id) - else: - await publish_message(broker_url, subject, env_json_str, correlation_id) - - return env, env_json_str -``` - -#### serializeData Implementation - -```python -import base64 -import json -from typing import Any - -try: - import pyarrow as arrow - import pyarrow.feather as feather - import pyarrow.ipc as ipc - ARROW_AVAILABLE = True -except ImportError: - ARROW_AVAILABLE = False - - -def _serialize_data(data: Any, payload_type: str) -> bytes: - """ - Serialize data to bytes based on type. - - Note: Python uses "table" as a single type for both Arrow and JSON table - serialization. Julia/JavaScript use separate "arrowtable" and "jsontable" types. - """ - if payload_type == 'text': - if isinstance(data, str): - return data.encode('utf-8') - else: - raise ValueError('Text data must be a string') - elif payload_type == 'dictionary': - json_str = json.dumps(data) - return json_str.encode('utf-8') - elif payload_type == 'table': - # Python uses "table" for both arrowtable and jsontable - if not ARROW_AVAILABLE: - raise RuntimeError('pyarrow not available for table serialization') - - import io - buf = io.BytesIO() - import pandas as pd - if isinstance(data, pd.DataFrame): - # Serialize DataFrame to Arrow - table = arrow.Table.from_pandas(data) - sink = ipc.new_file(buf, table.schema) - ipc.write_table(table, sink) - sink.close() - return buf.getvalue() - elif isinstance(data, arrow.Table): - sink = ipc.new_file(buf, data.schema) - ipc.write_table(data, sink) - sink.close() - return buf.getvalue() - else: - raise ValueError('Table data must be a pandas DataFrame or pyarrow Table') - elif payload_type in ('image', 'audio', 'video', 'binary'): - if isinstance(data, (bytes, bytearray)): - return bytes(data) - else: - raise ValueError(f'{payload_type} data must be bytes') - else: - raise ValueError(f'Unknown payload_type: {payload_type}') -``` - -#### deserializeData Implementation - -```python -import base64 -import json -from typing import Any - -try: - import pyarrow as arrow - import pyarrow.feather as feather - import pyarrow.ipc as ipc - ARROW_AVAILABLE = True -except ImportError: - ARROW_AVAILABLE = False - - -def _deserialize_data(data: bytes, payload_type: str, correlation_id: str) -> Any: - """ - Deserialize bytes to data based on type. - - Note: Python uses "table" as a single type for both Arrow and JSON table - deserialization. Julia/JavaScript use separate "arrowtable" and "jsontable" types. - """ - if payload_type == 'text': - return data.decode('utf-8') - elif payload_type == 'dictionary': - json_str = data.decode('utf-8') - return json.loads(json_str) - elif payload_type == 'table': - # Python uses "table" for both arrowtable and jsontable - if not ARROW_AVAILABLE: - raise RuntimeError('pyarrow not available for table deserialization') - - import io - buf = io.BytesIO(data) - reader = ipc.open_file(buf) - return reader.read_all().to_pandas() - elif payload_type in ('image', 'audio', 'video', 'binary'): - return data - else: - raise ValueError(f'Unknown payload_type: {payload_type}') -``` - -#### fetchWithBackoff Implementation - -```python -import asyncio -import aiohttp -from typing import Callable - - -async def fetch_with_backoff( - url: str, - max_retries: int, - base_delay: int, - max_delay: int, - correlation_id: str -) -> bytes: - """Fetch URL with exponential backoff.""" - delay = base_delay - - for attempt in range(1, max_retries + 1): - try: - async with aiohttp.ClientSession() as session: - async with session.get(url) as response: - if response.status == 200: - print(f"[Correlation: {correlation_id}] Successfully fetched data from {url} on attempt {attempt}") - return await response.read() - else: - raise Exception(f"Failed to fetch: {response.status}") - except Exception as e: - print(f"[Correlation: {correlation_id}] Attempt {attempt} failed: {type(e).__name__}") - - if attempt < max_retries: - await asyncio.sleep(delay / 1000.0) - delay = min(delay * 2, max_delay) - - raise Exception(f"Failed to fetch data after {max_retries} attempts") -``` - -#### plikOneshotUpload Implementation - -```python -import aiohttp -import json -from typing import Dict, Any - - -async def plik_oneshot_upload( - file_server_url: str, - dataname: str, - data: bytes -) -> Dict[str, Any]: - """Upload data to plik server in one-shot mode.""" - - # Get upload id - async with aiohttp.ClientSession() as session: - url_getUploadID = f"{file_server_url}/upload" - headers = {'Content-Type': 'application/json'} - body = json.dumps({"OneShot": True}) - - async with session.post(url_getUploadID, headers=headers, data=body) as response: - response_json = await response.json() - uploadid = response_json['id'] - uploadtoken = response_json['uploadToken'] - - # Upload file - url_upload = f"{file_server_url}/file/{uploadid}" - headers = {'X-UploadToken': uploadtoken} - - form = aiohttp.FormData() - form.add_field('file', data, filename=dataname, content_type='application/octet-stream') - - async with session.post(url_upload, headers=headers, data=form) as upload_response: - upload_json = await upload_response.json() - fileid = upload_json['id'] - - url = f"{file_server_url}/file/{uploadid}/{fileid}/{dataname}" - - return { - 'status': upload_response.status, - 'uploadid': uploadid, - 'fileid': fileid, - 'url': url - } -``` - ---- - -### MicroPython Implementation - -#### Limitations - -MicroPython has significant constraints compared to desktop implementations: - -| Feature | Desktop | MicroPython | -|---------|---------|-------------| -| Memory | Unlimited | ~256KB - 1MB | -| Arrow IPC | ✅ | ❌ (not supported) | -| Async/Await | ✅ | ❌ (synchronous only) | -| Large payloads (>1MB) | ✅ | ❌ (enforced limit) | -| arrowtable | ✅ | ❌ (not supported) | -| jsontable | ✅ | ❌ (not supported) | -| Multiple payloads | ✅ | ⚠️ (limited) | - -**Note:** MicroPython does NOT support table types (`arrowtable` or `jsontable`) due to memory constraints. - -#### Module Structure - -```python -# natsbridge_mpy.py (MicroPython) -import network -import time -import json -import base64 -import uos -import struct -import random - -# Constants -DEFAULT_SIZE_THRESHOLD = 100000 # 100KB for MicroPython -DEFAULT_BROKER_URL = "nats://localhost:4222" -DEFAULT_FILESERVER_URL = "http://localhost:8080" -MAX_PAYLOAD_SIZE = 50000 # Hard limit (lower than threshold for safety) - -# Note: MicroPython does NOT support table types (arrowtable/jsontable) -# Only supports: text, dictionary, image, audio, video, binary - - -class NATSBridge: - """MicroPython NATS bridge implementation.""" - - def __init__(self, broker_url=None, fileserver_url=None): - self.broker_url = broker_url or DEFAULT_BROKER_URL - self.fileserver_url = fileserver_url or DEFAULT_FILESERVER_URL - self._nats_conn = None - - def smartsend(self, subject, data, **kwargs): - """Send data (synchronous).""" - correlation_id = self._generate_uuid() - msg_id = self._generate_uuid() - sender_id = self._generate_uuid() - - print(f"[Correlation: {correlation_id}] Starting smartsend") - - payloads = [] - for dataname, payload_data, payload_type in data: - payload_bytes = self._serialize_data(payload_data, payload_type) - payload_size = len(payload_bytes) - - if payload_size > MAX_PAYLOAD_SIZE: - raise MemoryError(f"Payload {dataname} exceeds max size {MAX_PAYLOAD_SIZE}") - - if payload_size < DEFAULT_SIZE_THRESHOLD: - # Direct path - payload_b64 = base64.b64encode(payload_bytes).decode('ascii') - payloads.append({ - 'id': self._generate_uuid(), - 'dataname': dataname, - 'payload_type': payload_type, - 'transport': 'direct', - 'encoding': 'base64', - 'size': payload_size, - 'data': payload_b64 - }) - else: - # Link path (limited support) - response = self._sync_fileserver_upload(self.fileserver_url, dataname, payload_bytes) - payloads.append({ - 'id': self._generate_uuid(), - 'dataname': dataname, - 'payload_type': payload_type, - 'transport': 'link', - 'encoding': 'none', - 'size': payload_size, - 'data': response['url'] - }) - - env = { - 'correlation_id': correlation_id, - 'msg_id': msg_id, - 'timestamp': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.localtime()), - 'send_to': subject, - 'msg_purpose': kwargs.get('msg_purpose', 'chat'), - 'sender_name': kwargs.get('sender_name', 'NATSBridge'), - 'sender_id': sender_id, - 'receiver_name': kwargs.get('receiver_name', ''), - 'receiver_id': kwargs.get('receiver_id', ''), - 'reply_to': kwargs.get('reply_to', ''), - 'reply_to_msg_id': kwargs.get('reply_to_msg_id', ''), - 'broker_url': self.broker_url, - 'metadata': {}, - 'payloads': payloads - } - - env_json_str = json.dumps(env) - - # Publish - self._publish(subject, env_json_str, correlation_id) - - return env, env_json_str - - def smartreceive(self, msg, **kwargs): - """Receive and process message (synchronous).""" - env_json_obj = json.loads(msg.payload) - correlation_id = env_json_obj['correlation_id'] - - payloads_list = [] - for payload in env_json_obj['payloads']: - transport = payload['transport'] - dataname = payload['dataname'] - - if transport == 'direct': - payload_b64 = payload['data'] - payload_bytes = base64.b64decode(payload_b64) - data_type = payload['payload_type'] - data = self._deserialize_data(payload_bytes, data_type) - payloads_list.append((dataname, data, data_type)) - elif transport == 'link': - url = payload['data'] - downloaded_data = self._sync_fileserver_download( - url, - kwargs.get('max_retries', 3), - kwargs.get('base_delay', 100), - kwargs.get('max_delay', 1000), - correlation_id - ) - data_type = payload['payload_type'] - data = self._deserialize_data(downloaded_data, data_type) - payloads_list.append((dataname, data, data_type)) - - env_json_obj['payloads'] = payloads_list - return env_json_obj - - def _serialize_data(self, data, payload_type): - """ - Serialize data (MicroPython version). - - Note: MicroPython does NOT support table types (arrowtable/jsontable). - Only supports: text, dictionary, image, audio, video, binary - """ - if payload_type == 'text': - if isinstance(data, str): - return data.encode('utf-8') - else: - raise ValueError('Text data must be a string') - elif payload_type == 'dictionary': - json_str = json.dumps(data) - return json_str.encode('utf-8') - elif payload_type in ('image', 'audio', 'video', 'binary'): - if isinstance(data, (bytes, bytearray, memoryview)): - return bytes(data) - else: - raise ValueError(f'{payload_type} data must be bytes') - else: - raise ValueError(f'Unknown payload_type: {payload_type}') - - def _deserialize_data(self, data, payload_type): - """ - Deserialize data (MicroPython version). - - Note: MicroPython does NOT support table types (arrowtable/jsontable). - Only supports: text, dictionary, image, audio, video, binary - """ - if payload_type == 'text': - return data.decode('utf-8') - elif payload_type == 'dictionary': - json_str = data.decode('utf-8') - return json.loads(json_str) - elif payload_type in ('image', 'audio', 'video', 'binary'): - return data - else: - raise ValueError(f'Unknown payload_type: {payload_type}') - - def _generate_uuid(self): - """Generate simple UUID (MicroPython compatible).""" - return 'mp-%04x%04x-%04x-%04x-%04x-%04x%04x%04x' % ( - time.time_ns() // (10**6) % 0xFFFFFFFF, - time.time_ns() % 0xFFFFFFFF, - time.time_ns() >> 32 & 0xFFFF, - time.time_ns() >> 48 & 0xFFFF, - time.time_ns() >> 64 & 0xFFFF, - time.time_ns() >> 80 & 0xFFFF, - time.time_ns() >> 96 & 0xFFFF, - time.time_ns() >> 112 & 0xFFFF - ) - - def _sync_fileserver_upload(self, url, dataname, data): - """Synchronous file upload (limited).""" - # Simplified implementation for MicroPython - # In practice, would use network.HTTP or similar - raise NotImplementedError("File upload not implemented in MicroPython") - - def _sync_fileserver_download(self, url, max_retries, base_delay, max_delay, correlation_id): - """Synchronous file download with backoff.""" - # Simplified implementation for MicroPython - raise NotImplementedError("File download not implemented in MicroPython") - - def _publish(self, subject, message, correlation_id): - """Publish message to NATS.""" - # Simplified implementation for MicroPython - raise NotImplementedError("NATS publishing not implemented in MicroPython") -``` - ---- - -## Configuration - -### Environment Variables - -| Variable | Default | Description | -|----------|---------|-------------| -| `NATS_URL` | `nats://localhost:4222` | NATS server URL | -| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | -| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | - -### MicroPython Configuration - -```python -# micropython.conf -NATS_URL = "nats://broker.local:4222" -FILESERVER_URL = "http://fileserver.local:8080" -SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices -MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython -``` - ---- - -## Performance Considerations - -### Zero-Copy Reading - -| Platform | Strategy | -|----------|----------| -| **Julia** | `Arrow.read()` with memory-mapped files | -| **JavaScript** | `ArrayBuffer` with `DataView` | -| **Python** | `pyarrow` memory mapping | -| **MicroPython** | Not available (streaming only) | - -### Exponential Backoff - -All platforms implement exponential backoff for HTTP downloads: - -```python -# Python -async def fetch_with_backoff(url, max_retries, base_delay, max_delay, correlation_id): - delay = base_delay - for attempt in range(1, max_retries + 1): - try: - async with aiohttp.ClientSession() as session: - async with session.get(url) as response: - if response.status == 200: - return await response.read() - except Exception as e: - if attempt < max_retries: - await asyncio.sleep(delay / 1000.0) - delay = min(delay * 2, max_delay) - raise Exception("Failed to fetch after max retries") -``` - -### Correlation ID Logging - -All platforms use correlation IDs for distributed tracing: - -``` -[timestamp] [Correlation: abc123] Message published to subject -``` - -### Serialization Performance - -| Format | Use Case | Pros | Cons | -|--------|----------|------|------| -| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library, not supported in MicroPython | -| `jsontable` | Small/medium tabular data | Human-readable, universal support, works in MicroPython | Slower, larger size, no schema enforcement | - ---- - -## Testing - -### Test File Organization - -| Platform | Sender Tests | Receiver Tests | -|----------|--------------|----------------| -| **Julia** | `test/test_julia_*_sender.jl` | `test/test_julia_*_receiver.jl` | -| **JavaScript** | `test/test_js_*_sender.js` | `test/test_js_*_receiver.js` | -| **Python** | `test/test_py_*_sender.py` | `test/test_py_*_receiver.py` | - -### Run Tests - -```bash -# Julia -julia test/test_julia_text_sender.jl -julia test/test_julia_text_receiver.jl - -# JavaScript (Node.js) -node test/test_js_text_sender.js -node test/test_js_text_receiver.js - -# Python -python3 test/test_py_text_sender.py -python3 test/test_py_text_receiver.py -``` - ---- - -## Troubleshooting - -### Common Issues - -1. **NATS Connection Failed** - - Ensure NATS server is running - - Check `broker_url` configuration - -2. **HTTP Upload Failed** - - Ensure file server is running - - Check `fileserver_url` configuration - - Verify upload permissions - -3. **Arrow IPC Deserialization Error** - - Ensure data is properly serialized to Arrow format - - Check Arrow version compatibility - - MicroPython doesn't support Arrow IPC - -4. **Memory Constraints (MicroPython)** - - Reduce `size_threshold` - - Use direct transport only (< 100KB) - - Avoid large payloads - - Use `jsontable` instead of `arrowtable` (arrowtable not supported) - ---- - -## Summary - -This cross-platform NATS bridge provides: - -1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across all platforms -2. **Idiomatic Implementations**: - - **Julia**: Multiple dispatch, struct-based design, native Arrow IPC - - **JavaScript**: Async/await, prototype-based utilities, class-based NATS client - - **Python**: Class-based design with dataclasses, type hints, async/await - - **MicroPython**: Synchronous API, memory-constrained optimizations -3. **Message Format Consistency**: Identical JSON schemas across all platforms -4. **Handler Abstraction**: File server operations abstracted through configurable handlers -5. **Platform-Specific Optimizations**: - - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data (not supported in MicroPython) - - **JSON** (`jsontable`): Universal human-readable format for smaller tables (works in Julia, JavaScript, Python; NOT supported in MicroPython) - -The Julia implementation in [`src/NATSBridge.jl`](src/NATSBridge.jl:1) serves as the ground truth for API design and behavior. - -### Datatype Summary - -| Datatype | Serialization | Use Case | Encoding | Supported Platforms | -|----------|---------------|----------|----------|---------------------| -| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | -| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | -| `table` | Apache Arrow IPC (Python only) | Python's unified table type | `arrow-ipc` → `base64` | Python | diff --git a/docs/walkthrough.md b/docs/walkthrough.md index 39b7d1b..07a7595 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -1,1378 +1,965 @@ -# Cross-Platform NATSBridge Walkthrough +# Walkthrough: NATSBridge -A comprehensive guide to building real-world applications with NATSBridge across **Julia**, **JavaScript**, and **Python/MicroPython**. - -## Table of Contents - -1. [Introduction](#introduction) -2. [Architecture Overview](#architecture-overview) -3. [Building a Chat Application](#building-a-chat-application) -4. [Building a File Transfer System](#building-a-file-transfer-system) -5. [Building a Streaming Data Pipeline](#building-a-streaming-data-pipeline) -6. [Performance Optimization](#performance-optimization) -7. [Best Practices](#best-practices) +**Version**: 1.0.0 +**Date**: 2026-03-13 +**Status**: Active +**Ground Truth**: [`src/NATSBridge.jl`](../src/NATSBridge.jl) --- -## Introduction +## Executive Summary -This walkthrough will guide you through building several real-world applications using NATSBridge. We'll cover: +This document provides the **story of flow** for NATSBridge - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. -- Chat applications with rich media support -- File transfer systems with claim-check pattern -- Streaming data pipelines - -Each section builds on the previous one, gradually increasing in complexity. +This walkthrough serves as the primary onboarding guide for new developers and explains: +- **How the system works** - Step-by-step flow of data transmission and reception +- **Why steps are sequenced** - The rationale behind architectural decisions +- **What could go wrong** - Common failure scenarios and recovery strategies --- -## Architecture Overview +## Overview: The Big Picture -### Cross-Platform System Components +NATSBridge implements the **Claim-Check pattern** for efficient handling of large payloads (>0.5MB): + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ NATSBridge Architecture │ +├─────────────────────────────────────────────────────────────────────┤ +│ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ Sender │ │ Receiver │ │ +│ │ │ │ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │smartsend │◀─────────┤ │smartreceive│ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │Serialize │◀─────────┤ │Deserialize│ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │Transport │◀─────────┤ │Transport │ │ │ +│ │ │Selection │ │ │ │Selection │ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │ NATS │◀─────────┤ │ NATS │ │ │ +│ │ │Publish │ │ │ │Subscribe │ │ │ +│ │ └──────────┘ │ │ └──────────┘ │ │ +│ │ │ │ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │File Server│◀─────────┤ │File Server│ │ │ +│ │ │Upload │ │ │ │Download │ │ │ +│ │ └──────────┘ │ │ └──────────┘ │ │ +│ └──────────────┘ └──────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +### Key Design Principles + +| Principle | Description | Rationale | +|-----------|-------------|-----------| +| **Claim-Check Pattern** | Large payloads uploaded to HTTP server, URL sent via NATS | NATS has message size limits; avoids NATS overflow | +| **Automatic Transport Selection** | Direct (< threshold) vs Link (≥ threshold) based on size | Optimizes memory vs network I/O trade-off | +| **Cross-Platform API** | Consistent `smartsend()`/`smartreceive()` across all platforms | Simplifies developer experience | +| **Exponential Backoff** | Retry downloads with increasing delays | Handles transient failures gracefully | + +--- + +## The Sending Flow: `smartsend()` + +### Step-by-Step Journey ```mermaid -flowchart TB - subgraph JuliaApp["Julia Application"] - JuliaAppCode[App Code] - JuliaBridge[NATSBridge.jl] - JuliaNATS[NATS.jl] - end - - subgraph JSApp["JavaScript Application"] - JSAppCode[App Code] - JSBridge[NATSBridge.js] - JSNATS[nats.js] - end - - subgraph PythonApp["Python Application"] - PythonAppCode[App Code] - PythonBridge[NATSBridge.py] - PythonNATS[nats-py] - end - - subgraph Infrastructure["Infrastructure"] - NATS[NATS Server
Message Broker] - FileServer[HTTP File Server
Upload/Download] - end - - JuliaAppCode --> JuliaBridge - JuliaBridge --> JuliaNATS - JSAppCode --> JSBridge - JSBridge --> JSNATS - PythonAppCode --> PythonBridge - PythonBridge --> PythonNATS - - JuliaNATS --> NATS - JSNATS --> NATS - PythonNATS --> NATS - - NATS --> JuliaNATS - NATS --> JSNATS - NATS --> PythonNATS - - JuliaBridge -.->|HTTP POST upload| FileServer - JSBridge -.->|HTTP POST upload| FileServer - PythonBridge -.->|HTTP POST upload| FileServer - - FileServer -.->|HTTP GET download| JuliaBridge - FileServer -.->|HTTP GET download| JSBridge - FileServer -.->|HTTP GET download| PythonBridge - - style JuliaApp fill:#c5e1a5 - style JSApp fill:#bbdefb - style PythonApp fill:#f8bbd0 - style NATS fill:#fff3e0 - style FileServer fill:#f3e5f5 +flowchart TD + A[User calls smartsend subject data] --> B[Process each payload] + B --> C{Parse payload tuple} + C --> D[Extract: dataname, data, payload_type] + + D --> E[_serialize_data] + E --> F{payload_type} + + F -->|"text"| G[UTF-8 encode] + F -->|"dictionary"| H[JSON serialize] + F -->|"arrowtable"| I[Arrow IPC serialize] + F -->|"jsontable"| J[JSON serialize] + F -->|"image"| K[Raw bytes] + F -->|"audio"| L[Raw bytes] + F -->|"video"| M[Raw bytes] + F -->|"binary"| N[Raw bytes] + + G --> O[Return bytes] + H --> O + I --> O + J --> O + K --> O + L --> O + M --> O + N --> O + + O --> P[Calculate serialized size] + P --> Q{Size < Threshold?} + + Q -->|Yes| R[Direct Transport] + Q -->|No| S[Link Transport] + + R --> T[Base64 encode] + T --> U[Build payload with direct] + + S --> V[Upload to file server] + V --> W[Get download URL] + W --> U + + U --> X[Build envelope] + X --> Y[Convert to JSON] + Y --> Z[Publish to NATS] + + style A fill:#f9f9f9,stroke:#333 + style Z fill:#e0e7ff,stroke:#3b82f6 + style R fill:#d1fae5,stroke:#10b981 + style S fill:#fef3c7,stroke:#f59e0b ``` -### Message Flow +### Detailed Walkthrough -1. **Sender** creates a message envelope with payloads -2. **NATSBridge** serializes and encodes payloads -3. **Transport Decision**: Small payloads go directly to NATS, large payloads are uploaded to file server -4. **NATS** routes messages to subscribers -5. **Receiver** fetches payloads (from NATS or file server) -6. **NATSBridge** deserializes and decodes payloads - ---- - -## Building a Chat Application - -Let's build a full-featured chat application that supports text, images, and file attachments. - -### Step 1: Set Up the Project - -```bash -# Create project directory -mkdir -p chat-app/src -cd chat-app - -# Create configuration file -cat > config.json << 'EOF' -{ - "nats_url": "nats://localhost:4222", - "fileserver_url": "http://localhost:8080", - "size_threshold": 1048576 -} -EOF -``` - -### Step 2: Create the Chat Interface - -#### Julia +#### Step 1: User Calls `smartsend()` ```julia -# src/chat_ui.jl -using NATSBridge, NATS - -struct ChatUI - messages::Vector{Dict} - current_room::String -end - -function ChatUI() - ChatUI(Dict[], "") -end - -function send_message(ui::ChatUI, message_input::String, selected_file::Union{Nothing, String}) - data = [] - - # Add text message - if !isempty(message_input) - push!(data, ("text", message_input, "text")) - end - - # Add file if selected - if selected_file !== nothing - file_data = read(selected_file) - file_type = get_file_type(selected_file) - push!(data, ("attachment", file_data, file_type)) - end - - return data -end - -function get_file_type(filename::String)::String - if endswith(filename, ".png") || endswith(filename, ".jpg") - return "image" - elseif endswith(filename, ".mp3") || endswith(filename, ".wav") - return "audio" - elseif endswith(filename, ".mp4") || endswith(filename, ".avi") - return "video" - else - return "binary" - end -end - -function add_message(ui::ChatUI, user::String, text::String, attachment::Union{Nothing, Dict}) - push!(ui.messages, Dict( - "user" => user, - "text" => text, - "attachment" => attachment - )) -end -``` - -#### JavaScript - -```javascript -// src/chat_ui.js -const NATSBridge = require('./src/natsbridge.js'); - -class ChatUI { - constructor() { - this.messages = []; - this.currentRoom = ""; - } - - sendMessage(messageInput, selectedFile = null) { - const data = []; - - // Add text message - if (messageInput.length > 0) { - data.push(["text", messageInput, "text"]); - } - - // Add file if selected - if (selectedFile !== null) { - const fileData = fs.readFileSync(selectedFile); - const fileType = this.getFileType(selectedFile); - data.push(["attachment", fileData, fileType]); - } - - return data; - } - - getFileType(filename) { - if (filename.endsWith('.png') || filename.endsWith('.jpg')) { - return 'image'; - } else if (filename.endsWith('.mp3') || filename.endsWith('.wav')) { - return 'audio'; - } else if (filename.endsWith('.mp4') || filename.endsWith('.avi')) { - return 'video'; - } else { - return 'binary'; - } - } - - addMessage(user, text, attachment = null) { - this.messages.push({ - user, - text, - attachment - }); - } -} - -module.exports = ChatUI; -``` - -#### Python - -```python -# src/chat_ui.py -from typing import List, Dict, Optional, Union - -class ChatUI: - def __init__(self): - self.messages: List[Dict] = [] - self.current_room: str = "" - - def send_message(self, message_input: str, selected_file: Optional[str] = None) -> List[tuple]: - data = [] - - # Add text message - if message_input: - data.append(("text", message_input, "text")) - - # Add file if selected - if selected_file: - with open(selected_file, "rb") as f: - file_data = f.read() - file_type = self.get_file_type(selected_file) - data.append(("attachment", file_data, file_type)) - - return data - - def get_file_type(self, filename: str) -> str: - if filename.endswith(('.png', '.jpg')): - return "image" - elif filename.endswith(('.mp3', '.wav')): - return "audio" - elif filename.endswith(('.mp4', '.avi')): - return "video" - else: - return "binary" - - def add_message(self, user: str, text: str, attachment: Optional[Dict] = None): - self.messages.append({ - "user": user, - "text": text, - "attachment": attachment - }) -``` - -### Step 3: Create the Message Handler - -#### Julia - -```julia -# src/chat_handler.jl -using NATSBridge, NATS - -struct ChatHandler - nats::NATS.Connection - ui::ChatUI -end - -function ChatHandler(nats_connection::NATS.Connection) - ChatHandler(nats_connection, ChatUI()) -end - -function start(handler::ChatHandler) - # Subscribe to chat rooms - rooms = ["general", "tech", "random"] - - for room in rooms - NATS.subscribe(handler.nats, "/chat/$room") do msg - handle_message(handler, msg) - end - end - - println("Chat handler started") -end - -function handle_message(handler::ChatHandler, msg::NATS.Msg) - env = smartreceive(msg, fileserver_download_handler=_fetch_with_backoff) - - # Extract sender info from envelope - sender = get(env, "sender_name", "Anonymous") - - # Process each payload - for (dataname, data, type) in env["payloads"] - if type == "text" - add_message(handler.ui, sender, data, nothing) - elseif type == "image" - # Convert to data URL for display - base64_data = base64encode(data) - attachment = Dict( - "type" => "image", - "data" => "data:image/png;base64,$base64_data" - ) - add_message(handler.ui, sender, "", attachment) - else - # For other types, use file server URL - attachment = Dict("type" => type, "data" => data) - add_message(handler.ui, sender, "", attachment) - end - end -end -``` - -#### JavaScript - -```javascript -// src/chat_handler.js -const NATSBridge = require('./src/natsbridge.js'); -const nats = require('nats'); - -class ChatHandler { - constructor(natsConnection) { - this.nats = natsConnection; - this.ui = new (require('./chat_ui.js'))(); - } - - async start() { - // Subscribe to chat rooms - const rooms = ['general', 'tech', 'random']; - - for (const room of rooms) { - this.nats.subscribe(`/chat/${room}`, async (msg) => { - await this.handleMessage(msg); - }); - } - - console.log('Chat handler started'); - } - - async handleMessage(msg) { - const env = await NATSBridge.smartreceive(msg, { - fileserver_download_handler: NATSBridge.fetchWithBackoff - }); - - // Extract sender info from envelope - const sender = env.sender_name || 'Anonymous'; - - // Process each payload - for (const [dataname, data, type] of env.payloads) { - if (type === 'text') { - this.ui.addMessage(sender, data, null); - } else if (type === 'image') { - // Convert to data URL for display - const base64Data = Buffer.from(data).toString('base64'); - const attachment = { - type: 'image', - data: `data:image/png;base64,${base64Data}` - }; - this.ui.addMessage(sender, '', attachment); - } else { - // For other types, use file server URL - const attachment = { type, data }; - this.ui.addMessage(sender, '', attachment); - } - } - } -} - -module.exports = ChatHandler; -``` - -#### Python - -```python -# src/chat_handler.py -import asyncio -from typing import Optional -from natsbridge import smartreceive, fetch_with_backoff - -class ChatHandler: - def __init__(self, nats_connection): - self.nats = nats_connection - self.ui = ChatUI() - - async def start(self): - # Subscribe to chat rooms - rooms = ['general', 'tech', 'random'] - - for room in rooms: - await self.nats.subscribe( - f'/chat/{room}', - callback=self.handle_message - ) - - print('Chat handler started') - - async def handle_message(self, msg): - env = await smartreceive( - msg, - fileserver_download_handler=fetch_with_backoff - ) - - # Extract sender info from envelope - sender = env.get('sender_name', 'Anonymous') - - # Process each payload - for dataname, data, type_ in env['payloads']: - if type_ == 'text': - self.ui.add_message(sender, data, None) - elif type_ == 'image': - # Convert to data URL for display - import base64 - base64_data = base64.b64encode(data).decode('utf-8') - attachment = { - 'type': 'image', - 'data': f'data:image/png;base64,{base64_data}' - } - self.ui.add_message(sender, '', attachment) - else: - # For other types, use file server URL or data - attachment = {'type': type_, 'data': data} - self.ui.add_message(sender, '', attachment) -``` - -### Step 4: Run the Application - -```bash -# Start NATS -docker run -p 4222:4222 nats:latest - -# Start file server -mkdir -p /tmp/fileserver -python3 -m http.server 8080 --directory /tmp/fileserver - -# Run chat app # Julia -julia src/chat_ui.jl -julia src/chat_handler.jl - -# JavaScript -node src/chat_ui.js -node src/chat_handler.js +data = [ + ("msg", "Hello World", "text"), + ("img", binary_data, "image") +] +env, msg_json = smartsend("/chat/user/v1/message", data) +``` +```python # Python -python3 src/chat_ui.py -python3 src/chat_handler.py +data = [ + ("msg", "Hello World", "text"), + ("img", binary_data, "image") +] +env, msg_json = await smartsend("/chat/user/v1/message", data) +``` + +```javascript +// JavaScript +const data = [ + ["msg", "Hello World", "text"], + ["img", binaryData, "image"] +]; +const [env, msgJson] = await smartsend("/chat/user/v1/message", data); +``` + +**What happens**: +- User provides a list of tuples: `(dataname, data, payload_type)` +- `dataname`: Identifier for the payload (e.g., "msg", "login_image") +- `data`: The actual data to send +- `payload_type`: Type string determining serialization method + +#### Step 2: Serialization (`_serialize_data`) + +Each payload is serialized based on its type: + +| Payload Type | Julia | Python | JavaScript | Encoding | +|--------------|-------|--------|------------|----------| +| `text` | UTF-8 bytes | UTF-8 bytes | UTF-8 bytes | Base64 | +| `dictionary` | JSON string | JSON string | JSON string | Base64 | +| `arrowtable` | Arrow IPC | Arrow IPC | Arrow IPC | Base64/arrow-ipc | +| `jsontable` | JSON array | JSON array | JSON array | Base64/json | +| `image`/`audio`/`video`/`binary` | Raw bytes | Raw bytes | Raw bytes | Base64 | + +**Example**: +```julia +# Text serialization +text_bytes = Vector{UInt8}("Hello World") # 11 bytes + +# Dictionary serialization +dict_bytes = Vector{UInt8}("{\"key\":\"value\"}") # 17 bytes + +# Arrow table serialization +io = IOBuffer() +Arrow.write(io, data_frame) +arrow_bytes = take!(io) # Binary Arrow IPC stream +``` + +#### Step 3: Transport Selection + +The serialized size determines the transport method: + +| Platform | Threshold | Notes | +|----------|-----------|-------| +| Desktop (Julia/JS/Python) | 500,000 bytes (0.5MB) | Default threshold | +| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | + +**Decision Logic**: +```julia +if payload_size < size_threshold + # Direct transport: send via NATS +else + # Link transport: upload to file server +end +``` + +#### Step 4: Direct Transport Path + +For payloads < threshold: + +1. **Base64 Encode**: Convert binary data to ASCII string +2. **Build Payload**: Create `msg_payload_v1` with `transport="direct"` + +```julia +# Encode as Base64 +payload_b64 = Base64.base64encode(payload_bytes) + +# Build payload +payload = msg_payload_v1( + payload_b64, + payload_type; + transport = "direct", + encoding = "base64", + size = payload_size +) +``` + +#### Step 5: Link Transport Path + +For payloads ≥ threshold: + +1. **Upload to File Server**: Use `plik_oneshot_upload()` +2. **Get Download URL**: Server returns URL for the uploaded file +3. **Build Payload**: Create `msg_payload_v1` with `transport="link"` + +```julia +# Upload to Plik server +response = fileserver_upload_handler(fileserver_url, dataname, payload_bytes) + +# Extract URL +url = response["url"] + +# Build payload +payload = msg_payload_v1( + url, + payload_type; + transport = "link", + encoding = "none", + size = payload_size +) +``` + +**File Server Handler Contract**: +```julia +function fileserver_upload_handler( + file_server_url::String, + dataname::String, + data::Vector{UInt8} +)::Dict{String, Any} + # Returns: Dict("status" => 200, "uploadid" => "...", "fileid" => "...", "url" => "...") +end +``` + +#### Step 6: Build Envelope + +All payloads are wrapped in a message envelope: + +```julia +env = msg_envelope_v1( + subject, + payloads; + correlation_id = correlation_id, + msg_id = msg_id, + msg_purpose = msg_purpose, + sender_name = sender_name, + sender_id = sender_id, + receiver_name = receiver_name, + receiver_id = receiver_id, + reply_to = reply_to, + reply_to_msg_id = reply_to_msg_id, + broker_url = broker_url +) +``` + +**Envelope Fields**: +| Field | Purpose | +|-------|---------| +| `correlation_id` | Track message flow across distributed systems | +| `msg_id` | Unique identifier for this message | +| `timestamp` | ISO 8601 UTC timestamp | +| `send_to` | NATS subject to publish to | +| `msg_purpose` | ACK, NACK, updateStatus, shutdown, chat, command, event | +| `sender_name`/`sender_id` | Sender identification | +| `receiver_name`/`receiver_id` | Receiver identification (empty = broadcast) | +| `reply_to` | Topic for reply messages | +| `broker_url` | NATS server URL | +| `metadata` | Message-level metadata | +| `payloads` | Array of payload objects | + +#### Step 7: Publish to NATS + +The envelope is converted to JSON and published to NATS: + +```julia +env_json_str = envelope_to_json(env) + +# Publish with existing connection +publish_message(nats_connection, subject, env_json_str, correlation_id) + +# Or publish by creating new connection +publish_message(broker_url, subject, env_json_str, correlation_id) ``` --- -## Building a File Transfer System +## The Receiving Flow: `smartreceive()` -Let's build a file transfer system that handles large files efficiently. +### Step-by-Step Journey -### Step 1: File Upload Service +```mermaid +flowchart TD + A[NATS message arrives] --> B[Parse JSON envelope] + B --> C[Extract payloads array] + C --> D{Iterate through payloads} + + D --> E[Get payload transport] + E --> F{transport == direct?} + + F -->|Yes| G[Extract Base64 data] + G --> H[Decode Base64] + H --> I[_deserialize_data] + + F -->|No| J[Extract download URL] + J --> K[Fetch with exponential backoff] + K --> L[_deserialize_data] + + I --> M[Build payload tuple] + L --> M + + M --> N{More payloads?} + N -->|Yes| D + N -->|No| O[Replace payloads array] + O --> P[Return envelope] + + style A fill:#f9f9f9,stroke:#333 + style P fill:#e0e7ff,stroke:#3b82f6 + style G fill:#d1fae5,stroke:#10b981 + style J fill:#fef3c7,stroke:#f59e0b +``` -#### Julia +### Detailed Walkthrough + +#### Step 1: NATS Message Arrives + +The receiver gets a message from NATS: ```julia -# src/file_upload_service.jl -using NATSBridge, HTTP - -struct FileUploadService - broker_url::String - fileserver_url::String -end - -function FileUploadService(broker_url::String, fileserver_url::String) - FileUploadService(broker_url, fileserver_url) -end - -function upload_file(service::FileUploadService, file_path::String, recipient::String)::Dict - file_data = read(file_path) - file_name = basename(file_path) - - data = [("file", file_data, "binary")] - - env, env_json_str = smartsend( - "/files/$recipient", - data, - broker_url=service.broker_url, - fileserver_url=service.fileserver_url - ) - - return env -end - -function upload_large_file(service::FileUploadService, file_path::String, recipient::String)::Dict - file_size = stat(file_path).size - - if file_size > 100 * 1024 * 1024 # > 100MB - println("File too large for direct upload, using streaming...") - return stream_upload(service, file_path, recipient) - end - - return upload_file(service, file_path, recipient) -end - -function stream_upload(service::FileUploadService, file_path::String, recipient::String)::Dict - # Implement streaming upload to file server - # This would require a more sophisticated file server - # For now, we'll use the standard upload - return upload_file(service, file_path, recipient) -end +# Julia +msg = nats_subscription.next() # Get next message +env = smartreceive(msg) ``` -#### JavaScript - -```javascript -// src/file_upload_service.js -const NATSBridge = require('./src/natsbridge.js'); -const fs = require('fs'); - -class FileUploadService { - constructor(brokerUrl, fileserverUrl) { - this.broker_url = brokerUrl; - this.fileserver_url = fileserverUrl; - } - - async uploadFile(filePath, recipient) { - const fileData = fs.readFileSync(filePath); - const fileName = require('path').basename(filePath); - - const data = [["file", fileData, "binary"]]; - - const [env, env_json_str] = await NATSBridge.smartsend( - `/files/${recipient}`, - data, - { - broker_url: this.broker_url, - fileserver_url: this.fileserver_url - } - ); - - return env; - } - - async uploadLargeFile(filePath, recipient) { - const stats = fs.statSync(filePath); - const fileSize = stats.size; - - if (fileSize > 100 * 1024 * 1024) { // > 100MB - console.log('File too large for direct upload, using streaming...'); - return this.streamUpload(filePath, recipient); - } - - return this.uploadFile(filePath, recipient); - } - - async streamUpload(filePath, recipient) { - // Implement streaming upload to file server - // This would require a more sophisticated file server - // For now, we'll use the standard upload - return this.uploadFile(filePath, recipient); - } -} - -module.exports = FileUploadService; -``` - -#### Python - ```python -# src/file_upload_service.py -from natsbridge import smartsend -import os - -class FileUploadService: - def __init__(self, broker_url: str, fileserver_url: str): - self.broker_url = broker_url - self.fileserver_url = fileserver_url - - async def upload_file(self, file_path: str, recipient: str) -> tuple: - with open(file_path, "rb") as f: - file_data = f.read() - file_name = os.path.basename(file_path) - - data = [("file", file_data, "binary")] - - env, env_json_str = await smartsend( - f"/files/{recipient}", - data, - broker_url=self.broker_url, - fileserver_url=self.fileserver_url - ) - - return env, env_json_str - - async def upload_large_file(self, file_path: str, recipient: str) -> tuple: - file_size = os.path.getsize(file_path) - - if file_size > 100 * 1024 * 1024: # > 100MB - print("File too large for direct upload, using streaming...") - return await self.stream_upload(file_path, recipient) - - return await self.upload_file(file_path, recipient) - - async def stream_upload(self, file_path: str, recipient: str) -> tuple: - # Implement streaming upload to file server - # This would require a more sophisticated file server - # For now, we'll use the standard upload - return await self.upload_file(file_path, recipient) +# Python +msg = await nats_consumer.next() # Get next message +env = await smartreceive(msg) ``` -### Step 2: File Download Service +```javascript +// JavaScript +const msg = await natsSubscription.next(); +const env = await smartreceive(msg); +``` -#### Julia +#### Step 2: Parse JSON Envelope + +The message payload is parsed as JSON: ```julia -# src/file_download_service.jl -using NATSBridge +env_json_obj = JSON.parse(String(msg.payload)) +``` -struct FileDownloadService - nats_url::String -end +**Expected Structure**: +```json +{ + "correlation_id": "abc123...", + "msg_id": "def456...", + "timestamp": "2026-03-13T07:02:50.443Z", + "send_to": "/chat/user/v1/message", + "msg_purpose": "chat", + "sender_name": "sender-app", + "sender_id": "sender-uuid...", + "receiver_name": "receiver-app", + "receiver_id": "receiver-uuid...", + "reply_to": "reply.subject", + "reply_to_msg_id": "msg-id...", + "broker_url": "nats://localhost:4222", + "metadata": {}, + "payloads": [ + { + "id": "payload-uuid...", + "dataname": "msg", + "payload_type": "text", + "transport": "direct", + "encoding": "base64", + "size": 11, + "data": "SGVsbG8gV29ybGQ=", + "metadata": {"payload_bytes": 11} + } + ] +} +``` -function FileDownloadService(nats_url::String) - FileDownloadService(nats_url) -end +#### Step 3: Process Each Payload -function download_file(service::FileDownloadService, msg::NATS.Msg, sender::String, download_id::String) - env = smartreceive(msg, fileserver_download_handler=fetch_from_url) +For each payload in the envelope: + +```julia +num_payloads = length(env_json_obj["payloads"]) + +for i in 1:num_payloads + payload = env_json_obj["payloads"][i] + transport = String(payload["transport"]) + dataname = String(payload["dataname"]) - # Process each payload - for (dataname, data, type) in env["payloads"] - if type == "binary" - file_path = "/downloads/$dataname" - write(file_path, data) - println("File saved to $file_path") + if transport == "direct" + # Direct transport path + elseif transport == "link" + # Link transport path + else + error("Unknown transport type: $transport") + end +end +``` + +#### Step 4: Direct Transport Path + +For payloads with `transport == "direct"`: + +1. **Extract Base64 Data**: Get the Base64-encoded string +2. **Decode Base64**: Convert to binary data +3. **Deserialize**: Convert bytes to native data type + +```julia +# Extract Base64 payload +payload_b64 = String(payload["data"]) + +# Decode Base64 +payload_bytes = Base64.base64decode(payload_b64) + +# Deserialize based on type +data_type = String(payload["payload_type"]) +data = _deserialize_data(payload_bytes, data_type, env_json_obj["correlation_id"]) +``` + +**Deserialization Logic**: +| Payload Type | Deserialization | +|--------------|-----------------| +| `text` | UTF-8 bytes → String | +| `dictionary` | UTF-8 bytes → JSON string → Julia object | +| `arrowtable` | UTF-8 bytes → Arrow IPC → DataFrame | +| `jsontable` | UTF-8 bytes → JSON string → Vector{Dict} → DataFrame | +| `image`/`audio`/`video`/`binary` | Bytes directly | + +#### Step 5: Link Transport Path + +For payloads with `transport == "link"`: + +1. **Extract URL**: Get the download URL from payload +2. **Fetch with Backoff**: Download data with retry logic +3. **Deserialize**: Convert bytes to native data type + +```julia +# Extract download URL +url = String(payload["data"]) + +# Fetch with exponential backoff +downloaded_data = fileserver_download_handler( + url, + max_retries, + base_delay, + max_delay, + env_json_obj["correlation_id"] +) + +# Deserialize based on type +data_type = String(payload["payload_type"]) +data = _deserialize_data(downloaded_data, data_type, env_json_obj["correlation_id"]) +``` + +**Download Handler Contract**: +```julia +function fileserver_download_handler( + url::String, + max_retries::Int, + base_delay::Int, + max_delay::Int, + correlation_id::String +)::Vector{UInt8} + # Returns: Vector{UInt8} (downloaded bytes) +end +``` + +#### Step 6: Build Payload List + +Each processed payload is added to the result list: + +```julia +payloads_list = Tuple{String, Any, String}[] + +# After processing each payload +push!(payloads_list, (dataname, data, data_type)) +``` + +**Result Format**: +```julia +[ + ("msg", "Hello World", "text"), + ("img", binary_data, "image") +] +``` + +#### Step 7: Return Envelope + +The envelope is updated with the processed payloads and returned: + +```julia +env_json_obj["payloads"] = payloads_list +return env_json_obj +``` + +--- + +## File Server Integration + +### Plik One-Shot Upload + +NATSBridge uses **Plik** as the default HTTP file server for link transport: + +```julia +# Upload handler +function plik_oneshot_upload( + file_server_url::String, + dataname::String, + data::Vector{UInt8} +)::Dict{String, Any} +``` + +**Upload Flow**: +1. **Create One-Shot Session**: POST `/upload` with `{"OneShot": true}` +2. **Get Upload ID**: Server returns `uploadid` and `uploadtoken` +3. **Upload File**: POST `/file/{uploadid}` with multipart form data +4. **Get File ID**: Server returns `fileid` +5. **Return URL**: Construct download URL + +```julia +# Step 1: Create one-shot session +POST /upload +Headers: Content-Type: application/json +Body: {"OneShot": true} + +Response: +{ + "id": "UPLOAD_ID", + "uploadToken": "UPLOAD_TOKEN", + "status": 200 +} + +# Step 2: Upload file +POST /file/UPLOAD_ID +Headers: X-UploadToken: UPLOAD_TOKEN +Body: multipart/form-data (file) + +Response: +{ + "id": "FILE_ID", + "status": 200 +} + +# Final URL: http://localhost:8080/file/UPLOAD_ID/FILE_ID/filename.ext +``` + +### Exponential Backoff for Downloads + +File downloads use exponential backoff for resilience: + +```julia +function _fetch_with_backoff( + url::String, + max_retries::Int, + base_delay::Int, + max_delay::Int, + correlation_id::String +)::Vector{UInt8} +``` + +**Retry Policy**: +- Initial delay: `base_delay` milliseconds (default: 100ms) +- Multiplier: 2x per retry +- Maximum delay: `max_delay` milliseconds (default: 5000ms) +- Maximum retries: `max_retries` (default: 5) + +**Delay Calculation**: +```julia +delay = base_delay # Start with 100ms + +for attempt in 1:max_retries + try + # Try to fetch + response = HTTP.request("GET", url) + if response.status == 200 + return response.body + end + catch e + if attempt < max_retries + sleep(delay / 1000.0) # Sleep before retry + delay = min(delay * 2, max_delay) # Double delay, cap at max end end end -function fetch_from_url(url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)::Vector{UInt8} - # Fetch data from URL with exponential backoff - # Return downloaded data as Vector{UInt8} -end +error("Failed after $max_retries attempts") ``` -#### JavaScript +**Example Delays**: +| Attempt | Delay | +|---------|-------| +| 1 | 100ms | +| 2 | 200ms | +| 3 | 400ms | +| 4 | 800ms | +| 5 | 1600ms (capped at 5000ms) | -```javascript -// src/file_download_service.js -const NATSBridge = require('./src/natsbridge.js'); -const fs = require('fs'); +--- -class FileDownloadService { - constructor(natsUrl) { - this.nats_url = natsUrl; - } - - async downloadFile(msg, sender, downloadId) { - const env = await NATSBridge.smartreceive(msg, { - fileserver_download_handler: NATSBridge.fetchWithBackoff - }); - - // Process each payload - for (const [dataname, data, type] of env.payloads) { - if (type === 'binary') { - const filePath = `/downloads/${dataname}`; - fs.writeFileSync(filePath, data); - console.log(`File saved to ${filePath}`); - } - } +## Cross-Platform Compatibility + +### Platform-Specific Implementations + +| Platform | File | Key Features | +|----------|------|--------------| +| Julia | `src/NATSBridge.jl` | Multiple dispatch, Arrow.jl support | +| Python | `src/natsbridge.py` | Async/await, pyarrow support | +| Node.js | `src/natsbridge_ssr.js` | Buffer, nats.js | +| Browser | `src/natsbridge_csr.js` | Uint8Array, nats.ws, Web Crypto | +| MicroPython | `src/natsbridge_mpy.py` | Synchronous, limited payload types | + +### API Parity + +All platforms implement the same core API: + +| Function | Julia | Python | JavaScript | MicroPython | +|----------|-------|--------|------------|-------------| +| `smartsend()` | ✅ | ✅ | ✅ | ✅ | +| `smartreceive()` | ✅ | ✅ | ✅ | ✅ | +| `plik_oneshot_upload()` | ✅ | ✅ | ✅ | ⚠️ (placeholder) | +| `fetch_with_backoff()` | ✅ | ✅ | ✅ | ⚠️ (placeholder) | + +### Payload Type Support by Platform + +| Type | Julia | Python | Node.js | Browser | MicroPython | +|------|-------|--------|---------|---------|-------------| +| `text` | ✅ | ✅ | ✅ | ✅ | ✅ | +| `dictionary` | ✅ | ✅ | ✅ | ✅ | ✅ | +| `arrowtable` | ✅ | ✅ | ✅ | ✅ | ❌ | +| `jsontable` | ✅ | ✅ | ✅ | ✅ | ⚠️ | +| `image` | ✅ | ✅ | ✅ | ✅ | ✅ | +| `audio` | ✅ | ✅ | ✅ | ✅ | ✅ | +| `video` | ✅ | ✅ | ✅ | ✅ | ✅ | +| `binary` | ✅ | ✅ | ✅ | ✅ | ✅ | + +--- + +## Error Handling + +### Common Error Scenarios + +| Scenario | Error Code | Recovery | +|----------|------------|----------| +| **Unknown payload_type** | `INVALID_PAYLOAD_TYPE` | Use supported payload_type | +| **Failed to upload** | `UPLOAD_FAILED` | Retry or use direct transport | +| **Failed to fetch** | `DOWNLOAD_FAILED` | Retry with exponential backoff | +| **Unknown transport** | `INVALID_TRANSPORT` | Check payload transport field | +| **NATS connection failed** | `NATS_CONNECTION_FAILED` | Check NATS server availability | +| **Deserialization error** | `DESERIALIZATION_ERROR` | Validate payload_type matches data | + +### Error Response Format + +```json +{ + "correlation_id": "abc123...", + "msg_id": "def456...", + "timestamp": "2026-03-13T07:02:50.443Z", + "send_to": "/chat/user/v1/message", + "error": { + "code": "DOWNLOAD_FAILED", + "message": "Failed to fetch data after 5 attempts", + "details": { + "url": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/filename.ext", + "correlation_id": "abc123..." } + } } - -module.exports = FileDownloadService; ``` -#### Python - -```python -# src/file_download_service.py -from natsbridge import smartreceive, fetch_with_backoff -import os - -class FileDownloadService: - def __init__(self, nats_url: str): - self.nats_url = nats_url - - async def download_file(self, msg, sender: str, download_id: str): - env = await smartreceive( - msg, - fileserver_download_handler=fetch_with_backoff - ) - - # Process each payload - for dataname, data, type_ in env['payloads']: - if type_ == 'binary': - file_path = f'/downloads/{dataname}' - os.makedirs('/downloads', exist_ok=True) - with open(file_path, 'wb') as f: - f.write(data) - print(f"File saved to {file_path}") -``` - -### Step 3: File Transfer CLI - -#### Julia +### Exception Handling Examples ```julia -# src/cli.jl -using NATSBridge - -function main() - println("File Transfer System") - println("====================") - println("1. Upload file") - println("2. Download file") - println("3. List pending downloads") - - print("Enter choice: ") - choice = readline() - - if choice == "1" - upload_file_cli() - elseif choice == "2" - download_file_cli() - end +# File server unavailable +try + env, msg_json = smartsend("/subject", data) +catch e + # Retry with direct transport or use smaller payloads end -function upload_file_cli() - print("Enter file path: ") - file_path = readline() - - print("Enter recipient: ") - recipient = readline() - - file_service = FileUploadService("nats://localhost:4222", "http://localhost:8080") - - try - env = upload_file(file_service, file_path, recipient) - println("Upload successful!") - println("File ID: $(env["payloads"][1][1])") - catch error - println("Upload failed: $(error)") - end +# Deserialization error +try + env = smartreceive(msg) +catch e + # Log correlation_id and inspect payload structure + @error "Deserialization failed" exception=(e, env.correlation_id) end - -function download_file_cli() - print("Enter sender: ") - sender = readline() - - file_service = FileDownloadService("nats://localhost:4222") - - try - download_file(file_service, sender) - println("Download complete!") - catch error - println("Download failed: $(error)") - end -end - -main() ``` --- -## Building a Streaming Data Pipeline +## Debugging and Tracing -Let's build a data pipeline that processes streaming data from sensors. +### Correlation ID Tracking -### Step 1: Sensor Data Model - -#### Julia +Every message includes a `correlation_id` for distributed tracing: ```julia -# src/sensor_data.jl -using Dates, DataFrames +# Generate correlation ID at start of request +correlation_id = string(uuid4()) -struct SensorReading - sensor_id::String - timestamp::String - value::Float64 - unit::String - metadata::Dict{String, Any} -end - -function SensorReading(sensor_id::String, value::Float64, unit::String, metadata::Dict{String, Any}=Dict()) - SensorReading( - sensor_id, - ISODateTime(now(), Dates.Second) |> string, - value, - unit, - metadata - ) -end - -struct SensorBatch - readings::Vector{SensorReading} -end - -function SensorBatch() - SensorBatch(SensorReading[]) -end - -function add_reading(batch::SensorBatch, reading::SensorReading) - push!(batch.readings, reading) -end - -function to_dataframe(batch::SensorBatch)::DataFrame - data = Dict{String, Any}() - data["sensor_id"] = [r.sensor_id for r in batch.readings] - data["timestamp"] = [r.timestamp for r in batch.readings] - data["value"] = [r.value for r in batch.readings] - data["unit"] = [r.unit for r in batch.readings] - - return DataFrame(data) -end +# Use throughout the request flow +log_trace(correlation_id, "Starting smartsend for subject: $subject") +log_trace(correlation_id, "Serialized payload '$dataname' size: $payload_size bytes") +log_trace(correlation_id, "Using direct transport for $payload_size bytes") ``` -#### JavaScript - -```javascript -// src/sensor_data.js -const NATSBridge = require('./src/natsbridge.js'); - -class SensorReading { - constructor(sensorId, value, unit, metadata = {}) { - this.sensor_id = sensorId; - this.timestamp = new Date().toISOString(); - this.value = value; - this.unit = unit; - this.metadata = metadata; - } -} - -class SensorBatch { - constructor() { - this.readings = []; - } - - addReading(reading) { - this.readings.push(reading); - } - - toDataFrame() { - return { - sensor_id: this.readings.map(r => r.sensor_id), - timestamp: this.readings.map(r => r.timestamp), - value: this.readings.map(r => r.value), - unit: this.readings.map(r => r.unit) - }; - } -} - -module.exports = { SensorReading, SensorBatch }; +**Log Format**: +``` +[2026-03-13T07:02:50.443Z] [Correlation: abc123...] Starting smartsend for subject: /chat/user/v1/message +[2026-03-13T07:02:50.445Z] [Correlation: abc123...] Serialized payload 'msg' (type: text) size: 11 bytes +[2026-03-13T07:02:50.446Z] [Correlation: abc123...] Using direct transport for 11 bytes ``` -#### Python +### Logging in All Implementations -```python -# src/sensor_data.py -from datetime import datetime -from dataclasses import dataclass, field -from typing import List, Dict, Any +| Platform | Logging Method | +|----------|----------------| +| Julia | `@info` macro | +| Python | `print()` with timestamp | +| JavaScript | `console.log()` | +| MicroPython | `print()` | -@dataclass -class SensorReading: - sensor_id: str - timestamp: str - value: float - unit: str - metadata: Dict[str, Any] = field(default_factory=dict) +--- - @classmethod - def create(cls, sensor_id: str, value: float, unit: str, metadata: Dict[str, Any] = None): - return cls( - sensor_id=sensor_id, - timestamp=datetime.utcnow().isoformat(), - value=value, - unit=unit, - metadata=metadata or {} - ) +## Testing the Flow -class SensorBatch: - def __init__(self): - self.readings: List[SensorReading] = [] - - def add_reading(self, reading: SensorReading): - self.readings.append(reading) - - def to_dataframe(self): - import pandas as pd - return pd.DataFrame({ - 'sensor_id': [r.sensor_id for r in self.readings], - 'timestamp': [r.timestamp for r in self.readings], - 'value': [r.value for r in self.readings], - 'unit': [r.unit for r in self.readings] - }) -``` - -### Step 2: Sensor Sender - -#### Julia +### Example: End-to-End Test ```julia -# src/sensor_sender.jl -using NATSBridge, Dates, Random +# Sender side +data = [ + ("msg", "Hello", "text"), + ("img", image_data, "image") +] +env, msg_json = smartsend("/chat/user/v1/message", data) -struct SensorSender - broker_url::String - fileserver_url::String -end +# Receiver side +msg = nats_subscription.next() +env = smartreceive(msg) -function SensorSender(broker_url::String, fileserver_url::String) - SensorSender(broker_url, fileserver_url) -end - -function send_reading(sender::SensorSender, sensor_id::String, value::Float64, unit::String) - reading = SensorReading(sensor_id, value, unit) - - data = [("reading", reading.metadata, "dictionary")] - - # Default: is_publish=True (automatically publishes to NATS) - smartsend( - "/sensors/$sensor_id", - data, - broker_url=sender.broker_url, - fileserver_url=sender.fileserver_url - ) -end - -function send_batch(sender::SensorSender, readings::Vector{SensorReading}) - batch = SensorBatch() - for reading in readings - add_reading(batch, reading) - end - - df = to_dataframe(batch) - - # Convert to Arrow IPC format - import Arrow - table = Arrow.Table(df) - - # Serialize to Arrow IPC - import IOBuffer - buf = IOBuffer() - Arrow.write(buf, table) - - arrow_data = take!(buf) - - # Send based on size (auto-selected by smartsend) - data = [("batch", arrow_data, "arrowtable")] - smartsend( - "/sensors/batch", - data, - broker_url=sender.broker_url, - fileserver_url=sender.fileserver_url - ) +# Verify payloads +for (dataname, data, type_) in env["payloads"] + println("$dataname: $data (type: $type_)") end ``` -#### JavaScript +### Test Scenarios -```javascript -// src/sensor_sender.js -const NATSBridge = require('./src/natsbridge.js'); -const { SensorReading, SensorBatch } = require('./sensor_data.js'); +| Scenario | Payloads | Transport | Expected Result | +|----------|----------|-----------|-----------------| +| Single text (small) | `text` | direct | Round-trip successful | +| Single dictionary (small) | `dictionary` | direct | Round-trip successful | +| Single arrow table (small) | `arrowtable` | direct | Arrow IPC round-trip | +| Single image (large) | `image` | link | File server upload/download | +| Mixed payloads | `text` + `image` | direct + link | All payloads preserved | -class SensorSender { - constructor(brokerUrl, fileserverUrl) { - this.broker_url = brokerUrl; - this.fileserver_url = fileserverUrl; - } +--- - async sendReading(sensorId, value, unit) { - const reading = new SensorReading(sensorId, value, unit); - - const data = [["reading", reading.metadata, "dictionary"]]; - - await NATSBridge.smartsend( - `/sensors/${sensorId}`, - data, - { - broker_url: this.broker_url, - fileserver_url: this.fileserver_url - } - ); - } +## Deployment Considerations - async sendBatch(readings) { - const batch = new SensorBatch(); - for (const reading of readings) { - batch.addReading(reading); - } - - const df = batch.toDataFrame(); - - // Convert to Arrow IPC - const arrow = require('apache-arrow'); - const schema = new arrow.Schema([ - new arrow.Field('sensor_id', arrow.string()), - new arrow.Field('timestamp', arrow.string()), - new arrow.Field('value', arrow.float64()), - new arrow.Field('unit', arrow.string()) - ]); - - const arrays = { - sensor_id: new arrow.StringArray(df.sensor_id.map(s => String(s))), - timestamp: new arrow.StringArray(df.timestamp), - value: new arrow.Float64Array(df.value), - unit: new arrow.StringArray(df.unit) - }; - - const recordBatch = arrow.RecordBatch.fromArrays(schema, arrays, df.value.length); - const buffer = arrow.tableFromBatches([recordBatch]).toBuffer(); - const arrow_data = new Uint8Array(buffer); - - // Send based on size (auto-selected by smartsend) - const data = [["batch", arrow_data, "arrowtable"]]; - await NATSBridge.smartsend( - "/sensors/batch", - data, - { - broker_url: this.broker_url, - fileserver_url: this.fileserver_url - } - ); - } -} +### Minimum Infrastructure -module.exports = SensorSender; -``` +| Component | Minimum | Notes | +|-----------|---------|-------| +| NATS Server | 1 instance | Single node for development | +| File Server | 1 instance | HTTP server for large payloads | +| Client Memory | 50MB | Desktop platforms | +| Client Memory | 256KB | MicroPython devices | -#### Python +### Environment Variables -```python -# src/sensor_sender.py -from natsbridge import smartsend -from sensor_data import SensorReading, SensorBatch +| Variable | Default | Description | +|----------|---------|-------------| +| `NATS_URL` | `nats://localhost:4222` | NATS server URL | +| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes | -class SensorSender: - def __init__(self, broker_url: str, fileserver_url: str): - self.broker_url = broker_url - self.fileserver_url = fileserver_url +### Container Deployment - async def send_reading(self, sensor_id: str, value: float, unit: str): - reading = SensorReading.create(sensor_id, value, unit) - - data = [("reading", reading.metadata, "dictionary")] - - await smartsend( - f"/sensors/{sensor_id}", - data, - broker_url=self.broker_url, - fileserver_url=self.fileserver_url - ) - - async def send_batch(self, readings): - batch = SensorBatch() - for reading in readings: - batch.add_reading(reading) - - df = batch.to_dataframe() - - # Convert to Arrow IPC - import pyarrow as arrow - import pyarrow.ipc as ipc - import io - - table = arrow.Table.from_pandas(df) - buf = io.BytesIO() - sink = ipc.new_file(buf, table.schema) - ipc.write_table(table, sink) - sink.close() - arrow_data = buf.getvalue() - - # Send based on size (auto-selected by smartsend) - data = [("batch", arrow_data, "arrowtable")] - await smartsend( - "/sensors/batch", - data, - broker_url=self.broker_url, - fileserver_url=self.fileserver_url - ) +```yaml +# docker-compose.yml +version: '3' +services: + nats: + image: nats:latest + ports: + - "4222:4222" + + plik: + image: rootfs/plik:latest + ports: + - "8080:8080" + volumes: + - plik-data:/data + + app: + image: my-app:latest + depends_on: + - nats + - plik ``` --- -## Performance Optimization +## Common Pitfalls -### 1. Batch Processing +### Pitfall 1: Payload Size Threshold -#### Julia +**Issue**: Payloads just above threshold may cause unnecessary file server uploads + +**Solution**: Monitor payload sizes and adjust threshold based on: +- Network latency to file server +- Memory constraints +- File server performance ```julia -# Batch multiple readings into a single message -function send_batch_readings(sender::SensorSender, readings::Vector{Tuple{String, Float64, String}}) - batch = SensorBatch() - - for (sensor_id, value, unit) in readings - reading = SensorReading(sensor_id, value, unit) - add_reading(batch, reading) - end - - df = to_dataframe(batch) - - # Convert to Arrow IPC - import Arrow - table = Arrow.Table(df) - - # Serialize to Arrow IPC - import IOBuffer - buf = IOBuffer() - Arrow.write(buf, table) - - arrow_data = take!(buf) - - # Send as single message - smartsend( - "/sensors/batch", - [("batch", arrow_data, "arrowtable")], - broker_url=sender.broker_url - ) +# Adjust threshold based on use case +env, msg_json = smartsend("/subject", data; size_threshold = 1_000_000) # 1MB +``` + +### Pitfall 2: File Server Availability + +**Issue**: File server down during upload/download + +**Solution**: Implement fallback strategies: +- Fall back to direct transport for uploads +- Use smaller payloads to avoid link transport +- Implement application-level retries + +```julia +# Fallback to direct transport if file upload fails +try + response = fileserver_upload_handler(fileserver_url, dataname, payload_bytes) +catch e + # Fall back to direct transport + payload_b64 = Base64.base64encode(payload_bytes) + # Build payload with direct transport end ``` -#### JavaScript +### Pitfall 3: Payload Type Mismatch -```javascript -// Batch multiple readings into a single message -async function sendBatchReadings(sender, readings) { - const batch = new SensorBatch(); - - for (const [sensorId, value, unit] of readings) { - const reading = new SensorReading(sensorId, value, unit); - batch.addReading(reading); - } - - const df = batch.toDataFrame(); - - // Convert to Arrow IPC - const arrow = require('apache-arrow'); - const schema = new arrow.Schema([ - new arrow.Field('sensor_id', arrow.string()), - new arrow.Field('timestamp', arrow.string()), - new arrow.Field('value', arrow.float64()), - new arrow.Field('unit', arrow.string()) - ]); - - const arrays = { - sensor_id: new arrow.StringArray(df.sensor_id), - timestamp: new arrow.StringArray(df.timestamp), - value: new arrow.Float64Array(df.value), - unit: new arrow.StringArray(df.unit) - }; - - const recordBatch = arrow.RecordBatch.fromArrays(schema, arrays, df.value.length); - const buffer = arrow.tableFromBatches([recordBatch]).toBuffer(); - const arrow_data = new Uint8Array(buffer); - - // Send as single message - const data = [["batch", arrow_data, "arrowtable"]]; - await NATSBridge.smartsend( - "/sensors/batch", - data, - { broker_url: sender.broker_url } - ); -} -``` +**Issue**: Receiver deserializes with wrong payload_type -### 2. Connection Reuse - -#### Julia +**Solution**: Always validate payload_type matches data: +- Sender and receiver must agree on payload types +- Use consistent payload_type strings across platforms ```julia -# Reuse NATS connections -function create_connection_pool() - connections = Dict{String, NATS.Connection}() - - function get_connection(nats_url::String)::NATS.Connection - if !haskey(connections, nats_url) - connections[nats_url] = NATS.connect(nats_url) - end - return connections[nats_url] - end - - function close_all() - for conn in values(connections) - NATS.drain(conn) - end - empty!(connections) - end - - return (get_connection=get_connection, close_all=close_all) -end -``` +# Sender +smartsend("/subject", [("data", data, "arrowtable")]) -#### Python - -```python -# Reuse NATS connections -import asyncio -import nats - -class ConnectionPool: - def __init__(self): - self.connections = {} - - async def get_connection(self, nats_url: str): - if nats_url not in self.connections: - self.connections[nats_url] = await nats.connect(nats_url) - return self.connections[nats_url] - - async def close_all(self): - for conn in self.connections.values(): - await conn.drain() - self.connections.clear() -``` - -### 3. Caching - -#### Julia - -```julia -# Cache file server responses -using Base.Threads - -const file_cache = Dict{String, Vector{UInt8}}() - -function fetch_with_caching(url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)::Vector{UInt8} - if haskey(file_cache, url) - return file_cache[url] - end - - # Fetch from file server - data = _fetch_with_backoff(url, max_retries, base_delay, max_delay, correlation_id) - - # Cache the result - file_cache[url] = data - - return data -end -``` - -#### Python - -```python -# Cache file server responses -import asyncio -import threading -from natsbridge import fetch_with_backoff - -file_cache = {} -cache_lock = threading.Lock() - -async def fetch_with_caching(url, max_retries, base_delay, max_delay, correlation_id): - with cache_lock: - if url in file_cache: - return file_cache[url] - - # Fetch from file server - data = await fetch_with_backoff(url, max_retries, base_delay, max_delay, correlation_id) - - # Cache the result - with cache_lock: - file_cache[url] = data - - return data +# Receiver (must use same payload_type) +env = smartreceive(msg) +# env["payloads"][1][3] == "arrowtable" ``` --- -## Best Practices +## Performance Considerations -### 1. Error Handling +### Optimization Strategies -#### Julia +| Strategy | Description | When to Use | +|----------|-------------|-------------| +| **Pre-create NATS connection** | Reuse connection for multiple sends | High-throughput scenarios | +| **Batch small payloads** | Combine multiple small payloads | Reduce NATS overhead | +| **Adjust size threshold** | Increase threshold if file server slow | File server bottleneck | +| **Use direct transport** | Avoid file server for small payloads | Low latency requirements | + +### Benchmarking ```julia -function safe_smartsend(subject::String, data::Vector{Tuple}, kwargs...) - try - return smartsend(subject, data; kwargs...) - catch error - println("Failed to send message: $(error)") - return nothing - end -end -``` +# Benchmark direct vs link transport +using BenchmarkTools -#### JavaScript +# Direct transport +@btime smartsend("/subject", [("data", rand(1000), "arrowtable")]) -```javascript -async function safeSmartSend(subject, data, options = {}) { - try { - return await NATSBridge.smartsend(subject, data, options); - } catch (error) { - console.error(`Failed to send message: ${error}`); - return null; - } -} -``` - -#### Python - -```python -from typing import List, Tuple, Optional, Union - -async def safe_smartsend( - subject: str, - data: List[Tuple[str, Any, str]], - **kwargs -) -> Optional[Tuple[dict, str]]: - try: - return await smartsend(subject, data, **kwargs) - except Exception as error: - print(f"Failed to send message: {error}") - return None -``` - -### 2. Logging - -#### Julia - -```julia -using Logging - -function log_send(subject::String, data::Vector{Tuple}, correlation_id::String) - @info "Sending to $subject: $(length(data)) payloads, correlation_id=$correlation_id" -end - -function log_receive(correlation_id::String, num_payloads::Int) - @info "Received message: $num_payloads payloads, correlation_id=$correlation_id" -end -``` - -#### Python - -```python -import logging -from typing import List, Tuple, Any - -logger = logging.getLogger(__name__) - -def log_send(subject: str, data: List[Tuple[str, Any, str]], correlation_id: str): - logger.info(f"Sending to {subject}: {len(data)} payloads, correlation_id={correlation_id}") - -def log_receive(correlation_id: str, num_payloads: int): - logger.info(f"Received message: {num_payloads} payloads, correlation_id={correlation_id}") +# Link transport (with file server) +@btime smartsend("/subject", [("data", rand(1_000_000), "arrowtable")]) ``` --- -## Conclusion +## Versioning -This walkthrough covered: +### Current Version -- Building a chat application with rich media support -- Building a file transfer system with claim-check pattern -- Building a streaming data pipeline for sensor data +- **Major**: 1 (Breaking changes require major version bump) +- **Minor**: 0 (Feature additions) +- **Patch**: 0 (Bug fixes) -For more information, check the [API documentation](../src/README.md) and [test examples](../test/). +### Version Compatibility + +| Version | Supported Platforms | +|---------|---------------------| +| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, MicroPython 1.19+ | --- -## License +## Change Log -MIT +| Date | Version | Changes | +|------|---------|---------| +| 2026-03-13 | 1.0.0 | Initial walkthrough documentation | + +--- + +## References + +- [`docs/requirements.md`](./requirements.md) - Business requirements and user stories +- [`docs/spec.md`](./spec.md) - Technical specification and contracts +- [`docs/architecture.md`](./architecture.md) - System architecture diagrams +- [`src/NATSBridge.jl`](../src/NATSBridge.jl) - Ground truth implementation +- [`README.md`](../README.md) - Project overview + +--- + +*This walkthrough document is versioned and maintained in git alongside the codebase. All implementations must adhere to this documentation.* + + + + +[x] Analyze existing documentation (requirements.md, spec.md, architecture.md) +[x] Read all source files in src/ folder +[x] Write docs/walkthrough.md according to SDD framework \ No newline at end of file From 824468336da570aae5f3c8fe77ab4d4832bdfbd5 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 20:57:08 +0700 Subject: [PATCH 24/29] The Big Picture mermaid --- docs/walkthrough.md | 79 ++++++++++++++++++++++++--------------------- 1 file changed, 43 insertions(+), 36 deletions(-) diff --git a/docs/walkthrough.md b/docs/walkthrough.md index 07a7595..028b9e9 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -22,42 +22,49 @@ This walkthrough serves as the primary onboarding guide for new developers and e NATSBridge implements the **Claim-Check pattern** for efficient handling of large payloads (>0.5MB): -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ NATSBridge Architecture │ -├─────────────────────────────────────────────────────────────────────┤ -│ │ -│ ┌──────────────┐ ┌──────────────┐ │ -│ │ Sender │ │ Receiver │ │ -│ │ │ │ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │smartsend │◀─────────┤ │smartreceive│ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │Serialize │◀─────────┤ │Deserialize│ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │Transport │◀─────────┤ │Transport │ │ │ -│ │ │Selection │ │ │ │Selection │ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │ NATS │◀─────────┤ │ NATS │ │ │ -│ │ │Publish │ │ │ │Subscribe │ │ │ -│ │ └──────────┘ │ │ └──────────┘ │ │ -│ │ │ │ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │File Server│◀─────────┤ │File Server│ │ │ -│ │ │Upload │ │ │ │Download │ │ │ -│ │ └──────────┘ │ │ └──────────┘ │ │ -│ └──────────────┘ └──────────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────┘ +```mermaid +flowchart TD + subgraph "Sender Application" + A1[User Code] + A2[smartsend Function] + A3[Serialize Data] + A4[Transport Selection] + A5[Publish to NATS] + A6[File Server Upload] + end + + subgraph "NATS Broker" + B1[NATS Server] + end + + subgraph "Receiver Application" + C1[Subscribe to NATS] + C2[smartreceive Function] + C3[Deserialize Data] + C4[Transport Selection] + C5[Fetch from File Server] + end + + A1 --> A2 + A2 --> A3 + A3 --> A4 + A4 -->|Direct| A5 + A4 -->|Link| A6 + A6 --> A5 + A5 --> B1 + B1 --> C1 + C1 --> C2 + C2 --> C3 + C2 --> C4 + C4 -->|Fetch| C5 + C5 --> C3 + C3 --> C4 + + style A1 fill:#e8f5e9,stroke:#4caf50 + style B1 fill:#fff3e0,stroke:#f57c00 + style C1 fill:#e3f2fd,stroke:#2196f3 + style A6 fill:#fef3c7,stroke:#f59e0b + style C5 fill:#fef3c7,stroke:#f59e0b ``` ### Key Design Principles From 46f024df4c6e772f62b801da68bdf59f43dc7137 Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 21:04:37 +0700 Subject: [PATCH 25/29] update big picture mermaid --- docs/walkthrough.md | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/docs/walkthrough.md b/docs/walkthrough.md index 028b9e9..589bf39 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -26,11 +26,11 @@ NATSBridge implements the **Claim-Check pattern** for efficient handling of larg flowchart TD subgraph "Sender Application" A1[User Code] - A2[smartsend Function] - A3[Serialize Data] + A2[smartsend] + A3[_serialize_data] A4[Transport Selection] - A5[Publish to NATS] - A6[File Server Upload] + A5[Build Envelope] + A6[Publish to NATS] end subgraph "NATS Broker" @@ -39,10 +39,10 @@ flowchart TD subgraph "Receiver Application" C1[Subscribe to NATS] - C2[smartreceive Function] - C3[Deserialize Data] - C4[Transport Selection] - C5[Fetch from File Server] + C2[smartreceive] + C3[Parse Envelope] + C4[Process Payloads] + C5[Deserialize Data] end A1 --> A2 @@ -50,21 +50,17 @@ flowchart TD A3 --> A4 A4 -->|Direct| A5 A4 -->|Link| A6 - A6 --> A5 - A5 --> B1 + A6 --> B1 B1 --> C1 C1 --> C2 C2 --> C3 - C2 --> C4 - C4 -->|Fetch| C5 - C5 --> C3 C3 --> C4 + C4 --> C5 style A1 fill:#e8f5e9,stroke:#4caf50 style B1 fill:#fff3e0,stroke:#f57c00 style C1 fill:#e3f2fd,stroke:#2196f3 - style A6 fill:#fef3c7,stroke:#f59e0b - style C5 fill:#fef3c7,stroke:#f59e0b + style A4 fill:#fef3c7,stroke:#f59e0b ``` ### Key Design Principles From d888e679c50faae0f7a9dbeabf310634e6cc5ae5 Mon Sep 17 00:00:00 2001 From: narawat Date: Sat, 14 Mar 2026 06:28:06 +0700 Subject: [PATCH 26/29] user walkthrough --- docs/walkthrough.md | 1264 +++++++++++++++++-------------------------- 1 file changed, 508 insertions(+), 756 deletions(-) diff --git a/docs/walkthrough.md b/docs/walkthrough.md index 589bf39..c243035 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -12,7 +12,7 @@ This document provides the **story of flow** for NATSBridge - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. This walkthrough serves as the primary onboarding guide for new developers and explains: -- **How the system works** - Step-by-step flow of data transmission and reception +- **User scenarios** - Real-world use cases from developer perspective - **Why steps are sequenced** - The rationale behind architectural decisions - **What could go wrong** - Common failure scenarios and recovery strategies @@ -22,45 +22,42 @@ This walkthrough serves as the primary onboarding guide for new developers and e NATSBridge implements the **Claim-Check pattern** for efficient handling of large payloads (>0.5MB): -```mermaid -flowchart TD - subgraph "Sender Application" - A1[User Code] - A2[smartsend] - A3[_serialize_data] - A4[Transport Selection] - A5[Build Envelope] - A6[Publish to NATS] - end - - subgraph "NATS Broker" - B1[NATS Server] - end - - subgraph "Receiver Application" - C1[Subscribe to NATS] - C2[smartreceive] - C3[Parse Envelope] - C4[Process Payloads] - C5[Deserialize Data] - end - - A1 --> A2 - A2 --> A3 - A3 --> A4 - A4 -->|Direct| A5 - A4 -->|Link| A6 - A6 --> B1 - B1 --> C1 - C1 --> C2 - C2 --> C3 - C3 --> C4 - C4 --> C5 - - style A1 fill:#e8f5e9,stroke:#4caf50 - style B1 fill:#fff3e0,stroke:#f57c00 - style C1 fill:#e3f2fd,stroke:#2196f3 - style A4 fill:#fef3c7,stroke:#f59e0b +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ NATSBridge Architecture │ +├─────────────────────────────────────────────────────────────────────┤ +│ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ Sender │ │ Receiver │ │ +│ │ │ │ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │smartsend │◀─────────┤ │smartreceive│ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │Serialize │◀─────────┤ │Deserialize│ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │Transport │◀─────────┤ │Transport │ │ │ +│ │ │Selection │ │ │ │Selection │ │ │ +│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ +│ │ │ │ │ │ │ │ +│ │ ▼ │ │ ▼ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │ NATS │◀─────────┤ │ NATS │ │ │ +│ │ │Publish │ │ │ │Subscribe │ │ │ +│ │ └──────────┘ │ │ └──────────┘ │ │ +│ │ │ │ │ │ +│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ +│ │ │File Server│◀─────────┤ │File Server│ │ │ +│ │ │Upload │ │ │ │Download │ │ │ +│ │ └──────────┘ │ │ └──────────┘ │ │ +│ └──────────────┘ └──────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ ``` ### Key Design Principles @@ -74,332 +71,82 @@ flowchart TD --- -## The Sending Flow: `smartsend()` +## User Scenario 1: Chat Webapp ↔ Julia Backend -### Step-by-Step Journey +### Scenario Description -```mermaid -flowchart TD - A[User calls smartsend subject data] --> B[Process each payload] - B --> C{Parse payload tuple} - C --> D[Extract: dataname, data, payload_type] - - D --> E[_serialize_data] - E --> F{payload_type} - - F -->|"text"| G[UTF-8 encode] - F -->|"dictionary"| H[JSON serialize] - F -->|"arrowtable"| I[Arrow IPC serialize] - F -->|"jsontable"| J[JSON serialize] - F -->|"image"| K[Raw bytes] - F -->|"audio"| L[Raw bytes] - F -->|"video"| M[Raw bytes] - F -->|"binary"| N[Raw bytes] - - G --> O[Return bytes] - H --> O - I --> O - J --> O - K --> O - L --> O - M --> O - N --> O - - O --> P[Calculate serialized size] - P --> Q{Size < Threshold?} - - Q -->|Yes| R[Direct Transport] - Q -->|No| S[Link Transport] - - R --> T[Base64 encode] - T --> U[Build payload with direct] - - S --> V[Upload to file server] - V --> W[Get download URL] - W --> U - - U --> X[Build envelope] - X --> Y[Convert to JSON] - Y --> Z[Publish to NATS] - - style A fill:#f9f9f9,stroke:#333 - style Z fill:#e0e7ff,stroke:#3b82f6 - style R fill:#d1fae5,stroke:#10b981 - style S fill:#fef3c7,stroke:#f59e0b -``` +A JavaScript chat webapp wants to send mixed payloads (text message + user avatar image) to a Julia backend, and receive mixed payloads (text response + AI-generated image) back. -### Detailed Walkthrough +### Step-by-Step Flow -#### Step 1: User Calls `smartsend()` - -```julia -# Julia -data = [ - ("msg", "Hello World", "text"), - ("img", binary_data, "image") -] -env, msg_json = smartsend("/chat/user/v1/message", data) -``` - -```python -# Python -data = [ - ("msg", "Hello World", "text"), - ("img", binary_data, "image") -] -env, msg_json = await smartsend("/chat/user/v1/message", data) -``` +#### Step 1: JavaScript Webapp Sends Mixed Payloads ```javascript -// JavaScript -const data = [ - ["msg", "Hello World", "text"], - ["img", binaryData, "image"] -]; -const [env, msgJson] = await smartsend("/chat/user/v1/message", data); +// JavaScript (Browser or Node.js) +const [env, msgJson] = await NATSBridge.smartsend( + "/agent/wine/api/v1/prompt", + [ + ["msg", "Hello! I'm Ton.", "text"], + ["avatar", avatarImageData, "image"] + ], + { + broker_url: "ws://localhost:4222", + receiver_name: "agent-backend", + msg_purpose: "chat" + } +); ``` -**What happens**: -- User provides a list of tuples: `(dataname, data, payload_type)` -- `dataname`: Identifier for the payload (e.g., "msg", "login_image") -- `data`: The actual data to send -- `payload_type`: Type string determining serialization method +**Rationale**: +- **Why mixed payloads?** Real chat apps often send both text and images together +- **Why text first?** Text is smaller, sent via direct transport (fast, no file server needed) +- **Why image second?** Images may trigger link transport if >0.5MB -#### Step 2: Serialization (`_serialize_data`) +#### Step 2: Transport Selection -Each payload is serialized based on its type: +For each payload, NATSBridge determines transport: -| Payload Type | Julia | Python | JavaScript | Encoding | -|--------------|-------|--------|------------|----------| -| `text` | UTF-8 bytes | UTF-8 bytes | UTF-8 bytes | Base64 | -| `dictionary` | JSON string | JSON string | JSON string | Base64 | -| `arrowtable` | Arrow IPC | Arrow IPC | Arrow IPC | Base64/arrow-ipc | -| `jsontable` | JSON array | JSON array | JSON array | Base64/json | -| `image`/`audio`/`video`/`binary` | Raw bytes | Raw bytes | Raw bytes | Base64 | +| Payload | Size | Transport | Reason | +|---------|------|-----------|--------| +| `"msg"` (text) | ~20 bytes | direct | < 0.5MB threshold | +| `"avatar"` (image) | ~150KB | direct | < 0.5MB threshold | -**Example**: -```julia -# Text serialization -text_bytes = Vector{UInt8}("Hello World") # 11 bytes +**Rationale**: +- Direct transport is faster for small payloads (no file server round-trip) +- Link transport is used when payload ≥ 0.5MB (avoids NATS size limits) -# Dictionary serialization -dict_bytes = Vector{UInt8}("{\"key\":\"value\"}") # 17 bytes +#### Step 3: Serialization and Encoding -# Arrow table serialization -io = IOBuffer() -Arrow.write(io, data_frame) -arrow_bytes = take!(io) # Binary Arrow IPC stream -``` +Each payload is serialized: -#### Step 3: Transport Selection +| Payload | Type | Serialization | Encoding | +|---------|------|---------------|----------| +| `"msg"` | `text` | UTF-8 bytes | Base64 | +| `"avatar"` | `image` | Raw bytes | Base64 | -The serialized size determines the transport method: +**Rationale**: +- Text uses UTF-8 encoding for human-readable data +- Images use raw bytes to preserve binary data integrity +- All payloads encoded as Base64 for JSON compatibility -| Platform | Threshold | Notes | -|----------|-----------|-------| -| Desktop (Julia/JS/Python) | 500,000 bytes (0.5MB) | Default threshold | -| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | +#### Step 4: Envelope Building -**Decision Logic**: -```julia -if payload_size < size_threshold - # Direct transport: send via NATS -else - # Link transport: upload to file server -end -``` +NATSBridge builds the message envelope: -#### Step 4: Direct Transport Path - -For payloads < threshold: - -1. **Base64 Encode**: Convert binary data to ASCII string -2. **Build Payload**: Create `msg_payload_v1` with `transport="direct"` - -```julia -# Encode as Base64 -payload_b64 = Base64.base64encode(payload_bytes) - -# Build payload -payload = msg_payload_v1( - payload_b64, - payload_type; - transport = "direct", - encoding = "base64", - size = payload_size -) -``` - -#### Step 5: Link Transport Path - -For payloads ≥ threshold: - -1. **Upload to File Server**: Use `plik_oneshot_upload()` -2. **Get Download URL**: Server returns URL for the uploaded file -3. **Build Payload**: Create `msg_payload_v1` with `transport="link"` - -```julia -# Upload to Plik server -response = fileserver_upload_handler(fileserver_url, dataname, payload_bytes) - -# Extract URL -url = response["url"] - -# Build payload -payload = msg_payload_v1( - url, - payload_type; - transport = "link", - encoding = "none", - size = payload_size -) -``` - -**File Server Handler Contract**: -```julia -function fileserver_upload_handler( - file_server_url::String, - dataname::String, - data::Vector{UInt8} -)::Dict{String, Any} - # Returns: Dict("status" => 200, "uploadid" => "...", "fileid" => "...", "url" => "...") -end -``` - -#### Step 6: Build Envelope - -All payloads are wrapped in a message envelope: - -```julia -env = msg_envelope_v1( - subject, - payloads; - correlation_id = correlation_id, - msg_id = msg_id, - msg_purpose = msg_purpose, - sender_name = sender_name, - sender_id = sender_id, - receiver_name = receiver_name, - receiver_id = receiver_id, - reply_to = reply_to, - reply_to_msg_id = reply_to_msg_id, - broker_url = broker_url -) -``` - -**Envelope Fields**: -| Field | Purpose | -|-------|---------| -| `correlation_id` | Track message flow across distributed systems | -| `msg_id` | Unique identifier for this message | -| `timestamp` | ISO 8601 UTC timestamp | -| `send_to` | NATS subject to publish to | -| `msg_purpose` | ACK, NACK, updateStatus, shutdown, chat, command, event | -| `sender_name`/`sender_id` | Sender identification | -| `receiver_name`/`receiver_id` | Receiver identification (empty = broadcast) | -| `reply_to` | Topic for reply messages | -| `broker_url` | NATS server URL | -| `metadata` | Message-level metadata | -| `payloads` | Array of payload objects | - -#### Step 7: Publish to NATS - -The envelope is converted to JSON and published to NATS: - -```julia -env_json_str = envelope_to_json(env) - -# Publish with existing connection -publish_message(nats_connection, subject, env_json_str, correlation_id) - -# Or publish by creating new connection -publish_message(broker_url, subject, env_json_str, correlation_id) -``` - ---- - -## The Receiving Flow: `smartreceive()` - -### Step-by-Step Journey - -```mermaid -flowchart TD - A[NATS message arrives] --> B[Parse JSON envelope] - B --> C[Extract payloads array] - C --> D{Iterate through payloads} - - D --> E[Get payload transport] - E --> F{transport == direct?} - - F -->|Yes| G[Extract Base64 data] - G --> H[Decode Base64] - H --> I[_deserialize_data] - - F -->|No| J[Extract download URL] - J --> K[Fetch with exponential backoff] - K --> L[_deserialize_data] - - I --> M[Build payload tuple] - L --> M - - M --> N{More payloads?} - N -->|Yes| D - N -->|No| O[Replace payloads array] - O --> P[Return envelope] - - style A fill:#f9f9f9,stroke:#333 - style P fill:#e0e7ff,stroke:#3b82f6 - style G fill:#d1fae5,stroke:#10b981 - style J fill:#fef3c7,stroke:#f59e0b -``` - -### Detailed Walkthrough - -#### Step 1: NATS Message Arrives - -The receiver gets a message from NATS: - -```julia -# Julia -msg = nats_subscription.next() # Get next message -env = smartreceive(msg) -``` - -```python -# Python -msg = await nats_consumer.next() # Get next message -env = await smartreceive(msg) -``` - -```javascript -// JavaScript -const msg = await natsSubscription.next(); -const env = await smartreceive(msg); -``` - -#### Step 2: Parse JSON Envelope - -The message payload is parsed as JSON: - -```julia -env_json_obj = JSON.parse(String(msg.payload)) -``` - -**Expected Structure**: ```json { - "correlation_id": "abc123...", - "msg_id": "def456...", - "timestamp": "2026-03-13T07:02:50.443Z", - "send_to": "/chat/user/v1/message", + "correlation_id": "a1b2c3d4...", + "msg_id": "e5f6g7h8...", + "timestamp": "2026-03-13T16:30:00.000Z", + "send_to": "/agent/wine/api/v1/prompt", "msg_purpose": "chat", - "sender_name": "sender-app", + "sender_name": "chat-webapp", "sender_id": "sender-uuid...", - "receiver_name": "receiver-app", - "receiver_id": "receiver-uuid...", - "reply_to": "reply.subject", - "reply_to_msg_id": "msg-id...", - "broker_url": "nats://localhost:4222", + "receiver_name": "agent-backend", + "receiver_id": "", + "reply_to": "/agent/wine/api/v1/response", + "reply_to_msg_id": "", + "broker_url": "ws://localhost:4222", "metadata": {}, "payloads": [ { @@ -408,271 +155,446 @@ env_json_obj = JSON.parse(String(msg.payload)) "payload_type": "text", "transport": "direct", "encoding": "base64", - "size": 11, - "data": "SGVsbG8gV29ybGQ=", - "metadata": {"payload_bytes": 11} + "size": 20, + "data": "SGVsbG8hIEknIHRlbCB5b3UgSW4gZW5nbGlzaC4=", + "metadata": {"payload_bytes": 20} + }, + { + "id": "payload-uuid...", + "dataname": "avatar", + "payload_type": "image", + "transport": "direct", + "encoding": "base64", + "size": 150000, + "data": "iVBORw0KGgoAAAANSUhEUgAA...", + "metadata": {"payload_bytes": 150000} } ] } ``` -#### Step 3: Process Each Payload +**Rationale**: +- **correlation_id**: Tracks this chat session across all systems +- **reply_to**: Tells backend where to send response +- **payloads array**: Contains all data with metadata for proper handling -For each payload in the envelope: +#### Step 5: Publish to NATS -```julia -num_payloads = length(env_json_obj["payloads"]) - -for i in 1:num_payloads - payload = env_json_obj["payloads"][i] - transport = String(payload["transport"]) - dataname = String(payload["dataname"]) - - if transport == "direct" - # Direct transport path - elseif transport == "link" - # Link transport path - else - error("Unknown transport type: $transport") - end -end +```javascript +await NATSBridge.NATSClient.connect("ws://localhost:4222"); +await NATSBridge.NATSClient.publish("/agent/wine/api/v1/prompt", msgJson); ``` -#### Step 4: Direct Transport Path +**Rationale**: +- NATS provides low-latency message delivery +- JSON format ensures cross-platform compatibility -For payloads with `transport == "direct"`: - -1. **Extract Base64 Data**: Get the Base64-encoded string -2. **Decode Base64**: Convert to binary data -3. **Deserialize**: Convert bytes to native data type +#### Step 6: Julia Backend Receives Message ```julia -# Extract Base64 payload -payload_b64 = String(payload["data"]) +# Julia backend +msg = NATS.subscription.next() # Get message from NATS +env = smartreceive(msg) -# Decode Base64 -payload_bytes = Base64.base64decode(payload_b64) - -# Deserialize based on type -data_type = String(payload["payload_type"]) -data = _deserialize_data(payload_bytes, data_type, env_json_obj["correlation_id"]) +# env["payloads"] is now: +# [ +# ("msg", "Hello! I'm Ton.", "text"), +# ("avatar", binary_data, "image") +# ] ``` -**Deserialization Logic**: -| Payload Type | Deserialization | -|--------------|-----------------| -| `text` | UTF-8 bytes → String | -| `dictionary` | UTF-8 bytes → JSON string → Julia object | -| `arrowtable` | UTF-8 bytes → Arrow IPC → DataFrame | -| `jsontable` | UTF-8 bytes → JSON string → Vector{Dict} → DataFrame | -| `image`/`audio`/`video`/`binary` | Bytes directly | +**Rationale**: +- `smartreceive()` handles both transport types automatically +- Deserialization is type-aware based on `payload_type` +- Returns consistent tuple format regardless of transport -#### Step 5: Link Transport Path - -For payloads with `transport == "link"`: - -1. **Extract URL**: Get the download URL from payload -2. **Fetch with Backoff**: Download data with retry logic -3. **Deserialize**: Convert bytes to native data type +#### Step 7: Julia Backend Sends Response ```julia -# Extract download URL -url = String(payload["data"]) +# Julia backend processes the message +response_text = "Hello Ton! I'm the AI assistant." +generated_image = generate_ai_image(response_text) -# Fetch with exponential backoff -downloaded_data = fileserver_download_handler( - url, - max_retries, - base_delay, - max_delay, - env_json_obj["correlation_id"] +env, msg_json = smartsend( + "/agent/wine/api/v1/response", + [ + ("response", response_text, "text"), + ("generated_image", generated_image, "image") + ], + reply_to = "/chat/user/v1/message", + reply_to_msg_id = msg["msg_id"] ) - -# Deserialize based on type -data_type = String(payload["payload_type"]) -data = _deserialize_data(downloaded_data, data_type, env_json_obj["correlation_id"]) ``` -**Download Handler Contract**: -```julia -function fileserver_download_handler( - url::String, - max_retries::Int, - base_delay::Int, - max_delay::Int, - correlation_id::String -)::Vector{UInt8} - # Returns: Vector{UInt8} (downloaded bytes) -end -``` - -#### Step 6: Build Payload List - -Each processed payload is added to the result list: - -```julia -payloads_list = Tuple{String, Any, String}[] - -# After processing each payload -push!(payloads_list, (dataname, data, data_type)) -``` - -**Result Format**: -```julia -[ - ("msg", "Hello World", "text"), - ("img", binary_data, "image") -] -``` - -#### Step 7: Return Envelope - -The envelope is updated with the processed payloads and returned: - -```julia -env_json_obj["payloads"] = payloads_list -return env_json_obj -``` +**Rationale**: +- **Mixed response**: Text explanation + AI-generated image +- **reply_to**: Ensures response goes to correct topic +- **reply_to_msg_id**: Links response to original message for tracing --- -## File Server Integration +## User Scenario 2: Large File Transfer -### Plik One-Shot Upload +### Scenario Description -NATSBridge uses **Plik** as the default HTTP file server for link transport: +A JavaScript webapp wants to upload a large file (10MB) to a Julia backend for processing. -```julia -# Upload handler -function plik_oneshot_upload( - file_server_url::String, - dataname::String, - data::Vector{UInt8} -)::Dict{String, Any} +### Step-by-Step Flow + +#### Step 1: JavaScript Webapp Sends Large File + +```javascript +const [env, msgJson] = await NATSBridge.smartsend( + "/agent/wine/api/v1/process", + [ + ["file", largeFileData, "binary"] + ], + { + broker_url: "ws://localhost:4222", + receiver_name: "agent-backend" + } +); ``` -**Upload Flow**: -1. **Create One-Shot Session**: POST `/upload` with `{"OneShot": true}` -2. **Get Upload ID**: Server returns `uploadid` and `uploadtoken` -3. **Upload File**: POST `/file/{uploadid}` with multipart form data -4. **Get File ID**: Server returns `fileid` -5. **Return URL**: Construct download URL +#### Step 2: Transport Selection (Link) -```julia -# Step 1: Create one-shot session -POST /upload -Headers: Content-Type: application/json -Body: {"OneShot": true} +| Payload | Size | Transport | Reason | +|---------|------|-----------|--------| +| `"file"` | 10MB | link | ≥ 0.5MB threshold | -Response: +**Rationale**: +- Link transport used for large payloads +- File server handles large file upload +- NATS only sends URL (small message) + +#### Step 3: File Server Upload + +```javascript +// NATSBridge internally calls: +const response = await plikOneshotUpload( + "http://localhost:8080", + "file", + largeFileData +); + +// Response: +// { +// status: 200, +// uploadid: "UPLOAD_ID", +// fileid: "FILE_ID", +// url: "http://localhost:8080/file/UPLOAD_ID/FILE_ID/file" +// } +``` + +**Rationale**: +- Plik handles multipart upload +- One-shot mode simplifies API +- Returns URL for download + +#### Step 4: Envelope with Link Transport + +```json { - "id": "UPLOAD_ID", - "uploadToken": "UPLOAD_TOKEN", - "status": 200 + "correlation_id": "a1b2c3d4...", + "payloads": [ + { + "id": "payload-uuid...", + "dataname": "file", + "payload_type": "binary", + "transport": "link", + "encoding": "none", + "size": 10000000, + "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/file", + "metadata": {} + } + ] } - -# Step 2: Upload file -POST /file/UPLOAD_ID -Headers: X-UploadToken: UPLOAD_TOKEN -Body: multipart/form-data (file) - -Response: -{ - "id": "FILE_ID", - "status": 200 -} - -# Final URL: http://localhost:8080/file/UPLOAD_ID/FILE_ID/filename.ext ``` -### Exponential Backoff for Downloads +**Rationale**: +- `data` field contains URL instead of Base64 +- `transport: "link"` signals URL-based download +- `encoding: "none"` indicates no additional encoding -File downloads use exponential backoff for resilience: +#### Step 5: Julia Backend Receives and Downloads ```julia -function _fetch_with_backoff( - url::String, - max_retries::Int, - base_delay::Int, - max_delay::Int, - correlation_id::String -)::Vector{UInt8} +# Julia backend +msg = NATS.subscription.next() +env = smartreceive(msg) + +# NATSBridge automatically: +# 1. Extracts URL from payload +# 2. Downloads with exponential backoff +# 3. Deserializes to binary data ``` -**Retry Policy**: -- Initial delay: `base_delay` milliseconds (default: 100ms) -- Multiplier: 2x per retry -- Maximum delay: `max_delay` milliseconds (default: 5000ms) -- Maximum retries: `max_retries` (default: 5) - -**Delay Calculation**: -```julia -delay = base_delay # Start with 100ms - -for attempt in 1:max_retries - try - # Try to fetch - response = HTTP.request("GET", url) - if response.status == 200 - return response.body - end - catch e - if attempt < max_retries - sleep(delay / 1000.0) # Sleep before retry - delay = min(delay * 2, max_delay) # Double delay, cap at max - end - end -end - -error("Failed after $max_retries attempts") -``` - -**Example Delays**: -| Attempt | Delay | -|---------|-------| -| 1 | 100ms | -| 2 | 200ms | -| 3 | 400ms | -| 4 | 800ms | -| 5 | 1600ms (capped at 5000ms) | +**Rationale**: +- Exponential backoff handles transient failures +- Automatic download simplifies receiver code +- Binary data returned directly --- -## Cross-Platform Compatibility +## User Scenario 3: Tabular Data Exchange -### Platform-Specific Implementations +### Scenario Description -| Platform | File | Key Features | -|----------|------|--------------| -| Julia | `src/NATSBridge.jl` | Multiple dispatch, Arrow.jl support | -| Python | `src/natsbridge.py` | Async/await, pyarrow support | -| Node.js | `src/natsbridge_ssr.js` | Buffer, nats.js | -| Browser | `src/natsbridge_csr.js` | Uint8Array, nats.ws, Web Crypto | -| MicroPython | `src/natsbridge_mpy.py` | Synchronous, limited payload types | +A Python application sends tabular data (pandas DataFrame) to a Julia backend for analysis, and receives processed results back. -### API Parity +### Step-by-Step Flow -All platforms implement the same core API: +#### Step 1: Python Sends Tabular Data -| Function | Julia | Python | JavaScript | MicroPython | -|----------|-------|--------|------------|-------------| -| `smartsend()` | ✅ | ✅ | ✅ | ✅ | -| `smartreceive()` | ✅ | ✅ | ✅ | ✅ | -| `plik_oneshot_upload()` | ✅ | ✅ | ✅ | ⚠️ (placeholder) | -| `fetch_with_backoff()` | ✅ | ✅ | ✅ | ⚠️ (placeholder) | +```python +# Python +import pandas as pd +from natsbridge import smartsend -### Payload Type Support by Platform +df = pd.DataFrame({ + "id": [1, 2, 3], + "name": ["Alice", "Bob", "Charlie"], + "score": [95, 88, 92] +}) -| Type | Julia | Python | Node.js | Browser | MicroPython | -|------|-------|--------|---------|---------|-------------| -| `text` | ✅ | ✅ | ✅ | ✅ | ✅ | -| `dictionary` | ✅ | ✅ | ✅ | ✅ | ✅ | -| `arrowtable` | ✅ | ✅ | ✅ | ✅ | ❌ | -| `jsontable` | ✅ | ✅ | ✅ | ✅ | ⚠️ | -| `image` | ✅ | ✅ | ✅ | ✅ | ✅ | -| `audio` | ✅ | ✅ | ✅ | ✅ | ✅ | -| `video` | ✅ | ✅ | ✅ | ✅ | ✅ | -| `binary` | ✅ | ✅ | ✅ | ✅ | ✅ | +env, msg_json = await smartsend( + "/agent/wine/api/v1/analyze", + [("data", df, "arrowtable")], + broker_url="nats://localhost:4222", + receiver_name="agent-backend" +) +``` + +**Rationale**: +- `arrowtable` type for efficient tabular data transfer +- Arrow IPC format preserves data types +- Much faster than JSON serialization + +#### Step 2: Serialization to Arrow IPC + +```python +# NATSBridge internally: +import pyarrow as pa +import pyarrow.ipc as ipc + +table = pa.Table.from_pandas(df) +buf = io.BytesIO() +sink = ipc.new_file(buf, table.schema) +ipc.write_table(table, sink) +arrow_bytes = buf.getvalue() +``` + +**Rationale**: +- Arrow IPC preserves column types +- Binary format is compact +- No schema information loss + +#### Step 3: Julia Receives and Deserializes + +```julia +# Julia backend +msg = NATS.subscription.next() +env = smartreceive(msg) + +# env["payloads"][1] is now: +# ("data", DataFrame with id, name, score columns, "arrowtable") +``` + +**Rationale**: +- Arrow.jl reads IPC format directly +- DataFrame returned with correct types +- No manual parsing needed + +#### Step 4: Julia Sends Results + +```julia +# Julia backend +results = analyze_data(env["payloads"][1][2]) + +# Send results back +env, msg_json = smartsend( + "/agent/wine/api/v1/results", + [("results", results, "arrowtable")], + reply_to = "/python/worker/v1/results" +) +``` + +**Rationale**: +- Arrow IPC format for efficient round-trip +- Results preserve DataFrame structure +- Python can deserialize to pandas DataFrame + +--- + +## User Scenario 4: MicroPython Device + +### Scenario Description + +A MicroPython sensor device sends sensor readings to a Python backend. + +### Step-by-Step Flow + +#### Step 1: MicroPython Sends Sensor Data + +```python +# MicroPython +from natsbridge import smartsend + +sensor_data = { + "temperature": 25.5, + "humidity": 60.0, + "pressure": 1013.25 +} + +env, msg_json = smartsend( + "/sensor/device/v1/readings", + [("data", sensor_data, "dictionary")], + broker_url="nats://localhost:4222", + size_threshold=100000 # 100KB for MicroPython +) +``` + +**Rationale**: +- `dictionary` type for JSON-serializable sensor data +- Smaller threshold (100KB) for memory constraints +- Direct transport only (no file server support) + +#### Step 2: Serialization + +```python +# NATSBridge internally: +json_str = json.dumps(sensor_data) +json_bytes = json_str.encode('utf-8') +payload_b64 = base64.b64encode(json_bytes).decode('ascii') +``` + +**Rationale**: +- JSON format for human-readable data +- Base64 for NATS compatibility +- UTF-8 for text encoding + +#### Step 3: Python Backend Receives + +```python +# Python backend +msg = await nats_consumer.next() +env = await smartreceive(msg) + +# env["payloads"][0] is now: +# ("data", {"temperature": 25.5, "humidity": 60.0, ...}, "dictionary") +``` + +**Rationale**: +- JSON deserialization +- Dictionary returned directly +- No Arrow support (memory constraints) + +--- + +## User Scenario 5: Cross-Platform Chat with Mixed Payloads + +### Scenario Description + +Multiple platforms (JavaScript, Python, Julia) communicate in a chat application with mixed payload types. + +### Step-by-Step Flow + +#### Step 1: JavaScript Sends Chat Message + +```javascript +// JavaScript (Frontend) +const [env, msgJson] = await NATSBridge.smartsend( + "/chat/user/v1/message", + [ + ["text", "Check this out!", "text"], + ["image", imageData, "image"] + ], + { + broker_url: "ws://localhost:4222", + receiver_name: "", + msg_purpose: "chat" + } +); +``` + +**Rationale**: +- Empty `receiver_name` = broadcast to all subscribers +- Chat messages often include text + images +- NATS wildcard subscriptions route to correct recipients + +#### Step 2: Python Backend Receives + +```python +# Python (Backend) +msg = await nats_consumer.next() +env = await smartreceive(msg) + +# env["payloads"] is now: +# [ +# ("text", "Check this out!", "text"), +# ("image", binary_data, "image") +# ] +``` + +**Rationale**: +- Consistent API across platforms +- Same payload structure regardless of sender +- Type information preserved + +#### Step 3: Julia Backend Receives + +```julia +# Julia (Backend) +msg = NATS.subscription.next() +env = smartreceive(msg) + +# env["payloads"] is now: +# [ +# ("text", "Check this out!", "text"), +# ("image", binary_data, "image") +# ] +``` + +**Rationale**: +- Cross-platform API parity +- Same function signature across platforms +- Type information enables proper deserialization + +#### Step 4: All Platforms Reply + +Each platform can reply using the same API: + +```python +# Python reply +await smartsend( + "/chat/user/v1/reply", + [("response", "Nice!", "text")], + reply_to="/chat/user/v1/message" +) +``` + +```julia +# Julia reply +smartsend( + "/chat/user/v1/reply", + [("response", "Nice!", "text")], + reply_to="/chat/user/v1/message" +) +``` + +```javascript +// JavaScript reply +await NATSBridge.smartsend( + "/chat/user/v1/reply", + [["response", "Nice!", "text"]], + { reply_to: "/chat/user/v1/message" } +); +``` + +**Rationale**: +- Same API across platforms +- Consistent behavior +- Easy to maintain parity --- @@ -680,120 +602,72 @@ All platforms implement the same core API: ### Common Error Scenarios -| Scenario | Error Code | Recovery | -|----------|------------|----------| -| **Unknown payload_type** | `INVALID_PAYLOAD_TYPE` | Use supported payload_type | -| **Failed to upload** | `UPLOAD_FAILED` | Retry or use direct transport | -| **Failed to fetch** | `DOWNLOAD_FAILED` | Retry with exponential backoff | -| **Unknown transport** | `INVALID_TRANSPORT` | Check payload transport field | -| **NATS connection failed** | `NATS_CONNECTION_FAILED` | Check NATS server availability | -| **Deserialization error** | `DESERIALIZATION_ERROR` | Validate payload_type matches data | +| Scenario | Error | Recovery | +|----------|-------|----------| +| File server unavailable | `UPLOAD_FAILED` | Fall back to direct transport or smaller payloads | +| File server download fails | `DOWNLOAD_FAILED` | Retry with exponential backoff | +| Payload type mismatch | `DESERIALIZATION_ERROR` | Validate payload_type matches data | +| NATS connection lost | `NATS_CONNECTION_FAILED` | NATS client auto-reconnects | ### Error Response Format ```json { "correlation_id": "abc123...", - "msg_id": "def456...", - "timestamp": "2026-03-13T07:02:50.443Z", - "send_to": "/chat/user/v1/message", "error": { "code": "DOWNLOAD_FAILED", "message": "Failed to fetch data after 5 attempts", "details": { - "url": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/filename.ext", + "url": "http://localhost:8080/file/...", "correlation_id": "abc123..." } } } ``` -### Exception Handling Examples - -```julia -# File server unavailable -try - env, msg_json = smartsend("/subject", data) -catch e - # Retry with direct transport or use smaller payloads -end - -# Deserialization error -try - env = smartreceive(msg) -catch e - # Log correlation_id and inspect payload structure - @error "Deserialization failed" exception=(e, env.correlation_id) -end -``` - --- ## Debugging and Tracing ### Correlation ID Tracking -Every message includes a `correlation_id` for distributed tracing: +Every message includes a `correlation_id`: ```julia -# Generate correlation ID at start of request +# At start of request correlation_id = string(uuid4()) -# Use throughout the request flow -log_trace(correlation_id, "Starting smartsend for subject: $subject") -log_trace(correlation_id, "Serialized payload '$dataname' size: $payload_size bytes") -log_trace(correlation_id, "Using direct transport for $payload_size bytes") +# Use throughout the flow +log_trace(correlation_id, "Starting smartsend") +log_trace(correlation_id, "Serialized payload size: 100 bytes") +log_trace(correlation_id, "Published to NATS") ``` **Log Format**: ``` -[2026-03-13T07:02:50.443Z] [Correlation: abc123...] Starting smartsend for subject: /chat/user/v1/message -[2026-03-13T07:02:50.445Z] [Correlation: abc123...] Serialized payload 'msg' (type: text) size: 11 bytes -[2026-03-13T07:02:50.446Z] [Correlation: abc123...] Using direct transport for 11 bytes +[2026-03-13T16:30:00.000Z] [Correlation: abc123...] Starting smartsend +[2026-03-13T16:30:00.001Z] [Correlation: abc123...] Serialized payload size: 100 bytes +[2026-03-13T16:30:00.002Z] [Correlation: abc123...] Published to NATS ``` -### Logging in All Implementations - -| Platform | Logging Method | -|----------|----------------| -| Julia | `@info` macro | -| Python | `print()` with timestamp | -| JavaScript | `console.log()` | -| MicroPython | `print()` | - --- -## Testing the Flow +## Performance Considerations -### Example: End-to-End Test +### Optimization Strategies -```julia -# Sender side -data = [ - ("msg", "Hello", "text"), - ("img", image_data, "image") -] -env, msg_json = smartsend("/chat/user/v1/message", data) +| Strategy | Description | When to Use | +|----------|-------------|-------------| +| Pre-create NATS connection | Reuse connection for multiple sends | High-throughput scenarios | +| Adjust size threshold | Increase threshold if file server slow | File server bottleneck | +| Use direct transport | Avoid file server for small payloads | Low latency requirements | -# Receiver side -msg = nats_subscription.next() -env = smartreceive(msg) +### Size Threshold by Platform -# Verify payloads -for (dataname, data, type_) in env["payloads"] - println("$dataname: $data (type: $type_)") -end -``` - -### Test Scenarios - -| Scenario | Payloads | Transport | Expected Result | -|----------|----------|-----------|-----------------| -| Single text (small) | `text` | direct | Round-trip successful | -| Single dictionary (small) | `dictionary` | direct | Round-trip successful | -| Single arrow table (small) | `arrowtable` | direct | Arrow IPC round-trip | -| Single image (large) | `image` | link | File server upload/download | -| Mixed payloads | `text` + `image` | direct + link | All payloads preserved | +| Platform | Threshold | Notes | +|----------|-----------|-------| +| Desktop (Julia/JS/Python) | 500,000 bytes (0.5MB) | Default threshold | +| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | --- @@ -816,128 +690,6 @@ end | `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | | `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes | -### Container Deployment - -```yaml -# docker-compose.yml -version: '3' -services: - nats: - image: nats:latest - ports: - - "4222:4222" - - plik: - image: rootfs/plik:latest - ports: - - "8080:8080" - volumes: - - plik-data:/data - - app: - image: my-app:latest - depends_on: - - nats - - plik -``` - ---- - -## Common Pitfalls - -### Pitfall 1: Payload Size Threshold - -**Issue**: Payloads just above threshold may cause unnecessary file server uploads - -**Solution**: Monitor payload sizes and adjust threshold based on: -- Network latency to file server -- Memory constraints -- File server performance - -```julia -# Adjust threshold based on use case -env, msg_json = smartsend("/subject", data; size_threshold = 1_000_000) # 1MB -``` - -### Pitfall 2: File Server Availability - -**Issue**: File server down during upload/download - -**Solution**: Implement fallback strategies: -- Fall back to direct transport for uploads -- Use smaller payloads to avoid link transport -- Implement application-level retries - -```julia -# Fallback to direct transport if file upload fails -try - response = fileserver_upload_handler(fileserver_url, dataname, payload_bytes) -catch e - # Fall back to direct transport - payload_b64 = Base64.base64encode(payload_bytes) - # Build payload with direct transport -end -``` - -### Pitfall 3: Payload Type Mismatch - -**Issue**: Receiver deserializes with wrong payload_type - -**Solution**: Always validate payload_type matches data: -- Sender and receiver must agree on payload types -- Use consistent payload_type strings across platforms - -```julia -# Sender -smartsend("/subject", [("data", data, "arrowtable")]) - -# Receiver (must use same payload_type) -env = smartreceive(msg) -# env["payloads"][1][3] == "arrowtable" -``` - ---- - -## Performance Considerations - -### Optimization Strategies - -| Strategy | Description | When to Use | -|----------|-------------|-------------| -| **Pre-create NATS connection** | Reuse connection for multiple sends | High-throughput scenarios | -| **Batch small payloads** | Combine multiple small payloads | Reduce NATS overhead | -| **Adjust size threshold** | Increase threshold if file server slow | File server bottleneck | -| **Use direct transport** | Avoid file server for small payloads | Low latency requirements | - -### Benchmarking - -```julia -# Benchmark direct vs link transport -using BenchmarkTools - -# Direct transport -@btime smartsend("/subject", [("data", rand(1000), "arrowtable")]) - -# Link transport (with file server) -@btime smartsend("/subject", [("data", rand(1_000_000), "arrowtable")]) -``` - ---- - -## Versioning - -### Current Version - -- **Major**: 1 (Breaking changes require major version bump) -- **Minor**: 0 (Feature additions) -- **Patch**: 0 (Bug fixes) - -### Version Compatibility - -| Version | Supported Platforms | -|---------|---------------------| -| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, MicroPython 1.19+ | - --- ## Change Log @@ -965,4 +717,4 @@ using BenchmarkTools [x] Analyze existing documentation (requirements.md, spec.md, architecture.md) [x] Read all source files in src/ folder -[x] Write docs/walkthrough.md according to SDD framework \ No newline at end of file +[x] Write docs/walkthrough.md according to SDD framework with user scenarios \ No newline at end of file From a1971b737a090ed234db6907aec7e84e1553890d Mon Sep 17 00:00:00 2001 From: narawat Date: Sat, 14 Mar 2026 07:43:22 +0700 Subject: [PATCH 27/29] big picture mermaid --- docs/walkthrough.md | 89 +++++++++++++++++++++++++++------------------ 1 file changed, 53 insertions(+), 36 deletions(-) diff --git a/docs/walkthrough.md b/docs/walkthrough.md index c243035..0e9e022 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -22,43 +22,60 @@ This walkthrough serves as the primary onboarding guide for new developers and e NATSBridge implements the **Claim-Check pattern** for efficient handling of large payloads (>0.5MB): +```mermaid +flowchart TB + subgraph NATSBridge["NATSBridge Module"] + direction TB + + subgraph Sender["Sender (smartsend)"] + direction LR + S1["Data Tuples
[(dataname, data, type)]"] + S2["Serialize Data"] + S3["Size Check"] + S4["Transport Selection"] + S5["Build Envelope"] + S6["Publish to NATS"] + + S1 --> S2 + S2 --> S3 + S3 --> S4 + S4 --> S5 + S5 --> S6 + end + + subgraph Receiver["Receiver (smartreceive)"] + direction LR + R1["Subscribe to NATS"] + R2["Parse Envelope"] + R3["Check Transport"] + R4["Deserialize Data"] + R5["Return Payloads"] + + R1 --> R2 + R2 --> R3 + R3 --> R4 + R4 --> R5 + end + + S6 -.->|Message| R1 + end + + subgraph FileServer["HTTP File Server (Plik)"] + direction TB + FS1["Upload URL"] + FS2["Download URL"] + + S4 -.->|Large Payload| FS1 + R3 -.->|Fetch URL| FS2 + end + + style NATSBridge fill:#e1f5fe,stroke:#0288d1,stroke-width:2px + style Sender fill:#b3e5fc,stroke:#0288d1 + style Receiver fill:#b3e5fc,stroke:#0288d1 + style FileServer fill:#ffe0b2,stroke:#f57c00 ``` -┌─────────────────────────────────────────────────────────────────────┐ -│ NATSBridge Architecture │ -├─────────────────────────────────────────────────────────────────────┤ -│ │ -│ ┌──────────────┐ ┌──────────────┐ │ -│ │ Sender │ │ Receiver │ │ -│ │ │ │ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │smartsend │◀─────────┤ │smartreceive│ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │Serialize │◀─────────┤ │Deserialize│ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │Transport │◀─────────┤ │Transport │ │ │ -│ │ │Selection │ │ │ │Selection │ │ │ -│ │ └────┬─────┘ │ │ └────┬─────┘ │ │ -│ │ │ │ │ │ │ │ -│ │ ▼ │ │ ▼ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │ NATS │◀─────────┤ │ NATS │ │ │ -│ │ │Publish │ │ │ │Subscribe │ │ │ -│ │ └──────────┘ │ │ └──────────┘ │ │ -│ │ │ │ │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │File Server│◀─────────┤ │File Server│ │ │ -│ │ │Upload │ │ │ │Download │ │ │ -│ │ └──────────┘ │ │ └──────────┘ │ │ -│ └──────────────┘ └──────────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────┘ -``` + +### Key Design Principles ### Key Design Principles From bc670a2af409a74337b8e39ed40fa97c8ab0115f Mon Sep 17 00:00:00 2001 From: narawat Date: Sat, 14 Mar 2026 07:50:00 +0700 Subject: [PATCH 28/29] new mermaid update --- docs/walkthrough.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/walkthrough.md b/docs/walkthrough.md index 0e9e022..92fc129 100644 --- a/docs/walkthrough.md +++ b/docs/walkthrough.md @@ -66,6 +66,7 @@ flowchart TB FS2["Download URL"] S4 -.->|Large Payload| FS1 + FS1 -.->|URL| S5 R3 -.->|Fetch URL| FS2 end From d32f64dbc0641f092f5f6680d7c91519d0606766 Mon Sep 17 00:00:00 2001 From: narawat Date: Sat, 14 Mar 2026 07:52:15 +0700 Subject: [PATCH 29/29] update version --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index d2d61e8..85f16c3 100644 --- a/Project.toml +++ b/Project.toml @@ -1,6 +1,6 @@ name = "NATSBridge" uuid = "f2724d33-f338-4a57-b9f8-1be882570d10" -version = "0.5.4" +version = "0.5.5" authors = ["narawat "] [deps]