diff --git a/docs/architecture.md b/docs/architecture.md deleted file mode 100644 index 1259106..0000000 --- a/docs/architecture.md +++ /dev/null @@ -1,942 +0,0 @@ -# Architecture Documentation: msghandler - -**Version**: 1.4.0 -**Date**: 2026-05-14 -**Status**: Active -**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl) -**Architecture Level**: C4 Container Level - ---- - -## 1. Executive Summary - -This document defines the **blueprint** for msghandler - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, **Dart**, **Rust**, and **MicroPython** applications using a message broker as the transport layer. - -This architecture document serves as the single source of truth for: -- **System Structure**: How components fit together and interact -- **Scaling Considerations**: How the system scales horizontally and vertically -- **Failure Modes**: How the system handles failures and recovers -- **Trade-off Decisions**: The rationale behind architectural decisions - -### 1.1 Specification Traceability - -| Architecture Section | Specification Reference | UI Specification Reference | Requirement ID(s) | -|---------------------|-------------------------|---------------------------|-------------------| -| Section 2 (Context Diagram) | specification.md:2 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | -| Section 3 (Container Diagram) | specification.md:2, specification.md:3, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | -| Section 4 (Component Diagram) | specification.md:2, specification.md:3, specification.md:5, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | -| Section 5 (High-Level) | specification.md:2, specification.md:3, specification.md:5, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | -| Section 6 (Message Envelope) | specification.md:2, specification.md:3, specification.md:8 | - | FR-011, FR-012, FR-013, FR-014, NFR-401, NFR-403 | -| Section 7 (Payload Type) | specification.md:3, specification.md:5, specification.md:6 | - | FR-001, FR-002, FR-003, FR-006, FR-012, NFR-101, NFR-102 | -| Section 8 (Transport Strategy) | specification.md:6, specification.md:7 | - | FR-003, FR-004, FR-005, FR-010, NFR-104, NFR-105, NFR-106 | -| Section 9 (Platform-Specific) | specification.md:13, specification.md:14 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | -| Section 10 (Scaling) | specification.md:7, specification.md:13 | - | NFR-101, NFR-102, NFR-103, NFR-104, NFR-105, NFR-106, NFR-107 | -| Section 11 (Failure Modes) | specification.md:9, specification.md:11 | - | FR-008, FR-009, FR-010, FR-011, NFR-201, NFR-202, NFR-203 | -| Section 12 (Trade-offs) | specification.md:2, specification.md:3, specification.md:6, specification.md:7 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 | -| Section 13 (Deployment) | specification.md:12, specification.md:18 | - | FR-013, FR-014, NFR-201, NFR-203 | -| Section 14 (Security) | specification.md:4, specification.md:9, specification.md:12 | - | NFR-301, NFR-302, NFR-303, NFR-401, NFR-402, NFR-403, NFR-404, NFR-405 | -| Section 15 (Testing) | specification.md:17 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 | - ---- - -## 2. Architecture Overview - -## Architecture Overview - -### C4 Context Diagram - -```mermaid -flowchart TD - subgraph "External Systems" - Message_Broker[Message Broker
NATS/MQTT/WebSocket/Custom] - File_Server[HTTP File Server
Plik/AWS S3/Custom] - end - - Julia_App[Julia Application] - JS_App[JavaScript Application
Node.js/Browser] - Python_App[Python Application
Desktop] - Dart_App[Dart Application
Desktop/Flutter/Web] - Rust_App[Rust Application
Server/Desktop] - MicroPython_App[MicroPython Device] - end - - Julia_App -->|Transport| Message_Broker - JS_App -->|Transport| Message_Broker - Python_App -->|Transport| Message_Broker - Dart_App -->|Transport| Message_Broker - Rust_App -->|Transport| Message_Broker - MicroPython_App -->|Transport| Message_Broker - - Julia_App -->|HTTP| File_Server - JS_App -->|HTTP| File_Server - Python_App -->|HTTP| File_Server - Dart_App -->|HTTP| File_Server - Rust_App -->|HTTP| File_Server - MicroPython_App -->|HTTP| File_Server - - style Message_Broker fill:#fff3e0,stroke:#f57c00 - style File_Server fill:#f3e5f5,stroke:#9c27b4 - style Julia_App fill:#e8f5e9,stroke:#4caf50 - style JS_App fill:#e3f2fd,stroke:#2196f3 - style Python_App fill:#e3f2fd,stroke:#2196f3 - style Dart_App fill:#fff0f6,stroke:#e91e63 - style Rust_App fill:#dea584,stroke:#e65100 - style MicroPython_App fill:#fce4ec,stroke:#e91e63 -``` - -### C4 Container Diagram - -```mermaid -flowchart TD - subgraph "Client Container" - Julia_Module[Julia msghandler Module] - JS_Module[JavaScript msghandler Module] - Python_Module[Python msghandler Module] - Dart_Module[Dart msghandler Module] - Rust_Module[Rust msghandler Module] - MicroPython_Module[MicroPython msghandler Module] - end - - Julia_Module --> Transport_Client - JS_Module --> Transport_Client - Python_Module --> Transport_Client - Dart_Module --> Transport_Client - Rust_Module --> Transport_Client - MicroPython_Module --> Transport_Client - - Transport_Client --> Message_Broker - - Julia_Module --> File_Client - JS_Module --> File_Client - Python_Module --> File_Client - Dart_Module --> File_Client - Rust_Module --> File_Client - MicroPython_Module --> File_Client - - File_Client --> File_Server - - style Julia_Module fill:#e8f5e9,stroke:#4caf50 - style JS_Module fill:#e3f2fd,stroke:#2196f3 - style Python_Module fill:#e3f2fd,stroke:#2196f3 - style Dart_Module fill:#fff0f6,stroke:#e91e63 - style Rust_Module fill:#dea584,stroke:#e65100 - style MicroPython_Module fill:#fce4ec,stroke:#e91e63 - style Message_Broker fill:#fff3e0,stroke:#f57c00 - style File_Server fill:#f3e5f5,stroke:#9c27b4 -``` - -### C4 Component Diagram (Julia Implementation) - -```mermaid -flowchart TD - subgraph "msghandler Module" - smartpack[smartpack Function] - smartunpack[smartunpack Function] - - Serialize[_serialize_data] - Deserialize[_deserialize_data] - - EnvelopeToJson[envelope_to_json] - - FileServerUpload[fileserver_upload_handler] - FileServerDownload[fileserver_download_handler] - - LogTrace[log_trace] - end - - subgraph "Data Models" - Payload[msg_payload_v1 Struct] - Envelope[msg_envelope_v1 Struct] - end - - smartpack --> Serialize - smartpack --> EnvelopeToJson - smartpack --> FileServerUpload - - smartunpack --> Deserialize - smartunpack --> FileServerDownload - - EnvelopeToJson --> Envelope - Serialize --> Payload - - style smartpack fill:#d1fae5,stroke:#10b981 - style smartunpack fill:#d1fae5,stroke:#10b981 - style FileServerUpload fill:#fef3c7,stroke:#f59e0b - style FileServerDownload fill:#fef3c7,stroke:#f59e0b -``` - ---- - -## High-Level Architecture - -### System Components - -| Component | Purpose | Platform Support | -|-----------|---------|------------------| -| **smartpack** | Send data with automatic transport selection, returns (envelope, json_string) for caller to publish via transport | All | -| **smartunpack** | Receive and process messages from JSON string | All | -| **_serialize_data** | Serialize data according to payload type | All | -| **_deserialize_data** | Deserialize bytes to native data types | All | -| **envelope_to_json** | Convert msg_envelope_v1 struct to JSON string | All | -| **log_trace** | Log trace messages with correlation ID | All | -| **fileserver_upload_handler** | Upload large payloads to HTTP server | Desktop (Julia/JS/Python/Dart/Rust) | -| **fileserver_download_handler** | Download payloads from HTTP server with exponential backoff | Desktop (Julia/JS/Python/Dart/Rust) | -| **plik_upload_file** | Upload a local file to Plik server from disk | Rust | - -### Data Flow - -```mermaid -flowchart TD - A[User calls smartpack subject data] --> B[Process each payload] - B --> C{Calculate serialized size} - C -->|Size < Threshold| D[Direct Transport] - C -->|Size >= Threshold| E[Link Transport] - - D --> F[Serialize data] - F --> G[Base64 encode] - G --> H[Build payload object] - - E --> I[Serialize data] - I --> J[Upload to file server] - J --> K[Get download URL] - K --> H - - H --> L[Build envelope] - L --> M[Convert to JSON] - M --> N[Return envelope + JSON to caller] - - style A fill:#f9f9f9,stroke:#333 - style N fill:#e0e7ff,stroke:#3b82f6 - style D fill:#d1fae5,stroke:#10b981 - style E fill:#fef3c7,stroke:#f59e0b -``` - ---- - -## Message Envelope Architecture - -### msg_envelope_v1 Structure (Julia) - -```julia -struct msg_envelope_v1 - correlation_id::String # UUID v4 for distributed tracing - msg_id::String # UUID v4 for this message - timestamp::String # ISO 8601 UTC timestamp - - send_to::String # Topic/subject to publish to - msg_purpose::String # ACK, NACK, updateStatus, shutdown, chat - sender_name::String # Sender application name - sender_id::String # UUID v4 of sender - receiver_name::String # Receiver application name (empty = broadcast) - receiver_id::String # UUID v4 of receiver (empty = broadcast) - - reply_to::String # Topic for reply messages - reply_to_msg_id::String # Message ID being replied to - broker_url::String # Broker URL for the transport layer - - metadata::Dict{String, Any} # Message-level metadata - payloads::Vector{msg_payload_v1} # List of payloads -end -``` - -### msg_payload_v1 Structure (Julia) - -```julia -struct msg_payload_v1 - id::String # UUID v4 for this payload - dataname::String # Name of the payload - payload_type::String # text, dictionary, arrowtable, etc. - transport::String # direct or link - encoding::String # none, json, base64, arrow-ipc - size::Integer # Size in bytes - data::Any # Base64 string or URL - metadata::Dict{String, Any} # Payload-level metadata -end -``` - -### JSON Schema (Cross-Platform) - -```json -{ - "correlation_id": "string (UUID v4)", - "msg_id": "string (UUID v4)", - "timestamp": "string (ISO 8601 UTC)", - "send_to": "string", - "msg_purpose": "string", - "sender_name": "string", - "sender_id": "string (UUID v4)", - "receiver_name": "string", - "receiver_id": "string (UUID v4)", - "reply_to": "string", - "reply_to_msg_id": "string", - "broker_url": "string", - "metadata": "object", - "payloads": [ - { - "id": "string (UUID v4)", - "dataname": "string", - "payload_type": "string", - "transport": "string", - "encoding": "string", - "size": "integer", - "data": "string or URL", - "metadata": "object" - } - ] -} -``` - ---- - -## Payload Type Architecture - -### Supported Payload Types - -| Type | Description | Serialization | Encoding | Platforms | -|------|-------------|---------------|----------|-----------| -| `text` | Plain text string | UTF-8 bytes | Base64 | All | -| `dictionary` | JSON object | JSON string | Base64/JSON | All | -| `arrowtable` | Apache Arrow IPC | Arrow IPC stream | Base64/arrow-ipc | Desktop (Julia/Python/Node.js/Dart/Rust) | -| `jsontable` | JSON array of objects | JSON string | Base64/json | All (including Browser/Dart Web) | -| `image` | Binary image data | Raw bytes | Base64 | All | -| `audio` | Binary audio data | Raw bytes | Base64 | All | -| `video` | Binary video data | Raw bytes | Base64 | All | -| `binary` | Generic binary data | Raw bytes | Base64 | All | - -### Serialization Logic - -```mermaid -flowchart TD - A[Input data + payload_type] --> B{Payload Type} - - B -->|"text"| C[UTF-8 encode] - B -->|"dictionary"| D[JSON serialize] - B -->|"arrowtable"| E[Arrow IPC serialize] - B -->|"jsontable"| F[JSON serialize] - B -->|"image"| G[Raw bytes] - B -->|"audio"| H[Raw bytes] - B -->|"video"| I[Raw bytes] - B -->|"binary"| J[Raw bytes] - - C --> K[Return bytes] - D --> K - E --> K - F --> K - G --> K - H --> K - I --> K - J --> K - - style A fill:#f9f9f9,stroke:#333 - style K fill:#e0e7ff,stroke:#3b82f6 -``` - ---- - -## Transport Strategy Architecture - -### Size Threshold Decision Logic - -| Platform | Size Threshold | Notes | -|----------|----------------|-------| -| Desktop (Julia/JS/Python/Dart) | 500,000 bytes (0.5MB) | Default threshold | -| Dart Desktop | 500,000 bytes (0.5MB) | Default threshold | -| Dart Flutter | 500,000 bytes (0.5MB) | Default threshold | -| Dart Web | 500,000 bytes (0.5MB) | Default threshold | -| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | - -### Transport Selection Flow - -```mermaid -flowchart TD - A[smartpack called] --> B[Serialize payload] - B --> C[Calculate size] - C --> D{Size < Threshold?} - - D -->|Yes| E[Direct Transport] - D -->|No| F[Link Transport] - - E --> G[Base64 encode] - G --> H[Build payload with direct transport] - - F --> I[Upload to file server] - I --> J[Get download URL] - J --> K[Build payload with link transport] - - H --> L[Build envelope] - K --> L - - style A fill:#f9f9f9,stroke:#333 - style L fill:#e0e7ff,stroke:#3b82f6 - style E fill:#d1fae5,stroke:#10b981 - style F fill:#fef3c7,stroke:#f59e0b -``` - -### Direct Transport Protocol - -When `transport = "direct"`, the `data` field contains a Base64-encoded string of the serialized payload. - -**Encoding Rules**: -- `text`: UTF-8 → Base64 -- `dictionary`: JSON → Base64 (or direct JSON) -- `arrowtable`: Arrow IPC → Base64 (or arrow-ipc) -- `jsontable`: JSON → Base64 (or direct JSON) -- `image`/`audio`/`video`/`binary`: Raw bytes → Base64 - -### Link Transport Protocol - -When `transport = "link"`, the `data` field contains a URL pointing to the uploaded payload. - -**Upload Flow**: -1. Serialize payload according to `payload_type` -2. Upload to HTTP file server (e.g., Plik) -3. Include returned URL in `data` field - -**Download Flow**: -1. Extract URL from payload -2. Fetch with exponential backoff (max 5 retries) -3. Deserialize based on `payload_type` - ---- - -## Platform-Specific Architecture - -### Julia Architecture - -Julia leverages multiple dispatch for type-specific implementations: - -- **Multiple Dispatch**: Function overloading based on argument types -- **Struct-based Data Models**: Explicit type definitions with `struct` -- **Native Arrow IPC**: Support via `Arrow.jl` -- **Async/Await**: Tasks for non-blocking I/O - -```julia -# Multiple dispatch for serialization -function _serialize_data(data::String, payload_type::String) - # Text serialization -end - -function _serialize_data(data::Dict, payload_type::String) - # Dictionary serialization -end - -function _serialize_data(data::DataFrame, payload_type::String) - # Arrow table serialization -end -``` - -### JavaScript Architecture - -JavaScript uses async/await for non-blocking I/O: - -- **Module-level Utilities**: Serialization functions -- **Native ArrayBuffer**: Binary data handling (Browser) / Buffer (Node.js) -- **Fetch API**: HTTP file server communication - -#### Node.js Implementation (msghandler_ssr.js) - -- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`, `ws://`) -- **Apache Arrow IPC**: Full support via `apache-arrow` -- **Buffer for binary data**: Native Node.js Buffer handling - -#### Browser Implementation (msghandler_csr.js) - -- **WebSocket connections**: Uses `ws://` or `wss://` URLs (transport-agnostic) -- **No Apache Arrow**: Uses `jsontable` for tabular data only -- **Uint8Array for binary data**: Browser-compatible binary handling -- **Web Crypto API**: UUID generation via `crypto.getRandomValues()` - -### Python Architecture - -Python uses classes for stateful operations: - -- **Class-based msghandler**: Encapsulated API -- **Dataclasses**: Structured data (MsgPayloadV1, MsgEnvelopeV1) -- **Async/await**: I/O operations -- **pyarrow**: Arrow IPC support - -```python -class msghandler: - DEFAULT_SIZE_THRESHOLD = 500_000 - - def __init__(self, broker_url=None, fileserver_url=None): - self.broker_url = broker_url or self.DEFAULT_BROKER_URL - self.fileserver_url = fileserver_url or self.DEFAULT_FILESERVER_URL -``` - -### Dart Architecture - -Dart uses classes for stateful operations with async/await: - -- **Class-based msghandler**: Encapsulated API -- **Data classes**: Structured data (MsgPayloadV1, MsgEnvelopeV1) -- **Async/await**: I/O operations -- **dart-arrow**: Arrow IPC support (Desktop/Flutter only) -- **HTTP package**: HTTP file server communication -- **Transport package**: Transport client with WebSocket support (Dart Web) - -```dart -class msghandler { - static const DEFAULT_SIZE_THRESHOLD = 500000; - - final String brokerUrl; - final String fileserverUrl; - - msghandler({ - this.brokerUrl = DEFAULT_BROKER_URL, - this.fileserverUrl = 'http://localhost:8080', - }); -} -``` - -#### Dart Desktop (Dart SDK) - -- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`) -- **Apache Arrow IPC**: Full support via `dart-arrow` -- **Uint8List for binary data**: Native Dart binary handling - -#### Dart Flutter (Dart SDK) - -- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`) -- **Apache Arrow IPC**: Full support via `dart-arrow` -- **Uint8List for binary data**: Native Dart binary handling - -#### Dart Web (Dart SDK) - -- **WebSocket connections**: Uses `ws://` or `wss://` URLs (transport-agnostic) -- **No Apache Arrow**: Uses `jsontable` for tabular data only -- **Uint8List for binary data**: Browser-compatible binary handling -- **Fetch API**: HTTP file server communication via `http` package - -### Browser Architecture - -Browser JavaScript has specific constraints due to security and compatibility: - -- **Async/await**: Native async/await support -- **No Apache Arrow**: Arrow IPC not available in browsers -- **JSON table only**: Use "jsontable" for tabular data -- **WebSocket transport**: Uses transport client for browser-compatible connections -- **Fetch API**: HTTP file server communication via fetch - -### MicroPython Architecture - -MicroPython has significant constraints: - -- **Synchronous API**: No async/await -- **Memory-constrained**: 256KB - 1MB -- **Limited payload support**: No tables, max 50KB -- **Simplified UUID generation**: Custom implementation - -```python -# MicroPython constraints -DEFAULT_SIZE_THRESHOLD = 100_000 # 100KB -MAX_PAYLOAD_SIZE = 50_000 # 50KB hard limit -``` - -### Rust Architecture - -Rust leverages compile-time type safety and async runtimes: - -- **Type-safe payloads**: Rust enum discriminates between `Text`, `Dictionary`, `ArrowTable`, `Binary`, etc. -- **serde serialization**: Automatic JSON deserialization via `#[derive(Serialize, Deserialize)]` -- **tokio runtime**: Efficient async I/O for transport connections and HTTP file server operations -- **arrow2 integration**: Native Arrow IPC deserialization without intermediate format conversion -- **reqwest**: High-performance HTTP client with built-in TLS and connection pooling -- **Zero-copy patterns**: `Vec` passed directly to avoid unnecessary memory copies -- **Result**: Idiomatic error handling with typed error types - -```rust -// Type-safe payload enum (compile-time discrimination) -#[derive(Serialize, Deserialize, Clone)] -pub enum Payload { - Text(String), - Dictionary(serde_json::Value), - ArrowTable(Vec), - JsonTable(serde_json::Value), - Image(Vec), - Audio(Vec), - Video(Vec), - Binary(Vec), -} - -// Configuration via builder pattern -pub struct smartpackOptions { - pub broker_url: String, - pub fileserver_url: String, - pub fileserver_upload_handler: Option>, - pub size_threshold: usize, - pub correlation_id: String, - pub msg_purpose: String, - pub sender_name: String, - // ... other fields -} - -// Transport client with tokio integration -let conn = transport_client::connect(DEFAULT_BROKER_URL).await?; - -// Subscribe and process messages -let mut sub = conn.subscribe("/agent/wine/api/v1/analyze")?; -for msg in sub.messages() { - let envelope = smartunpack(&String::from_utf8_lossy(&msg.payload), &Default::default()).await?; - // Access deserialized payloads by type - for payload in &envelope.payloads { - match payload.payload_type.as_str() { - "arrowtable" => { /* payload.data is base64-encoded Arrow IPC */ }, - "text" => { /* payload.data is decoded text string */ }, - "binary" | "image" | "audio" | "video" => { /* payload.data is base64-encoded binary */ }, - _ => { /* other types */ } - } - } -} -``` - ---- - -## Scaling Architecture - -### Horizontal Scaling - -| Component | Scaling Strategy | -|-----------|------------------| -| **Message Broker** | Cluster deployment with multiple nodes | -| **File Server** | Load balancer + multiple instances | -| **Client Applications** | Deploy multiple instances behind load balancer | - -### Vertical Scaling - -| Component | Scaling Strategy | -|-----------|------------------| -| **Message Broker** | Increase memory, CPU, disk I/O | -| **File Server** | Increase memory, CPU, disk capacity | -| **Client Applications** | Increase heap size (Python/JS) | - -### Performance Considerations - -| Metric | Target | Notes | -|--------|--------|-------| -| Message serialization overhead | <50ms | For 10KB payload | -| Message deserialization overhead | <50ms | For 10KB payload | -| Transport connection establishment | <100ms | Connection pool recommended | -| File upload latency | <1s | For 0.5MB file | -| File download latency | <1s | For 0.5MB file | - ---- - -## Failure Modes and Recovery - -### Transport Connection Failure - -**Scenario**: Message broker unavailable - -**Handler**: -- Connection auto-reconnect via transport-level reconnection -- Retry with exponential backoff for publish operations - -**Recovery**: -- Transport client automatically attempts reconnection -- Application can check connection status before publishing - -### File Server Unavailable - -**Scenario**: HTTP file server unavailable during upload/download - -**Handler**: -- Retry up to 5 times with exponential backoff (100ms → 5000ms) -- Fallback to direct transport for upload (MicroPython) - -**Recovery**: -- Exponential backoff: `delay = min(delay * 2, max_delay)` -- After max retries, throw error with correlation ID - -### Deserialization Error - -**Scenario**: Payload type mismatch or corrupted data - -**Handler**: -- Log correlation ID and throw error -- No retry (data corruption) - -**Recovery**: -- Application must validate payload_type matches data type -- Use proper serialization before sending - -### Memory Overflow (MicroPython) - -**Scenario**: Payload exceeds maximum size (50KB) - -**Handler**: -- Reject payloads >50KB with MemoryError -- No retry (client-side check) - -**Recovery**: -- Application must split large payloads -- Use direct transport only for small payloads - ---- - -## Trade-off Decisions - -### Decision 1: Direct vs Link Transport Threshold - -**Trade-off**: Memory vs Network I/O - -**Decision**: Use 0.5MB threshold for desktop, 100KB for MicroPython - -**Rationale**: -- Direct transport uses more memory (Base64 encoding adds ~33% overhead) -- Link transport requires network I/O for upload/download -- 0.5MB is reasonable for desktop memory constraints -- 100KB is necessary for MicroPython memory constraints - -### Decision 2: Base64 Encoding for Direct Transport - -**Trade-off**: Bandwidth vs Simplicity - -**Decision**: Use Base64 encoding for all direct transport payloads - -**Rationale**: -- Simplifies JSON serialization (all data is string-compatible) -- Increases payload size by ~33%, but transport can handle this -- Alternative would be binary payload support (more complex) - -### Decision 3: Multiple Platform Implementations - -**Trade-off**: Development effort vs Cross-platform support - -**Decision**: Maintain separate implementations for each platform - -**Rationale**: -- Each platform has idiomatic patterns (multiple dispatch, async/await, etc.) -- Maintains developer productivity and code quality -- API parity ensures cross-platform compatibility - -### Decision 4: Handler Function Abstraction - -**Trade-off**: Flexibility vs Simplicity - -**Decision**: Abstract file server operations through handler functions - -**Rationale**: -- Allows support for different file server implementations (Plik, AWS S3, custom) -- Maintains simplicity for common use cases -- Enables plug-in architecture for custom backends - ---- - -## Deployment Architecture - -### Minimum Infrastructure - -| Component | Minimum | Notes | -|-----------|---------|-------| -| Message Broker | 1 instance | Single node for development | -| File Server | 1 instance | HTTP server for large payloads | -| Client Memory | 50MB | Desktop platforms (Julia/JS/Python/Dart) | -| Client Memory | 256KB | MicroPython devices | - -### Environment Variables - -| Variable | Default | Description | -|----------|---------|-------------| -| `BROKER_URL` | `ws://localhost:4222` | Message broker URL | -| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | -| `SIZE_THRESHOLD` | `500000` | Size threshold in bytes (0.5MB) | - -### Container Deployment - -```mermaid -flowchart TD - subgraph "Docker Network" - Broker_Container[Message Broker] - FileServer_Container[Plik File Server] - App_Container[Application Container] - end - - App_Container -->|Transport| Broker_Container - App_Container -->|HTTP| FileServer_Container - - style Broker_Container fill:#fff3e0,stroke:#f57c00 - style FileServer_Container fill:#f3e5f5,stroke:#9c27b4 - style App_Container fill:#e3f2fd,stroke:#2196f3 -``` - ---- - -## Security Considerations - -### Payload Integrity - -**Mechanism**: SHA-256 checksum via metadata - -**Implementation**: -- Sender calculates checksum and stores in payload metadata -- Receiver validates checksum on receipt - -### Transport Security - -**Mechanism**: TLS support for transport connections - -**Implementation**: -- Use `nats://` URL for plain text -- Use `tls://` URL for TLS-encrypted connections -- Use `ws://` or `wss://` for WebSocket connections - -### File Server Security - -**Mechanism**: Authentication token for file uploads - -**Implementation**: -- Plik uses upload token in `X-UploadToken` header -- Application can implement custom authentication - ---- - -## Testing Architecture - -### Unit Test Coverage - -| Test Category | Coverage | Files | -|---------------|----------|-------| -| Serialization | All payload types | `test/test_*_sender.*` | -| Deserialization | All payload types | `test/test_*_receiver.*` | -| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | -| File server upload | Plik integration | Platform-specific | -| File server download | Exponential backoff | Platform-specific | - -### Integration Test Scenarios - -| Scenario | Platforms | Payloads | Transport | Expected Result | -|----------|-----------|----------|-----------|-----------------| -| Cross-platform text | Julia ↔ JS ↔ Python | text | direct | Round-trip successful | -| Arrow IPC round-trip | Julia ↔ JS ↔ Python | arrowtable | direct | Arrow IPC preserved | -| Large file transfer | All | image/audio/video | link | File server upload/download | -| Multi-payload mixed | All | text + image + file | direct/link | All payloads preserved | - ---- - -## Versioning - -### Architecture Versioning - -| Component | Version | Notes | -|-----------|---------|-------| -| Architecture | 1.0.0 | Initial release | -| Protocol | v1 | Message envelope protocol version | - -### Backward Compatibility - -| Version | Supported Platforms | -|---------|---------------------| -| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, Dart 2.17+, Rust 1.70+, MicroPython 1.19+ | - ---- - -## Change Log - -| Date | Version | Changes | -|------|---------|---------| -| 2026-05-15 | 1.5.0 | Made transport layer agnostic | All sections | -| - | - | Removed all NATS-specific references from architecture docs | All sections | -| - | - | Updated diagrams to use generic "Message Broker" instead of "NATS Server" | All sections | -| - | - | Updated code examples to use transport-agnostic patterns | All sections | -| - | - | Removed NATS client packages from external dependencies | All sections | -| 2026-05-14 | 1.4.0 | Updated Rust API to reflect `smartunpack` deserialization changes | All sections | -| - | - | `smartunpack` now stores deserialized data in `MsgPayloadV1.data` | specification.md:8 | -| - | - | Added `plik_upload_file` convenience function to component table | specification.md:13 | -| - | - | Fixed Rust payload access pattern (data is String, not Payload enum) | All sections | -| - | - | Fixed `smartpackOptions.fileserver_upload_handler` type to `Arc` | specification.md:13 | -| - | - | Removed `metadata` from link transport examples (now `None`/omitted) | specification.md:3 | -| - | - | Removed duplicate footer text | All sections | -| 2026-05-13 | 1.3.0 | Added Rust support with tokio, serde, and arrow2 | All sections | -| - | - | Added Rust to C4 diagrams (context, container) | All sections | -| - | - | Added Rust platform-specific architecture section | specification.md:13 | -| - | - | Updated component table with Rust support | All sections | -| 2026-05-13 | 1.2.0 | Aligned with ground truth implementation (src/msghandler.jl) | -| - | - | Removed publish_message component (commented out in source) | -| - | - | Removed NATSClient and NATSConnectionPool classes (not in ground truth) | -| - | - | Updated smartpack to return JSON for caller to publish via transport | -| - | - | Updated component diagram to match actual module structure | -| - | - | Updated data flow to show smartpack returns JSON for caller to publish | -| - | - | Fixed SIZE_THRESHOLD default to 500,000 bytes | -| 2026-03-15 | 1.1.0 | JavaScript connection management | -| - | - | Added NATSClient with keepAlive support | -| - | - | Added NATSConnectionPool for connection reuse | -| - | - | Added publishMessage function with closeConnection option | -| (Historical - pre-transport-agnostic refactor) | | | -| 2026-03-13 | 1.0.0 | Initial architecture documentation | - ---- - -## 16. References - -### 16.1 Documentation Artifacts - -| Document | Purpose | Specification Traceability | UI Specification Traceability | Requirement ID(s) | -|----------|---------|---------------------------|------------------------------|-------------------| -| [`docs/requirements.md`](./requirements.md) | Business requirements and user stories | FR-001 through FR-014, NFR-101 through NFR-405 | - | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/specification.md`](./specification.md) | Technical contract for msghandler | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/ui-specification.md`](./ui-specification.md) | UI specification for client applications | - | All UI components and interactions | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/walkthrough.md`](./walkthrough.md) | End-to-end system flow | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/architecture.md`](./architecture.md) | System architecture diagrams | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/validation.md`](./validation.md) | CI/CD validation rules | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`docs/runbook.md`](./runbook.md) | Operational runbook | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 | - -### 16.2 Implementation Files - -| File | Platform | Features | Specification Traceability | Requirement ID(s) | -|------|----------|----------|---------------------------|-------------------| -| [`src/msghandler.jl`](../src/msghandler.jl) | Julia | Full feature set, Arrow IPC, multiple dispatch | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler_ssr.js`](../src/msghandler_ssr.js) | Node.js | Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler_csr.js`](../src/msghandler_csr.js) | Browser | JSON table only | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler.py`](../src/msghandler.py) | Python | Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler.dart`](../src/msghandler.dart) | Dart | Full feature set, Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler.rs`](../src/msghandler.rs) | Rust | Full feature set, Arrow IPC, async/await, type-safe, file upload helpers | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 | -| [`src/msghandler_mpy.py`](../src/msghandler_mpy.py) | MicroPython | Limited to direct transport | specification.md:2-19 (all sections) | FR-005, FR-006, FR-012 | - -### 16.3 External Dependencies - -| Platform | Package | Version | Purpose | Specification Traceability | Requirement ID(s) | -|----------|---------|---------|---------|--------------------------|-------------------| -| Julia | JSON.jl | Latest | JSON serialization | specification.md:11 | FR-012, NFR-101, NFR-102 | -| Julia | Arrow.jl | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 | -| Julia | HTTP.jl | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 | -| Julia | UUIDs.jl | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 | -| Node.js | node-fetch | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 | -| Browser | - | - | Transport-agnostic (caller provides) | specification.md:11 | FR-013, FR-014 | -| Python | aiohttp | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 | -| Python | pyarrow | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 | -| Dart | http | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 | -| Dart | uuid | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 | -| Dart | dart-arrow | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 | -| Rust | serde | Latest | JSON serialization | specification.md:11 | FR-012, NFR-101, NFR-102 | -| Rust | serde_json | Latest | JSON handling | specification.md:11 | FR-012, NFR-101, NFR-102 | -| Rust | tokio | Latest | Async runtime | specification.md:11 | FR-013, FR-014 | -| Rust | reqwest | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 | -| Rust | uuid | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 | -| Rust | arrow2 | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 | -| MicroPython | builtin | N/A | Limited implementation | specification.md:11 | FR-005, FR-006, FR-012 | - ---- - -## 17. Change Log - -| Date | Version | Changes | Specification Reference | -|------|---------|---------|------------------------| -| 2026-03-23 | 1.1.0 | Updated to ASG Framework architecture guidelines | specification.md:2-19 (all sections) | -| 2026-03-15 | 1.1.0 | JavaScript connection management | specification.md:2-19 (all sections) | -| 2026-03-13 | 1.0.0 | Initial architecture documentation | specification.md:2-19 (all sections) | - ---- - -## 18. Gap-Check Validation - -| Stage Transition | Gap-Check Question | Status | -|------------------|-------------------|--------| -| Requirements → Specification | Does the Specification define all edge cases and conflict scenarios from the Requirements? | ✅ Verified - All FR-XXX requirements have corresponding spec rules | -| Specification → UI Specification | Does the UI Specification expose all the data and states defined in the Specification? | ⏳ Pending - UI spec not yet created | -| UI Specification → Walkthrough | Does the Walkthrough reflect the complete flow including error states and timing? | ⏳ Pending - UI spec not yet created | -| Walkthrough → Architecture | Does the Architecture support the performance and integration requirements defined in the Walkthrough? | ✅ Verified - Architecture supports all walkthrough flows | - ---- - -*This architecture document is versioned and maintained in git alongside the codebase. All implementations must adhere to this architecture.* diff --git a/docs/implementation-plan.md b/docs/implementation-plan.md new file mode 100644 index 0000000..77c21e3 --- /dev/null +++ b/docs/implementation-plan.md @@ -0,0 +1,416 @@ +# Implementation Plan: msghandler + +**Version**: 1.3.0 +**Date**: 2026-05-19 +**Status**: Active +**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl) + +--- + +## 1. Implementation Phases and Timeline + +### Phase 1: Core API Implementation (Week 1-2) +| Task | Priority | Estimated Effort | Status | +|------|----------|-----------------|--------| +| Core `smartpack()` implementation | P0 | 3 days | ✅ Complete | +| Core `smartunpack()` implementation | P0 | 3 days | ✅ Complete | +| Message envelope structure | P0 | 2 days | ✅ Complete | +| Payload type handling | P0 | 2 days | ✅ Complete | +| Transport adapter layer | P0 | 3 days | ✅ Complete | + +**Deliverables**: +- Julia module: `src/msghandler.jl` +- Node.js module: `src/msghandler_ssr.js` +- Browser module: `src/msghandler_csr.js` +- Python module: `src/msghandler.py` +- MicroPython module: `src/msghandler_mpy.py` + +### Phase 2: File Server Integration (Week 3) +| Task | Priority | Estimated Effort | Status | +|------|----------|-----------------|--------| +| File server upload handler | P1 | 2 days | ✅ Complete | +| File server download handler | P1 | 2 days | ✅ Complete | +| Exponential backoff logic | P1 | 1 day | ✅ Complete | +| Plik integration | P1 | 2 days | ✅ Complete | + +**Deliverables**: +- Upload handler with plik_oneshot_upload +- Download handler with retry logic +- Configurable file server URL + +### Phase 3: Platform-Specific Features (Week 4) +| Task | Priority | Estimated Effort | Status | +|------|----------|-----------------|--------| +| Arrow IPC support (Desktop) | P1 | 3 days | ✅ Complete | +| JSON table support (Browser) | P1 | 2 days | ✅ Complete | +| Browser WebSocket transport | P1 | 2 days | ✅ Complete | +| MicroPython optimizations | P2 | 2 days | ✅ Complete | + +**Deliverables**: +- Arrow IPC serialization for tabular data +- JSON table format for browser compatibility +- Browser-specific transport layer +- Memory-optimized MicroPython implementation + +### Phase 4: Cross-Platform Testing (Week 5) +| Task | Priority | Estimated Effort | Status | +|------|----------|-----------------|--------| +| Text message tests | P1 | 1 day | ✅ Complete | +| Dictionary tests | P1 | 1 day | ✅ Complete | +| Tabular data tests | P1 | 2 days | ✅ Complete | +| Mixed payload tests | P1 | 2 days | ✅ Complete | +| Large file tests | P1 | 2 days | ✅ Complete | + +**Deliverables**: +- Platform-specific test suites +- Integration test scenarios +- Performance benchmarks + +### Phase 5: Documentation & Examples (Week 6) +| Task | Priority | Estimated Effort | Status | +|------|----------|-----------------|--------| +| API documentation | P2 | 2 days | ✅ Complete | +| Walkthrough examples | P2 | 2 days | ✅ Complete | +| Architecture diagrams | P2 | 1 day | ✅ Complete | +| Deployment guides | P2 | 1 day | ✅ Complete | + +**Deliverables**: +- Comprehensive documentation +- Code examples for all platforms +- Deployment runbooks + +--- + +## 2. Module/Component Breakdown + +### Core Modules + +#### msghandler.jl (Julia) +``` +src/ +└── msghandler.jl + ├── Constants (DEFAULT_SIZE_THRESHOLD, etc.) + ├── msg_payload_v1 struct + ├── msg_envelope_v1 struct + ├── Serialization functions + │ ├── serialize_text() + │ ├── serialize_dictionary() + │ ├── serialize_arrowtable() + │ ├── serialize_jsontable() + │ └── serialize_binary() + ├── Deserialization functions + │ ├── deserialize_text() + │ ├── deserialize_dictionary() + │ ├── deserialize_arrowtable() + │ ├── deserialize_jsontable() + │ └── deserialize_binary() + ├── File server handlers + │ ├── plik_oneshot_upload() + │ └── _fetch_with_backoff() + ├── smartpack() - Main sender function + └── smartunpack() - Main receiver function +``` + +**Dependencies**: +- JSON.jl (JSON serialization) +- Arrow.jl (Arrow IPC) +- HTTP.jl (File server) +- UUIDs.jl (IDs) +- DataFrames.jl (DataFrame support) + +#### msghandler_ssr.js (Node.js) +``` +src/ +├── msghandler_ssr.js +│ ├── Constants +│ ├── msg_payload_v1 class +│ ├── msg_envelope_v1 class +│ ├── Serialization methods +│ ├── Deserialization methods +│ ├── File server handlers +│ ├── smartpack() function +│ └── smartunpack() function +└── nats/ + ├── NATSClient.js + └── NATSConnectionPool.js +``` + +**Dependencies**: +- nats (NATS client) +- node-fetch (HTTP file server) + +#### msghandler_csr.js (Browser) +``` +src/ +└── msghandler_csr.js + ├── Constants + ├── msg_payload_v1 class + ├── msg_envelope_v1 class + ├── Serialization methods (JSON table only) + ├── Deserialization methods + ├── File server handlers (browser-compatible) + ├── smartpack() function + └── smartunpack() function +``` + +**Dependencies**: +- nats.ws (Browser NATS client) + +#### msghandler.py (Python) +``` +src/ +└── msghandler.py + ├── Constants + ├── msg_payload_v1 class + ├── msg_envelope_v1 class + ├── Serialization methods + ├── Deserialization methods + ├── File server handlers + ├── smartpack() async function + └── smartunpack() async function +``` + +**Dependencies**: +- aiohttp (HTTP file server) +- pyarrow (Arrow IPC) +- uuid (IDs) + +#### msghandler.rs (Rust) +``` +src/ +├── msghandler.rs +│ ├── Constants +│ ├── msg_payload_v1 struct +│ ├── msg_envelope_v1 struct +│ ├── Serialization traits +│ ├── Deserialization traits +│ ├── File server handlers +│ ├── smartpack() async function +│ └── smartunpack() async function +├── Payload enum +├── smartpackOptions struct +└── smartunpackOptions struct +``` + +**Dependencies**: +- tokio (Async runtime) +- serde (JSON serialization) +- reqwest (HTTP file server) +- arrow2 (Arrow IPC) + +#### msghandler_mpy.py (MicroPython) +``` +src/ +└── msghandler_mpy.py + ├── Constants (lower thresholds) + ├── msg_payload_v1 class + ├── msg_envelope_v1 class + ├── serialize_text() + ├── deserialize_text() + ├── serialize_dictionary() + ├── deserialize_dictionary() + └── smartpack()/smartunpack() functions +``` + +**Constraints**: +- Limited to text and dictionary types +- Direct transport only (no file server) +- 100KB threshold for memory constraints + +--- + +## 3. Task List + +### Core API Tasks + +| Task ID | Description | Assignee | Priority | Status | +|---------|-------------|----------|----------|--------| +| T-001 | Implement `smartpack()` with tuple format | Developer A | P0 | ✅ Complete | +| T-002 | Implement `smartunpack()` with type handling | Developer A | P0 | ✅ Complete | +| T-003 | Create message envelope structure | Developer A | P0 | ✅ Complete | +| T-004 | Implement transport adapter | Developer B | P0 | ✅ Complete | +| T-005 | Add correlation ID support | Developer A | P0 | ✅ Complete | + +### File Server Tasks + +| Task ID | Description | Assignee | Priority | Status | +|---------|-------------|----------|----------|--------| +| T-006 | Implement Plik upload handler | Developer B | P1 | ✅ Complete | +| T-007 | Implement file download with retry | Developer B | P1 | ✅ Complete | +| T-008 | Add exponential backoff logic | Developer B | P1 | ✅ Complete | + +### Platform Tasks + +| Task ID | Description | Assignee | Priority | Status | +|---------|-------------|----------|----------|--------| +| T-009 | Implement Arrow IPC (Julia/Python/Node.js) | Developer A | P1 | ✅ Complete | +| T-010 | Implement JSON table (Browser) | Developer B | P1 | ✅ Complete | +| T-011 | Implement MicroPython optimizations | Developer C | P2 | ✅ Complete | +| T-012 | Browser WebSocket transport | Developer B | P1 | ✅ Complete | + +### Testing Tasks + +| Task ID | Description | Assignee | Priority | Status | +|---------|-------------|----------|------------------| +| T-013 | Text message tests | QA Team | P1 | ✅ Complete | +| T-014 | Dictionary tests | QA Team | P1 | ✅ Complete | +| T-015 | Tabular data tests | QA Team | P1 | ✅ Complete | +| T-016 | Mixed payload tests | QA Team | P1 | ✅ Complete | +| T-017 | Large file tests | QA Team | P1 | ✅ Complete | + +--- + +## 4. Test Strategy + +### Unit Tests + +| Test Category | Coverage | Files | Requirements | +|---------------|----------|-------|--------------| +| Serialization | All payload types | `test/test_*_sender.*` | FR-001 through FR-012 | +| Deserialization | All payload types | `test/test_*_receiver.*` | FR-001 through FR-012 | +| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | FR-003, FR-004, FR-006 | +| File server upload | Plik integration | Platform-specific | FR-008, FR-009 | +| File server download | Exponential backoff | Platform-specific | FR-010, FR-011 | + +### Integration Tests + +| Scenario | Platforms | Payloads | Transport | Requirements | +|----------|-----------|----------|-----------|--------------| +| Single text (small) | All | text | direct | FR-001, FR-012 | +| Single dictionary (small) | All | dictionary | direct | FR-002, FR-012 | +| Single arrow table (small) | Desktop | arrowtable | direct | FR-002, FR-012 | +| Single JSON table (small) | All | jsontable | direct | FR-001, FR-002, FR-006 | +| Single image (small) | All | image | direct | FR-001, FR-006 | +| Single text (large) | All | text | link | FR-003, FR-008, FR-009 | +| Mixed payloads | All | text + dictionary + image | mixed | FR-006, FR-007 | + +### Test Coverage Targets + +| Phase | Coverage Target | Method | +|-------|----------------|--------| +| Phase 1 | 70% | Unit tests per platform | +| Phase 2 | 80% | Add integration tests | +| Phase 3 | 85% | Add edge case tests | +| Phase 4 | 90% | Add performance tests | + +--- + +## 5. Build and Deployment Preparation + +### Continuous Integration + +| Check | Command | Purpose | +|-------|---------|---------| +| Linting | `npm run lint` | Code style enforcement | +| Type checking | `npx tsc --noEmit` | Type safety (JavaScript/TypeScript) | +| Unit tests | `npm test` | Functionality validation | +| Integration tests | `npm run test:integration` | Cross-platform validation | +| Coverage | `npm run coverage` | Test coverage tracking | + +### Deployment Pipeline + +``` +GitHub Push + ↓ +CI/CD Pipeline + ↓ +├──→ Linting (all platforms) +├──→ Unit tests (all platforms) +├──→ Integration tests (cross-platform) +├──→ Coverage report +└──→ Build documentation + ↓ +Release (if all checks pass) + ↓ +├──→ GitHub Releases +├──→ Package registry (npm, PyPI) +└──→ Documentation site +``` + +--- + +## 6. Risk Mitigation + +### Known Blockers + +| Risk | Mitigation Step | Owner | +|------|----------------|-------| +| **Browser Arrow IPC** | Use JSON table as fallback | Developer B | +| **MicroPython memory** | 100KB threshold, direct transport only | Developer C | +| **File server availability** | Exponential backoff with graceful degradation | Developer B | + +### Known Unknowns + +| Unknown | Monitoring Strategy | Response Plan | +|---------|-------------------|---------------| +| Platform-specific bugs | Comprehensive test coverage | Hotfix with platform-specific handling | +| Performance bottlenecks | Load testing and profiling | Optimized serialization/deserialization | + +--- + +## 7. Requirements Traceability + +### Functional Requirements + +| Requirement ID | Implementation Location | Status | +|---------------|------------------------|--------| +| FR-001 | All platform modules | ✅ Complete | +| FR-002 | All platform modules | ✅ Complete | +| FR-003 | All platform modules (size_threshold logic) | ✅ Complete | +| FR-004 | All platform modules | ✅ Complete | +| FR-005 | MicroPython module | ✅ Complete | +| FR-006 | All platform modules | ✅ Complete | +| FR-007 | All platform modules | ✅ Complete | +| FR-008 | All platform modules | ✅ Complete | +| FR-009 | All platform modules | ✅ Complete | +| FR-010 | All platform modules | ✅ Complete | +| FR-011 | All platform modules | ✅ Complete | +| FR-012 | All platform modules | ✅ Complete | +| FR-013 | All platform modules | ✅ Complete | +| FR-014 | All platform modules | ✅ Complete | + +### Non-Functional Requirements + +| NFR ID | Implementation Location | Status | +|--------|------------------------|--------| +| NFR-101 | Serialization functions | ✅ Complete | +| NFR-102 | Deserialization functions | ✅ Complete | +| NFR-103 | Transport adapter | ✅ Complete | +| NFR-104 | File upload handler | ✅ Complete | +| NFR-105 | File download handler | ✅ Complete | +| NFR-106 | MicroPython module | ✅ Complete | +| NFR-107 | Performance benchmarks | ✅ Complete | +| NFR-201 | Transport adapter | ✅ Complete | +| NFR-202 | File download retry logic | ✅ Complete | +| NFR-203 | Transport adapter | ✅ Complete | +| NFR-401 | Message envelope | ✅ Complete | +| NFR-402 | Metrics instrumentation | ✅ Complete | +| NFR-403 | Correlation ID propagation | ✅ Complete | + +--- + +## 8. Validation Gates + +### Pre-Release Checklist + +| Gate | Check | Pass Criteria | +|------|-------|--------------| +| **G-001** | All unit tests pass | 100% pass rate per platform | +| **G-002** | Integration tests pass | Cross-platform round-trip successful | +| **G-003** | Coverage threshold | ≥80% line coverage | +| **G-004** | Linting clean | No warnings or errors | +| **G-005** | Specification compliance | All spec rules validated | +| **G-006** | Documentation complete | All required docs present | + +### CI/CD Validation + +| Check | Command | Failure Action | +|-------|---------|---------------| +| Syntax | `julia --check-base` | Block PR | +| Unit tests | `julia test/runtests.jl` | Block PR | +| Integration | `npm run test:integration` | Block PR | +| Coverage | `codecov` | Report only | + +--- + +*This implementation plan is versioned and maintained in git alongside the codebase. All implementations must adhere to this plan.* diff --git a/docs/requirements.md b/docs/requirements.md index ce93f63..3c4af87 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -1,7 +1,7 @@ # Requirements Document: msghandler -**Version**: 1.2.0 -**Date**: 2026-05-13 +**Version**: 1.3.0 +**Date**: 2026-05-22 **Status**: Active **Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl) @@ -33,16 +33,44 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless | **As a developer**, I want automatic retry on file server download failures | P1 | Exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) | | **As a developer**, I want message tracing across distributed systems | P1 | Correlation ID is propagated through all message processing steps | -### 1.3 KPIs & Targets +### 1.3 Success Metrics & KPIs -| Metric | Target | Measurement Method | -|--------|--------|-------------------| -| 95% of messages complete within 200ms | 95% | Synthetic monitoring | -| <2 days from onboarding to first PR | 2 days | PR timeline tracking | -| 100% of messages validate against spec | 100% | CI block rate | -| >80% unit test coverage | 80% | Test coverage tools | -| <1% of PRs bypass validation gates | 1% | CI gate analysis | -| MTTR <15 minutes for P1 incidents | 15 minutes | Incident tracking | +**Functional Requirements KPIs:** +- **FR-001** (Cross-platform text messaging): 95% of text messages delivered correctly across all platform pairs (<200ms latency) - Measured via synthetic cross-platform tests +- **FR-002** (Cross-platform tabular data): 100% Arrow IPC round-trip integrity (Desktop), 100% JSON table round-trip integrity (Browser) - Measured via data validation tests +- **FR-003** (Large file handling): 99% successful file uploads to server for payloads ≥0.5MB - Measured via integration tests +- **FR-004** (Direct transport for small payloads): 100% of payloads <0.5MB use direct transport - Measured via transport selection tests +- **FR-005** (MicroPython support): 100% of payloads <100KB delivered on MicroPython devices - Measured via MicroPython integration tests +- **FR-006** (Multi-payload messages): 100% correct parsing of multi-payload message lists - Measured via multi-payload tests +- **FR-007** (Payload type preservation): 100% type integrity preserved across all platforms - Measured via type validation tests +- **FR-008** (Plik file server integration): 100% successful Plik upload/token handling - Measured via Plik integration tests +- **FR-009** (Custom file server support): 100% handler abstraction works with custom implementations - Measured via custom server integration tests +- **FR-010** (Exponential backoff retry): 95% successful downloads within retry limit - Measured via failure injection tests +- **FR-011** (Correlation ID propagation): 100% correlation IDs propagated through all steps - Measured via tracing tests +- **FR-012** (Message serialization): <50ms serialization overhead for 10KB payload - Measured via benchmark tests +- **FR-013** (Transport publishing): 100% JSON envelope generated correctly - Measured via serialization tests +- **FR-014** (Transport subscription): 100% JSON messages processed correctly - Measured via deserialization tests + +**Non-Functional Requirements KPIs:** +- **NFR-101** (Message serialization overhead): <50ms for 10KB payload - Measured via benchmark tests +- **NFR-102** (Message deserialization overhead): <50ms for 10KB payload - Measured via benchmark tests +- **NFR-103** (Transport connection establishment): <100ms average - Measured via connection pool benchmarks +- **NFR-104** (File upload latency): <1s for 0.5MB file - Measured via integration tests +- **NFR-105** (File download latency): <1s for 0.5MB file - Measured via integration tests +- **NFR-106** (Concurrent connections): 100+ simultaneous transport connections - Measured via scale testing +- **NFR-107** (Message throughput): 1000+ messages/second per instance - Measured via load testing +- **NFR-108** (File server scalability): Horizontal scaling verified via architecture review +- **NFR-201** (Message delivery): At-least-once delivery via transport - Measured via message acknowledgment tests +- **NFR-202** (File server availability): <5% failure rate when file server unavailable - Measured via failure injection tests +- **NFR-203** (Connection recovery): Auto-reconnect within 30s - Measured via connection failure tests +- **NFR-301** (Payload integrity): 100% SHA-256 checksum validation - Measured via integrity tests +- **NFR-302** (Transport security): 100% TLS connections in production - Measured via connection audits +- **NFR-303** (File server security): 100% authenticated file uploads - Measured via security tests +- **NFR-401** (Required logs): 100% messages logged with required fields - Measured via log validation +- **NFR-402** (Critical metrics): 100% metrics collected with 1-minute granularity - Measured via metrics pipeline tests +- **NFR-403** (Tracing): 100% correlation ID propagation for tracing - Measured via tracing validation +- **NFR-404** (Alerting): <5min alert latency for `download_retry_exceeded` - Measured via alert pipeline tests +- **NFR-405** (Retention): Logs: 30 days, Metrics: 1 year - Measured via storage audits --- @@ -108,63 +136,68 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless | ID | Requirement | Description | |----|-------------|-------------| -| **FR-001** | Cross-platform text messaging | System shall allow users to send text messages between Julia, JavaScript, Python, and MicroPython applications | -| **FR-002** | Cross-platform tabular data | System shall support DataFrame exchange between Julia and Python applications using Arrow IPC format | -| **FR-003** | Large file handling | System shall automatically detect payloads ≥0.5MB and upload them to HTTP file server instead of sending via transport | -| **FR-004** | Direct transport for small payloads | System shall send payloads <0.5MB directly via transport without file server upload | -| **FR-005** | MicroPython support | System shall support payloads <100KB on MicroPython devices using direct transport | -| **FR-006** | Multi-payload messages | System shall accept and process lists of (dataname, data, type) tuples | -| **FR-007** | Payload type preservation | System shall preserve payload types when returning multi-payload messages | -| **FR-008** | Plik file server integration | System shall support Plik one-shot upload mode with upload ID and token handling | -| **FR-009** | Custom file server support | System shall provide handler function abstraction for custom HTTP file server implementations | -| **FR-010** | Exponential backoff retry | System shall implement exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) for file server download failures | -| **FR-011** | Correlation ID propagation | System shall propagate correlation IDs through all message processing steps | -| **FR-012** | Message serialization | System shall serialize data types using Base64, JSON, or Arrow IPC encoding | -| **FR-013** | Transport publishing | System shall return JSON string representation for caller to publish via transport layer (caller is responsible for actual transport publish) | -| **FR-014** | Transport subscription | System shall receive and process messages by accepting JSON string from transport payload | +| **FR-001** | Cross-platform text messaging | System shall allow users to send text messages between Julia, JavaScript, Python, and MicroPython applications | FR-001 KPI: 95% of text messages delivered correctly across all platform pairs (<200ms latency) | +| **FR-002** | Cross-platform tabular data | System shall support DataFrame exchange between Julia and Python applications using Arrow IPC format | FR-002 KPI: 100% Arrow IPC round-trip integrity (Desktop), 100% JSON table round-trip integrity (Browser) | +| **FR-003** | Large file handling | System shall automatically detect payloads ≥0.5MB and upload them to HTTP file server instead of sending via transport | FR-003 KPI: 99% successful file uploads to server for payloads ≥0.5MB | +| **FR-004** | Direct transport for small payloads | System shall send payloads <0.5MB directly via transport without file server upload | FR-004 KPI: 100% of payloads <0.5MB use direct transport | +| **FR-005** | MicroPython support | System shall support payloads <100KB on MicroPython devices using direct transport | FR-005 KPI: 100% of payloads <100KB delivered on MicroPython devices | +| **FR-006** | Multi-payload messages | System shall accept and process lists of (dataname, data, type) tuples | FR-006 KPI: 100% correct parsing of multi-payload message lists | +| **FR-007** | Payload type preservation | System shall preserve payload types when returning multi-payload messages | FR-007 KPI: 100% type integrity preserved across all platforms | +| **FR-008** | Plik file server integration | System shall support Plik one-shot upload mode with upload ID and token handling | FR-008 KPI: 100% successful Plik upload/token handling | +| **FR-009** | Custom file server support | System shall provide handler function abstraction for custom HTTP file server implementations | FR-009 KPI: 100% handler abstraction works with custom implementations | +| **FR-010** | Exponential backoff retry | System shall implement exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) for file server download failures | FR-010 KPI: 95% successful downloads within retry limit | +| **FR-011** | Correlation ID propagation | System shall propagate correlation IDs through all message processing steps | FR-011 KPI: 100% correlation IDs propagated through all steps | +| **FR-012** | Message serialization | System shall serialize data types using Base64, JSON, or Arrow IPC encoding | FR-012 KPI: <50ms serialization overhead for 10KB payload | +| **FR-013** | Transport publishing | System shall return JSON string representation for caller to publish via transport layer (caller is responsible for actual transport publish) | FR-013 KPI: 100% JSON envelope generated correctly | +| **FR-014** | Transport subscription | System shall receive and process messages by accepting JSON string from transport payload | FR-014 KPI: 100% JSON messages processed correctly | --- ## 4. Non-Functional Requirements (NFRs) +**Requirement vs KPI Clarification:** +- **FR and NFR** is a *requirement* — it defines what quality or constraint the system must have (e.g., "System shall support 10K TPS", "99.9% monthly uptime", "TLS 1.3+ encryption") +- **KPI** is a *measurement* — it's the actual data collected to verify if the requirement was met (e.g., "Peak traffic was 8.5K TPS", "MTTR was 8 minutes", "100% of connections use TLS 1.3") +- Requirements tell you **what to build**; KPIs tell you **how well you built it** + ### 4.1 Performance & Scalability -| ID | Requirement | Specification | Test Method | -|----|-------------|---------------|-------------| -| **NFR-101** | Message serialization overhead | <50ms for 10KB payload | Benchmark tests | -| **NFR-102** | Message deserialization overhead | <50ms for 10KB payload | Benchmark tests | -| **NFR-103** | Transport connection establishment | <100ms | Connection pool benchmarks | -| **NFR-104** | File upload latency | <1s for 0.5MB file | Integration tests | -| **NFR-105** | File download latency | <1s for 0.5MB file | Integration tests | -| **NFR-106** | Concurrent connections | Support 100+ simultaneous transport connections | Scale testing | -| **NFR-107** | Message throughput | Handle 1000+ messages/second per instance | Load testing | -| **NFR-108** | File server scalability | Support horizontal scaling of file server backend | Architecture review | +| ID | Requirement | Specification | KPI | Test Method | +|----|-------------|---------------|-----|-------------| +| **NFR-101** | Message serialization overhead | <50ms for 10KB payload | <50ms for 10KB payload | Benchmark tests | +| **NFR-102** | Message deserialization overhead | <50ms for 10KB payload | <50ms for 10KB payload | Benchmark tests | +| **NFR-103** | Transport connection establishment | <100ms | <100ms average | Connection pool benchmarks | +| **NFR-104** | File upload latency | <1s for 0.5MB file | <1s for 0.5MB file | Integration tests | +| **NFR-105** | File download latency | <1s for 0.5MB file | <1s for 0.5MB file | Integration tests | +| **NFR-106** | Concurrent connections | Support 100+ simultaneous transport connections | 100+ simultaneous connections | Scale testing | +| **NFR-107** | Message throughput | Handle 1000+ messages/second per instance | 1000+ messages/second | Load testing | +| **NFR-108** | File server scalability | Support horizontal scaling of file server backend | Horizontal scaling verified | Architecture review | ### 4.2 Availability & Reliability -| ID | Requirement | Specification | -|----|-------------|---------------| -| **NFR-201** | Message delivery | At-least-once delivery semantics via transport | -| **NFR-202** | File server availability | Graceful degradation when file server is unavailable | -| **NFR-203** | Connection recovery | Auto-reconnect on transport connection failure | +| ID | Requirement | Specification | KPI | Test Method | +|----|-------------|---------------|-----|-------------| +| **NFR-201** | Message delivery | At-least-once delivery semantics via transport | At-least-once delivery via transport | Message acknowledgment tests | +| **NFR-202** | File server availability | Graceful degradation when file server is unavailable | <5% failure rate when file server unavailable | Failure injection tests | +| **NFR-203** | Connection recovery | Auto-reconnect on transport connection failure | Auto-reconnect within 30s | Connection failure tests | ### 4.3 Privacy & Security -| ID | Requirement | Specification | -|----|-------------|---------------| -| **NFR-301** | Payload integrity | SHA-256 checksum support via metadata | -| **NFR-302** | Transport security | TLS support for transport connections | -| **NFR-303** | File server security | Authentication token for file uploads | +| ID | Requirement | Specification | KPI | Test Method | +|----|-------------|---------------|-----|-------------| +| **NFR-301** | Payload integrity | SHA-256 checksum support via metadata | 100% SHA-256 checksum validation | Integrity tests | +| **NFR-302** | Transport security | TLS support for transport connections | 100% TLS connections in production | Connection audits | +| **NFR-303** | File server security | Authentication token for file uploads | 100% authenticated file uploads | Security tests | ### 4.4 Observability & Telemetry -| ID | Requirement | Specification | -|----|-------------|---------------| -| **NFR-401** | Required logs | `correlation_id`, `msg_id`, `timestamp`, `sender_name`, `receiver_name`, `payload_type`, `transport` | -| **NFR-402** | Critical metrics | `messages_sent_total`, `messages_received_total`, `file_upload_duration_seconds`, `file_download_duration_seconds`, `retry_attempts_total` | -| **NFR-403** | Tracing | Correlation ID propagation for request tracing | -| **NFR-404** | Alerting | `download_retry_exceeded` triggers alert when max retries exceeded | -| **NFR-405** | Retention | Logs: 30 days, Metrics: 1 year | +| ID | Requirement | Specification | KPI | Test Method | +|----|-------------|---------------|-----|-------------| +| **NFR-401** | Required logs | `correlation_id`, `msg_id`, `timestamp`, `sender_name`, `receiver_name`, `payload_type`, `transport` | 100% messages logged with required fields | Log validation | +| **NFR-402** | Critical metrics | `messages_sent_total`, `messages_received_total`, `file_upload_duration_seconds`, `file_download_duration_seconds`, `retry_attempts_total` | 100% metrics collected with 1-minute granularity | Metrics pipeline tests | +| **NFR-403** | Tracing | Correlation ID propagation for request tracing | 100% correlation ID propagation for tracing | Tracing validation | +| **NFR-404** | Alerting | `download_retry_exceeded` triggers alert when max retries exceeded | <5min alert latency for `download_retry_exceeded` | Alert pipeline tests | +| **NFR-405** | Retention | Logs: 30 days, Metrics: 1 year | Logs: 30 days, Metrics: 1 year | Storage audits | --- @@ -173,7 +206,7 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless | Condition | Description | |-----------|-------------| | **AC-001** | All functional requirements FR-001 through FR-014 are implemented and tested | -| **AC-002** | All non-functional requirements NFR-101 through NFR-405 meet specified targets | +| **AC-002** | All non-functional requirements NFR-101 through NFR-405 meet specified KPI targets | | **AC-003** | Cross-platform text message test passes (Julia ↔ JavaScript ↔ Python) | | **AC-004** | Cross-platform tabular data test passes with Arrow IPC round-trip (Desktop) | | **AC-005** | Cross-platform tabular data test passes with JSON table round-trip (Browser) | @@ -406,6 +439,10 @@ function smartunpack( | Date | Version | Changes | |------|---------|---------| +| 2026-05-22 | 1.3.0 | Updated to ASG Framework v8 pillars - added KPIs to all FR and NFR requirements | +| - | - | Added Success Metrics & KPIs section with measurable targets for each requirement | +| - | - | Added NFR vs KPI clarification section | +| - | - | Updated NFR tables to include KPI column and Test Method column | | 2026-05-15 | 1.3.0 | Made transport layer agnostic | | - | - | Removed all NATS-specific dependencies and references | | - | - | Updated all NATS references to generic "transport layer"/"message broker" | diff --git a/docs/solution-design.md b/docs/solution-design.md new file mode 100644 index 0000000..cc57b35 --- /dev/null +++ b/docs/solution-design.md @@ -0,0 +1,345 @@ +# Solution Design: msghandler + +**Version**: 1.3.0 +**Date**: 2026-05-22 +**Status**: Active +**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl) + +--- + +## 1. Problem Decomposition + +msghandler addresses the challenge of cross-platform data exchange between **Julia**, **JavaScript**, **Python**, **Dart**, **Rust**, and **MicroPython** applications using message brokers as transport layers. + +### Problem Statement + +Developers working across multiple programming languages face significant obstacles when trying to share data: + +| Problem | Description | User Impact | +|---------|-------------|-------------| +| **P-001**: Cross-platform data serialization | Different languages have incompatible data types and serialization formats | Developers must write platform-specific conversion code | +| **P-002**: Large payload handling | Message brokers have size limits, but large files need to be transferred | Large files either fail or require complex workarounds | +| **P-003**: Transport abstraction | Each platform has different message broker libraries and APIs | No unified interface across platforms | +| **P-004**: Request-response patterns | Bi-directional communication requires complex correlation tracking | Developers must implement custom message routing | + +### Solution Boundaries + +**In Scope**: +- Unified API for `smartpack()` and `smartunpack()` across all platforms +- Automatic transport selection based on payload size +- File server integration using Claim-Check pattern +- Multi-payload support with mixed types in single message +- Exponential backoff for reliable file downloads + +**Out of Scope**: +- Message compression (adds complexity without clear benefit) +- Message encryption (application-layer concern) +- Advanced message routing (simple topic matching sufficient) +- Persistent message queues (transport pattern sufficient) + +### Decision IDs + +| Decision ID | Decision | Description | +|-------------|----------|-------------| +| SD-001 | Claim-Check Pattern | Large payloads uploaded to HTTP server, small payloads sent directly | +| SD-002 | Automatic Transport Selection | <0.5MB = direct, ≥0.5MB = link based on size threshold | +| SD-003 | Handler Function Abstraction | Pluggable file server implementations via handler functions | +| SD-004 | Unified Tuple Format | Same `(dataname, data, type)` format across all platforms | +| SD-005 | Base64 Encoding | JSON-compatible binary data transport | +| SD-006 | Transport Abstraction | Support multiple broker protocols (NATS/MQTT/WebSocket) transparently | + +--- + +## 2. Solution Approach + +msghandler implements a **Claim-Check pattern** with intelligent transport selection: + +``` +Sender (smartpack) Transport Layer Receiver (smartunpack) +┌─────────────────┐ ┌───────────────┐ ┌───────────────────┐ +│ │ │ │ │ │ +│ 1. Data tuples │────────────>│ │───────────>│ 1. Parse envelope │ +│ [(name, │ JSON │ Message │ JSON │ 2. Check transport│ +│ data, type)]│ format │ Broker │ format │ 3. Fetch/Decode │ +│ │ │ (NATS/MQTT/ │ │ 4. Return tuples │ +└─────────────────┘ │ WebSocket) │ │ │ + │ │ └───────────────────┘ + └───────────────┘ +``` + +### Key Design Decisions + +| Decision ID | Decision | Rationale | Alternatives Rejected | +|-------------|----------|-----------|----------------------| +| **SD-001** | Claim-Check Pattern | Large payloads (>0.5MB) uploaded to HTTP server, small payloads sent directly via transport | Client-side compression - adds complexity; Server-side compression - not universally supported | +| **SD-002** | Automatic Transport Selection | <0.5MB = direct (fast), ≥0.5MB = link (avoid transport limits) | Manual selection - error-prone; Fixed threshold - not adaptive | +| **SD-003** | Handler Function Abstraction | Allows pluggable file server implementations (Plik, AWS S3, custom) | Hardcoded Plik - not flexible; Interface-based - too complex for this use case | +| **SD-004** | Unified Tuple Format | Same input/output format across all platforms | Platform-native formats - no interoperability; Protocol buffers - too heavy | +| **SD-005** | Base64 Encoding | JSON-compatible binary data transport | Raw bytes - not JSON-compatible; Hex encoding - 2x size overhead | +| **SD-006** | Transport Abstraction | Support multiple broker protocols (NATS/MQTT/WebSocket) transparently | Platform-specific libraries - no interoperability | + +### Architecture Components + +```mermaid +flowchart TB + subgraph Client["Client Application"] + direction TB + APP["Application Code"] + API["msghandler API"] + + APP -->|Data tuples| API + API -->|JSON envelope| TRANSPORT + end + + subgraph Transport["Transport Layer"] + direction TB + BROKER["Message Broker
NATS/MQTT/WebSocket"] + TOPICS["Topic Subscription"] + + API -->|Publish| BROKER + BROKER -->|Deliver| TOPICS + TOPICS -->|Subscribe| API + end + + subgraph FileServer["File Server"] + direction TB + UPLOAD["Upload Handler"] + DOWNLOAD["Download Handler"] + + API -.->|Upload URL| UPLOAD + DOWNLOAD -.->|Fetch URL| API + end + + style CLIENT fill:#e1f5fe,stroke:#0288d1,stroke-width:2px + style Transport fill:#ffe0b2,stroke:#f57c00,stroke-width:2px + style FileServer fill:#c8e6c9,stroke:#43a047,stroke-width:2px +``` + +--- + +## 3. Alternatives Considered + +| Alternative | Pros | Cons | Decision | +|-------------|------|------|----------| +| **gRPC/Protobuf** | Strong typing, efficient binary format | No native MicroPython support; Complex schema management | Rejected - not cross-platform enough | +| **MessagePack** | Compact binary, good performance | Browser support limited; No standard for tabular data | Rejected - missing Arrow IPC alternative | +| **Protocol Buffers** | Type-safe, efficient | No native support for tabular data exchange | Rejected - cannot represent DataFrames natively | +| **REST HTTP Upload** | Simple, universal | High latency; No real-time capability | Rejected - not suitable for message broker pattern | +| **Hybrid (direct/link)** | Optimal for both small and large payloads | More complex implementation | Accepted - matches user requirements (FR-003, FR-004) | +| **Single transport type** | Simpler implementation | Cannot handle large payloads efficiently | Rejected - violates FR-003 requirement | +| **Platform-specific APIs** | Native performance | No interoperability; Maintenance burden | Rejected - violates cross-platform goal | + +--- + +## 4. High-Level Component Diagram + +```mermaid +flowchart TD + subgraph msghandler["msghandler Core Module"] + direction TB + + subgraph Serialization["Serialization Layer"] + DIR["Direct Transport"] + LNK["Link Transport"] + + DIR -->|Base64| JSON_MSG + LNK -->|HTTP URL| JSON_MSG + end + + subgraph Envelope["Envelope Builder"] + HDR["Message Header"] + PAY["Payload Manager"] + + HDR --> PAY + end + + subgraph Handlers["Handler Functions"] + UPD["Upload Handler"] + DWN["Download Handler"] + + UPD --> LNK + DWN --> LNK + end + + API["smartpack() / smartunpack()"] + + API -->|Input| Serialization + API -->|Output| Serialization + API -->|Configure| Handlers + end + + subgraph Transport["Transport Layer"] + BROKER["NATS / MQTT / WebSocket"] + API -->|JSON| BROKER + BROKER -->|JSON| API + end + + subgraph FileServer["File Server"] + Plik["HTTP Server"] + UPD -.->|POST| Plik + Plik -.->|URL| DWN + end + + style msghandler fill:#b3e5fc,stroke:#0288d1,stroke-width:2px + style Transport fill:#ffe0b2,stroke:#f57c00,stroke-width:2px + style FileServer fill:#c8e6c9,stroke:#43a047,stroke-width:2px +``` + +### Component Responsibilities + +| Component | Responsibilities | Decision IDs | Requirements Addressed | +|-----------|-----------------|--------------|----------------------| +| **Serialization Layer** | Convert data types to transport format (Base64/URL) | SD-005 | FR-001, FR-002, FR-012 | +| **Envelope Builder** | Create standardized message envelope with metadata | SD-001 | FR-011, FR-013, FR-014 | +| **Handler Functions** | Abstract file server operations for pluggability | SD-003 | FR-008, FR-009 | +| **Transport Adapter** | Support multiple broker protocols transparently | SD-006 | FR-013, FR-014 | +| **Payload Manager** | Track payload types, sizes, and encoding | SD-004 | FR-006, FR-007 | + +--- + +## 5. Decision Rationale + +### SD-001: Why Claim-Check Pattern? + +**Requirement**: FR-003 - Large file handling, FR-004 - Direct transport for small payloads + +**Rationale**: +- Transport layers (NATS, MQTT) have message size limits (typically 1MB) +- Direct transport is faster for small payloads (no file server round-trip) +- Link transport avoids transport limits for large payloads +- User doesn't need to manually choose - automatic selection based on threshold + +### SD-002: Why Handler Functions for File Server? + +**Requirement**: FR-008 - Plik integration, FR-009 - Custom file server support + +**Rationale**: +- Plik is common open-source solution for file server +- Some users need AWS S3 or custom implementation +- Handler functions provide clean abstraction without vendor lock-in +- Same signature across all platforms (unified API) + +### SD-003: Why Tuple Format for Payloads? + +**Requirement**: FR-006 - Multi-payload messages, FR-007 - Payload type preservation + +**Rationale**: +- `(dataname, data, type)` tuple is language-agnostic +- Simple to understand: name, content, type +- Supports mixed payload types in single message +- Easy to serialize/deserialize across platforms + +### SD-004: Why Base64 Encoding? + +**Requirement**: FR-012 - Message serialization, FR-001 - Cross-platform text messaging + +**Rationale**: +- JSON is universal - works on all platforms +- Base64 converts binary to ASCII for JSON compatibility +- Standard format with native support in all languages +- No additional dependencies needed + +### SD-005: Why Automatic Transport Selection? + +**Requirement**: FR-003, FR-004, NFR-104, NFR-105 + +**Rationale**: +- <0.5MB payloads use direct transport (<1s latency, FR-004 KPI) +- ≥0.5MB payloads use link transport to avoid transport limits (FR-003 KPI: 99% successful uploads) +- User doesn't need to manually choose - automatic selection based on threshold + +### SD-006: Why Transport Abstraction? + +**Requirement**: FR-013, FR-014, NFR-201 + +**Rationale**: +- Support multiple broker protocols (NATS, MQTT, WebSocket) transparently +- Caller handles actual transport publishing/subscription +- Unified API across all platforms +- At-least-once delivery semantics via transport layer + +--- + +## 6. Risk Assessment + +| Risk | Impact | Probability | Mitigation | +|------|--------|-------------|------------| +| **Performance degradation with >500KB payloads** | High | Medium | Size threshold detection; Link transport fallback | +| **File server availability issues** | Medium | Low | Exponential backoff retry; Graceful degradation | +| **Platform-specific bugs** | Medium | Low | Comprehensive test suite per platform; CI validation | +| **Encoding mismatches between platforms** | High | Low | Strict specification; Test contracts; Validation rules | +| **Transport layer incompatibility** | Medium | Low | Transport-agnostic design; Handler abstraction | + +--- + +## 7. Requirements Traceability + +| Solution Component | Decision ID | Requirement ID | Description | +|-------------------|-------------|----------------|-------------| +| **smartpack() function** | SD-001, SD-002, SD-004, SD-005, SD-006 | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 | Unified API for sending messages across all platforms | +| **smartunpack() function** | SD-001, SD-002, SD-004, SD-005, SD-006 | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 | Unified API for receiving messages across all platforms | +| **Direct transport** | SD-002 | FR-004, NFR-101, NFR-102, NFR-104, NFR-105 | Send payloads < threshold directly via transport | +| **Link transport** | SD-001, SD-002 | FR-003, NFR-104, NFR-105 | Upload payloads ≥ threshold to file server | +| **File server handler** | SD-003 | FR-008, FR-009, FR-010 | Pluggable upload/download handlers with retry logic | +| **Payload type preservation** | SD-004 | FR-006, FR-007 | Support text, dictionary, arrowtable, jsontable, image, audio, video, binary | +| **Correlation ID** | SD-001 | FR-011, NFR-401, NFR-403 | Message tracing across distributed systems | +| **Multi-payload support** | SD-004 | FR-006, FR-007 | List of (dataname, data, type) tuples | + +### Non-Functional Requirements Traceability + +| Solution Component | Decision ID | NFR ID | Description | +|-------------------|-------------|--------|-------------| +| **Serialization optimization** | SD-005 | NFR-101, NFR-102 | <50ms overhead for 10KB payloads | +| **Transport efficiency** | SD-006 | NFR-103 | <100ms connection establishment | +| **File server latency** | SD-001, SD-002 | NFR-104, NFR-105 | <1s upload/download for 0.5MB files | +| **Concurrent connections** | SD-006 | NFR-106 | Support 100+ simultaneous connections | +| **Message throughput** | SD-005, SD-006 | NFR-107 | Handle 1000+ messages/second per instance | +| **At-least-once delivery** | SD-006 | NFR-201 | Transport layer semantics | +| **Graceful degradation** | SD-003 | NFR-202 | File server unavailability handling | +| **Auto-reconnect** | SD-006 | NFR-203 | Transport connection failure recovery | +| **Required logs** | SD-001 | NFR-401 | Correlation ID, msg_id, timestamp, etc. | +| **Critical metrics** | SD-001, SD-005 | NFR-402 | messages_sent_total, file upload/download duration | +| **Tracing** | SD-001 | NFR-403 | Correlation ID propagation | + +--- + +## 8. Gap-Check Validation + +| Stage Transition | Gap-Check Question | Status | +|------------------|-------------------|--------| +| **Requirements → Solution Design** | Does the Solution Design clearly explain how the system solves the user problem, not just what it does? | ✅ Verified - All user stories mapped to solution components with requirement ID and decision ID references | +| **Solution Design → Specification** | Does the Specification define all technical details that the solution approach requires? | ⏳ Pending - Specification needs review for completeness | +| **Solution Design → Walkthrough** | Does the Walkthrough reflect the complete flow including error states and timing? | ⏳ Pending - Walkthrough needs validation against design | + +### Solution Design Validation + +**Problem**: Users need to send mixed payload types (text + image + large file) between Julia, JavaScript, Python, and MicroPython applications. + +**Solution Components**: +1. **SD-001** - `smartpack()` - Unified API for all platforms +2. **SD-002** - Tuple format - `(dataname, data, type)` - platform-agnostic +3. **SD-003** - Automatic transport selection - <0.5MB = direct, ≥0.5MB = link +4. **SD-004** - File server handler abstraction - Plik/AWS S3/custom support +5. **SD-005** - Exponential backoff - Reliable file downloads +6. **SD-006** - Correlation ID - Message tracing + +**Requirement Mapping**: +- FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 ✅ + +**Gap Check**: Does this solution explain *how* users will actually use the system? + +**Answer**: Yes - the walkthrough provides concrete examples: +1. JavaScript sends `[(msg, "Hello", "text"), (avatar, binary_data, "image")]` +2. `smartpack()` automatically selects transport based on size (SD-002) +3. Large file (≥0.5MB) → link transport → file server upload (SD-001) +4. Small payload (<0.5MB) → direct transport → base64 encoding (SD-005) +5. Receiver calls `smartunpack()` → receives same tuple format + +--- + +*This solution design document is versioned and maintained in git alongside the codebase. All implementations must adhere to this design.* + +**Traceability Summary**: +- All requirements traced to solution components with SD-XXX decision IDs +- Each decision ID references the corresponding requirement IDs (FR-XXX, NFR-XXX) +- Specification must cite SD-XXX references for each technical detail