v1.0.0 #1

Merged
ton merged 14 commits from v1.0.0 into main 2026-05-22 22:16:17 +00:00
4 changed files with 851 additions and 995 deletions
Showing only changes of commit 312d14b28f - Show all commits

View File

@@ -1,942 +0,0 @@
# Architecture Documentation: msghandler
**Version**: 1.4.0
**Date**: 2026-05-14
**Status**: Active
**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl)
**Architecture Level**: C4 Container Level
---
## 1. Executive Summary
This document defines the **blueprint** for msghandler - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, **Dart**, **Rust**, and **MicroPython** applications using a message broker as the transport layer.
This architecture document serves as the single source of truth for:
- **System Structure**: How components fit together and interact
- **Scaling Considerations**: How the system scales horizontally and vertically
- **Failure Modes**: How the system handles failures and recovers
- **Trade-off Decisions**: The rationale behind architectural decisions
### 1.1 Specification Traceability
| Architecture Section | Specification Reference | UI Specification Reference | Requirement ID(s) |
|---------------------|-------------------------|---------------------------|-------------------|
| Section 2 (Context Diagram) | specification.md:2 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
| Section 3 (Container Diagram) | specification.md:2, specification.md:3, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
| Section 4 (Component Diagram) | specification.md:2, specification.md:3, specification.md:5, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
| Section 5 (High-Level) | specification.md:2, specification.md:3, specification.md:5, specification.md:11 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
| Section 6 (Message Envelope) | specification.md:2, specification.md:3, specification.md:8 | - | FR-011, FR-012, FR-013, FR-014, NFR-401, NFR-403 |
| Section 7 (Payload Type) | specification.md:3, specification.md:5, specification.md:6 | - | FR-001, FR-002, FR-003, FR-006, FR-012, NFR-101, NFR-102 |
| Section 8 (Transport Strategy) | specification.md:6, specification.md:7 | - | FR-003, FR-004, FR-005, FR-010, NFR-104, NFR-105, NFR-106 |
| Section 9 (Platform-Specific) | specification.md:13, specification.md:14 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
| Section 10 (Scaling) | specification.md:7, specification.md:13 | - | NFR-101, NFR-102, NFR-103, NFR-104, NFR-105, NFR-106, NFR-107 |
| Section 11 (Failure Modes) | specification.md:9, specification.md:11 | - | FR-008, FR-009, FR-010, FR-011, NFR-201, NFR-202, NFR-203 |
| Section 12 (Trade-offs) | specification.md:2, specification.md:3, specification.md:6, specification.md:7 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 |
| Section 13 (Deployment) | specification.md:12, specification.md:18 | - | FR-013, FR-014, NFR-201, NFR-203 |
| Section 14 (Security) | specification.md:4, specification.md:9, specification.md:12 | - | NFR-301, NFR-302, NFR-303, NFR-401, NFR-402, NFR-403, NFR-404, NFR-405 |
| Section 15 (Testing) | specification.md:17 | - | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-012, FR-013, FR-014 |
---
## 2. Architecture Overview
## Architecture Overview
### C4 Context Diagram
```mermaid
flowchart TD
subgraph "External Systems"
Message_Broker[Message Broker<br/>NATS/MQTT/WebSocket/Custom]
File_Server[HTTP File Server<br/>Plik/AWS S3/Custom]
end
Julia_App[Julia Application]
JS_App[JavaScript Application<br/>Node.js/Browser]
Python_App[Python Application<br/>Desktop]
Dart_App[Dart Application<br/>Desktop/Flutter/Web]
Rust_App[Rust Application<br/>Server/Desktop]
MicroPython_App[MicroPython Device]
end
Julia_App -->|Transport| Message_Broker
JS_App -->|Transport| Message_Broker
Python_App -->|Transport| Message_Broker
Dart_App -->|Transport| Message_Broker
Rust_App -->|Transport| Message_Broker
MicroPython_App -->|Transport| Message_Broker
Julia_App -->|HTTP| File_Server
JS_App -->|HTTP| File_Server
Python_App -->|HTTP| File_Server
Dart_App -->|HTTP| File_Server
Rust_App -->|HTTP| File_Server
MicroPython_App -->|HTTP| File_Server
style Message_Broker fill:#fff3e0,stroke:#f57c00
style File_Server fill:#f3e5f5,stroke:#9c27b4
style Julia_App fill:#e8f5e9,stroke:#4caf50
style JS_App fill:#e3f2fd,stroke:#2196f3
style Python_App fill:#e3f2fd,stroke:#2196f3
style Dart_App fill:#fff0f6,stroke:#e91e63
style Rust_App fill:#dea584,stroke:#e65100
style MicroPython_App fill:#fce4ec,stroke:#e91e63
```
### C4 Container Diagram
```mermaid
flowchart TD
subgraph "Client Container"
Julia_Module[Julia msghandler Module]
JS_Module[JavaScript msghandler Module]
Python_Module[Python msghandler Module]
Dart_Module[Dart msghandler Module]
Rust_Module[Rust msghandler Module]
MicroPython_Module[MicroPython msghandler Module]
end
Julia_Module --> Transport_Client
JS_Module --> Transport_Client
Python_Module --> Transport_Client
Dart_Module --> Transport_Client
Rust_Module --> Transport_Client
MicroPython_Module --> Transport_Client
Transport_Client --> Message_Broker
Julia_Module --> File_Client
JS_Module --> File_Client
Python_Module --> File_Client
Dart_Module --> File_Client
Rust_Module --> File_Client
MicroPython_Module --> File_Client
File_Client --> File_Server
style Julia_Module fill:#e8f5e9,stroke:#4caf50
style JS_Module fill:#e3f2fd,stroke:#2196f3
style Python_Module fill:#e3f2fd,stroke:#2196f3
style Dart_Module fill:#fff0f6,stroke:#e91e63
style Rust_Module fill:#dea584,stroke:#e65100
style MicroPython_Module fill:#fce4ec,stroke:#e91e63
style Message_Broker fill:#fff3e0,stroke:#f57c00
style File_Server fill:#f3e5f5,stroke:#9c27b4
```
### C4 Component Diagram (Julia Implementation)
```mermaid
flowchart TD
subgraph "msghandler Module"
smartpack[smartpack Function]
smartunpack[smartunpack Function]
Serialize[_serialize_data]
Deserialize[_deserialize_data]
EnvelopeToJson[envelope_to_json]
FileServerUpload[fileserver_upload_handler]
FileServerDownload[fileserver_download_handler]
LogTrace[log_trace]
end
subgraph "Data Models"
Payload[msg_payload_v1 Struct]
Envelope[msg_envelope_v1 Struct]
end
smartpack --> Serialize
smartpack --> EnvelopeToJson
smartpack --> FileServerUpload
smartunpack --> Deserialize
smartunpack --> FileServerDownload
EnvelopeToJson --> Envelope
Serialize --> Payload
style smartpack fill:#d1fae5,stroke:#10b981
style smartunpack fill:#d1fae5,stroke:#10b981
style FileServerUpload fill:#fef3c7,stroke:#f59e0b
style FileServerDownload fill:#fef3c7,stroke:#f59e0b
```
---
## High-Level Architecture
### System Components
| Component | Purpose | Platform Support |
|-----------|---------|------------------|
| **smartpack** | Send data with automatic transport selection, returns (envelope, json_string) for caller to publish via transport | All |
| **smartunpack** | Receive and process messages from JSON string | All |
| **_serialize_data** | Serialize data according to payload type | All |
| **_deserialize_data** | Deserialize bytes to native data types | All |
| **envelope_to_json** | Convert msg_envelope_v1 struct to JSON string | All |
| **log_trace** | Log trace messages with correlation ID | All |
| **fileserver_upload_handler** | Upload large payloads to HTTP server | Desktop (Julia/JS/Python/Dart/Rust) |
| **fileserver_download_handler** | Download payloads from HTTP server with exponential backoff | Desktop (Julia/JS/Python/Dart/Rust) |
| **plik_upload_file** | Upload a local file to Plik server from disk | Rust |
### Data Flow
```mermaid
flowchart TD
A[User calls smartpack subject data] --> B[Process each payload]
B --> C{Calculate serialized size}
C -->|Size < Threshold| D[Direct Transport]
C -->|Size >= Threshold| E[Link Transport]
D --> F[Serialize data]
F --> G[Base64 encode]
G --> H[Build payload object]
E --> I[Serialize data]
I --> J[Upload to file server]
J --> K[Get download URL]
K --> H
H --> L[Build envelope]
L --> M[Convert to JSON]
M --> N[Return envelope + JSON to caller]
style A fill:#f9f9f9,stroke:#333
style N fill:#e0e7ff,stroke:#3b82f6
style D fill:#d1fae5,stroke:#10b981
style E fill:#fef3c7,stroke:#f59e0b
```
---
## Message Envelope Architecture
### msg_envelope_v1 Structure (Julia)
```julia
struct msg_envelope_v1
correlation_id::String # UUID v4 for distributed tracing
msg_id::String # UUID v4 for this message
timestamp::String # ISO 8601 UTC timestamp
send_to::String # Topic/subject to publish to
msg_purpose::String # ACK, NACK, updateStatus, shutdown, chat
sender_name::String # Sender application name
sender_id::String # UUID v4 of sender
receiver_name::String # Receiver application name (empty = broadcast)
receiver_id::String # UUID v4 of receiver (empty = broadcast)
reply_to::String # Topic for reply messages
reply_to_msg_id::String # Message ID being replied to
broker_url::String # Broker URL for the transport layer
metadata::Dict{String, Any} # Message-level metadata
payloads::Vector{msg_payload_v1} # List of payloads
end
```
### msg_payload_v1 Structure (Julia)
```julia
struct msg_payload_v1
id::String # UUID v4 for this payload
dataname::String # Name of the payload
payload_type::String # text, dictionary, arrowtable, etc.
transport::String # direct or link
encoding::String # none, json, base64, arrow-ipc
size::Integer # Size in bytes
data::Any # Base64 string or URL
metadata::Dict{String, Any} # Payload-level metadata
end
```
### JSON Schema (Cross-Platform)
```json
{
"correlation_id": "string (UUID v4)",
"msg_id": "string (UUID v4)",
"timestamp": "string (ISO 8601 UTC)",
"send_to": "string",
"msg_purpose": "string",
"sender_name": "string",
"sender_id": "string (UUID v4)",
"receiver_name": "string",
"receiver_id": "string (UUID v4)",
"reply_to": "string",
"reply_to_msg_id": "string",
"broker_url": "string",
"metadata": "object",
"payloads": [
{
"id": "string (UUID v4)",
"dataname": "string",
"payload_type": "string",
"transport": "string",
"encoding": "string",
"size": "integer",
"data": "string or URL",
"metadata": "object"
}
]
}
```
---
## Payload Type Architecture
### Supported Payload Types
| Type | Description | Serialization | Encoding | Platforms |
|------|-------------|---------------|----------|-----------|
| `text` | Plain text string | UTF-8 bytes | Base64 | All |
| `dictionary` | JSON object | JSON string | Base64/JSON | All |
| `arrowtable` | Apache Arrow IPC | Arrow IPC stream | Base64/arrow-ipc | Desktop (Julia/Python/Node.js/Dart/Rust) |
| `jsontable` | JSON array of objects | JSON string | Base64/json | All (including Browser/Dart Web) |
| `image` | Binary image data | Raw bytes | Base64 | All |
| `audio` | Binary audio data | Raw bytes | Base64 | All |
| `video` | Binary video data | Raw bytes | Base64 | All |
| `binary` | Generic binary data | Raw bytes | Base64 | All |
### Serialization Logic
```mermaid
flowchart TD
A[Input data + payload_type] --> B{Payload Type}
B -->|"text"| C[UTF-8 encode]
B -->|"dictionary"| D[JSON serialize]
B -->|"arrowtable"| E[Arrow IPC serialize]
B -->|"jsontable"| F[JSON serialize]
B -->|"image"| G[Raw bytes]
B -->|"audio"| H[Raw bytes]
B -->|"video"| I[Raw bytes]
B -->|"binary"| J[Raw bytes]
C --> K[Return bytes]
D --> K
E --> K
F --> K
G --> K
H --> K
I --> K
J --> K
style A fill:#f9f9f9,stroke:#333
style K fill:#e0e7ff,stroke:#3b82f6
```
---
## Transport Strategy Architecture
### Size Threshold Decision Logic
| Platform | Size Threshold | Notes |
|----------|----------------|-------|
| Desktop (Julia/JS/Python/Dart) | 500,000 bytes (0.5MB) | Default threshold |
| Dart Desktop | 500,000 bytes (0.5MB) | Default threshold |
| Dart Flutter | 500,000 bytes (0.5MB) | Default threshold |
| Dart Web | 500,000 bytes (0.5MB) | Default threshold |
| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints |
### Transport Selection Flow
```mermaid
flowchart TD
A[smartpack called] --> B[Serialize payload]
B --> C[Calculate size]
C --> D{Size < Threshold?}
D -->|Yes| E[Direct Transport]
D -->|No| F[Link Transport]
E --> G[Base64 encode]
G --> H[Build payload with direct transport]
F --> I[Upload to file server]
I --> J[Get download URL]
J --> K[Build payload with link transport]
H --> L[Build envelope]
K --> L
style A fill:#f9f9f9,stroke:#333
style L fill:#e0e7ff,stroke:#3b82f6
style E fill:#d1fae5,stroke:#10b981
style F fill:#fef3c7,stroke:#f59e0b
```
### Direct Transport Protocol
When `transport = "direct"`, the `data` field contains a Base64-encoded string of the serialized payload.
**Encoding Rules**:
- `text`: UTF-8 → Base64
- `dictionary`: JSON → Base64 (or direct JSON)
- `arrowtable`: Arrow IPC → Base64 (or arrow-ipc)
- `jsontable`: JSON → Base64 (or direct JSON)
- `image`/`audio`/`video`/`binary`: Raw bytes → Base64
### Link Transport Protocol
When `transport = "link"`, the `data` field contains a URL pointing to the uploaded payload.
**Upload Flow**:
1. Serialize payload according to `payload_type`
2. Upload to HTTP file server (e.g., Plik)
3. Include returned URL in `data` field
**Download Flow**:
1. Extract URL from payload
2. Fetch with exponential backoff (max 5 retries)
3. Deserialize based on `payload_type`
---
## Platform-Specific Architecture
### Julia Architecture
Julia leverages multiple dispatch for type-specific implementations:
- **Multiple Dispatch**: Function overloading based on argument types
- **Struct-based Data Models**: Explicit type definitions with `struct`
- **Native Arrow IPC**: Support via `Arrow.jl`
- **Async/Await**: Tasks for non-blocking I/O
```julia
# Multiple dispatch for serialization
function _serialize_data(data::String, payload_type::String)
# Text serialization
end
function _serialize_data(data::Dict, payload_type::String)
# Dictionary serialization
end
function _serialize_data(data::DataFrame, payload_type::String)
# Arrow table serialization
end
```
### JavaScript Architecture
JavaScript uses async/await for non-blocking I/O:
- **Module-level Utilities**: Serialization functions
- **Native ArrayBuffer**: Binary data handling (Browser) / Buffer (Node.js)
- **Fetch API**: HTTP file server communication
#### Node.js Implementation (msghandler_ssr.js)
- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`, `ws://`)
- **Apache Arrow IPC**: Full support via `apache-arrow`
- **Buffer for binary data**: Native Node.js Buffer handling
#### Browser Implementation (msghandler_csr.js)
- **WebSocket connections**: Uses `ws://` or `wss://` URLs (transport-agnostic)
- **No Apache Arrow**: Uses `jsontable` for tabular data only
- **Uint8Array for binary data**: Browser-compatible binary handling
- **Web Crypto API**: UUID generation via `crypto.getRandomValues()`
### Python Architecture
Python uses classes for stateful operations:
- **Class-based msghandler**: Encapsulated API
- **Dataclasses**: Structured data (MsgPayloadV1, MsgEnvelopeV1)
- **Async/await**: I/O operations
- **pyarrow**: Arrow IPC support
```python
class msghandler:
DEFAULT_SIZE_THRESHOLD = 500_000
def __init__(self, broker_url=None, fileserver_url=None):
self.broker_url = broker_url or self.DEFAULT_BROKER_URL
self.fileserver_url = fileserver_url or self.DEFAULT_FILESERVER_URL
```
### Dart Architecture
Dart uses classes for stateful operations with async/await:
- **Class-based msghandler**: Encapsulated API
- **Data classes**: Structured data (MsgPayloadV1, MsgEnvelopeV1)
- **Async/await**: I/O operations
- **dart-arrow**: Arrow IPC support (Desktop/Flutter only)
- **HTTP package**: HTTP file server communication
- **Transport package**: Transport client with WebSocket support (Dart Web)
```dart
class msghandler {
static const DEFAULT_SIZE_THRESHOLD = 500000;
final String brokerUrl;
final String fileserverUrl;
msghandler({
this.brokerUrl = DEFAULT_BROKER_URL,
this.fileserverUrl = 'http://localhost:8080',
});
}
```
#### Dart Desktop (Dart SDK)
- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`)
- **Apache Arrow IPC**: Full support via `dart-arrow`
- **Uint8List for binary data**: Native Dart binary handling
#### Dart Flutter (Dart SDK)
- **Transport connections**: Uses broker URLs (e.g., `nats://`, `mqtt://`)
- **Apache Arrow IPC**: Full support via `dart-arrow`
- **Uint8List for binary data**: Native Dart binary handling
#### Dart Web (Dart SDK)
- **WebSocket connections**: Uses `ws://` or `wss://` URLs (transport-agnostic)
- **No Apache Arrow**: Uses `jsontable` for tabular data only
- **Uint8List for binary data**: Browser-compatible binary handling
- **Fetch API**: HTTP file server communication via `http` package
### Browser Architecture
Browser JavaScript has specific constraints due to security and compatibility:
- **Async/await**: Native async/await support
- **No Apache Arrow**: Arrow IPC not available in browsers
- **JSON table only**: Use "jsontable" for tabular data
- **WebSocket transport**: Uses transport client for browser-compatible connections
- **Fetch API**: HTTP file server communication via fetch
### MicroPython Architecture
MicroPython has significant constraints:
- **Synchronous API**: No async/await
- **Memory-constrained**: 256KB - 1MB
- **Limited payload support**: No tables, max 50KB
- **Simplified UUID generation**: Custom implementation
```python
# MicroPython constraints
DEFAULT_SIZE_THRESHOLD = 100_000 # 100KB
MAX_PAYLOAD_SIZE = 50_000 # 50KB hard limit
```
### Rust Architecture
Rust leverages compile-time type safety and async runtimes:
- **Type-safe payloads**: Rust enum discriminates between `Text`, `Dictionary`, `ArrowTable`, `Binary`, etc.
- **serde serialization**: Automatic JSON deserialization via `#[derive(Serialize, Deserialize)]`
- **tokio runtime**: Efficient async I/O for transport connections and HTTP file server operations
- **arrow2 integration**: Native Arrow IPC deserialization without intermediate format conversion
- **reqwest**: High-performance HTTP client with built-in TLS and connection pooling
- **Zero-copy patterns**: `Vec<u8>` passed directly to avoid unnecessary memory copies
- **Result<T, E>**: Idiomatic error handling with typed error types
```rust
// Type-safe payload enum (compile-time discrimination)
#[derive(Serialize, Deserialize, Clone)]
pub enum Payload {
Text(String),
Dictionary(serde_json::Value),
ArrowTable(Vec<u8>),
JsonTable(serde_json::Value),
Image(Vec<u8>),
Audio(Vec<u8>),
Video(Vec<u8>),
Binary(Vec<u8>),
}
// Configuration via builder pattern
pub struct smartpackOptions {
pub broker_url: String,
pub fileserver_url: String,
pub fileserver_upload_handler: Option<Arc<dyn FileUploadHandler>>,
pub size_threshold: usize,
pub correlation_id: String,
pub msg_purpose: String,
pub sender_name: String,
// ... other fields
}
// Transport client with tokio integration
let conn = transport_client::connect(DEFAULT_BROKER_URL).await?;
// Subscribe and process messages
let mut sub = conn.subscribe("/agent/wine/api/v1/analyze")?;
for msg in sub.messages() {
let envelope = smartunpack(&String::from_utf8_lossy(&msg.payload), &Default::default()).await?;
// Access deserialized payloads by type
for payload in &envelope.payloads {
match payload.payload_type.as_str() {
"arrowtable" => { /* payload.data is base64-encoded Arrow IPC */ },
"text" => { /* payload.data is decoded text string */ },
"binary" | "image" | "audio" | "video" => { /* payload.data is base64-encoded binary */ },
_ => { /* other types */ }
}
}
}
```
---
## Scaling Architecture
### Horizontal Scaling
| Component | Scaling Strategy |
|-----------|------------------|
| **Message Broker** | Cluster deployment with multiple nodes |
| **File Server** | Load balancer + multiple instances |
| **Client Applications** | Deploy multiple instances behind load balancer |
### Vertical Scaling
| Component | Scaling Strategy |
|-----------|------------------|
| **Message Broker** | Increase memory, CPU, disk I/O |
| **File Server** | Increase memory, CPU, disk capacity |
| **Client Applications** | Increase heap size (Python/JS) |
### Performance Considerations
| Metric | Target | Notes |
|--------|--------|-------|
| Message serialization overhead | <50ms | For 10KB payload |
| Message deserialization overhead | <50ms | For 10KB payload |
| Transport connection establishment | <100ms | Connection pool recommended |
| File upload latency | <1s | For 0.5MB file |
| File download latency | <1s | For 0.5MB file |
---
## Failure Modes and Recovery
### Transport Connection Failure
**Scenario**: Message broker unavailable
**Handler**:
- Connection auto-reconnect via transport-level reconnection
- Retry with exponential backoff for publish operations
**Recovery**:
- Transport client automatically attempts reconnection
- Application can check connection status before publishing
### File Server Unavailable
**Scenario**: HTTP file server unavailable during upload/download
**Handler**:
- Retry up to 5 times with exponential backoff (100ms → 5000ms)
- Fallback to direct transport for upload (MicroPython)
**Recovery**:
- Exponential backoff: `delay = min(delay * 2, max_delay)`
- After max retries, throw error with correlation ID
### Deserialization Error
**Scenario**: Payload type mismatch or corrupted data
**Handler**:
- Log correlation ID and throw error
- No retry (data corruption)
**Recovery**:
- Application must validate payload_type matches data type
- Use proper serialization before sending
### Memory Overflow (MicroPython)
**Scenario**: Payload exceeds maximum size (50KB)
**Handler**:
- Reject payloads >50KB with MemoryError
- No retry (client-side check)
**Recovery**:
- Application must split large payloads
- Use direct transport only for small payloads
---
## Trade-off Decisions
### Decision 1: Direct vs Link Transport Threshold
**Trade-off**: Memory vs Network I/O
**Decision**: Use 0.5MB threshold for desktop, 100KB for MicroPython
**Rationale**:
- Direct transport uses more memory (Base64 encoding adds ~33% overhead)
- Link transport requires network I/O for upload/download
- 0.5MB is reasonable for desktop memory constraints
- 100KB is necessary for MicroPython memory constraints
### Decision 2: Base64 Encoding for Direct Transport
**Trade-off**: Bandwidth vs Simplicity
**Decision**: Use Base64 encoding for all direct transport payloads
**Rationale**:
- Simplifies JSON serialization (all data is string-compatible)
- Increases payload size by ~33%, but transport can handle this
- Alternative would be binary payload support (more complex)
### Decision 3: Multiple Platform Implementations
**Trade-off**: Development effort vs Cross-platform support
**Decision**: Maintain separate implementations for each platform
**Rationale**:
- Each platform has idiomatic patterns (multiple dispatch, async/await, etc.)
- Maintains developer productivity and code quality
- API parity ensures cross-platform compatibility
### Decision 4: Handler Function Abstraction
**Trade-off**: Flexibility vs Simplicity
**Decision**: Abstract file server operations through handler functions
**Rationale**:
- Allows support for different file server implementations (Plik, AWS S3, custom)
- Maintains simplicity for common use cases
- Enables plug-in architecture for custom backends
---
## Deployment Architecture
### Minimum Infrastructure
| Component | Minimum | Notes |
|-----------|---------|-------|
| Message Broker | 1 instance | Single node for development |
| File Server | 1 instance | HTTP server for large payloads |
| Client Memory | 50MB | Desktop platforms (Julia/JS/Python/Dart) |
| Client Memory | 256KB | MicroPython devices |
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `BROKER_URL` | `ws://localhost:4222` | Message broker URL |
| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL |
| `SIZE_THRESHOLD` | `500000` | Size threshold in bytes (0.5MB) |
### Container Deployment
```mermaid
flowchart TD
subgraph "Docker Network"
Broker_Container[Message Broker]
FileServer_Container[Plik File Server]
App_Container[Application Container]
end
App_Container -->|Transport| Broker_Container
App_Container -->|HTTP| FileServer_Container
style Broker_Container fill:#fff3e0,stroke:#f57c00
style FileServer_Container fill:#f3e5f5,stroke:#9c27b4
style App_Container fill:#e3f2fd,stroke:#2196f3
```
---
## Security Considerations
### Payload Integrity
**Mechanism**: SHA-256 checksum via metadata
**Implementation**:
- Sender calculates checksum and stores in payload metadata
- Receiver validates checksum on receipt
### Transport Security
**Mechanism**: TLS support for transport connections
**Implementation**:
- Use `nats://` URL for plain text
- Use `tls://` URL for TLS-encrypted connections
- Use `ws://` or `wss://` for WebSocket connections
### File Server Security
**Mechanism**: Authentication token for file uploads
**Implementation**:
- Plik uses upload token in `X-UploadToken` header
- Application can implement custom authentication
---
## Testing Architecture
### Unit Test Coverage
| Test Category | Coverage | Files |
|---------------|----------|-------|
| Serialization | All payload types | `test/test_*_sender.*` |
| Deserialization | All payload types | `test/test_*_receiver.*` |
| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` |
| File server upload | Plik integration | Platform-specific |
| File server download | Exponential backoff | Platform-specific |
### Integration Test Scenarios
| Scenario | Platforms | Payloads | Transport | Expected Result |
|----------|-----------|----------|-----------|-----------------|
| Cross-platform text | Julia ↔ JS ↔ Python | text | direct | Round-trip successful |
| Arrow IPC round-trip | Julia ↔ JS ↔ Python | arrowtable | direct | Arrow IPC preserved |
| Large file transfer | All | image/audio/video | link | File server upload/download |
| Multi-payload mixed | All | text + image + file | direct/link | All payloads preserved |
---
## Versioning
### Architecture Versioning
| Component | Version | Notes |
|-----------|---------|-------|
| Architecture | 1.0.0 | Initial release |
| Protocol | v1 | Message envelope protocol version |
### Backward Compatibility
| Version | Supported Platforms |
|---------|---------------------|
| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, Dart 2.17+, Rust 1.70+, MicroPython 1.19+ |
---
## Change Log
| Date | Version | Changes |
|------|---------|---------|
| 2026-05-15 | 1.5.0 | Made transport layer agnostic | All sections |
| - | - | Removed all NATS-specific references from architecture docs | All sections |
| - | - | Updated diagrams to use generic "Message Broker" instead of "NATS Server" | All sections |
| - | - | Updated code examples to use transport-agnostic patterns | All sections |
| - | - | Removed NATS client packages from external dependencies | All sections |
| 2026-05-14 | 1.4.0 | Updated Rust API to reflect `smartunpack` deserialization changes | All sections |
| - | - | `smartunpack` now stores deserialized data in `MsgPayloadV1.data` | specification.md:8 |
| - | - | Added `plik_upload_file` convenience function to component table | specification.md:13 |
| - | - | Fixed Rust payload access pattern (data is String, not Payload enum) | All sections |
| - | - | Fixed `smartpackOptions.fileserver_upload_handler` type to `Arc<dyn FileUploadHandler>` | specification.md:13 |
| - | - | Removed `metadata` from link transport examples (now `None`/omitted) | specification.md:3 |
| - | - | Removed duplicate footer text | All sections |
| 2026-05-13 | 1.3.0 | Added Rust support with tokio, serde, and arrow2 | All sections |
| - | - | Added Rust to C4 diagrams (context, container) | All sections |
| - | - | Added Rust platform-specific architecture section | specification.md:13 |
| - | - | Updated component table with Rust support | All sections |
| 2026-05-13 | 1.2.0 | Aligned with ground truth implementation (src/msghandler.jl) |
| - | - | Removed publish_message component (commented out in source) |
| - | - | Removed NATSClient and NATSConnectionPool classes (not in ground truth) |
| - | - | Updated smartpack to return JSON for caller to publish via transport |
| - | - | Updated component diagram to match actual module structure |
| - | - | Updated data flow to show smartpack returns JSON for caller to publish |
| - | - | Fixed SIZE_THRESHOLD default to 500,000 bytes |
| 2026-03-15 | 1.1.0 | JavaScript connection management |
| - | - | Added NATSClient with keepAlive support |
| - | - | Added NATSConnectionPool for connection reuse |
| - | - | Added publishMessage function with closeConnection option |
| (Historical - pre-transport-agnostic refactor) | | |
| 2026-03-13 | 1.0.0 | Initial architecture documentation |
---
## 16. References
### 16.1 Documentation Artifacts
| Document | Purpose | Specification Traceability | UI Specification Traceability | Requirement ID(s) |
|----------|---------|---------------------------|------------------------------|-------------------|
| [`docs/requirements.md`](./requirements.md) | Business requirements and user stories | FR-001 through FR-014, NFR-101 through NFR-405 | - | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/specification.md`](./specification.md) | Technical contract for msghandler | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/ui-specification.md`](./ui-specification.md) | UI specification for client applications | - | All UI components and interactions | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/walkthrough.md`](./walkthrough.md) | End-to-end system flow | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/architecture.md`](./architecture.md) | System architecture diagrams | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/validation.md`](./validation.md) | CI/CD validation rules | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`docs/runbook.md`](./runbook.md) | Operational runbook | specification.md:2-19 (all sections) | - | FR-001 through FR-014, NFR-101 through NFR-405 |
### 16.2 Implementation Files
| File | Platform | Features | Specification Traceability | Requirement ID(s) |
|------|----------|----------|---------------------------|-------------------|
| [`src/msghandler.jl`](../src/msghandler.jl) | Julia | Full feature set, Arrow IPC, multiple dispatch | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler_ssr.js`](../src/msghandler_ssr.js) | Node.js | Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler_csr.js`](../src/msghandler_csr.js) | Browser | JSON table only | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler.py`](../src/msghandler.py) | Python | Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler.dart`](../src/msghandler.dart) | Dart | Full feature set, Arrow IPC, async/await | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler.rs`](../src/msghandler.rs) | Rust | Full feature set, Arrow IPC, async/await, type-safe, file upload helpers | specification.md:2-19 (all sections) | FR-001 through FR-014, NFR-101 through NFR-405 |
| [`src/msghandler_mpy.py`](../src/msghandler_mpy.py) | MicroPython | Limited to direct transport | specification.md:2-19 (all sections) | FR-005, FR-006, FR-012 |
### 16.3 External Dependencies
| Platform | Package | Version | Purpose | Specification Traceability | Requirement ID(s) |
|----------|---------|---------|---------|--------------------------|-------------------|
| Julia | JSON.jl | Latest | JSON serialization | specification.md:11 | FR-012, NFR-101, NFR-102 |
| Julia | Arrow.jl | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 |
| Julia | HTTP.jl | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 |
| Julia | UUIDs.jl | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 |
| Node.js | node-fetch | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 |
| Browser | - | - | Transport-agnostic (caller provides) | specification.md:11 | FR-013, FR-014 |
| Python | aiohttp | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 |
| Python | pyarrow | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 |
| Dart | http | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 |
| Dart | uuid | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 |
| Dart | dart-arrow | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 |
| Rust | serde | Latest | JSON serialization | specification.md:11 | FR-012, NFR-101, NFR-102 |
| Rust | serde_json | Latest | JSON handling | specification.md:11 | FR-012, NFR-101, NFR-102 |
| Rust | tokio | Latest | Async runtime | specification.md:11 | FR-013, FR-014 |
| Rust | reqwest | Latest | HTTP file server | specification.md:11 | FR-008, FR-009 |
| Rust | uuid | Latest | UUID generation | specification.md:11 | FR-011, NFR-401 |
| Rust | arrow2 | Latest | Arrow IPC support | specification.md:11 | FR-002, FR-012 |
| MicroPython | builtin | N/A | Limited implementation | specification.md:11 | FR-005, FR-006, FR-012 |
---
## 17. Change Log
| Date | Version | Changes | Specification Reference |
|------|---------|---------|------------------------|
| 2026-03-23 | 1.1.0 | Updated to ASG Framework architecture guidelines | specification.md:2-19 (all sections) |
| 2026-03-15 | 1.1.0 | JavaScript connection management | specification.md:2-19 (all sections) |
| 2026-03-13 | 1.0.0 | Initial architecture documentation | specification.md:2-19 (all sections) |
---
## 18. Gap-Check Validation
| Stage Transition | Gap-Check Question | Status |
|------------------|-------------------|--------|
| Requirements → Specification | Does the Specification define all edge cases and conflict scenarios from the Requirements? | ✅ Verified - All FR-XXX requirements have corresponding spec rules |
| Specification → UI Specification | Does the UI Specification expose all the data and states defined in the Specification? | ⏳ Pending - UI spec not yet created |
| UI Specification → Walkthrough | Does the Walkthrough reflect the complete flow including error states and timing? | ⏳ Pending - UI spec not yet created |
| Walkthrough → Architecture | Does the Architecture support the performance and integration requirements defined in the Walkthrough? | ✅ Verified - Architecture supports all walkthrough flows |
---
*This architecture document is versioned and maintained in git alongside the codebase. All implementations must adhere to this architecture.*

416
docs/implementation-plan.md Normal file
View File

@@ -0,0 +1,416 @@
# Implementation Plan: msghandler
**Version**: 1.3.0
**Date**: 2026-05-19
**Status**: Active
**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl)
---
## 1. Implementation Phases and Timeline
### Phase 1: Core API Implementation (Week 1-2)
| Task | Priority | Estimated Effort | Status |
|------|----------|-----------------|--------|
| Core `smartpack()` implementation | P0 | 3 days | ✅ Complete |
| Core `smartunpack()` implementation | P0 | 3 days | ✅ Complete |
| Message envelope structure | P0 | 2 days | ✅ Complete |
| Payload type handling | P0 | 2 days | ✅ Complete |
| Transport adapter layer | P0 | 3 days | ✅ Complete |
**Deliverables**:
- Julia module: `src/msghandler.jl`
- Node.js module: `src/msghandler_ssr.js`
- Browser module: `src/msghandler_csr.js`
- Python module: `src/msghandler.py`
- MicroPython module: `src/msghandler_mpy.py`
### Phase 2: File Server Integration (Week 3)
| Task | Priority | Estimated Effort | Status |
|------|----------|-----------------|--------|
| File server upload handler | P1 | 2 days | ✅ Complete |
| File server download handler | P1 | 2 days | ✅ Complete |
| Exponential backoff logic | P1 | 1 day | ✅ Complete |
| Plik integration | P1 | 2 days | ✅ Complete |
**Deliverables**:
- Upload handler with plik_oneshot_upload
- Download handler with retry logic
- Configurable file server URL
### Phase 3: Platform-Specific Features (Week 4)
| Task | Priority | Estimated Effort | Status |
|------|----------|-----------------|--------|
| Arrow IPC support (Desktop) | P1 | 3 days | ✅ Complete |
| JSON table support (Browser) | P1 | 2 days | ✅ Complete |
| Browser WebSocket transport | P1 | 2 days | ✅ Complete |
| MicroPython optimizations | P2 | 2 days | ✅ Complete |
**Deliverables**:
- Arrow IPC serialization for tabular data
- JSON table format for browser compatibility
- Browser-specific transport layer
- Memory-optimized MicroPython implementation
### Phase 4: Cross-Platform Testing (Week 5)
| Task | Priority | Estimated Effort | Status |
|------|----------|-----------------|--------|
| Text message tests | P1 | 1 day | ✅ Complete |
| Dictionary tests | P1 | 1 day | ✅ Complete |
| Tabular data tests | P1 | 2 days | ✅ Complete |
| Mixed payload tests | P1 | 2 days | ✅ Complete |
| Large file tests | P1 | 2 days | ✅ Complete |
**Deliverables**:
- Platform-specific test suites
- Integration test scenarios
- Performance benchmarks
### Phase 5: Documentation & Examples (Week 6)
| Task | Priority | Estimated Effort | Status |
|------|----------|-----------------|--------|
| API documentation | P2 | 2 days | ✅ Complete |
| Walkthrough examples | P2 | 2 days | ✅ Complete |
| Architecture diagrams | P2 | 1 day | ✅ Complete |
| Deployment guides | P2 | 1 day | ✅ Complete |
**Deliverables**:
- Comprehensive documentation
- Code examples for all platforms
- Deployment runbooks
---
## 2. Module/Component Breakdown
### Core Modules
#### msghandler.jl (Julia)
```
src/
└── msghandler.jl
├── Constants (DEFAULT_SIZE_THRESHOLD, etc.)
├── msg_payload_v1 struct
├── msg_envelope_v1 struct
├── Serialization functions
│ ├── serialize_text()
│ ├── serialize_dictionary()
│ ├── serialize_arrowtable()
│ ├── serialize_jsontable()
│ └── serialize_binary()
├── Deserialization functions
│ ├── deserialize_text()
│ ├── deserialize_dictionary()
│ ├── deserialize_arrowtable()
│ ├── deserialize_jsontable()
│ └── deserialize_binary()
├── File server handlers
│ ├── plik_oneshot_upload()
│ └── _fetch_with_backoff()
├── smartpack() - Main sender function
└── smartunpack() - Main receiver function
```
**Dependencies**:
- JSON.jl (JSON serialization)
- Arrow.jl (Arrow IPC)
- HTTP.jl (File server)
- UUIDs.jl (IDs)
- DataFrames.jl (DataFrame support)
#### msghandler_ssr.js (Node.js)
```
src/
├── msghandler_ssr.js
│ ├── Constants
│ ├── msg_payload_v1 class
│ ├── msg_envelope_v1 class
│ ├── Serialization methods
│ ├── Deserialization methods
│ ├── File server handlers
│ ├── smartpack() function
│ └── smartunpack() function
└── nats/
├── NATSClient.js
└── NATSConnectionPool.js
```
**Dependencies**:
- nats (NATS client)
- node-fetch (HTTP file server)
#### msghandler_csr.js (Browser)
```
src/
└── msghandler_csr.js
├── Constants
├── msg_payload_v1 class
├── msg_envelope_v1 class
├── Serialization methods (JSON table only)
├── Deserialization methods
├── File server handlers (browser-compatible)
├── smartpack() function
└── smartunpack() function
```
**Dependencies**:
- nats.ws (Browser NATS client)
#### msghandler.py (Python)
```
src/
└── msghandler.py
├── Constants
├── msg_payload_v1 class
├── msg_envelope_v1 class
├── Serialization methods
├── Deserialization methods
├── File server handlers
├── smartpack() async function
└── smartunpack() async function
```
**Dependencies**:
- aiohttp (HTTP file server)
- pyarrow (Arrow IPC)
- uuid (IDs)
#### msghandler.rs (Rust)
```
src/
├── msghandler.rs
│ ├── Constants
│ ├── msg_payload_v1 struct
│ ├── msg_envelope_v1 struct
│ ├── Serialization traits
│ ├── Deserialization traits
│ ├── File server handlers
│ ├── smartpack() async function
│ └── smartunpack() async function
├── Payload enum
├── smartpackOptions struct
└── smartunpackOptions struct
```
**Dependencies**:
- tokio (Async runtime)
- serde (JSON serialization)
- reqwest (HTTP file server)
- arrow2 (Arrow IPC)
#### msghandler_mpy.py (MicroPython)
```
src/
└── msghandler_mpy.py
├── Constants (lower thresholds)
├── msg_payload_v1 class
├── msg_envelope_v1 class
├── serialize_text()
├── deserialize_text()
├── serialize_dictionary()
├── deserialize_dictionary()
└── smartpack()/smartunpack() functions
```
**Constraints**:
- Limited to text and dictionary types
- Direct transport only (no file server)
- 100KB threshold for memory constraints
---
## 3. Task List
### Core API Tasks
| Task ID | Description | Assignee | Priority | Status |
|---------|-------------|----------|----------|--------|
| T-001 | Implement `smartpack()` with tuple format | Developer A | P0 | ✅ Complete |
| T-002 | Implement `smartunpack()` with type handling | Developer A | P0 | ✅ Complete |
| T-003 | Create message envelope structure | Developer A | P0 | ✅ Complete |
| T-004 | Implement transport adapter | Developer B | P0 | ✅ Complete |
| T-005 | Add correlation ID support | Developer A | P0 | ✅ Complete |
### File Server Tasks
| Task ID | Description | Assignee | Priority | Status |
|---------|-------------|----------|----------|--------|
| T-006 | Implement Plik upload handler | Developer B | P1 | ✅ Complete |
| T-007 | Implement file download with retry | Developer B | P1 | ✅ Complete |
| T-008 | Add exponential backoff logic | Developer B | P1 | ✅ Complete |
### Platform Tasks
| Task ID | Description | Assignee | Priority | Status |
|---------|-------------|----------|----------|--------|
| T-009 | Implement Arrow IPC (Julia/Python/Node.js) | Developer A | P1 | ✅ Complete |
| T-010 | Implement JSON table (Browser) | Developer B | P1 | ✅ Complete |
| T-011 | Implement MicroPython optimizations | Developer C | P2 | ✅ Complete |
| T-012 | Browser WebSocket transport | Developer B | P1 | ✅ Complete |
### Testing Tasks
| Task ID | Description | Assignee | Priority | Status |
|---------|-------------|----------|------------------|
| T-013 | Text message tests | QA Team | P1 | ✅ Complete |
| T-014 | Dictionary tests | QA Team | P1 | ✅ Complete |
| T-015 | Tabular data tests | QA Team | P1 | ✅ Complete |
| T-016 | Mixed payload tests | QA Team | P1 | ✅ Complete |
| T-017 | Large file tests | QA Team | P1 | ✅ Complete |
---
## 4. Test Strategy
### Unit Tests
| Test Category | Coverage | Files | Requirements |
|---------------|----------|-------|--------------|
| Serialization | All payload types | `test/test_*_sender.*` | FR-001 through FR-012 |
| Deserialization | All payload types | `test/test_*_receiver.*` | FR-001 through FR-012 |
| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | FR-003, FR-004, FR-006 |
| File server upload | Plik integration | Platform-specific | FR-008, FR-009 |
| File server download | Exponential backoff | Platform-specific | FR-010, FR-011 |
### Integration Tests
| Scenario | Platforms | Payloads | Transport | Requirements |
|----------|-----------|----------|-----------|--------------|
| Single text (small) | All | text | direct | FR-001, FR-012 |
| Single dictionary (small) | All | dictionary | direct | FR-002, FR-012 |
| Single arrow table (small) | Desktop | arrowtable | direct | FR-002, FR-012 |
| Single JSON table (small) | All | jsontable | direct | FR-001, FR-002, FR-006 |
| Single image (small) | All | image | direct | FR-001, FR-006 |
| Single text (large) | All | text | link | FR-003, FR-008, FR-009 |
| Mixed payloads | All | text + dictionary + image | mixed | FR-006, FR-007 |
### Test Coverage Targets
| Phase | Coverage Target | Method |
|-------|----------------|--------|
| Phase 1 | 70% | Unit tests per platform |
| Phase 2 | 80% | Add integration tests |
| Phase 3 | 85% | Add edge case tests |
| Phase 4 | 90% | Add performance tests |
---
## 5. Build and Deployment Preparation
### Continuous Integration
| Check | Command | Purpose |
|-------|---------|---------|
| Linting | `npm run lint` | Code style enforcement |
| Type checking | `npx tsc --noEmit` | Type safety (JavaScript/TypeScript) |
| Unit tests | `npm test` | Functionality validation |
| Integration tests | `npm run test:integration` | Cross-platform validation |
| Coverage | `npm run coverage` | Test coverage tracking |
### Deployment Pipeline
```
GitHub Push
CI/CD Pipeline
├──→ Linting (all platforms)
├──→ Unit tests (all platforms)
├──→ Integration tests (cross-platform)
├──→ Coverage report
└──→ Build documentation
Release (if all checks pass)
├──→ GitHub Releases
├──→ Package registry (npm, PyPI)
└──→ Documentation site
```
---
## 6. Risk Mitigation
### Known Blockers
| Risk | Mitigation Step | Owner |
|------|----------------|-------|
| **Browser Arrow IPC** | Use JSON table as fallback | Developer B |
| **MicroPython memory** | 100KB threshold, direct transport only | Developer C |
| **File server availability** | Exponential backoff with graceful degradation | Developer B |
### Known Unknowns
| Unknown | Monitoring Strategy | Response Plan |
|---------|-------------------|---------------|
| Platform-specific bugs | Comprehensive test coverage | Hotfix with platform-specific handling |
| Performance bottlenecks | Load testing and profiling | Optimized serialization/deserialization |
---
## 7. Requirements Traceability
### Functional Requirements
| Requirement ID | Implementation Location | Status |
|---------------|------------------------|--------|
| FR-001 | All platform modules | ✅ Complete |
| FR-002 | All platform modules | ✅ Complete |
| FR-003 | All platform modules (size_threshold logic) | ✅ Complete |
| FR-004 | All platform modules | ✅ Complete |
| FR-005 | MicroPython module | ✅ Complete |
| FR-006 | All platform modules | ✅ Complete |
| FR-007 | All platform modules | ✅ Complete |
| FR-008 | All platform modules | ✅ Complete |
| FR-009 | All platform modules | ✅ Complete |
| FR-010 | All platform modules | ✅ Complete |
| FR-011 | All platform modules | ✅ Complete |
| FR-012 | All platform modules | ✅ Complete |
| FR-013 | All platform modules | ✅ Complete |
| FR-014 | All platform modules | ✅ Complete |
### Non-Functional Requirements
| NFR ID | Implementation Location | Status |
|--------|------------------------|--------|
| NFR-101 | Serialization functions | ✅ Complete |
| NFR-102 | Deserialization functions | ✅ Complete |
| NFR-103 | Transport adapter | ✅ Complete |
| NFR-104 | File upload handler | ✅ Complete |
| NFR-105 | File download handler | ✅ Complete |
| NFR-106 | MicroPython module | ✅ Complete |
| NFR-107 | Performance benchmarks | ✅ Complete |
| NFR-201 | Transport adapter | ✅ Complete |
| NFR-202 | File download retry logic | ✅ Complete |
| NFR-203 | Transport adapter | ✅ Complete |
| NFR-401 | Message envelope | ✅ Complete |
| NFR-402 | Metrics instrumentation | ✅ Complete |
| NFR-403 | Correlation ID propagation | ✅ Complete |
---
## 8. Validation Gates
### Pre-Release Checklist
| Gate | Check | Pass Criteria |
|------|-------|--------------|
| **G-001** | All unit tests pass | 100% pass rate per platform |
| **G-002** | Integration tests pass | Cross-platform round-trip successful |
| **G-003** | Coverage threshold | ≥80% line coverage |
| **G-004** | Linting clean | No warnings or errors |
| **G-005** | Specification compliance | All spec rules validated |
| **G-006** | Documentation complete | All required docs present |
### CI/CD Validation
| Check | Command | Failure Action |
|-------|---------|---------------|
| Syntax | `julia --check-base` | Block PR |
| Unit tests | `julia test/runtests.jl` | Block PR |
| Integration | `npm run test:integration` | Block PR |
| Coverage | `codecov` | Report only |
---
*This implementation plan is versioned and maintained in git alongside the codebase. All implementations must adhere to this plan.*

View File

@@ -1,7 +1,7 @@
# Requirements Document: msghandler
**Version**: 1.2.0
**Date**: 2026-05-13
**Version**: 1.3.0
**Date**: 2026-05-22
**Status**: Active
**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl)
@@ -33,16 +33,44 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless
| **As a developer**, I want automatic retry on file server download failures | P1 | Exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) |
| **As a developer**, I want message tracing across distributed systems | P1 | Correlation ID is propagated through all message processing steps |
### 1.3 KPIs & Targets
### 1.3 Success Metrics & KPIs
| Metric | Target | Measurement Method |
|--------|--------|-------------------|
| 95% of messages complete within 200ms | 95% | Synthetic monitoring |
| <2 days from onboarding to first PR | 2 days | PR timeline tracking |
| 100% of messages validate against spec | 100% | CI block rate |
| >80% unit test coverage | 80% | Test coverage tools |
| <1% of PRs bypass validation gates | 1% | CI gate analysis |
| MTTR <15 minutes for P1 incidents | 15 minutes | Incident tracking |
**Functional Requirements KPIs:**
- **FR-001** (Cross-platform text messaging): 95% of text messages delivered correctly across all platform pairs (<200ms latency) - Measured via synthetic cross-platform tests
- **FR-002** (Cross-platform tabular data): 100% Arrow IPC round-trip integrity (Desktop), 100% JSON table round-trip integrity (Browser) - Measured via data validation tests
- **FR-003** (Large file handling): 99% successful file uploads to server for payloads ≥0.5MB - Measured via integration tests
- **FR-004** (Direct transport for small payloads): 100% of payloads <0.5MB use direct transport - Measured via transport selection tests
- **FR-005** (MicroPython support): 100% of payloads <100KB delivered on MicroPython devices - Measured via MicroPython integration tests
- **FR-006** (Multi-payload messages): 100% correct parsing of multi-payload message lists - Measured via multi-payload tests
- **FR-007** (Payload type preservation): 100% type integrity preserved across all platforms - Measured via type validation tests
- **FR-008** (Plik file server integration): 100% successful Plik upload/token handling - Measured via Plik integration tests
- **FR-009** (Custom file server support): 100% handler abstraction works with custom implementations - Measured via custom server integration tests
- **FR-010** (Exponential backoff retry): 95% successful downloads within retry limit - Measured via failure injection tests
- **FR-011** (Correlation ID propagation): 100% correlation IDs propagated through all steps - Measured via tracing tests
- **FR-012** (Message serialization): <50ms serialization overhead for 10KB payload - Measured via benchmark tests
- **FR-013** (Transport publishing): 100% JSON envelope generated correctly - Measured via serialization tests
- **FR-014** (Transport subscription): 100% JSON messages processed correctly - Measured via deserialization tests
**Non-Functional Requirements KPIs:**
- **NFR-101** (Message serialization overhead): <50ms for 10KB payload - Measured via benchmark tests
- **NFR-102** (Message deserialization overhead): <50ms for 10KB payload - Measured via benchmark tests
- **NFR-103** (Transport connection establishment): <100ms average - Measured via connection pool benchmarks
- **NFR-104** (File upload latency): <1s for 0.5MB file - Measured via integration tests
- **NFR-105** (File download latency): <1s for 0.5MB file - Measured via integration tests
- **NFR-106** (Concurrent connections): 100+ simultaneous transport connections - Measured via scale testing
- **NFR-107** (Message throughput): 1000+ messages/second per instance - Measured via load testing
- **NFR-108** (File server scalability): Horizontal scaling verified via architecture review
- **NFR-201** (Message delivery): At-least-once delivery via transport - Measured via message acknowledgment tests
- **NFR-202** (File server availability): <5% failure rate when file server unavailable - Measured via failure injection tests
- **NFR-203** (Connection recovery): Auto-reconnect within 30s - Measured via connection failure tests
- **NFR-301** (Payload integrity): 100% SHA-256 checksum validation - Measured via integrity tests
- **NFR-302** (Transport security): 100% TLS connections in production - Measured via connection audits
- **NFR-303** (File server security): 100% authenticated file uploads - Measured via security tests
- **NFR-401** (Required logs): 100% messages logged with required fields - Measured via log validation
- **NFR-402** (Critical metrics): 100% metrics collected with 1-minute granularity - Measured via metrics pipeline tests
- **NFR-403** (Tracing): 100% correlation ID propagation for tracing - Measured via tracing validation
- **NFR-404** (Alerting): <5min alert latency for `download_retry_exceeded` - Measured via alert pipeline tests
- **NFR-405** (Retention): Logs: 30 days, Metrics: 1 year - Measured via storage audits
---
@@ -108,63 +136,68 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless
| ID | Requirement | Description |
|----|-------------|-------------|
| **FR-001** | Cross-platform text messaging | System shall allow users to send text messages between Julia, JavaScript, Python, and MicroPython applications |
| **FR-002** | Cross-platform tabular data | System shall support DataFrame exchange between Julia and Python applications using Arrow IPC format |
| **FR-003** | Large file handling | System shall automatically detect payloads ≥0.5MB and upload them to HTTP file server instead of sending via transport |
| **FR-004** | Direct transport for small payloads | System shall send payloads <0.5MB directly via transport without file server upload |
| **FR-005** | MicroPython support | System shall support payloads <100KB on MicroPython devices using direct transport |
| **FR-006** | Multi-payload messages | System shall accept and process lists of (dataname, data, type) tuples |
| **FR-007** | Payload type preservation | System shall preserve payload types when returning multi-payload messages |
| **FR-008** | Plik file server integration | System shall support Plik one-shot upload mode with upload ID and token handling |
| **FR-009** | Custom file server support | System shall provide handler function abstraction for custom HTTP file server implementations |
| **FR-010** | Exponential backoff retry | System shall implement exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) for file server download failures |
| **FR-011** | Correlation ID propagation | System shall propagate correlation IDs through all message processing steps |
| **FR-012** | Message serialization | System shall serialize data types using Base64, JSON, or Arrow IPC encoding |
| **FR-013** | Transport publishing | System shall return JSON string representation for caller to publish via transport layer (caller is responsible for actual transport publish) |
| **FR-014** | Transport subscription | System shall receive and process messages by accepting JSON string from transport payload |
| **FR-001** | Cross-platform text messaging | System shall allow users to send text messages between Julia, JavaScript, Python, and MicroPython applications | FR-001 KPI: 95% of text messages delivered correctly across all platform pairs (<200ms latency) |
| **FR-002** | Cross-platform tabular data | System shall support DataFrame exchange between Julia and Python applications using Arrow IPC format | FR-002 KPI: 100% Arrow IPC round-trip integrity (Desktop), 100% JSON table round-trip integrity (Browser) |
| **FR-003** | Large file handling | System shall automatically detect payloads ≥0.5MB and upload them to HTTP file server instead of sending via transport | FR-003 KPI: 99% successful file uploads to server for payloads ≥0.5MB |
| **FR-004** | Direct transport for small payloads | System shall send payloads <0.5MB directly via transport without file server upload | FR-004 KPI: 100% of payloads <0.5MB use direct transport |
| **FR-005** | MicroPython support | System shall support payloads <100KB on MicroPython devices using direct transport | FR-005 KPI: 100% of payloads <100KB delivered on MicroPython devices |
| **FR-006** | Multi-payload messages | System shall accept and process lists of (dataname, data, type) tuples | FR-006 KPI: 100% correct parsing of multi-payload message lists |
| **FR-007** | Payload type preservation | System shall preserve payload types when returning multi-payload messages | FR-007 KPI: 100% type integrity preserved across all platforms |
| **FR-008** | Plik file server integration | System shall support Plik one-shot upload mode with upload ID and token handling | FR-008 KPI: 100% successful Plik upload/token handling |
| **FR-009** | Custom file server support | System shall provide handler function abstraction for custom HTTP file server implementations | FR-009 KPI: 100% handler abstraction works with custom implementations |
| **FR-010** | Exponential backoff retry | System shall implement exponential backoff with configurable retries (default: 5, base_delay: 100ms, max_delay: 5000ms) for file server download failures | FR-010 KPI: 95% successful downloads within retry limit |
| **FR-011** | Correlation ID propagation | System shall propagate correlation IDs through all message processing steps | FR-011 KPI: 100% correlation IDs propagated through all steps |
| **FR-012** | Message serialization | System shall serialize data types using Base64, JSON, or Arrow IPC encoding | FR-012 KPI: <50ms serialization overhead for 10KB payload |
| **FR-013** | Transport publishing | System shall return JSON string representation for caller to publish via transport layer (caller is responsible for actual transport publish) | FR-013 KPI: 100% JSON envelope generated correctly |
| **FR-014** | Transport subscription | System shall receive and process messages by accepting JSON string from transport payload | FR-014 KPI: 100% JSON messages processed correctly |
---
## 4. Non-Functional Requirements (NFRs)
**Requirement vs KPI Clarification:**
- **FR and NFR** is a *requirement* — it defines what quality or constraint the system must have (e.g., "System shall support 10K TPS", "99.9% monthly uptime", "TLS 1.3+ encryption")
- **KPI** is a *measurement* — it's the actual data collected to verify if the requirement was met (e.g., "Peak traffic was 8.5K TPS", "MTTR was 8 minutes", "100% of connections use TLS 1.3")
- Requirements tell you **what to build**; KPIs tell you **how well you built it**
### 4.1 Performance & Scalability
| ID | Requirement | Specification | Test Method |
|----|-------------|---------------|-------------|
| **NFR-101** | Message serialization overhead | <50ms for 10KB payload | Benchmark tests |
| **NFR-102** | Message deserialization overhead | <50ms for 10KB payload | Benchmark tests |
| **NFR-103** | Transport connection establishment | <100ms | Connection pool benchmarks |
| **NFR-104** | File upload latency | <1s for 0.5MB file | Integration tests |
| **NFR-105** | File download latency | <1s for 0.5MB file | Integration tests |
| **NFR-106** | Concurrent connections | Support 100+ simultaneous transport connections | Scale testing |
| **NFR-107** | Message throughput | Handle 1000+ messages/second per instance | Load testing |
| **NFR-108** | File server scalability | Support horizontal scaling of file server backend | Architecture review |
| ID | Requirement | Specification | KPI | Test Method |
|----|-------------|---------------|-----|-------------|
| **NFR-101** | Message serialization overhead | <50ms for 10KB payload | <50ms for 10KB payload | Benchmark tests |
| **NFR-102** | Message deserialization overhead | <50ms for 10KB payload | <50ms for 10KB payload | Benchmark tests |
| **NFR-103** | Transport connection establishment | <100ms | <100ms average | Connection pool benchmarks |
| **NFR-104** | File upload latency | <1s for 0.5MB file | <1s for 0.5MB file | Integration tests |
| **NFR-105** | File download latency | <1s for 0.5MB file | <1s for 0.5MB file | Integration tests |
| **NFR-106** | Concurrent connections | Support 100+ simultaneous transport connections | 100+ simultaneous connections | Scale testing |
| **NFR-107** | Message throughput | Handle 1000+ messages/second per instance | 1000+ messages/second | Load testing |
| **NFR-108** | File server scalability | Support horizontal scaling of file server backend | Horizontal scaling verified | Architecture review |
### 4.2 Availability & Reliability
| ID | Requirement | Specification |
|----|-------------|---------------|
| **NFR-201** | Message delivery | At-least-once delivery semantics via transport |
| **NFR-202** | File server availability | Graceful degradation when file server is unavailable |
| **NFR-203** | Connection recovery | Auto-reconnect on transport connection failure |
| ID | Requirement | Specification | KPI | Test Method |
|----|-------------|---------------|-----|-------------|
| **NFR-201** | Message delivery | At-least-once delivery semantics via transport | At-least-once delivery via transport | Message acknowledgment tests |
| **NFR-202** | File server availability | Graceful degradation when file server is unavailable | <5% failure rate when file server unavailable | Failure injection tests |
| **NFR-203** | Connection recovery | Auto-reconnect on transport connection failure | Auto-reconnect within 30s | Connection failure tests |
### 4.3 Privacy & Security
| ID | Requirement | Specification |
|----|-------------|---------------|
| **NFR-301** | Payload integrity | SHA-256 checksum support via metadata |
| **NFR-302** | Transport security | TLS support for transport connections |
| **NFR-303** | File server security | Authentication token for file uploads |
| ID | Requirement | Specification | KPI | Test Method |
|----|-------------|---------------|-----|-------------|
| **NFR-301** | Payload integrity | SHA-256 checksum support via metadata | 100% SHA-256 checksum validation | Integrity tests |
| **NFR-302** | Transport security | TLS support for transport connections | 100% TLS connections in production | Connection audits |
| **NFR-303** | File server security | Authentication token for file uploads | 100% authenticated file uploads | Security tests |
### 4.4 Observability & Telemetry
| ID | Requirement | Specification |
|----|-------------|---------------|
| **NFR-401** | Required logs | `correlation_id`, `msg_id`, `timestamp`, `sender_name`, `receiver_name`, `payload_type`, `transport` |
| **NFR-402** | Critical metrics | `messages_sent_total`, `messages_received_total`, `file_upload_duration_seconds`, `file_download_duration_seconds`, `retry_attempts_total` |
| **NFR-403** | Tracing | Correlation ID propagation for request tracing |
| **NFR-404** | Alerting | `download_retry_exceeded` triggers alert when max retries exceeded |
| **NFR-405** | Retention | Logs: 30 days, Metrics: 1 year |
| ID | Requirement | Specification | KPI | Test Method |
|----|-------------|---------------|-----|-------------|
| **NFR-401** | Required logs | `correlation_id`, `msg_id`, `timestamp`, `sender_name`, `receiver_name`, `payload_type`, `transport` | 100% messages logged with required fields | Log validation |
| **NFR-402** | Critical metrics | `messages_sent_total`, `messages_received_total`, `file_upload_duration_seconds`, `file_download_duration_seconds`, `retry_attempts_total` | 100% metrics collected with 1-minute granularity | Metrics pipeline tests |
| **NFR-403** | Tracing | Correlation ID propagation for request tracing | 100% correlation ID propagation for tracing | Tracing validation |
| **NFR-404** | Alerting | `download_retry_exceeded` triggers alert when max retries exceeded | <5min alert latency for `download_retry_exceeded` | Alert pipeline tests |
| **NFR-405** | Retention | Logs: 30 days, Metrics: 1 year | Logs: 30 days, Metrics: 1 year | Storage audits |
---
@@ -173,7 +206,7 @@ msghandler is a cross-platform, bi-directional data bridge that enables seamless
| Condition | Description |
|-----------|-------------|
| **AC-001** | All functional requirements FR-001 through FR-014 are implemented and tested |
| **AC-002** | All non-functional requirements NFR-101 through NFR-405 meet specified targets |
| **AC-002** | All non-functional requirements NFR-101 through NFR-405 meet specified KPI targets |
| **AC-003** | Cross-platform text message test passes (Julia ↔ JavaScript ↔ Python) |
| **AC-004** | Cross-platform tabular data test passes with Arrow IPC round-trip (Desktop) |
| **AC-005** | Cross-platform tabular data test passes with JSON table round-trip (Browser) |
@@ -406,6 +439,10 @@ function smartunpack(
| Date | Version | Changes |
|------|---------|---------|
| 2026-05-22 | 1.3.0 | Updated to ASG Framework v8 pillars - added KPIs to all FR and NFR requirements |
| - | - | Added Success Metrics & KPIs section with measurable targets for each requirement |
| - | - | Added NFR vs KPI clarification section |
| - | - | Updated NFR tables to include KPI column and Test Method column |
| 2026-05-15 | 1.3.0 | Made transport layer agnostic |
| - | - | Removed all NATS-specific dependencies and references |
| - | - | Updated all NATS references to generic "transport layer"/"message broker" |

345
docs/solution-design.md Normal file
View File

@@ -0,0 +1,345 @@
# Solution Design: msghandler
**Version**: 1.3.0
**Date**: 2026-05-22
**Status**: Active
**Ground Truth**: [`src/msghandler.jl`](../src/msghandler.jl)
---
## 1. Problem Decomposition
msghandler addresses the challenge of cross-platform data exchange between **Julia**, **JavaScript**, **Python**, **Dart**, **Rust**, and **MicroPython** applications using message brokers as transport layers.
### Problem Statement
Developers working across multiple programming languages face significant obstacles when trying to share data:
| Problem | Description | User Impact |
|---------|-------------|-------------|
| **P-001**: Cross-platform data serialization | Different languages have incompatible data types and serialization formats | Developers must write platform-specific conversion code |
| **P-002**: Large payload handling | Message brokers have size limits, but large files need to be transferred | Large files either fail or require complex workarounds |
| **P-003**: Transport abstraction | Each platform has different message broker libraries and APIs | No unified interface across platforms |
| **P-004**: Request-response patterns | Bi-directional communication requires complex correlation tracking | Developers must implement custom message routing |
### Solution Boundaries
**In Scope**:
- Unified API for `smartpack()` and `smartunpack()` across all platforms
- Automatic transport selection based on payload size
- File server integration using Claim-Check pattern
- Multi-payload support with mixed types in single message
- Exponential backoff for reliable file downloads
**Out of Scope**:
- Message compression (adds complexity without clear benefit)
- Message encryption (application-layer concern)
- Advanced message routing (simple topic matching sufficient)
- Persistent message queues (transport pattern sufficient)
### Decision IDs
| Decision ID | Decision | Description |
|-------------|----------|-------------|
| SD-001 | Claim-Check Pattern | Large payloads uploaded to HTTP server, small payloads sent directly |
| SD-002 | Automatic Transport Selection | <0.5MB = direct, ≥0.5MB = link based on size threshold |
| SD-003 | Handler Function Abstraction | Pluggable file server implementations via handler functions |
| SD-004 | Unified Tuple Format | Same `(dataname, data, type)` format across all platforms |
| SD-005 | Base64 Encoding | JSON-compatible binary data transport |
| SD-006 | Transport Abstraction | Support multiple broker protocols (NATS/MQTT/WebSocket) transparently |
---
## 2. Solution Approach
msghandler implements a **Claim-Check pattern** with intelligent transport selection:
```
Sender (smartpack) Transport Layer Receiver (smartunpack)
┌─────────────────┐ ┌───────────────┐ ┌───────────────────┐
│ │ │ │ │ │
│ 1. Data tuples │────────────>│ │───────────>│ 1. Parse envelope │
│ [(name, │ JSON │ Message │ JSON │ 2. Check transport│
│ data, type)]│ format │ Broker │ format │ 3. Fetch/Decode │
│ │ │ (NATS/MQTT/ │ │ 4. Return tuples │
└─────────────────┘ │ WebSocket) │ │ │
│ │ └───────────────────┘
└───────────────┘
```
### Key Design Decisions
| Decision ID | Decision | Rationale | Alternatives Rejected |
|-------------|----------|-----------|----------------------|
| **SD-001** | Claim-Check Pattern | Large payloads (>0.5MB) uploaded to HTTP server, small payloads sent directly via transport | Client-side compression - adds complexity; Server-side compression - not universally supported |
| **SD-002** | Automatic Transport Selection | <0.5MB = direct (fast), ≥0.5MB = link (avoid transport limits) | Manual selection - error-prone; Fixed threshold - not adaptive |
| **SD-003** | Handler Function Abstraction | Allows pluggable file server implementations (Plik, AWS S3, custom) | Hardcoded Plik - not flexible; Interface-based - too complex for this use case |
| **SD-004** | Unified Tuple Format | Same input/output format across all platforms | Platform-native formats - no interoperability; Protocol buffers - too heavy |
| **SD-005** | Base64 Encoding | JSON-compatible binary data transport | Raw bytes - not JSON-compatible; Hex encoding - 2x size overhead |
| **SD-006** | Transport Abstraction | Support multiple broker protocols (NATS/MQTT/WebSocket) transparently | Platform-specific libraries - no interoperability |
### Architecture Components
```mermaid
flowchart TB
subgraph Client["Client Application"]
direction TB
APP["Application Code"]
API["msghandler API"]
APP -->|Data tuples| API
API -->|JSON envelope| TRANSPORT
end
subgraph Transport["Transport Layer"]
direction TB
BROKER["Message Broker<br/>NATS/MQTT/WebSocket"]
TOPICS["Topic Subscription"]
API -->|Publish| BROKER
BROKER -->|Deliver| TOPICS
TOPICS -->|Subscribe| API
end
subgraph FileServer["File Server"]
direction TB
UPLOAD["Upload Handler"]
DOWNLOAD["Download Handler"]
API -.->|Upload URL| UPLOAD
DOWNLOAD -.->|Fetch URL| API
end
style CLIENT fill:#e1f5fe,stroke:#0288d1,stroke-width:2px
style Transport fill:#ffe0b2,stroke:#f57c00,stroke-width:2px
style FileServer fill:#c8e6c9,stroke:#43a047,stroke-width:2px
```
---
## 3. Alternatives Considered
| Alternative | Pros | Cons | Decision |
|-------------|------|------|----------|
| **gRPC/Protobuf** | Strong typing, efficient binary format | No native MicroPython support; Complex schema management | Rejected - not cross-platform enough |
| **MessagePack** | Compact binary, good performance | Browser support limited; No standard for tabular data | Rejected - missing Arrow IPC alternative |
| **Protocol Buffers** | Type-safe, efficient | No native support for tabular data exchange | Rejected - cannot represent DataFrames natively |
| **REST HTTP Upload** | Simple, universal | High latency; No real-time capability | Rejected - not suitable for message broker pattern |
| **Hybrid (direct/link)** | Optimal for both small and large payloads | More complex implementation | Accepted - matches user requirements (FR-003, FR-004) |
| **Single transport type** | Simpler implementation | Cannot handle large payloads efficiently | Rejected - violates FR-003 requirement |
| **Platform-specific APIs** | Native performance | No interoperability; Maintenance burden | Rejected - violates cross-platform goal |
---
## 4. High-Level Component Diagram
```mermaid
flowchart TD
subgraph msghandler["msghandler Core Module"]
direction TB
subgraph Serialization["Serialization Layer"]
DIR["Direct Transport"]
LNK["Link Transport"]
DIR -->|Base64| JSON_MSG
LNK -->|HTTP URL| JSON_MSG
end
subgraph Envelope["Envelope Builder"]
HDR["Message Header"]
PAY["Payload Manager"]
HDR --> PAY
end
subgraph Handlers["Handler Functions"]
UPD["Upload Handler"]
DWN["Download Handler"]
UPD --> LNK
DWN --> LNK
end
API["smartpack() / smartunpack()"]
API -->|Input| Serialization
API -->|Output| Serialization
API -->|Configure| Handlers
end
subgraph Transport["Transport Layer"]
BROKER["NATS / MQTT / WebSocket"]
API -->|JSON| BROKER
BROKER -->|JSON| API
end
subgraph FileServer["File Server"]
Plik["HTTP Server"]
UPD -.->|POST| Plik
Plik -.->|URL| DWN
end
style msghandler fill:#b3e5fc,stroke:#0288d1,stroke-width:2px
style Transport fill:#ffe0b2,stroke:#f57c00,stroke-width:2px
style FileServer fill:#c8e6c9,stroke:#43a047,stroke-width:2px
```
### Component Responsibilities
| Component | Responsibilities | Decision IDs | Requirements Addressed |
|-----------|-----------------|--------------|----------------------|
| **Serialization Layer** | Convert data types to transport format (Base64/URL) | SD-005 | FR-001, FR-002, FR-012 |
| **Envelope Builder** | Create standardized message envelope with metadata | SD-001 | FR-011, FR-013, FR-014 |
| **Handler Functions** | Abstract file server operations for pluggability | SD-003 | FR-008, FR-009 |
| **Transport Adapter** | Support multiple broker protocols transparently | SD-006 | FR-013, FR-014 |
| **Payload Manager** | Track payload types, sizes, and encoding | SD-004 | FR-006, FR-007 |
---
## 5. Decision Rationale
### SD-001: Why Claim-Check Pattern?
**Requirement**: FR-003 - Large file handling, FR-004 - Direct transport for small payloads
**Rationale**:
- Transport layers (NATS, MQTT) have message size limits (typically 1MB)
- Direct transport is faster for small payloads (no file server round-trip)
- Link transport avoids transport limits for large payloads
- User doesn't need to manually choose - automatic selection based on threshold
### SD-002: Why Handler Functions for File Server?
**Requirement**: FR-008 - Plik integration, FR-009 - Custom file server support
**Rationale**:
- Plik is common open-source solution for file server
- Some users need AWS S3 or custom implementation
- Handler functions provide clean abstraction without vendor lock-in
- Same signature across all platforms (unified API)
### SD-003: Why Tuple Format for Payloads?
**Requirement**: FR-006 - Multi-payload messages, FR-007 - Payload type preservation
**Rationale**:
- `(dataname, data, type)` tuple is language-agnostic
- Simple to understand: name, content, type
- Supports mixed payload types in single message
- Easy to serialize/deserialize across platforms
### SD-004: Why Base64 Encoding?
**Requirement**: FR-012 - Message serialization, FR-001 - Cross-platform text messaging
**Rationale**:
- JSON is universal - works on all platforms
- Base64 converts binary to ASCII for JSON compatibility
- Standard format with native support in all languages
- No additional dependencies needed
### SD-005: Why Automatic Transport Selection?
**Requirement**: FR-003, FR-004, NFR-104, NFR-105
**Rationale**:
- <0.5MB payloads use direct transport (<1s latency, FR-004 KPI)
- ≥0.5MB payloads use link transport to avoid transport limits (FR-003 KPI: 99% successful uploads)
- User doesn't need to manually choose - automatic selection based on threshold
### SD-006: Why Transport Abstraction?
**Requirement**: FR-013, FR-014, NFR-201
**Rationale**:
- Support multiple broker protocols (NATS, MQTT, WebSocket) transparently
- Caller handles actual transport publishing/subscription
- Unified API across all platforms
- At-least-once delivery semantics via transport layer
---
## 6. Risk Assessment
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| **Performance degradation with >500KB payloads** | High | Medium | Size threshold detection; Link transport fallback |
| **File server availability issues** | Medium | Low | Exponential backoff retry; Graceful degradation |
| **Platform-specific bugs** | Medium | Low | Comprehensive test suite per platform; CI validation |
| **Encoding mismatches between platforms** | High | Low | Strict specification; Test contracts; Validation rules |
| **Transport layer incompatibility** | Medium | Low | Transport-agnostic design; Handler abstraction |
---
## 7. Requirements Traceability
| Solution Component | Decision ID | Requirement ID | Description |
|-------------------|-------------|----------------|-------------|
| **smartpack() function** | SD-001, SD-002, SD-004, SD-005, SD-006 | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 | Unified API for sending messages across all platforms |
| **smartunpack() function** | SD-001, SD-002, SD-004, SD-005, SD-006 | FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 | Unified API for receiving messages across all platforms |
| **Direct transport** | SD-002 | FR-004, NFR-101, NFR-102, NFR-104, NFR-105 | Send payloads < threshold directly via transport |
| **Link transport** | SD-001, SD-002 | FR-003, NFR-104, NFR-105 | Upload payloads ≥ threshold to file server |
| **File server handler** | SD-003 | FR-008, FR-009, FR-010 | Pluggable upload/download handlers with retry logic |
| **Payload type preservation** | SD-004 | FR-006, FR-007 | Support text, dictionary, arrowtable, jsontable, image, audio, video, binary |
| **Correlation ID** | SD-001 | FR-011, NFR-401, NFR-403 | Message tracing across distributed systems |
| **Multi-payload support** | SD-004 | FR-006, FR-007 | List of (dataname, data, type) tuples |
### Non-Functional Requirements Traceability
| Solution Component | Decision ID | NFR ID | Description |
|-------------------|-------------|--------|-------------|
| **Serialization optimization** | SD-005 | NFR-101, NFR-102 | <50ms overhead for 10KB payloads |
| **Transport efficiency** | SD-006 | NFR-103 | <100ms connection establishment |
| **File server latency** | SD-001, SD-002 | NFR-104, NFR-105 | <1s upload/download for 0.5MB files |
| **Concurrent connections** | SD-006 | NFR-106 | Support 100+ simultaneous connections |
| **Message throughput** | SD-005, SD-006 | NFR-107 | Handle 1000+ messages/second per instance |
| **At-least-once delivery** | SD-006 | NFR-201 | Transport layer semantics |
| **Graceful degradation** | SD-003 | NFR-202 | File server unavailability handling |
| **Auto-reconnect** | SD-006 | NFR-203 | Transport connection failure recovery |
| **Required logs** | SD-001 | NFR-401 | Correlation ID, msg_id, timestamp, etc. |
| **Critical metrics** | SD-001, SD-005 | NFR-402 | messages_sent_total, file upload/download duration |
| **Tracing** | SD-001 | NFR-403 | Correlation ID propagation |
---
## 8. Gap-Check Validation
| Stage Transition | Gap-Check Question | Status |
|------------------|-------------------|--------|
| **Requirements → Solution Design** | Does the Solution Design clearly explain how the system solves the user problem, not just what it does? | ✅ Verified - All user stories mapped to solution components with requirement ID and decision ID references |
| **Solution Design → Specification** | Does the Specification define all technical details that the solution approach requires? | ⏳ Pending - Specification needs review for completeness |
| **Solution Design → Walkthrough** | Does the Walkthrough reflect the complete flow including error states and timing? | ⏳ Pending - Walkthrough needs validation against design |
### Solution Design Validation
**Problem**: Users need to send mixed payload types (text + image + large file) between Julia, JavaScript, Python, and MicroPython applications.
**Solution Components**:
1. **SD-001** - `smartpack()` - Unified API for all platforms
2. **SD-002** - Tuple format - `(dataname, data, type)` - platform-agnostic
3. **SD-003** - Automatic transport selection - <0.5MB = direct, ≥0.5MB = link
4. **SD-004** - File server handler abstraction - Plik/AWS S3/custom support
5. **SD-005** - Exponential backoff - Reliable file downloads
6. **SD-006** - Correlation ID - Message tracing
**Requirement Mapping**:
- FR-001, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, FR-013, FR-014 ✅
**Gap Check**: Does this solution explain *how* users will actually use the system?
**Answer**: Yes - the walkthrough provides concrete examples:
1. JavaScript sends `[(msg, "Hello", "text"), (avatar, binary_data, "image")]`
2. `smartpack()` automatically selects transport based on size (SD-002)
3. Large file (≥0.5MB) → link transport → file server upload (SD-001)
4. Small payload (<0.5MB) → direct transport → base64 encoding (SD-005)
5. Receiver calls `smartunpack()` → receives same tuple format
---
*This solution design document is versioned and maintained in git alongside the codebase. All implementations must adhere to this design.*
**Traceability Summary**:
- All requirements traced to solution components with SD-XXX decision IDs
- Each decision ID references the corresponding requirement IDs (FR-XXX, NFR-XXX)
- Specification must cite SD-XXX references for each technical detail