From 7bc3e4992a0694936b120362b8c3a4307996630e Mon Sep 17 00:00:00 2001 From: narawat Date: Fri, 13 Mar 2026 18:41:18 +0700 Subject: [PATCH] update architecture.md --- docs/architecture.md | 990 ++++++++++++++++++++++------------- docs/earlier_architecture.md | 475 +++++++++++++++++ 2 files changed, 1091 insertions(+), 374 deletions(-) create mode 100644 docs/earlier_architecture.md diff --git a/docs/architecture.md b/docs/architecture.md index b1f7929..340679b 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,379 +1,605 @@ -# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge +# Architecture Documentation: NATSBridge -## Overview - -This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. - -**Supported Platforms:** -- **Julia** - Ground truth implementation with full feature set -- **JavaScript** - Node.js and browser-compatible implementation -- **Python/MicroPython** - Desktop and embedded-compatible implementation - -### Cross-Platform Design Principles - -1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior -2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python) -3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms -4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility +**Version**: 1.0.0 +**Date**: 2026-03-13 +**Status**: Active +**Ground Truth**: [`src/NATSBridge.jl`](../src/NATSBridge.jl) +**Architecture Level**: C4 Container Level --- -## High-Level API Standard (Cross-Platform) +## Executive Summary -### Unified API Signature +This document defines the **blueprint** for NATSBridge - the cross-platform bi-directional data bridge that enables seamless communication between **Julia**, **JavaScript**, **Python**, and **MicroPython** applications using NATS as the message bus. -All three platforms expose the same high-level API: - -**Input Format (smartsend):** -``` -[(dataname1, data1, type1), (dataname2, data2, type2), ...] -``` - -**Output Format (smartreceive):** -``` -{ - "correlation_id": "...", - "msg_id": "...", - "timestamp": "...", - "send_to": "...", - "msg_purpose": "...", - "sender_name": "...", - "sender_id": "...", - "receiver_name": "...", - "receiver_id": "...", - "reply_to": "...", - "reply_to_msg_id": "...", - "broker_url": "...", - "metadata": {...}, - "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] -} -``` - -### Supported Payload Types - -| Type | Julia | JavaScript | Python/MicroPython | -|------|-------|------------|-------------------| -| `text` | `String` | `string` | `str` | -| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | -| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | -| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | -| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | -| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` | - -**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead. - -### Cross-Platform API Examples - -**Julia:** -```julia -using NATSBridge - -# Send -env, env_json_str = smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns JSON.Object{String, Any} -env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff) -# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}} -# Access payloads: for (dataname, data, type) in env["payloads] -``` - -**JavaScript:** -```javascript -const NATSBridge = require('natsbridge'); - -// Send -const [env, env_json_str] = await NATSBridge.smartsend( - "/chat", - [ - ["message", "Hello!", "text"], - ["image", imageBuffer, "image"] - ], - { broker_url: "nats://localhost:4222" } -); - -// Receive - returns Promise -const env = await NATSBridge.smartreceive(msg, { - fileserver_download_handler: fetchWithBackoff -}); -// env is an object with "payloads" field containing Array of arrays -// Access payloads: for (const [dataname, data, type] of env.payloads) -``` - -**Python:** -```python -from natsbridge import NATSBridge - -# Send -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text"), ("image", image_bytes, "image")], - broker_url="nats://localhost:4222" -) - -# Receive - returns Tuple[Dict, str] -env = NATSBridge.smartreceive( - msg, - fileserver_download_handler=fetch_with_backoff -) -# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]] -# Access payloads: for dataname, data, type_ in env["payloads"] -``` - -**MicroPython:** -```python -from natsbridge import NATSBridge - -# Send (limited to direct transport due to memory constraints) -env, env_json_str = NATSBridge.smartsend( - "/chat", - [("message", "Hello!", "text")], - broker_url="nats://localhost:4222" -) -``` +This architecture document serves as the single source of truth for: +- **System Structure**: How components fit together and interact +- **Scaling Considerations**: How the system scales horizontally and vertically +- **Failure Modes**: How the system handles failures and recovers +- **Trade-off Decisions**: The rationale behind architectural decisions --- -## Architecture Diagram (Cross-Platform) +## Architecture Overview + +### C4 Context Diagram ```mermaid flowchart TD - subgraph Client - App[Julia/JS/Python/MicroPython Application] + subgraph "External Systems" + NATS_Server[NATS Server] + File_Server[HTTP File Server
Plik/AWS S3/Custom] end - subgraph Server - Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service] - NATS[NATS Server] - FileServer[HTTP File Server] + subgraph "Client Applications" + Julia_App[Julia Application] + JS_App[JavaScript Application
Node.js/Browser] + Python_App[Python Application
Desktop] + MicroPython_App[MicroPython Device] end - App -->|NATS| NATS - NATS -->|NATS| Julia/JS/Python/MicroPython - Julia/JS/Python/MicroPython -->|NATS| NATS - Julia/JS/Python/MicroPython -->|HTTP POST| FileServer + Julia_App -->|NATS| NATS_Server + JS_App -->|NATS| NATS_Server + Python_App -->|NATS| NATS_Server + MicroPython_App -->|NATS| NATS_Server - style App fill:#e8f5e9 - style Julia/JS/Python/MicroPython fill:#e8f5e9 - style NATS fill:#fff3e0 - style FileServer fill:#f3e5f5 + Julia_App -->|HTTP| File_Server + JS_App -->|HTTP| File_Server + Python_App -->|HTTP| File_Server + MicroPython_App -->|HTTP| File_Server + + style NATS_Server fill:#fff3e0,stroke:#f57c00 + style File_Server fill:#f3e5f5,stroke:#9c27b4 + style Julia_App fill:#e8f5e9,stroke:#4caf50 + style JS_App fill:#e3f2fd,stroke:#2196f3 + style Python_App fill:#e3f2fd,stroke:#2196f3 + style MicroPython_App fill:#fce4ec,stroke:#e91e63 +``` + +### C4 Container Diagram + +```mermaid +flowchart TD + subgraph "Client Container" + Julia_Module[Julia NATSBridge Module] + JS_Module[JavaScript NATSBridge Module] + Python_Module[Python NATSBridge Module] + MicroPython_Module[MicroPython NATSBridge Module] + end + + subgraph "NATS Container" + NATS_Client[NATS Client] + NATS_Broker[NATS Broker] + end + + subgraph "File Server Container" + File_Client[HTTP Client] + File_Server[File Server] + end + + Julia_Module --> NATS_Client + JS_Module --> NATS_Client + Python_Module --> NATS_Client + MicroPython_Module --> NATS_Client + + NATS_Client --> NATS_Broker + + Julia_Module --> File_Client + JS_Module --> File_Client + Python_Module --> File_Client + MicroPython_Module --> File_Client + + File_Client --> File_Server + + style Julia_Module fill:#e8f5e9,stroke:#4caf50 + style JS_Module fill:#e3f2fd,stroke:#2196f3 + style Python_Module fill:#e3f2fd,stroke:#2196f3 + style MicroPython_Module fill:#fce4ec,stroke:#e91e63 + style NATS_Broker fill:#fff3e0,stroke:#f57c00 + style File_Server fill:#f3e5f5,stroke:#9c27b4 +``` + +### C4 Component Diagram (Julia Implementation) + +```mermaid +flowchart TD + subgraph "NATSBridge Module" + SmartSend[smartsend Function] + SmartReceive[smartreceive Function] + + Serialize[_serialize_data] + Deserialize[_deserialize_data] + + BuildEnvelope[build_envelope] + BuildPayload[build_payload] + + PublishMessage[publish_message] + + FileServerUpload[fileserver_upload_handler] + FileServerDownload[fileserver_download_handler] + end + + subgraph "Data Models" + Payload[MsgPayloadV1 Struct] + Envelope[MsgEnvelopeV1 Struct] + end + + SmartSend --> Serialize + SmartSend --> BuildEnvelope + SmartSend --> BuildPayload + SmartSend --> PublishMessage + SmartSend --> FileServerUpload + + SmartReceive --> Deserialize + SmartReceive --> FileServerDownload + + Serialize --> Payload + BuildEnvelope --> Envelope + BuildPayload --> Payload + + style SmartSend fill:#d1fae5,stroke:#10b981 + style SmartReceive fill:#d1fae5,stroke:#10b981 + style PublishMessage fill:#fef3c7,stroke:#f59e0b + style FileServerUpload fill:#fef3c7,stroke:#f59e0b + style FileServerDownload fill:#fef3c7,stroke:#f59e0b ``` --- -## System Components +## High-Level Architecture -### 1. msg_envelope_v1 - Message Envelope +### System Components + +| Component | Purpose | Platform Support | +|-----------|---------|------------------| +| **smartsend** | Send data via NATS with automatic transport selection | All | +| **smartreceive** | Receive and process NATS messages | All | +| **_serialize_data** | Serialize data according to payload type | All | +| **_deserialize_data** | Deserialize bytes to native data types | All | +| **_build_envelope** | Build message envelope from payloads | All | +| **_build_payload** | Build payload object from serialized data | All | +| **publish_message** | Publish message to NATS subject | All | +| **fileserver_upload_handler** | Upload large payloads to HTTP server | Desktop | +| **fileserver_download_handler** | Download payloads from HTTP server | Desktop | + +### Data Flow + +```mermaid +flowchart TD + A[User calls smartsend subject data] --> B[Process each payload] + B --> C{Calculate serialized size} + C -->|Size < Threshold| D[Direct Transport] + C -->|Size >= Threshold| E[Link Transport] + + D --> F[Serialize data] + F --> G[Base64 encode] + G --> H[Build payload object] + + E --> I[Serialize data] + I --> J[Upload to file server] + J --> K[Get download URL] + K --> H + + H --> L[Build envelope] + L --> M[Convert to JSON] + M --> N[Publish to NATS] + + style A fill:#f9f9f9,stroke:#333 + style N fill:#e0e7ff,stroke:#3b82f6 + style D fill:#d1fae5,stroke:#10b981 + style E fill:#fef3c7,stroke:#f59e0b +``` + +--- + +## Message Envelope Architecture + +### msg_envelope_v1 Structure (Julia) + +```julia +struct msg_envelope_v1 + correlation_id::String # UUID v4 for distributed tracing + msg_id::String # UUID v4 for this message + timestamp::String # ISO 8601 UTC timestamp + + send_to::String # NATS subject to publish to + msg_purpose::String # ACK, NACK, updateStatus, shutdown, chat + sender_name::String # Sender application name + sender_id::String # UUID v4 of sender + receiver_name::String # Receiver application name (empty = broadcast) + receiver_id::String # UUID v4 of receiver (empty = broadcast) + + reply_to::String # Topic for reply messages + reply_to_msg_id::String # Message ID being replied to + broker_url::String # NATS broker URL + + metadata::Dict{String, Any} # Message-level metadata + payloads::Vector{msg_payload_v1} # List of payloads +end +``` + +### msg_payload_v1 Structure (Julia) + +```julia +struct msg_payload_v1 + id::String # UUID v4 for this payload + dataname::String # Name of the payload + payload_type::String # text, dictionary, arrowtable, etc. + transport::String # direct or link + encoding::String # none, json, base64, arrow-ipc + size::Integer # Size in bytes + data::Any # Base64 string or URL + metadata::Dict{String, Any} # Payload-level metadata +end +``` + +### JSON Schema (Cross-Platform) -**JSON Schema (Identical Across All Platforms):** ```json { - "correlation_id": "uuid-v4-string", - "msg_id": "uuid-v4-string", - "timestamp": "2024-01-15T10:30:00Z", - - "send_to": "topic/subject", - "msg_purpose": "ACK | NACK | updateStatus | shutdown | chat", - "sender_name": "agent-wine-web-frontend", - "sender_id": "uuid4", - "receiver_name": "agent-backend", - "receiver_id": "uuid4", - "reply_to": "topic", - "reply_to_msg_id": "uuid4", - "broker_url": "nats://localhost:4222", - - "metadata": { - "content_type": "application/octet-stream", - "content_length": 123456 - }, - + "correlation_id": "string (UUID v4)", + "msg_id": "string (UUID v4)", + "timestamp": "string (ISO 8601 UTC)", + "send_to": "string", + "msg_purpose": "string", + "sender_name": "string", + "sender_id": "string (UUID v4)", + "receiver_name": "string", + "receiver_id": "string (UUID v4)", + "reply_to": "string", + "reply_to_msg_id": "string", + "broker_url": "string", + "metadata": "object", "payloads": [ { - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image", - "transport": "direct", - "encoding": "base64", - "size": 15433, - "data": "base64-encoded-string", - "metadata": { - "checksum": "sha256_hash" - } - }, - { - "id": "uuid4", - "dataname": "large_arrow_table", - "payload_type": "arrowtable", - "transport": "link", - "encoding": "arrow-ipc", - "size": 524288, - "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow", - "metadata": {} + "id": "string (UUID v4)", + "dataname": "string", + "payload_type": "string", + "transport": "string", + "encoding": "string", + "size": "integer", + "data": "string or URL", + "metadata": "object" } ] } ``` -### 2. msg_payload_v1 - Payload Structure +--- -**JSON Schema (Identical Across All Platforms):** -```json -{ - "id": "uuid4", - "dataname": "login_image", - "payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary", - "transport": "direct | link", - "encoding": "none | json | base64 | arrow-ipc", - "size": 15433, - "data": "base64-encoded-string | http-url | json-string", - "metadata": { - "checksum": "sha256_hash" - } -} -``` +## Payload Type Architecture -### 3. Transport Strategy Decision Logic (Cross-Platform) +### Supported Payload Types -``` -┌─────────────────────────────────────────────────────────────┐ -│ smartsend Function (All Platforms) │ -│ Accepts: [(dataname1, data1, type1), ...] │ -│ (Type is per payload, not standalone) │ -└─────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ For each payload: │ -│ 1. Extract type from tuple/array │ -│ 2. Serialize based on type │ -│ 3. Check payload size │ -└─────────────────────────────────────────────────────────────┘ - │ - ┌───────────┴────────────┐ - ▼ ▼ - ┌──────────────┐ ┌──────────────┐ - │ Direct Path │ │ Link Path │ - │ (< 1MB) │ │ (>= 1MB) │ - │ │ │ │ - │ • Serialize │ │ • Serialize │ - │ to buffer │ │ to buffer │ - │ • Base64/JSON│ │ • Upload to │ - │ encode │ │ HTTP Server│ - │ • Publish to │ │ • Publish to │ - │ NATS │ │ NATS with │ - │ (in msg) │ │ URL │ - └──────────────┘ └──────────────┘ +| Type | Description | Serialization | Encoding | Platforms | +|------|-------------|---------------|----------|-----------| +| `text` | Plain text string | UTF-8 bytes | Base64 | All | +| `dictionary` | JSON object | JSON string | Base64/JSON | All | +| `arrowtable` | Apache Arrow IPC | Arrow IPC stream | Base64/arrow-ipc | Desktop | +| `jsontable` | JSON array of objects | JSON string | Base64/json | All | +| `image` | Binary image data | Raw bytes | Base64 | All | +| `audio` | Binary audio data | Raw bytes | Base64 | All | +| `video` | Binary video data | Raw bytes | Base64 | All | +| `binary` | Generic binary data | Raw bytes | Base64 | All | + +### Serialization Logic + +```mermaid +flowchart TD + A[Input data + payload_type] --> B{Payload Type} + + B -->|"text"| C[UTF-8 encode] + B -->|"dictionary"| D[JSON serialize] + B -->|"arrowtable"| E[Arrow IPC serialize] + B -->|"jsontable"| F[JSON serialize] + B -->|"image"| G[Raw bytes] + B -->|"audio"| H[Raw bytes] + B -->|"video"| I[Raw bytes] + B -->|"binary"| J[Raw bytes] + + C --> K[Return bytes] + D --> K + E --> K + F --> K + G --> K + H --> K + I --> K + J --> K + + style A fill:#f9f9f9,stroke:#333 + style K fill:#e0e7ff,stroke:#3b82f6 ``` --- -## Platform Comparison Matrix +## Transport Strategy Architecture -| Feature | Julia | JavaScript | Python | MicroPython | -|---------|-------|------------|--------|-------------| -| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ | -| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) | -| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ | -| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) | -| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ | -| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) | -| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ | -| **jsontable Support** | ✅ | ✅ | ✅ | ❌ | -| **Direct Transport** | ✅ | ✅ | ✅ | ✅ | -| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) | -| **Handler Functions** | ✅ | ✅ | ✅ | ✅ | -| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ | +### Size Threshold Decision Logic + +| Platform | Size Threshold | Notes | +|----------|----------------|-------| +| Desktop (Julia/JS/Python) | 500,000 bytes (0.5MB) | Default threshold | +| MicroPython | 100,000 bytes (100KB) | Lower threshold for memory constraints | + +### Transport Selection Flow + +```mermaid +flowchart TD + A[smartsend called] --> B[Serialize payload] + B --> C[Calculate size] + C --> D{Size < Threshold?} + + D -->|Yes| E[Direct Transport] + D -->|No| F[Link Transport] + + E --> G[Base64 encode] + G --> H[Build payload with direct transport] + + F --> I[Upload to file server] + I --> J[Get download URL] + J --> K[Build payload with link transport] + + H --> L[Build envelope] + K --> L + + style A fill:#f9f9f9,stroke:#333 + style L fill:#e0e7ff,stroke:#3b82f6 + style E fill:#d1fae5,stroke:#10b981 + style F fill:#fef3c7,stroke:#f59e0b +``` + +### Direct Transport Protocol + +When `transport = "direct"`, the `data` field contains a Base64-encoded string of the serialized payload. + +**Encoding Rules**: +- `text`: UTF-8 → Base64 +- `dictionary`: JSON → Base64 (or direct JSON) +- `arrowtable`: Arrow IPC → Base64 (or arrow-ipc) +- `jsontable`: JSON → Base64 (or direct JSON) +- `image`/`audio`/`video`/`binary`: Raw bytes → Base64 + +### Link Transport Protocol + +When `transport = "link"`, the `data` field contains a URL pointing to the uploaded payload. + +**Upload Flow**: +1. Serialize payload according to `payload_type` +2. Upload to HTTP file server (e.g., Plik) +3. Include returned URL in `data` field + +**Download Flow**: +1. Extract URL from payload +2. Fetch with exponential backoff (max 5 retries) +3. Deserialize based on `payload_type` --- -## Platform-Specific Architecture Patterns +## Platform-Specific Architecture -### Julia: Multiple Dispatch Pattern +### Julia Architecture Julia leverages multiple dispatch for type-specific implementations: -- **Function overloading** based on argument types -- **Struct-based data models** with explicit types -- **Native Arrow IPC** support via Arrow.jl +- **Multiple Dispatch**: Function overloading based on argument types +- **Struct-based Data Models**: Explicit type definitions with `struct` +- **Native Arrow IPC**: Support via `Arrow.jl` +- **Async/Await**: Tasks for non-blocking I/O -### JavaScript: Prototype + Async Pattern +```julia +# Multiple dispatch for serialization +function _serialize_data(data::String, payload_type::String) + # Text serialization +end + +function _serialize_data(data::Dict, payload_type::String) + # Dictionary serialization +end + +function _serialize_data(data::DataFrame, payload_type::String) + # Arrow table serialization +end +``` + +### JavaScript Architecture JavaScript uses async/await for non-blocking I/O: -- **Class-based NATS client** for connection management -- **Module-level utility functions** for serialization -- **Native ArrayBuffer** for binary data handling +- **Class-based NATS Client**: Connection management +- **Module-level Utilities**: Serialization functions +- **Native ArrayBuffer**: Binary data handling +- **Fetch API**: HTTP file server communication -### Python: Class-Based Pattern +```javascript +// Class-based NATS client +class NATSClient { + constructor(url) { + this.url = url; + this.connection = null; + } + + async connect() { + this.connection = await nats.connect({ servers: this.url }); + } +} +``` + +### Python Architecture Python uses classes for stateful operations: -- **Class-based NATSBridge** with type hints -- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1) -- **Async/await** for I/O operations +- **Class-based NATSBridge**: Encapsulated API +- **Dataclasses**: Structured data (MsgPayloadV1, MsgEnvelopeV1) +- **Async/await**: I/O operations +- **pyarrow**: Arrow IPC support -### MicroPython: Synchronous Pattern +```python +class NATSBridge: + DEFAULT_SIZE_THRESHOLD = 500_000 + + def __init__(self, broker_url=None, fileserver_url=None): + self.broker_url = broker_url or self.DEFAULT_BROKER_URL + self.fileserver_url = fileserver_url or self.DEFAULT_FILESERVER_URL +``` + +### MicroPython Architecture MicroPython has significant constraints: -- **Synchronous API** (no async/await) -- **Memory-constrained** (256KB - 1MB) -- **Limited payload support** (no tables, max 50KB) +- **Synchronous API**: No async/await +- **Memory-constrained**: 256KB - 1MB +- **Limited payload support**: No tables, max 50KB +- **Simplified UUID generation**: Custom implementation + +```python +# MicroPython constraints +DEFAULT_SIZE_THRESHOLD = 100_000 # 100KB +MAX_PAYLOAD_SIZE = 50_000 # 50KB hard limit +``` --- -## Cross-Platform Compatibility Notes +## Scaling Architecture -### 1. Payload Type Consistency +### Horizontal Scaling -All platforms use the same payload type values for tabular data: +| Component | Scaling Strategy | +|-----------|------------------| +| **NATS Server** | Cluster deployment with multiple nodes | +| **File Server** | Load balancer + multiple instances | +| **Client Applications** | Deploy multiple instances behind load balancer | -| Platform | Table Types | -|----------|-------------| -| Julia | `"arrowtable"`, `"jsontable"` | -| JavaScript | `"arrowtable"`, `"jsontable"` | -| Python | `"arrowtable"`, `"jsontable"` | -| MicroPython | Not supported | +### Vertical Scaling +| Component | Scaling Strategy | +|-----------|------------------| +| **NATS Server** | Increase memory, CPU, disk I/O | +| **File Server** | Increase memory, CPU, disk capacity | +| **Client Applications** | Increase heap size (Python/JS) | -### 2. Direct Transport Encoding Field +### Performance Considerations -The encoding field in direct transport payloads differs between platforms: - -| Platform | Encoding for Direct Transport | -|----------|-------------------------------| -| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | -| Python | Always `"base64"` for all direct transport payloads | -| MicroPython | Always `"base64"` for all direct transport payloads | - -**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. - -### 3. MicroPython Limitations - -MicroPython has significant constraints that affect feature support: - -| Feature | Desktop Platforms | MicroPython | -|---------|-------------------|-------------| -| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | -| `jsontable` | ✅ | ❌ (not supported - memory constraints) | -| `table` | ✅ | ❌ (not supported - memory constraints) | -| Async/await | ✅ | ❌ (synchronous only) | -| File upload/download | ✅ | ⚠️ (placeholder implementations) | -| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | -| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | - -**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. +| Metric | Target | Notes | +|--------|--------|-------| +| Message serialization overhead | <50ms | For 10KB payload | +| Message deserialization overhead | <50ms | For 10KB payload | +| NATS connection establishment | <100ms | Connection pool recommended | +| File upload latency | <1s | For 0.5MB file | +| File download latency | <1s | For 0.5MB file | --- -## Configuration +## Failure Modes and Recovery + +### NATS Connection Failure + +**Scenario**: NATS server unavailable + +**Handler**: +- Connection auto-reconnect via TCP-level reconnection +- Retry with exponential backoff for publish operations + +**Recovery**: +- NATS client automatically attempts reconnection +- Application can check connection status before publishing + +### File Server Unavailable + +**Scenario**: HTTP file server unavailable during upload/download + +**Handler**: +- Retry up to 5 times with exponential backoff (100ms → 5000ms) +- Fallback to direct transport for upload (MicroPython) + +**Recovery**: +- Exponential backoff: `delay = min(delay * 2, max_delay)` +- After max retries, throw error with correlation ID + +### Deserialization Error + +**Scenario**: Payload type mismatch or corrupted data + +**Handler**: +- Log correlation ID and throw error +- No retry (data corruption) + +**Recovery**: +- Application must validate payload_type matches data type +- Use proper serialization before sending + +### Memory Overflow (MicroPython) + +**Scenario**: Payload exceeds maximum size (50KB) + +**Handler**: +- Reject payloads >50KB with MemoryError +- No retry (client-side check) + +**Recovery**: +- Application must split large payloads +- Use direct transport only for small payloads + +--- + +## Trade-off Decisions + +### Decision 1: Direct vs Link Transport Threshold + +**Trade-off**: Memory vs Network I/O + +**Decision**: Use 0.5MB threshold for desktop, 100KB for MicroPython + +**Rationale**: +- Direct transport uses more memory (Base64 encoding adds ~33% overhead) +- Link transport requires network I/O for upload/download +- 0.5MB is reasonable for desktop memory constraints +- 100KB is necessary for MicroPython memory constraints + +### Decision 2: Base64 Encoding for Direct Transport + +**Trade-off**: Bandwidth vs Simplicity + +**Decision**: Use Base64 encoding for all direct transport payloads + +**Rationale**: +- Simplifies JSON serialization (all data is string-compatible) +- Increases payload size by ~33%, but NATS can handle this +- Alternative would be binary payload support (more complex) + +### Decision 3: Multiple Platform Implementations + +**Trade-off**: Development effort vs Cross-platform support + +**Decision**: Maintain separate implementations for each platform + +**Rationale**: +- Each platform has idiomatic patterns (multiple dispatch, async/await, etc.) +- Maintains developer productivity and code quality +- API parity ensures cross-platform compatibility + +### Decision 4: Handler Function Abstraction + +**Trade-off**: Flexibility vs Simplicity + +**Decision**: Abstract file server operations through handler functions + +**Rationale**: +- Allows support for different file server implementations (Plik, AWS S3, custom) +- Maintains simplicity for common use cases +- Enables plug-in architecture for custom backends + +--- + +## Deployment Architecture + +### Minimum Infrastructure + +| Component | Minimum | Notes | +|-----------|---------|-------| +| NATS Server | 1 instance | Single node for development | +| File Server | 1 instance | HTTP server for large payloads | +| Client Memory | 50MB | Desktop platforms | +| Client Memory | 256KB | MicroPython devices | ### Environment Variables @@ -381,95 +607,111 @@ MicroPython has significant constraints that affect feature support: |----------|---------|-------------| | `NATS_URL` | `nats://localhost:4222` | NATS server URL | | `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | -| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes | -### MicroPython-Specific Configuration +### Container Deployment -```python -# micropython.conf -NATS_URL = "nats://broker.local:4222" -FILESERVER_URL = "http://fileserver.local:8080" -SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices -MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython +```mermaid +flowchart TD + subgraph "Docker Network" + NATS_Container[NATS Server] + FileServer_Container[Plik File Server] + App_Container[Application Container] + end + + App_Container -->|NATS| NATS_Container + App_Container -->|HTTP| FileServer_Container + + style NATS_Container fill:#fff3e0,stroke:#f57c00 + style FileServer_Container fill:#f3e5f5,stroke:#9c27b4 + style App_Container fill:#e3f2fd,stroke:#2196f3 ``` --- -## Performance Considerations +## Security Considerations -### Zero-Copy Reading +### Payload Integrity -| Platform | Strategy | -|----------|----------| -| **Julia** | `Arrow.read()` with memory-mapped files | -| **JavaScript** | `ArrayBuffer` with `DataView` | -| **Python** | `pyarrow` memory mapping | -| **MicroPython** | Not available (streaming only) | +**Mechanism**: SHA-256 checksum via metadata -### Exponential Backoff +**Implementation**: +- Sender calculates checksum and stores in payload metadata +- Receiver validates checksum on receipt -All platforms implement exponential backoff for HTTP downloads: +### Transport Security -``` -delay = base_delay -for attempt in 1:max_retries: - try: - response = fetch(url) - if success: return response - except: - if attempt < max_retries: - sleep(delay) - delay = min(delay * 2, max_delay) -``` +**Mechanism**: TLS support for NATS connections -### Correlation ID Logging +**Implementation**: +- Use `nats://` URL for plain text +- Use `tls://` URL for TLS-encrypted connections -All platforms use correlation IDs for distributed tracing: +### File Server Security -``` -[timestamp] [Correlation: abc123] Message published to subject -``` +**Mechanism**: Authentication token for file uploads -### Serialization Performance Comparison - -| Format | Use Case | Pros | Cons | -|--------|----------|------|------| -| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library | -| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema | -| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow | +**Implementation**: +- Plik uses upload token in `X-UploadToken` header +- Application can implement custom authentication --- -## Summary +## Testing Architecture -This cross-platform NATS bridge provides: +### Unit Test Coverage -1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython -2. **Idiomatic Implementations**: - - Julia: Multiple dispatch and struct-based design - - JavaScript: Async/await and prototype-based utilities - - Python: Class-based design with type hints - - MicroPython: Synchronous API with memory constraints -3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas -4. **Handler Abstraction**: File server operations abstracted through configurable handlers -5. **Platform-Specific Optimizations**: - - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data - - **JSON** (`jsontable`): Universal human-readable format for smaller tables - - **Python table**: Unified table type for Python-specific implementations - - Streaming support in MicroPython +| Test Category | Coverage | Files | +|---------------|----------|-------| +| Serialization | All payload types | `test/test_*_sender.*` | +| Deserialization | All payload types | `test/test_*_receiver.*` | +| Transport selection | Direct vs link | `test/test_*_mix_payloads.*` | +| File server upload | Plik integration | Platform-specific | +| File server download | Exponential backoff | Platform-specific | -The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms. +### Integration Test Scenarios -### Datatype Summary +| Scenario | Platforms | Payloads | Transport | Expected Result | +|----------|-----------|----------|-----------|-----------------| +| Cross-platform text | Julia ↔ JS ↔ Python | text | direct | Round-trip successful | +| Arrow IPC round-trip | Julia ↔ JS ↔ Python | arrowtable | direct | Arrow IPC preserved | +| Large file transfer | All | image/audio/video | link | File server upload/download | +| Multi-payload mixed | All | text + image + file | direct/link | All payloads preserved | -| Datatype | Serialization | Use Case | Encoding | Supported Platforms | -|----------|---------------|----------|----------|---------------------| -| `text` | UTF-8 bytes | Text messages, chat content | `utf-8` → `base64` | All | -| `dictionary` | JSON | Structured key-value data, config | `json` → `base64` | All | -| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | -| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | -| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc` → `base64` | Python | -| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary` → `base64` | All | -| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary` → `base64` | All | -| `video` | Binary | Video files (MP4, AVI, etc.) | `binary` → `base64` | All | -| `binary` | Binary | Generic binary data, files | `binary` → `base64` | All | +--- + +## Versioning + +### Architecture Versioning + +| Component | Version | Notes | +|-----------|---------|-------| +| Architecture | 1.0.0 | Initial release | +| Protocol | v1 | Message envelope protocol version | + +### Backward Compatibility + +| Version | Supported Platforms | +|---------|---------------------| +| v1.0.x | Julia 1.7+, Node.js 16+, Python 3.8+, MicroPython 1.19+ | + +--- + +## Change Log + +| Date | Version | Changes | +|------|---------|---------| +| 2026-03-13 | 1.0.0 | Initial architecture documentation | + +--- + +## References + +- [`docs/requirements.md`](./requirements.md) - Business requirements and user stories +- [`docs/spec.md`](./spec.md) - Technical specification and contracts +- [`src/NATSBridge.jl`](../src/NATSBridge.jl) - Ground truth implementation +- [`README.md`](../README.md) - Project overview + +--- + +*This architecture document is versioned and maintained in git alongside the codebase. All implementations must adhere to this architecture.* diff --git a/docs/earlier_architecture.md b/docs/earlier_architecture.md new file mode 100644 index 0000000..b1f7929 --- /dev/null +++ b/docs/earlier_architecture.md @@ -0,0 +1,475 @@ +# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge + +## Overview + +This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language. + +**Supported Platforms:** +- **Julia** - Ground truth implementation with full feature set +- **JavaScript** - Node.js and browser-compatible implementation +- **Python/MicroPython** - Desktop and embedded-compatible implementation + +### Cross-Platform Design Principles + +1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior +2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python) +3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms +4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility + +--- + +## High-Level API Standard (Cross-Platform) + +### Unified API Signature + +All three platforms expose the same high-level API: + +**Input Format (smartsend):** +``` +[(dataname1, data1, type1), (dataname2, data2, type2), ...] +``` + +**Output Format (smartreceive):** +``` +{ + "correlation_id": "...", + "msg_id": "...", + "timestamp": "...", + "send_to": "...", + "msg_purpose": "...", + "sender_name": "...", + "sender_id": "...", + "receiver_name": "...", + "receiver_id": "...", + "reply_to": "...", + "reply_to_msg_id": "...", + "broker_url": "...", + "metadata": {...}, + "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...] +} +``` + +### Supported Payload Types + +| Type | Julia | JavaScript | Python/MicroPython | +|------|-------|------------|-------------------| +| `text` | `String` | `string` | `str` | +| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` | +| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) | +| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array` | `list[dict]`, `list` | +| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) | +| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` | +| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` | + +**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead. + +### Cross-Platform API Examples + +**Julia:** +```julia +using NATSBridge + +# Send +env, env_json_str = smartsend( + "/chat", + [("message", "Hello!", "text"), ("image", image_bytes, "image")], + broker_url="nats://localhost:4222" +) + +# Receive - returns JSON.Object{String, Any} +env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff) +# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}} +# Access payloads: for (dataname, data, type) in env["payloads] +``` + +**JavaScript:** +```javascript +const NATSBridge = require('natsbridge'); + +// Send +const [env, env_json_str] = await NATSBridge.smartsend( + "/chat", + [ + ["message", "Hello!", "text"], + ["image", imageBuffer, "image"] + ], + { broker_url: "nats://localhost:4222" } +); + +// Receive - returns Promise +const env = await NATSBridge.smartreceive(msg, { + fileserver_download_handler: fetchWithBackoff +}); +// env is an object with "payloads" field containing Array of arrays +// Access payloads: for (const [dataname, data, type] of env.payloads) +``` + +**Python:** +```python +from natsbridge import NATSBridge + +# Send +env, env_json_str = NATSBridge.smartsend( + "/chat", + [("message", "Hello!", "text"), ("image", image_bytes, "image")], + broker_url="nats://localhost:4222" +) + +# Receive - returns Tuple[Dict, str] +env = NATSBridge.smartreceive( + msg, + fileserver_download_handler=fetch_with_backoff +) +# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]] +# Access payloads: for dataname, data, type_ in env["payloads"] +``` + +**MicroPython:** +```python +from natsbridge import NATSBridge + +# Send (limited to direct transport due to memory constraints) +env, env_json_str = NATSBridge.smartsend( + "/chat", + [("message", "Hello!", "text")], + broker_url="nats://localhost:4222" +) +``` + +--- + +## Architecture Diagram (Cross-Platform) + +```mermaid +flowchart TD + subgraph Client + App[Julia/JS/Python/MicroPython Application] + end + + subgraph Server + Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service] + NATS[NATS Server] + FileServer[HTTP File Server] + end + + App -->|NATS| NATS + NATS -->|NATS| Julia/JS/Python/MicroPython + Julia/JS/Python/MicroPython -->|NATS| NATS + Julia/JS/Python/MicroPython -->|HTTP POST| FileServer + + style App fill:#e8f5e9 + style Julia/JS/Python/MicroPython fill:#e8f5e9 + style NATS fill:#fff3e0 + style FileServer fill:#f3e5f5 +``` + +--- + +## System Components + +### 1. msg_envelope_v1 - Message Envelope + +**JSON Schema (Identical Across All Platforms):** +```json +{ + "correlation_id": "uuid-v4-string", + "msg_id": "uuid-v4-string", + "timestamp": "2024-01-15T10:30:00Z", + + "send_to": "topic/subject", + "msg_purpose": "ACK | NACK | updateStatus | shutdown | chat", + "sender_name": "agent-wine-web-frontend", + "sender_id": "uuid4", + "receiver_name": "agent-backend", + "receiver_id": "uuid4", + "reply_to": "topic", + "reply_to_msg_id": "uuid4", + "broker_url": "nats://localhost:4222", + + "metadata": { + "content_type": "application/octet-stream", + "content_length": 123456 + }, + + "payloads": [ + { + "id": "uuid4", + "dataname": "login_image", + "payload_type": "image", + "transport": "direct", + "encoding": "base64", + "size": 15433, + "data": "base64-encoded-string", + "metadata": { + "checksum": "sha256_hash" + } + }, + { + "id": "uuid4", + "dataname": "large_arrow_table", + "payload_type": "arrowtable", + "transport": "link", + "encoding": "arrow-ipc", + "size": 524288, + "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow", + "metadata": {} + } + ] +} +``` + +### 2. msg_payload_v1 - Payload Structure + +**JSON Schema (Identical Across All Platforms):** +```json +{ + "id": "uuid4", + "dataname": "login_image", + "payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary", + "transport": "direct | link", + "encoding": "none | json | base64 | arrow-ipc", + "size": 15433, + "data": "base64-encoded-string | http-url | json-string", + "metadata": { + "checksum": "sha256_hash" + } +} +``` + +### 3. Transport Strategy Decision Logic (Cross-Platform) + +``` +┌─────────────────────────────────────────────────────────────┐ +│ smartsend Function (All Platforms) │ +│ Accepts: [(dataname1, data1, type1), ...] │ +│ (Type is per payload, not standalone) │ +└─────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ For each payload: │ +│ 1. Extract type from tuple/array │ +│ 2. Serialize based on type │ +│ 3. Check payload size │ +└─────────────────────────────────────────────────────────────┘ + │ + ┌───────────┴────────────┐ + ▼ ▼ + ┌──────────────┐ ┌──────────────┐ + │ Direct Path │ │ Link Path │ + │ (< 1MB) │ │ (>= 1MB) │ + │ │ │ │ + │ • Serialize │ │ • Serialize │ + │ to buffer │ │ to buffer │ + │ • Base64/JSON│ │ • Upload to │ + │ encode │ │ HTTP Server│ + │ • Publish to │ │ • Publish to │ + │ NATS │ │ NATS with │ + │ (in msg) │ │ URL │ + └──────────────┘ └──────────────┘ +``` + +--- + +## Platform Comparison Matrix + +| Feature | Julia | JavaScript | Python | MicroPython | +|---------|-------|------------|--------|-------------| +| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ | +| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) | +| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ | +| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) | +| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ | +| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) | +| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ | +| **jsontable Support** | ✅ | ✅ | ✅ | ❌ | +| **Direct Transport** | ✅ | ✅ | ✅ | ✅ | +| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) | +| **Handler Functions** | ✅ | ✅ | ✅ | ✅ | +| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ | + +--- + +## Platform-Specific Architecture Patterns + +### Julia: Multiple Dispatch Pattern + +Julia leverages multiple dispatch for type-specific implementations: + +- **Function overloading** based on argument types +- **Struct-based data models** with explicit types +- **Native Arrow IPC** support via Arrow.jl + +### JavaScript: Prototype + Async Pattern + +JavaScript uses async/await for non-blocking I/O: + +- **Class-based NATS client** for connection management +- **Module-level utility functions** for serialization +- **Native ArrayBuffer** for binary data handling + +### Python: Class-Based Pattern + +Python uses classes for stateful operations: + +- **Class-based NATSBridge** with type hints +- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1) +- **Async/await** for I/O operations + +### MicroPython: Synchronous Pattern + +MicroPython has significant constraints: + +- **Synchronous API** (no async/await) +- **Memory-constrained** (256KB - 1MB) +- **Limited payload support** (no tables, max 50KB) + +--- + +## Cross-Platform Compatibility Notes + +### 1. Payload Type Consistency + +All platforms use the same payload type values for tabular data: + +| Platform | Table Types | +|----------|-------------| +| Julia | `"arrowtable"`, `"jsontable"` | +| JavaScript | `"arrowtable"`, `"jsontable"` | +| Python | `"arrowtable"`, `"jsontable"` | +| MicroPython | Not supported | + + +### 2. Direct Transport Encoding Field + +The encoding field in direct transport payloads differs between platforms: + +| Platform | Encoding for Direct Transport | +|----------|-------------------------------| +| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | +| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` | +| Python | Always `"base64"` for all direct transport payloads | +| MicroPython | Always `"base64"` for all direct transport payloads | + +**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython. + +### 3. MicroPython Limitations + +MicroPython has significant constraints that affect feature support: + +| Feature | Desktop Platforms | MicroPython | +|---------|-------------------|-------------| +| `arrowtable` | ✅ | ❌ (not supported - memory constraints) | +| `jsontable` | ✅ | ❌ (not supported - memory constraints) | +| `table` | ✅ | ❌ (not supported - memory constraints) | +| Async/await | ✅ | ❌ (synchronous only) | +| File upload/download | ✅ | ⚠️ (placeholder implementations) | +| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) | +| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB | + +**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented. + +--- + +## Configuration + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `NATS_URL` | `nats://localhost:4222` | NATS server URL | +| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL | +| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) | + +### MicroPython-Specific Configuration + +```python +# micropython.conf +NATS_URL = "nats://broker.local:4222" +FILESERVER_URL = "http://fileserver.local:8080" +SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices +MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython +``` + +--- + +## Performance Considerations + +### Zero-Copy Reading + +| Platform | Strategy | +|----------|----------| +| **Julia** | `Arrow.read()` with memory-mapped files | +| **JavaScript** | `ArrayBuffer` with `DataView` | +| **Python** | `pyarrow` memory mapping | +| **MicroPython** | Not available (streaming only) | + +### Exponential Backoff + +All platforms implement exponential backoff for HTTP downloads: + +``` +delay = base_delay +for attempt in 1:max_retries: + try: + response = fetch(url) + if success: return response + except: + if attempt < max_retries: + sleep(delay) + delay = min(delay * 2, max_delay) +``` + +### Correlation ID Logging + +All platforms use correlation IDs for distributed tracing: + +``` +[timestamp] [Correlation: abc123] Message published to subject +``` + +### Serialization Performance Comparison + +| Format | Use Case | Pros | Cons | +|--------|----------|------|------| +| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library | +| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema | +| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow | + +--- + +## Summary + +This cross-platform NATS bridge provides: + +1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython +2. **Idiomatic Implementations**: + - Julia: Multiple dispatch and struct-based design + - JavaScript: Async/await and prototype-based utilities + - Python: Class-based design with type hints + - MicroPython: Synchronous API with memory constraints +3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas +4. **Handler Abstraction**: File server operations abstracted through configurable handlers +5. **Platform-Specific Optimizations**: + - **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data + - **JSON** (`jsontable`): Universal human-readable format for smaller tables + - **Python table**: Unified table type for Python-specific implementations + - Streaming support in MicroPython + +The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms. + +### Datatype Summary + +| Datatype | Serialization | Use Case | Encoding | Supported Platforms | +|----------|---------------|----------|----------|---------------------| +| `text` | UTF-8 bytes | Text messages, chat content | `utf-8` → `base64` | All | +| `dictionary` | JSON | Structured key-value data, config | `json` → `base64` | All | +| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc` → `base64` | Julia, JavaScript, Python | +| `jsontable` | JSON | Small/medium tabular data, human-readable | `json` → `base64` | Julia, JavaScript, Python | +| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc` → `base64` | Python | +| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary` → `base64` | All | +| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary` → `base64` | All | +| `video` | Binary | Video files (MP4, AVI, etc.) | `binary` → `base64` | All | +| `binary` | Binary | Generic binary data, files | `binary` → `base64` | All |