Files
NATSBridge/docs/architecture.md
2026-03-09 18:16:33 +07:00

476 lines
17 KiB
Markdown

# Cross-Platform Architecture Documentation: Bi-Directional Data Bridge
## Overview
This document describes the architecture for a high-performance, bi-directional data bridge using **NATS (Core & JetStream)**, implementing the Claim-Check pattern for large payloads. The system is implemented across three platforms with **high-level API parity** while maintaining **idiomatic implementations** for each language.
**Supported Platforms:**
- **Julia** - Ground truth implementation with full feature set
- **JavaScript** - Node.js and browser-compatible implementation
- **Python/MicroPython** - Desktop and embedded-compatible implementation
### Cross-Platform Design Principles
1. **High-Level API Parity**: All three platforms expose the same `smartsend()` and `smartreceive()` functions with identical signatures and behavior
2. **Idiomatic Implementations**: Each platform uses its native patterns (multiple dispatch in Julia, async/prototype in JS, class-based in Python)
3. **Message Format Consistency**: The `msg_envelope_v1` and `msg_payload_v1` JSON schemas are identical across all platforms
4. **Handler Function Abstraction**: File server operations are abstracted through handler functions for backend flexibility
---
## High-Level API Standard (Cross-Platform)
### Unified API Signature
All three platforms expose the same high-level API:
**Input Format (smartsend):**
```
[(dataname1, data1, type1), (dataname2, data2, type2), ...]
```
**Output Format (smartreceive):**
```
{
"correlation_id": "...",
"msg_id": "...",
"timestamp": "...",
"send_to": "...",
"msg_purpose": "...",
"sender_name": "...",
"sender_id": "...",
"receiver_name": "...",
"receiver_id": "...",
"reply_to": "...",
"reply_to_msg_id": "...",
"broker_url": "...",
"metadata": {...},
"payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...]
}
```
### Supported Payload Types
| Type | Julia | JavaScript | Python/MicroPython |
|------|-------|------------|-------------------|
| `text` | `String` | `string` | `str` |
| `dictionary` | `Dict`, `NamedTuple` | `Object`, `Array` | `dict`, `list` |
| `arrowtable` | `DataFrame`, `Arrow.Table` | `Array<Object>` (input) → `Buffer` (Arrow IPC) | `pandas.DataFrame`, `bytes` (Arrow IPC) |
| `jsontable` | `Vector{NamedTuple}`, `Vector{Dict}` | `Array<Object>` | `list[dict]`, `list` |
| `table` | ❌ | ❌ | `pandas.DataFrame`, `bytes` (Arrow IPC) |
| `image` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` |
| `audio` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` |
| `video` | `Vector{UInt8}` | `Uint8Array`, `Buffer` | `bytes`, `bytearray` |
| `binary` | `Vector{UInt8}`, `IOBuffer` | `Uint8Array`, `Buffer` | `bytes`, `bytearray`, `io.BytesIO` |
**Note on MicroPython:** MicroPython does not support table types (`arrowtable` or `jsontable`) due to memory constraints. Use `dictionary` or `binary` instead.
### Cross-Platform API Examples
**Julia:**
```julia
using NATSBridge
# Send
env, env_json_str = smartsend(
"/chat",
[("message", "Hello!", "text"), ("image", image_bytes, "image")],
broker_url="nats://localhost:4222"
)
# Receive - returns JSON.Object{String, Any}
env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff)
# env is a JSON.Object{String, Any} with "payloads" field containing Vector{Tuple{String, Any, String}}
# Access payloads: for (dataname, data, type) in env["payloads]
```
**JavaScript:**
```javascript
const NATSBridge = require('natsbridge');
// Send
const [env, env_json_str] = await NATSBridge.smartsend(
"/chat",
[
["message", "Hello!", "text"],
["image", imageBuffer, "image"]
],
{ broker_url: "nats://localhost:4222" }
);
// Receive - returns Promise<object>
const env = await NATSBridge.smartreceive(msg, {
fileserver_download_handler: fetchWithBackoff
});
// env is an object with "payloads" field containing Array of arrays
// Access payloads: for (const [dataname, data, type] of env.payloads)
```
**Python:**
```python
from natsbridge import NATSBridge
# Send
env, env_json_str = NATSBridge.smartsend(
"/chat",
[("message", "Hello!", "text"), ("image", image_bytes, "image")],
broker_url="nats://localhost:4222"
)
# Receive - returns Tuple[Dict, str]
env = NATSBridge.smartreceive(
msg,
fileserver_download_handler=fetch_with_backoff
)
# env is a Dict with "payloads" key containing List[Tuple[str, Any, str]]
# Access payloads: for dataname, data, type_ in env["payloads"]
```
**MicroPython:**
```python
from natsbridge import NATSBridge
# Send (limited to direct transport due to memory constraints)
env, env_json_str = NATSBridge.smartsend(
"/chat",
[("message", "Hello!", "text")],
broker_url="nats://localhost:4222"
)
```
---
## Architecture Diagram (Cross-Platform)
```mermaid
flowchart TD
subgraph Client
App[Julia/JS/Python/MicroPython Application]
end
subgraph Server
Julia/JS/Python/MicroPython[Julia/JS/Python/MicroPython Service]
NATS[NATS Server]
FileServer[HTTP File Server]
end
App -->|NATS| NATS
NATS -->|NATS| Julia/JS/Python/MicroPython
Julia/JS/Python/MicroPython -->|NATS| NATS
Julia/JS/Python/MicroPython -->|HTTP POST| FileServer
style App fill:#e8f5e9
style Julia/JS/Python/MicroPython fill:#e8f5e9
style NATS fill:#fff3e0
style FileServer fill:#f3e5f5
```
---
## System Components
### 1. msg_envelope_v1 - Message Envelope
**JSON Schema (Identical Across All Platforms):**
```json
{
"correlation_id": "uuid-v4-string",
"msg_id": "uuid-v4-string",
"timestamp": "2024-01-15T10:30:00Z",
"send_to": "topic/subject",
"msg_purpose": "ACK | NACK | updateStatus | shutdown | chat",
"sender_name": "agent-wine-web-frontend",
"sender_id": "uuid4",
"receiver_name": "agent-backend",
"receiver_id": "uuid4",
"reply_to": "topic",
"reply_to_msg_id": "uuid4",
"broker_url": "nats://localhost:4222",
"metadata": {
"content_type": "application/octet-stream",
"content_length": 123456
},
"payloads": [
{
"id": "uuid4",
"dataname": "login_image",
"payload_type": "image",
"transport": "direct",
"encoding": "base64",
"size": 15433,
"data": "base64-encoded-string",
"metadata": {
"checksum": "sha256_hash"
}
},
{
"id": "uuid4",
"dataname": "large_arrow_table",
"payload_type": "arrowtable",
"transport": "link",
"encoding": "arrow-ipc",
"size": 524288,
"data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow",
"metadata": {}
}
]
}
```
### 2. msg_payload_v1 - Payload Structure
**JSON Schema (Identical Across All Platforms):**
```json
{
"id": "uuid4",
"dataname": "login_image",
"payload_type": "image | dictionary | arrowtable | jsontable | table | text | audio | video | binary",
"transport": "direct | link",
"encoding": "none | json | base64 | arrow-ipc",
"size": 15433,
"data": "base64-encoded-string | http-url | json-string",
"metadata": {
"checksum": "sha256_hash"
}
}
```
### 3. Transport Strategy Decision Logic (Cross-Platform)
```
┌─────────────────────────────────────────────────────────────┐
│ smartsend Function (All Platforms) │
│ Accepts: [(dataname1, data1, type1), ...] │
│ (Type is per payload, not standalone) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ For each payload: │
│ 1. Extract type from tuple/array │
│ 2. Serialize based on type │
│ 3. Check payload size │
└─────────────────────────────────────────────────────────────┘
┌───────────┴────────────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Direct Path │ │ Link Path │
│ (< 1MB) │ │ (>= 1MB) │
│ │ │ │
│ • Serialize │ │ • Serialize │
│ to buffer │ │ to buffer │
│ • Base64/JSON│ │ • Upload to │
│ encode │ │ HTTP Server│
│ • Publish to │ │ • Publish to │
│ NATS │ │ NATS with │
│ (in msg) │ │ URL │
└──────────────┘ └──────────────┘
```
---
## Platform Comparison Matrix
| Feature | Julia | JavaScript | Python | MicroPython |
|---------|-------|------------|--------|-------------|
| **Multiple Dispatch** | ✅ Native | ❌ (Prototypes) | ❌ (Overload via `@overload`) | ❌ |
| **Async/Await** | ❌ (Tasks) | ✅ Native | ✅ Native | ⚠️ (uasyncio) |
| **Type Safety** | ✅ Strong | ⚠️ (TypeScript) | ✅ (Type hints) | ❌ |
| **Memory Management** | ✅ GC | ✅ GC | ✅ GC | ⚠️ (Manual) |
| **Arrow IPC** | ✅ Native | ✅ (arrow package) | ✅ (pyarrow) | ❌ |
| **JSON Serialization** | ✅ (JSON.jl) | ✅ (native) | ✅ (json) | ✅ (json) |
| **arrowtable Support** | ✅ | ✅ | ✅ | ❌ |
| **jsontable Support** | ✅ | ✅ | ✅ | ❌ |
| **Direct Transport** | ✅ | ✅ | ✅ | ✅ |
| **Link Transport** | ✅ | ✅ | ✅ | ⚠️ (Limited) |
| **Handler Functions** | ✅ | ✅ | ✅ | ✅ |
| **Cross-Platform API** | ✅ | ✅ | ✅ | ✅ |
---
## Platform-Specific Architecture Patterns
### Julia: Multiple Dispatch Pattern
Julia leverages multiple dispatch for type-specific implementations:
- **Function overloading** based on argument types
- **Struct-based data models** with explicit types
- **Native Arrow IPC** support via Arrow.jl
### JavaScript: Prototype + Async Pattern
JavaScript uses async/await for non-blocking I/O:
- **Class-based NATS client** for connection management
- **Module-level utility functions** for serialization
- **Native ArrayBuffer** for binary data handling
### Python: Class-Based Pattern
Python uses classes for stateful operations:
- **Class-based NATSBridge** with type hints
- **Dataclasses** for structured data (MsgPayloadV1, MsgEnvelopeV1)
- **Async/await** for I/O operations
### MicroPython: Synchronous Pattern
MicroPython has significant constraints:
- **Synchronous API** (no async/await)
- **Memory-constrained** (256KB - 1MB)
- **Limited payload support** (no tables, max 50KB)
---
## Cross-Platform Compatibility Notes
### 1. Payload Type Consistency
All platforms use the same payload type values for tabular data:
| Platform | Table Types |
|----------|-------------|
| Julia | `"arrowtable"`, `"jsontable"` |
| JavaScript | `"arrowtable"`, `"jsontable"` |
| Python | `"arrowtable"`, `"jsontable"` |
| MicroPython | Not supported |
### 2. Direct Transport Encoding Field
The encoding field in direct transport payloads differs between platforms:
| Platform | Encoding for Direct Transport |
|----------|-------------------------------|
| Julia | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` |
| JavaScript | Preserves original type: `"base64"`, `"json"`, or `"arrow-ipc"` |
| Python | Always `"base64"` for all direct transport payloads |
| MicroPython | Always `"base64"` for all direct transport payloads |
**Impact:** The encoding field may not accurately reflect the original serialization format when using Python or MicroPython.
### 3. MicroPython Limitations
MicroPython has significant constraints that affect feature support:
| Feature | Desktop Platforms | MicroPython |
|---------|-------------------|-------------|
| `arrowtable` | ✅ | ❌ (not supported - memory constraints) |
| `jsontable` | ✅ | ❌ (not supported - memory constraints) |
| `table` | ✅ | ❌ (not supported - memory constraints) |
| Async/await | ✅ | ❌ (synchronous only) |
| File upload/download | ✅ | ⚠️ (placeholder implementations) |
| MAX_PAYLOAD_SIZE | 1MB+ | 50KB (hard limit) |
| DEFAULT_SIZE_THRESHOLD | 1MB | 100KB |
**Impact:** MicroPython should only be used for small payloads with direct transport. File server operations are not fully implemented.
---
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `NATS_URL` | `nats://localhost:4222` | NATS server URL |
| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL |
| `SIZE_THRESHOLD` | `1000000` | Size threshold in bytes (1MB) |
### MicroPython-Specific Configuration
```python
# micropython.conf
NATS_URL = "nats://broker.local:4222"
FILESERVER_URL = "http://fileserver.local:8080"
SIZE_THRESHOLD = 100000 # Lower threshold for memory-constrained devices
MAX_PAYLOAD_SIZE = 50000 # Hard limit for MicroPython
```
---
## Performance Considerations
### Zero-Copy Reading
| Platform | Strategy |
|----------|----------|
| **Julia** | `Arrow.read()` with memory-mapped files |
| **JavaScript** | `ArrayBuffer` with `DataView` |
| **Python** | `pyarrow` memory mapping |
| **MicroPython** | Not available (streaming only) |
### Exponential Backoff
All platforms implement exponential backoff for HTTP downloads:
```
delay = base_delay
for attempt in 1:max_retries:
try:
response = fetch(url)
if success: return response
except:
if attempt < max_retries:
sleep(delay)
delay = min(delay * 2, max_delay)
```
### Correlation ID Logging
All platforms use correlation IDs for distributed tracing:
```
[timestamp] [Correlation: abc123] Message published to subject
```
### Serialization Performance Comparison
| Format | Use Case | Pros | Cons |
|--------|----------|------|------|
| `arrowtable` | Large tabular data | Fast, zero-copy, schema-preserving | Binary format, requires Arrow library |
| `jsontable` | Small/medium tabular data | Human-readable, universal support | Slower, larger size, no schema |
| `table` (Python) | Large tabular data | Fast, zero-copy, schema-preserving | Python-specific, requires pyarrow |
---
## Summary
This cross-platform NATS bridge provides:
1. **High-Level API Parity**: Identical `smartsend()` and `smartreceive()` signatures across Julia, JavaScript, and Python/MicroPython
2. **Idiomatic Implementations**:
- Julia: Multiple dispatch and struct-based design
- JavaScript: Async/await and prototype-based utilities
- Python: Class-based design with type hints
- MicroPython: Synchronous API with memory constraints
3. **Message Format Consistency**: Identical `msg_envelope_v1` and `msg_payload_v1` JSON schemas
4. **Handler Abstraction**: File server operations abstracted through configurable handlers
5. **Platform-Specific Optimizations**:
- **Arrow IPC** (`arrowtable`): Efficient binary format for large tabular data
- **JSON** (`jsontable`): Universal human-readable format for smaller tables
- **Python table**: Unified table type for Python-specific implementations
- Streaming support in MicroPython
The Julia implementation serves as the **ground truth** for API design and behavior, while JavaScript and Python implementations maintain interface parity while leveraging their respective language idioms.
### Datatype Summary
| Datatype | Serialization | Use Case | Encoding | Supported Platforms |
|----------|---------------|----------|----------|---------------------|
| `text` | UTF-8 bytes | Text messages, chat content | `utf-8``base64` | All |
| `dictionary` | JSON | Structured key-value data, config | `json``base64` | All |
| `arrowtable` | Apache Arrow IPC | Large tabular data, schema-preserving | `arrow-ipc``base64` | Julia, JavaScript, Python |
| `jsontable` | JSON | Small/medium tabular data, human-readable | `json``base64` | Julia, JavaScript, Python |
| `table` | Apache Arrow IPC | Python's unified table type | `arrow-ipc``base64` | Python |
| `image` | Binary | Image files (JPEG, PNG, etc.) | `binary``base64` | All |
| `audio` | Binary | Audio files (WAV, MP3, etc.) | `binary``base64` | All |
| `video` | Binary | Video files (MP4, AVI, etc.) | `binary``base64` | All |
| `binary` | Binary | Generic binary data, files | `binary``base64` | All |