This commit is contained in:
2026-02-25 20:27:51 +07:00
parent bee9f783d9
commit f8d93991f5
24 changed files with 579 additions and 4981 deletions

View File

@@ -2,12 +2,10 @@
## Overview
This document describes the architecture for a high-performance, bi-directional data bridge between **Julia**, **JavaScript**, and **Python/Micropython** applications using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
This document describes the architecture for a high-performance, bi-directional data bridge for **Julia** applications using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
The system enables seamless communication across all three platforms:
- **Julia ↔ JavaScript** bi-directional messaging
- **JavaScript ↔ Python/Micropython** bi-directional messaging
- **Julia ↔ Python/Micropython** bi-directional messaging (via JSON serialization)
The system enables seamless communication for Julia applications:
- **Julia** messaging with NATS
### File Server Handler Architecture
@@ -113,8 +111,7 @@ env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff, max_ret
```mermaid
flowchart TD
subgraph Client
JS[JavaScript Client]
JSApp[Application Logic]
App[Julia Application]
end
subgraph Server
@@ -123,14 +120,12 @@ flowchart TD
FileServer[HTTP File Server]
end
JS -->|Control/Small Data| JSApp
JSApp -->|NATS| NATS
App -->|NATS| NATS
NATS -->|NATS| Julia
Julia -->|NATS| NATS
Julia -->|HTTP POST| FileServer
JS -->|HTTP GET| FileServer
style JS fill:#e1f5fe
style App fill:#e8f5e9
style Julia fill:#e8f5e9
style NATS fill:#fff3e0
style FileServer fill:#f3e5f5
@@ -140,7 +135,7 @@ flowchart TD
### 1. msg_envelope_v1 - Message Envelope
The `msg_envelope_v1` structure provides a comprehensive message format for bidirectional communication between Julia, JavaScript, and Python/Micropython applications.
The `msg_envelope_v1` structure provides a comprehensive message format for bidirectional communication in Julia applications.
**Julia Structure:**
```julia
@@ -216,7 +211,7 @@ end
### 2. msg_payload_v1 - Payload Structure
The `msg_payload_v1` structure provides flexible payload handling for various data types across all supported platforms.
The `msg_payload_v1` structure provides flexible payload handling for various data types.
**Julia Structure:**
```julia
@@ -271,65 +266,7 @@ end
└─────────────────┘ └─────────────────┘
```
### 4. Cross-Platform Architecture
```mermaid
flowchart TD
subgraph PythonMicropython
Py[Python/Micropython]
PySmartSend[smartsend]
PySmartReceive[smartreceive]
end
subgraph JavaScript
JS[JavaScript]
JSSmartSend[smartsend]
JSSmartReceive[smartreceive]
end
subgraph Julia
Julia[Julia]
JuliaSmartSend[smartsend]
JuliaSmartReceive[smartreceive]
end
subgraph NATS
NATSServer[NATS Server]
end
PySmartSend --> NATSServer
JSSmartSend --> NATSServer
JuliaSmartSend --> NATSServer
NATSServer --> PySmartReceive
NATSServer --> JSSmartReceive
NATSServer --> JuliaSmartReceive
style PythonMicropython fill:#e1f5fe
style JavaScript fill:#f3e5f5
style Julia fill:#e8f5e9
```
### 5. Python/Micropython Module Architecture
```mermaid
graph TD
subgraph PyModule
PySmartSend[smartsend]
SizeCheck[Size Check]
DirectPath[Direct Path]
LinkPath[Link Path]
HTTPClient[HTTP Client]
end
PySmartSend --> SizeCheck
SizeCheck -->|< 1MB| DirectPath
SizeCheck -->|>= 1MB| LinkPath
LinkPath --> HTTPClient
style PyModule fill:#b3e5fc
```
### 6. Julia Module Architecture
### 4. Julia Module Architecture
```mermaid
graph TD
@@ -349,51 +286,8 @@ graph TD
style JuliaModule fill:#c5e1a5
```
### 7. JavaScript Module Architecture
```mermaid
graph TD
subgraph JSModule
JSSmartSend[smartsend]
JSSmartReceive[smartreceive]
JetStreamConsumer[JetStream Pull Consumer]
ApacheArrow[Apache Arrow]
end
JSSmartSend --> NATS
JSSmartReceive --> JetStreamConsumer
JetStreamConsumer --> ApacheArrow
style JSModule fill:#f3e5f5
```
## Implementation Details
### API Consistency Across Languages
**High-Level API (Consistent Across All Languages):**
- `smartsend(subject, data, ...)` - Main publishing function
- `smartreceive(msg, ...)` - Main receiving function
- Message envelope structure (`msg_envelope_v1` / `MessageEnvelope`)
- Payload structure (`msg_payload_v1` / `MessagePayload`)
- Transport strategy (direct vs link based on size threshold)
- Supported payload types: text, dictionary, table, image, audio, video, binary
**Low-Level Native Functions (Language-Specific Conventions):**
- Julia: `NATS.connect()`, `publish_message()`, function overloading
- JavaScript: `nats.js` client, native async/await patterns
- Python: `nats-python` client, native async/await patterns
**Connection Reuse Pattern:**
- **Julia:** Uses `NATS_connection` keyword parameter with function overloading
- **JavaScript/Python:** Achieved by creating NATS client outside the function and reusing it in custom handlers
**Note on `is_publish`:**
- `is_publish` is simply a switch to control automatic publishing
- When `true` (default): Message is published to NATS automatically
- When `false`: Returns `(env, env_json_str)` without publishing, allowing manual publishing
- Connection reuse is achieved separately by creating NATS client outside the function
### Julia Implementation
#### Dependencies
@@ -546,200 +440,6 @@ env, env_json_str = smartsend(
# Uses: publish_message(broker_url, subject, env_json_str, cid)
```
**API Consistency Note:**
- **High-level API (smartsend, smartreceive):** Uses consistent naming across all three languages (Julia, JavaScript, Python/Micropython)
- **Low-level native functions (NATS.connect(), publish_message()):** Follow the conventions of the specific language ecosystem and do not require cross-language consistency
### JavaScript Implementation
#### Dependencies
- `nats.js` - Core NATS functionality
- `apache-arrow` - Arrow IPC serialization
- `uuid` - Correlation ID and message ID generation
- `base64-arraybuffer` - Base64 encoding/decoding
- `node-fetch` or `fetch` - HTTP client for file server
#### smartsend Function
```javascript
async function smartsend(
subject,
data, // List of (dataname, data, type) tuples: [(dataname1, data1, type1), ...]
options = {}
)
```
**Options:**
- `broker_url` (String) - NATS server URL (default: `"nats://localhost:4222"`)
- `fileserver_url` (String) - Base URL of the file server (default: `"http://localhost:8080"`)
- `size_threshold` (Number) - Threshold in bytes for transport selection (default: `1048576` = 1MB)
- `correlation_id` (String) - Optional correlation ID for tracing
- `msg_purpose` (String) - Purpose of the message (default: `"chat"`)
- `sender_name` (String) - Sender name (default: `"NATSBridge"`)
- `receiver_name` (String) - Message receiver name (default: `""`)
- `receiver_id` (String) - Message receiver ID (default: `""`)
- `reply_to` (String) - Topic to reply to (default: `""`)
- `reply_to_msg_id` (String) - Message ID this message is replying to (default: `""`)
- `is_publish` (Boolean) - Whether to automatically publish the message to NATS (default: `true`)
- `fileserver_upload_handler` (Function) - Custom upload handler function
**Note:** JavaScript uses `is_publish` option (instead of `NATS_connection` keyword) to control automatic publishing behavior. Connection reuse can be achieved by creating a NATS client outside the function and reusing it in a custom `fileserver_upload_handler` or custom publish implementation.
**Return Value:**
- Returns a Promise that resolves to an object containing:
- `env` - The envelope object containing all metadata and payloads
- `env_json_str` - JSON string representation of the envelope for publishing
**Input Format:**
- `data` - **Must be a list of (dataname, data, type) tuples**: `[(dataname1, data1, "type1"), (dataname2, data2, "type2"), ...]`
- Even for single payloads: `[(dataname1, data1, "type1")]`
- Each payload can have a different type, enabling mixed-content messages
- Supported types: `"text"`, `"dictionary"`, `"table"`, `"image"`, `"audio"`, `"video"`, `"binary"`
**Flow:**
1. Generate correlation ID and message ID if not provided
2. Iterate through the list of `(dataname, data, type)` tuples
3. For each payload:
- Serialize based on payload type
- Check payload size
- If < threshold: Base64 encode and include in envelope
- If >= threshold: Upload to HTTP server, store URL in envelope
4. Publish the JSON envelope to NATS
5. Return envelope object and JSON string
#### smartreceive Handler
```javascript
async function smartreceive(msg, options = {})
```
**Options:**
- `fileserver_download_handler` (Function) - Custom download handler function
- `max_retries` (Number) - Maximum retry attempts for fetching URL (default: `5`)
- `base_delay` (Number) - Initial delay for exponential backoff in ms (default: `100`)
- `max_delay` (Number) - Maximum delay for exponential backoff in ms (default: `5000`)
- `correlation_id` (String) - Optional correlation ID for tracing
**Output Format:**
- Returns a Promise that resolves to an object containing all envelope fields:
- `correlation_id`, `msg_id`, `timestamp`, `send_to`, `msg_purpose`, `sender_name`, `sender_id`, `receiver_name`, `receiver_id`, `reply_to`, `reply_to_msg_id`, `broker_url`
- `metadata` - Message-level metadata dictionary
- `payloads` - List of tuples, each containing `(dataname, data, type)` with deserialized payload data
**Process Flow:**
1. Parse the JSON envelope to extract all fields
2. Iterate through each payload in `payloads` array
3. For each payload:
- Determine transport type (`direct` or `link`)
- If `direct`: Base64 decode the data from the message
- If `link`: Fetch data from URL using exponential backoff (via `fileserver_download_handler`)
- Deserialize based on payload type (`dictionary`, `table`, `binary`, etc.)
4. Return envelope object with `payloads` field containing list of `(dataname, data, type)` tuples
**Note:** The `fileserver_download_handler` receives `(url, max_retries, base_delay, max_delay, correlation_id)` and returns `ArrayBuffer` or `Uint8Array`.
### Python/Micropython Implementation
#### Dependencies
- `nats-python` - Core NATS functionality
- `pyarrow` - Arrow IPC serialization
- `uuid` - Correlation ID and message ID generation
- `base64` - Base64 encoding/decoding
- `requests` or `aiohttp` - HTTP client for file server
#### smartsend Function
```python
def smartsend(
subject: str,
data: List[Tuple[str, Any, str]], # List of (dataname, data, type) tuples
broker_url: str = DEFAULT_BROKER_URL,
fileserver_url: str = DEFAULT_FILESERVER_URL,
fileserver_upload_handler: Callable = plik_oneshot_upload,
size_threshold: int = DEFAULT_SIZE_THRESHOLD,
correlation_id: Union[str, None] = None,
msg_purpose: str = "chat",
sender_name: str = "NATSBridge",
receiver_name: str = "",
receiver_id: str = "",
reply_to: str = "",
reply_to_msg_id: str = "",
is_publish: bool = True
) -> Tuple[MessageEnvelope, str]
```
**Options:**
- `broker_url` (str) - NATS server URL (default: `"nats://localhost:4222"`)
- `fileserver_url` (str) - Base URL of the file server (default: `"http://localhost:8080"`)
- `size_threshold` (int) - Threshold in bytes for transport selection (default: `1048576` = 1MB)
- `correlation_id` (str) - Optional correlation ID for tracing (auto-generated if None)
- `msg_purpose` (str) - Purpose of the message (default: `"chat"`)
- `sender_name` (str) - Sender name (default: `"NATSBridge"`)
- `receiver_name` (str) - Message receiver name (default: `""`)
- `receiver_id` (str) - Message receiver ID (default: `""`)
- `reply_to` (str) - Topic to reply to (default: `""`)
- `reply_to_msg_id` (str) - Message ID this message is replying to (default: `""`)
- `is_publish` (bool) - Whether to automatically publish the message to NATS (default: `True`)
- `fileserver_upload_handler` (Callable) - Custom upload handler function
**Note:** Python uses `is_publish` parameter (instead of `NATS_connection` keyword) to control automatic publishing behavior. Connection reuse can be achieved by creating a NATS client outside the function and reusing it in a custom `fileserver_upload_handler` or custom publish implementation.
**Return Value:**
- Returns a tuple `(env, env_json_str)` where:
- `env` - The envelope dictionary containing all metadata and payloads
- `env_json_str` - JSON string representation of the envelope for publishing
**Input Format:**
- `data` - **Must be a list of (dataname, data, type) tuples**: `[(dataname1, data1, "type1"), (dataname2, data2, "type2"), ...]`
- Even for single payloads: `[(dataname1, data1, "type1")]`
- Each payload can have a different type, enabling mixed-content messages
- Supported types: `"text"`, `"dictionary"`, `"table"`, `"image"`, `"audio"`, `"video"`, `"binary"`
**Flow:**
1. Generate correlation ID and message ID if not provided
2. Iterate through the list of `(dataname, data, type)` tuples
3. For each payload:
- Serialize based on payload type
- Check payload size
- If < threshold: Base64 encode and include in envelope
- If >= threshold: Upload to HTTP server, store URL in envelope
4. Publish the JSON envelope to NATS
5. Return envelope dictionary and JSON string
#### smartreceive Handler
```python
async def smartreceive(
msg: NATS.Message,
options: Dict = {}
)
```
**Options:**
- `fileserver_download_handler` (Callable) - Custom download handler function
- `max_retries` (int) - Maximum retry attempts for fetching URL (default: `5`)
- `base_delay` (int) - Initial delay for exponential backoff in ms (default: `100`)
- `max_delay` (int) - Maximum delay for exponential backoff in ms (default: `5000`)
- `correlation_id` (str) - Optional correlation ID for tracing
**Output Format:**
- Returns a JSON object (dictionary) containing all envelope fields:
- `correlation_id`, `msg_id`, `timestamp`, `send_to`, `msg_purpose`, `sender_name`, `sender_id`, `receiver_name`, `receiver_id`, `reply_to`, `reply_to_msg_id`, `broker_url`
- `metadata` - Message-level metadata dictionary
- `payloads` - List of tuples, each containing `(dataname, data, payload_type)` with deserialized payload data
**Process Flow:**
1. Parse the JSON envelope to extract all fields
2. Iterate through each payload in `payloads` list
3. For each payload:
- Determine transport type (`direct` or `link`)
- If `direct`: Base64 decode the data from the message
- If `link`: Fetch data from URL using exponential backoff (via `fileserver_download_handler`)
- Deserialize based on payload type (`dictionary`, `table`, `binary`, etc.)
4. Return envelope dictionary with `payloads` field containing list of `(dataname, data, type)` tuples
**Note:** The `fileserver_download_handler` receives `(url: str, max_retries: int, base_delay: int, max_delay: int, correlation_id: str)` and returns `bytes`.
## Scenario Implementations
### Scenario 1: Command & Control (Small Dictionary)
@@ -752,18 +452,6 @@ async def smartreceive(
# Send acknowledgment
```
**JavaScript (Sender/Receiver):**
```javascript
// Create small dictionary config
// Send via smartsend with type="dictionary"
```
**Python/Micropython (Sender/Receiver):**
```python
# Create small dictionary config
# Send via smartsend with type="dictionary"
```
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
**Julia (Sender/Receiver):**
@@ -775,32 +463,8 @@ async def smartreceive(
# Publish NATS with URL
```
**JavaScript (Sender/Receiver):**
```javascript
// Receive NATS message with URL
// Fetch data from HTTP server
// Parse Arrow IPC with zero-copy
// Load into Perspective.js or D3
```
**Python/Micropython (Sender/Receiver):**
```python
# Create large DataFrame
# Convert to Arrow IPC stream
# Check size (> 1MB)
# Upload to HTTP server
# Publish NATS with URL
```
### Scenario 3: Live Audio Processing
**JavaScript (Sender/Receiver):**
```javascript
// Capture audio chunk
// Send as binary with metadata headers
// Use smartsend with type="audio"
```
**Julia (Sender/Receiver):**
```julia
# Receive audio data
@@ -808,13 +472,6 @@ async def smartreceive(
# Send results back (JSON + Arrow table)
```
**Python/Micropython (Sender/Receiver):**
```python
# Capture audio chunk
# Send as binary with metadata headers
# Use smartsend with type="audio"
```
### Scenario 4: Catch-Up (JetStream)
**Julia (Producer/Consumer):**
@@ -823,22 +480,9 @@ async def smartreceive(
# Include metadata for temporal tracking
```
**JavaScript (Producer/Consumer):**
```javascript
// Connect to JetStream
// Request replay from last 10 minutes
// Process historical and real-time messages
```
**Python/Micropython (Producer/Consumer):**
```python
# Publish to JetStream
# Include metadata for temporal tracking
```
### Scenario 5: Selection (Low Bandwidth)
**Focus:** Small Arrow tables, cross-platform communication. The Action: Any platform wants to send a small DataFrame to show on any receiving application for the user to choose.
**Focus:** Small Arrow tables. The Action: Julia wants to send a small DataFrame to show on a receiving application for the user to choose.
**Julia (Sender/Receiver):**
```julia
@@ -849,30 +493,9 @@ async def smartreceive(
# Include metadata for dashboard selection context
```
**JavaScript (Sender/Receiver):**
```javascript
// Receive NATS message with direct transport
// Decode Base64 payload
// Parse Arrow IPC with zero-copy
// Load into selection UI component (e.g., dropdown, table)
// User makes selection
// Send selection back to Julia
```
**Python/Micropython (Sender/Receiver):**
```python
# Create small DataFrame (e.g., 50KB - 500KB)
# Convert to Arrow IPC stream
# Check payload size (< 1MB threshold)
# Publish directly to NATS with Base64-encoded payload
# Include metadata for dashboard selection context
```
**Use Case:** Any server generates a list of available options (e.g., file selections, configuration presets) as a small DataFrame and sends to any receiving application for user selection. The selection is then sent back to the sender for processing.
### Scenario 6: Chat System
**Focus:** Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging across all platforms.
**Focus:** Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging.
**Multi-Payload Support:** The system supports mixed-payload messages where a single message can contain multiple payloads with different transport strategies. The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type.
@@ -894,42 +517,7 @@ async def smartreceive(
# Support bidirectional messaging with replyTo fields
```
**JavaScript (Sender/Receiver):**
```javascript
// Build chat message with mixed content:
// - User input text: direct transport
// - Selected image: check size, use appropriate transport
// - Audio recording: link transport for large files
// - File attachment: link transport
//
// Parse received message:
// - Direct payloads: decode Base64
// - Link payloads: fetch from HTTP with exponential backoff
// - Deserialize all payloads appropriately
//
// Render mixed content in chat interface
// Support bidirectional reply with claim-check delivery confirmation
```
**Python/Micropython (Sender/Receiver):**
```python
# Build chat message with mixed payloads:
# - Text: direct transport (Base64)
# - Small images: direct transport (Base64)
# - Large images: link transport (HTTP URL)
# - Audio/video: link transport (HTTP URL)
# - Tables: direct or link depending on size
# - Files: link transport (HTTP URL)
#
# Each payload uses appropriate transport strategy:
# - Size < 1MB → direct (NATS + Base64)
# - Size >= 1MB → link (HTTP upload + NATS URL)
#
# Include claim-check metadata for delivery tracking
# Support bidirectional messaging with replyTo fields
```
**Use Case:** Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components across all platforms.
**Use Case:** Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components.
**Implementation Note:** The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type. See the standard API format in Section 1: `msg_envelope_v1` supports `Vector{msg_payload_v1}` for multiple payloads.