622 lines
19 KiB
Markdown
622 lines
19 KiB
Markdown
# Implementation Guide: Bi-Directional Data Bridge
|
|
|
|
## Overview
|
|
|
|
This document describes the implementation of the high-performance, bi-directional data bridge between Julia and JavaScript services using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
|
|
|
|
### Multi-Payload Support
|
|
|
|
The implementation uses a **standardized list-of-tuples format** for all payload operations. **Even when sending a single payload, the user must wrap it in a list.**
|
|
|
|
**API Standard:**
|
|
```julia
|
|
# Input format for smartsend (always a list of tuples with type info)
|
|
[(dataname1, data1, type1), (dataname2, data2, type2), ...]
|
|
|
|
# Output format for smartreceive (returns envelope dictionary with payloads field)
|
|
# Returns: Dict with envelope metadata and payloads field containing list of tuples
|
|
# {
|
|
# "correlationId": "...",
|
|
# "msgId": "...",
|
|
# "timestamp": "...",
|
|
# "sendTo": "...",
|
|
# "msgPurpose": "...",
|
|
# "senderName": "...",
|
|
# "senderId": "...",
|
|
# "receiverName": "...",
|
|
# "receiverId": "...",
|
|
# "replyTo": "...",
|
|
# "replyToMsgId": "...",
|
|
# "brokerURL": "...",
|
|
# "metadata": {...},
|
|
# "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...]
|
|
# }
|
|
```
|
|
|
|
Where `type` can be: `"text"`, `"dictionary"`, `"table"`, `"image"`, `"audio"`, `"video"`, `"binary"`
|
|
|
|
**Examples:**
|
|
```julia
|
|
# Single payload - still wrapped in a list (type is required as third element)
|
|
smartsend("/test", [(dataname1, data1, "text")], ...)
|
|
|
|
# Multiple payloads in one message (each payload has its own type)
|
|
smartsend("/test", [(dataname1, data1, "dictionary"), (dataname2, data2, "table")], ...)
|
|
|
|
# Receive returns a dictionary envelope with all metadata and deserialized payloads
|
|
envelope = smartreceive(msg, ...)
|
|
# envelope["payloads"] = [(dataname1, data1, "text"), (dataname2, data2, "table"), ...]
|
|
# envelope["correlationId"], envelope["msgId"], etc.
|
|
```
|
|
|
|
## Architecture
|
|
|
|
The implementation follows the Claim-Check pattern:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ SmartSend Function │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Is payload size < 1MB? │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────┴─────────────────┐
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Direct Path │ │ Link Path │
|
|
│ (< 1MB) │ │ (> 1MB) │
|
|
│ │ │ │
|
|
│ • Serialize to │ │ • Serialize to │
|
|
│ IOBuffer │ │ IOBuffer │
|
|
│ • Base64 encode │ │ • Upload to │
|
|
│ • Publish to │ │ HTTP Server │
|
|
│ NATS │ │ • Publish to │
|
|
│ │ │ NATS with URL │
|
|
└─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
## Files
|
|
|
|
### Julia Module: [`src/julia_bridge.jl`](../src/julia_bridge.jl)
|
|
|
|
The Julia implementation provides:
|
|
|
|
- **[`MessageEnvelope`](../src/julia_bridge.jl)**: Struct for the unified JSON envelope
|
|
- **[`SmartSend()`](../src/julia_bridge.jl)**: Handles transport selection based on payload size
|
|
- **[`SmartReceive()`](../src/julia_bridge.jl)**: Handles both direct and link transport
|
|
|
|
### JavaScript Module: [`src/NATSBridge.js`](../src/NATSBridge.js)
|
|
|
|
The JavaScript implementation provides:
|
|
|
|
- **`MessageEnvelope` class**: For the unified JSON envelope
|
|
- **`MessagePayload` class**: For individual payload representation
|
|
- **[`smartsend()`](../src/NATSBridge.js)**: Handles transport selection based on payload size
|
|
- **[`smartreceive()`](../src/NATSBridge.js)**: Handles both direct and link transport
|
|
|
|
## Installation
|
|
|
|
### Julia Dependencies
|
|
|
|
```julia
|
|
using Pkg
|
|
Pkg.add("NATS")
|
|
Pkg.add("Arrow")
|
|
Pkg.add("JSON3")
|
|
Pkg.add("HTTP")
|
|
Pkg.add("UUIDs")
|
|
Pkg.add("Dates")
|
|
```
|
|
|
|
### JavaScript Dependencies
|
|
|
|
```bash
|
|
npm install nats.js apache-arrow uuid base64-url
|
|
```
|
|
|
|
## Usage Tutorial
|
|
|
|
### Step 1: Start NATS Server
|
|
|
|
```bash
|
|
docker run -p 4222:4222 nats:latest
|
|
```
|
|
|
|
### Step 2: Start HTTP File Server (optional)
|
|
|
|
```bash
|
|
# Create a directory for file uploads
|
|
mkdir -p /tmp/fileserver
|
|
|
|
# Use any HTTP server that supports POST for file uploads
|
|
# Example: Python's built-in server
|
|
python3 -m http.server 8080 --directory /tmp/fileserver
|
|
```
|
|
|
|
### Step 3: Run Test Scenarios
|
|
|
|
```bash
|
|
# Scenario 1: Command & Control (JavaScript sender)
|
|
node test/scenario1_command_control.js
|
|
|
|
# Scenario 2: Large Arrow Table (JavaScript sender)
|
|
node test/scenario2_large_table.js
|
|
|
|
# Scenario 3: Julia-to-Julia communication
|
|
# Run both Julia and JavaScript versions
|
|
julia test/scenario3_julia_to_julia.jl
|
|
node test/scenario3_julia_to_julia.js
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Scenario 0: Basic Multi-Payload Example
|
|
|
|
#### Julia (Sender)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
# Send multiple payloads in one message (type is required per payload)
|
|
smartsend(
|
|
"/test",
|
|
[("dataname1", data1, "dictionary"), ("dataname2", data2, "table")],
|
|
nats_url="nats://localhost:4222",
|
|
fileserver_url="http://localhost:8080",
|
|
metadata=Dict("custom_key" => "custom_value")
|
|
)
|
|
|
|
# Even single payload must be wrapped in a list with type
|
|
smartsend("/test", [("single_data", mydata, "dictionary")])
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
# Receive returns a dictionary envelope with all metadata and deserialized payloads
|
|
envelope = smartreceive(msg, "http://localhost:8080")
|
|
# envelope["payloads"] = [(dataname1, data1, "dictionary"), (dataname2, data2, "table"), ...]
|
|
# envelope["correlationId"], envelope["msgId"], etc.
|
|
```
|
|
|
|
### Scenario 1: Command & Control (Small JSON)
|
|
|
|
#### JavaScript (Sender)
|
|
```javascript
|
|
const { smartsend } = require('./src/NATSBridge');
|
|
|
|
// Single payload wrapped in a list
|
|
const config = [{
|
|
dataname: "config",
|
|
data: { step_size: 0.01, iterations: 1000 },
|
|
type: "dictionary"
|
|
}];
|
|
|
|
await smartsend("control", config, {
|
|
correlationId: "unique-id"
|
|
});
|
|
|
|
// Multiple payloads
|
|
const configs = [
|
|
{
|
|
dataname: "config1",
|
|
data: { step_size: 0.01 },
|
|
type: "dictionary"
|
|
},
|
|
{
|
|
dataname: "config2",
|
|
data: { iterations: 1000 },
|
|
type: "dictionary"
|
|
}
|
|
];
|
|
|
|
await smartsend("control", configs);
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using NATS
|
|
using JSON3
|
|
|
|
# Subscribe to control subject
|
|
subscribe(nats, "control") do msg
|
|
env = MessageEnvelope(String(msg.data))
|
|
config = JSON3.read(env.payload)
|
|
|
|
# Execute simulation with parameters
|
|
step_size = config.step_size
|
|
iterations = config.iterations
|
|
|
|
# Send acknowledgment
|
|
response = Dict("status" => "Running", "correlation_id" => env.correlation_id)
|
|
publish(nats, "control_response", JSON3.stringify(response))
|
|
end
|
|
```
|
|
|
|
### JavaScript (Receiver)
|
|
```javascript
|
|
const { smartreceive } = require('./src/NATSBridge');
|
|
|
|
// Subscribe to messages
|
|
const nc = await connect({ servers: ['nats://localhost:4222'] });
|
|
const sub = nc.subscribe("control");
|
|
|
|
for await (const msg of sub) {
|
|
const envelope = await smartreceive(msg);
|
|
|
|
// Process the payloads from the envelope
|
|
for (const payload of envelope.payloads) {
|
|
const { dataname, data, type } = payload;
|
|
console.log(`Received ${dataname} of type ${type}`);
|
|
console.log(`Data: ${JSON.stringify(data)}`);
|
|
}
|
|
|
|
// Also access envelope metadata
|
|
console.log(`Correlation ID: ${envelope.correlationId}`);
|
|
console.log(`Message ID: ${envelope.msgId}`);
|
|
}
|
|
```
|
|
|
|
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
|
|
|
|
#### Julia (Sender)
|
|
```julia
|
|
using Arrow
|
|
using DataFrames
|
|
|
|
# Create large DataFrame
|
|
df = DataFrame(
|
|
id = 1:10_000_000,
|
|
value = rand(10_000_000),
|
|
category = rand(["A", "B", "C"], 10_000_000)
|
|
)
|
|
|
|
# Send via SmartSend - wrapped in a list (type is part of each tuple)
|
|
await SmartSend("analysis_results", [("table_data", df, "table")]);
|
|
```
|
|
|
|
#### JavaScript (Receiver)
|
|
```javascript
|
|
const { smartreceive } = require('./src/NATSBridge');
|
|
|
|
const envelope = await smartreceive(msg);
|
|
|
|
// Use table data from the payloads field
|
|
// Note: Tables are sent as arrays of objects in JavaScript
|
|
const table = envelope.payloads;
|
|
```
|
|
|
|
### Scenario 3: Live Binary Processing
|
|
|
|
#### JavaScript (Sender)
|
|
```javascript
|
|
const { smartsend } = require('./src/NATSBridge');
|
|
|
|
// Binary data wrapped in a list
|
|
const binaryData = [{
|
|
dataname: "audio_chunk",
|
|
data: binaryBuffer, // ArrayBuffer or Uint8Array
|
|
type: "binary"
|
|
}];
|
|
|
|
await smartsend("binary_input", binaryData, {
|
|
metadata: {
|
|
sample_rate: 44100,
|
|
channels: 1
|
|
}
|
|
});
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using WAV
|
|
using DSP
|
|
|
|
# Receive binary data
|
|
function process_binary(data)
|
|
# Perform FFT or AI transcription
|
|
spectrum = fft(data)
|
|
|
|
# Send results back (JSON + Arrow table)
|
|
results = Dict("transcription" => "sample text", "spectrum" => spectrum)
|
|
await SmartSend("binary_output", results, "json")
|
|
end
|
|
```
|
|
|
|
### JavaScript (Receiver)
|
|
```javascript
|
|
const { smartreceive } = require('./src/NATSBridge');
|
|
|
|
// Receive binary data
|
|
function process_binary(msg) {
|
|
const envelope = await smartreceive(msg);
|
|
|
|
// Process the binary data from envelope.payloads
|
|
for (const payload of envelope.payloads) {
|
|
if (payload.type === "binary") {
|
|
// data is an ArrayBuffer or Uint8Array
|
|
console.log(`Received binary data: ${payload.dataname}, size: ${payload.data.length}`);
|
|
// Perform FFT or AI transcription here
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Scenario 4: Catch-Up (JetStream)
|
|
|
|
#### Julia (Producer)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
function publish_health_status(nats_url)
|
|
# Send status wrapped in a list (type is part of each tuple)
|
|
status = Dict("cpu" => rand(), "memory" => rand())
|
|
smartsend("health", [("status", status, "dictionary")], nats_url=nats_url)
|
|
sleep(5) # Every 5 seconds
|
|
end
|
|
```
|
|
|
|
#### JavaScript (Consumer)
|
|
```javascript
|
|
const { connect } = require('nats');
|
|
const { smartreceive } = require('./src/NATSBridge');
|
|
|
|
const nc = await connect({ servers: ['nats://localhost:4222'] });
|
|
const js = nc.jetstream();
|
|
|
|
// Request replay from last 10 minutes
|
|
const consumer = await js.pullSubscribe("health", {
|
|
durable_name: "catchup",
|
|
max_batch: 100,
|
|
max_ack_wait: 30000
|
|
});
|
|
|
|
// Process historical and real-time messages
|
|
for await (const msg of consumer) {
|
|
const envelope = await smartreceive(msg);
|
|
// envelope.payloads contains the list of payloads
|
|
// Each payload has: dataname, data, type
|
|
msg.ack();
|
|
}
|
|
```
|
|
|
|
### Scenario 5: Selection (Low Bandwidth)
|
|
|
|
**Focus:** Small Arrow tables, Julia to JavaScript. The Action: Julia wants to send a small DataFrame to show on a JavaScript dashboard for the user to choose.
|
|
|
|
**Julia (Sender):**
|
|
```julia
|
|
using NATSBridge
|
|
using DataFrames
|
|
|
|
# Create small DataFrame (e.g., 50KB - 500KB)
|
|
options_df = DataFrame(
|
|
id = 1:10,
|
|
name = ["Option A", "Option B", "Option C", "Option D", "Option E",
|
|
"Option F", "Option G", "Option H", "Option I", "Option J"],
|
|
description = ["Description A", "Description B", "Description C", "Description D", "Description E",
|
|
"Description F", "Description G", "Description H", "Description I", "Description J"]
|
|
)
|
|
|
|
# Convert to Arrow IPC stream
|
|
# Check payload size (< 1MB threshold)
|
|
# Publish directly to NATS with Base64-encoded payload
|
|
# Include metadata for dashboard selection context
|
|
smartsend(
|
|
"dashboard.selection",
|
|
[("options_table", options_df, "table")],
|
|
nats_url="nats://localhost:4222",
|
|
metadata=Dict("context" => "user_selection")
|
|
)
|
|
```
|
|
|
|
**JavaScript (Receiver):**
|
|
```javascript
|
|
const { smartreceive, smartsend } = require('./src/NATSBridge');
|
|
|
|
// Receive NATS message with direct transport
|
|
const envelope = await smartreceive(msg);
|
|
|
|
// Decode Base64 payload (for direct transport)
|
|
// For tables, data is in envelope.payloads
|
|
const table = envelope.payloads; // Array of objects
|
|
|
|
// User makes selection
|
|
const selection = uiComponent.getSelectedOption();
|
|
|
|
// Send selection back to Julia
|
|
await smartsend("dashboard.response", [
|
|
{ dataname: "selected_option", data: selection, type: "dictionary" }
|
|
]);
|
|
```
|
|
|
|
**Use Case:** Julia server generates a list of available options (e.g., file selections, configuration presets) as a small DataFrame and sends to JavaScript dashboard for user selection. The selection is then sent back to Julia for processing.
|
|
|
|
### Scenario 6: Chat System
|
|
|
|
**Focus:** Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging.
|
|
|
|
**Multi-Payload Support:** The system supports mixed-payload messages where a single message can contain multiple payloads with different transport strategies. The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type.
|
|
|
|
**Julia (Sender/Receiver):**
|
|
```julia
|
|
using NATSBridge
|
|
using DataFrames
|
|
|
|
# Build chat message with mixed payloads:
|
|
# - Text: direct transport (Base64)
|
|
# - Small images: direct transport (Base64)
|
|
# - Large images: link transport (HTTP URL)
|
|
# - Audio/video: link transport (HTTP URL)
|
|
# - Tables: direct or link depending on size
|
|
# - Files: link transport (HTTP URL)
|
|
#
|
|
# Each payload uses appropriate transport strategy:
|
|
# - Size < 1MB → direct (NATS + Base64)
|
|
# - Size >= 1MB → link (HTTP upload + NATS URL)
|
|
#
|
|
# Include claim-check metadata for delivery tracking
|
|
# Support bidirectional messaging with replyTo fields
|
|
|
|
# Example: Chat with text, small image, and large file
|
|
chat_message = [
|
|
("message_text", "Hello, this is a test message!", "text"),
|
|
("user_avatar", image_bytes, "image"), # Small image, direct transport
|
|
("large_document", large_file_bytes, "binary") # Large file, link transport
|
|
]
|
|
|
|
smartsend(
|
|
"chat.room123",
|
|
chat_message,
|
|
nats_url="nats://localhost:4222",
|
|
msg_purpose="chat",
|
|
reply_to="chat.room123.responses"
|
|
)
|
|
```
|
|
|
|
**JavaScript (Sender/Receiver):**
|
|
```javascript
|
|
const { smartsend, smartreceive } = require('./src/NATSBridge');
|
|
|
|
// Build chat message with mixed content:
|
|
// - User input text: direct transport
|
|
// - Selected image: check size, use appropriate transport
|
|
// - Audio recording: link transport for large files
|
|
// - File attachment: link transport
|
|
//
|
|
// Parse received message:
|
|
// - Direct payloads: decode Base64
|
|
// - Link payloads: fetch from HTTP with exponential backoff
|
|
// - Deserialize all payloads appropriately
|
|
//
|
|
// Render mixed content in chat interface
|
|
// Support bidirectional reply with claim-check delivery confirmation
|
|
|
|
// Example: Send chat with mixed content
|
|
const message = [
|
|
{
|
|
dataname: "text",
|
|
data: "Hello from JavaScript!",
|
|
type: "text"
|
|
},
|
|
{
|
|
dataname: "image",
|
|
data: selectedImageBuffer, // Small image (ArrayBuffer or Uint8Array)
|
|
type: "image"
|
|
},
|
|
{
|
|
dataname: "audio",
|
|
data: audioUrl, // Large audio, link transport
|
|
type: "audio"
|
|
}
|
|
];
|
|
|
|
await smartsend("chat.room123", message);
|
|
```
|
|
|
|
**Use Case:** Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components.
|
|
|
|
**Implementation Note:** The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type. See the standard API format in Section 1: `msgEnvelope_v1` supports `AbstractArray{msgPayload_v1}` for multiple payloads.
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `NATS_URL` | `nats://localhost:4222` | NATS server URL |
|
|
| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL (base URL without `/upload` suffix) |
|
|
| `SIZE_THRESHOLD` | `1_000_000` | Size threshold in bytes (1MB) |
|
|
|
|
### Message Envelope Schema
|
|
|
|
```json
|
|
{
|
|
"correlationId": "uuid-v4-string",
|
|
"msgId": "uuid-v4-string",
|
|
"timestamp": "2024-01-15T10:30:00Z",
|
|
|
|
"sendTo": "topic/subject",
|
|
"msgPurpose": "ACK | NACK | updateStatus | shutdown | chat",
|
|
"senderName": "agent-wine-web-frontend",
|
|
"senderId": "uuid4",
|
|
"receiverName": "agent-backend",
|
|
"receiverId": "uuid4",
|
|
"replyTo": "topic",
|
|
"replyToMsgId": "uuid4",
|
|
"BrokerURL": "nats://localhost:4222",
|
|
|
|
"metadata": {
|
|
"content_type": "application/octet-stream",
|
|
"content_length": 123456
|
|
},
|
|
|
|
"payloads": [
|
|
{
|
|
"id": "uuid4",
|
|
"dataname": "login_image",
|
|
"type": "image",
|
|
"transport": "direct",
|
|
"encoding": "base64",
|
|
"size": 15433,
|
|
"data": "base64-encoded-string",
|
|
"metadata": {
|
|
"checksum": "sha256_hash"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
### Zero-Copy Reading
|
|
- Use Arrow's memory-mapped file reading
|
|
- Avoid unnecessary data copying during deserialization
|
|
- Use Apache Arrow's native IPC reader
|
|
|
|
### Exponential Backoff
|
|
- Maximum retry count: 5
|
|
- Base delay: 100ms, max delay: 5000ms
|
|
- Implemented in both Julia and JavaScript implementations
|
|
|
|
### Correlation ID Logging
|
|
- Log correlation_id at every stage
|
|
- Include: send, receive, serialize, deserialize
|
|
- Use structured logging format
|
|
|
|
## Testing
|
|
|
|
Run the test scripts:
|
|
|
|
```bash
|
|
# Scenario 1: Command & Control (JavaScript sender)
|
|
node test/scenario1_command_control.js
|
|
|
|
# Scenario 2: Large Arrow Table (JavaScript sender)
|
|
node test/scenario2_large_table.js
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **NATS Connection Failed**
|
|
- Ensure NATS server is running
|
|
- Check NATS_URL configuration
|
|
|
|
2. **HTTP Upload Failed**
|
|
- Ensure file server is running
|
|
- Check FILESERVER_URL configuration
|
|
- Verify upload permissions
|
|
|
|
3. **Arrow IPC Deserialization Error**
|
|
- Ensure data is properly serialized to Arrow format
|
|
- Check Arrow version compatibility
|
|
|
|
## License
|
|
|
|
MIT |