418 lines
11 KiB
Markdown
418 lines
11 KiB
Markdown
# Implementation Guide: Bi-Directional Data Bridge
|
|
|
|
## Overview
|
|
|
|
This document describes the implementation of the high-performance, bi-directional data bridge between Julia and JavaScript services using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
|
|
|
|
### Multi-Payload Support
|
|
|
|
The implementation uses a **standardized list-of-tuples format** for all payload operations. **Even when sending a single payload, the user must wrap it in a list.**
|
|
|
|
**API Standard:**
|
|
```julia
|
|
# Input format for smartsend (always a list of tuples)
|
|
[(dataname1, data1), (dataname2, data2), ...]
|
|
|
|
# Output format for smartreceive (always returns a list of tuples)
|
|
[(dataname1, data1), (dataname2, data2), ...]
|
|
```
|
|
|
|
**Examples:**
|
|
```julia
|
|
# Single payload - still wrapped in a list
|
|
smartsend("/test", [(dataname1, data1)], ...)
|
|
|
|
# Multiple payloads in one message
|
|
smartsend("/test", [(dataname1, data1), (dataname2, data2)], ...)
|
|
|
|
# Receive always returns a list
|
|
payloads = smartreceive(msg, ...)
|
|
# payloads = [(dataname1, data1), (dataname2, data2), ...]
|
|
```
|
|
|
|
## Architecture
|
|
|
|
The implementation follows the Claim-Check pattern:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ SmartSend Function │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Is payload size < 1MB? │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────┴─────────────────┐
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Direct Path │ │ Link Path │
|
|
│ (< 1MB) │ │ (> 1MB) │
|
|
│ │ │ │
|
|
│ • Serialize to │ │ • Serialize to │
|
|
│ IOBuffer │ │ IOBuffer │
|
|
│ • Base64 encode │ │ • Upload to │
|
|
│ • Publish to │ │ HTTP Server │
|
|
│ NATS │ │ • Publish to │
|
|
│ │ │ NATS with URL │
|
|
└─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
## Files
|
|
|
|
### Julia Module: [`src/julia_bridge.jl`](../src/julia_bridge.jl)
|
|
|
|
The Julia implementation provides:
|
|
|
|
- **[`MessageEnvelope`](../src/julia_bridge.jl)**: Struct for the unified JSON envelope
|
|
- **[`SmartSend()`](../src/julia_bridge.jl)**: Handles transport selection based on payload size
|
|
- **[`SmartReceive()`](../src/julia_bridge.jl)**: Handles both direct and link transport
|
|
|
|
### JavaScript Module: [`src/js_bridge.js`](../src/js_bridge.js)
|
|
|
|
The JavaScript implementation provides:
|
|
|
|
- **`MessageEnvelope` class**: For the unified JSON envelope
|
|
- **[`SmartSend()`](../src/js_bridge.js)**: Handles transport selection based on payload size
|
|
- **[`SmartReceive()`](../src/js_bridge.js)**: Handles both direct and link transport
|
|
|
|
## Installation
|
|
|
|
### Julia Dependencies
|
|
|
|
```julia
|
|
using Pkg
|
|
Pkg.add("NATS")
|
|
Pkg.add("Arrow")
|
|
Pkg.add("JSON3")
|
|
Pkg.add("HTTP")
|
|
Pkg.add("UUIDs")
|
|
Pkg.add("Dates")
|
|
```
|
|
|
|
### JavaScript Dependencies
|
|
|
|
```bash
|
|
npm install nats.js apache-arrow uuid base64-url
|
|
```
|
|
|
|
## Usage Tutorial
|
|
|
|
### Step 1: Start NATS Server
|
|
|
|
```bash
|
|
docker run -p 4222:4222 nats:latest
|
|
```
|
|
|
|
### Step 2: Start HTTP File Server (optional)
|
|
|
|
```bash
|
|
# Create a directory for file uploads
|
|
mkdir -p /tmp/fileserver
|
|
|
|
# Use any HTTP server that supports POST for file uploads
|
|
# Example: Python's built-in server
|
|
python3 -m http.server 8080 --directory /tmp/fileserver
|
|
```
|
|
|
|
### Step 3: Run Test Scenarios
|
|
|
|
```bash
|
|
# Scenario 1: Command & Control (JavaScript sender)
|
|
node test/scenario1_command_control.js
|
|
|
|
# Scenario 2: Large Arrow Table (JavaScript sender)
|
|
node test/scenario2_large_table.js
|
|
|
|
# Scenario 3: Julia-to-Julia communication
|
|
# Run both Julia and JavaScript versions
|
|
julia test/scenario3_julia_to_julia.jl
|
|
node test/scenario3_julia_to_julia.js
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Scenario 0: Basic Multi-Payload Example
|
|
|
|
#### Julia (Sender)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
# Send multiple payloads in one message
|
|
smartsend(
|
|
"/test",
|
|
[("dataname1", data1), ("dataname2", data2)],
|
|
nats_url="nats://localhost:4222",
|
|
fileserver_url="http://localhost:8080/upload",
|
|
metadata=Dict("custom_key" => "custom_value")
|
|
)
|
|
|
|
# Even single payload must be wrapped in a list
|
|
smartsend("/test", [("single_data", mydata)])
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
# Receive returns a list of payloads
|
|
payloads = smartreceive(msg, "http://localhost:8080/upload")
|
|
# payloads = [(dataname1, data1), (dataname2, data2), ...]
|
|
```
|
|
|
|
### Scenario 1: Command & Control (Small JSON)
|
|
|
|
#### JavaScript (Sender)
|
|
```javascript
|
|
const { SmartSend } = require('./js_bridge');
|
|
|
|
// Single payload wrapped in a list
|
|
const config = [{
|
|
dataname: "config",
|
|
data: { step_size: 0.01, iterations: 1000 },
|
|
type: "json"
|
|
}];
|
|
|
|
await SmartSend("control", config, "json", {
|
|
correlationId: "unique-id"
|
|
});
|
|
|
|
// Multiple payloads
|
|
const configs = [
|
|
{
|
|
dataname: "config1",
|
|
data: { step_size: 0.01 },
|
|
type: "json"
|
|
},
|
|
{
|
|
dataname: "config2",
|
|
data: { iterations: 1000 },
|
|
type: "json"
|
|
}
|
|
];
|
|
|
|
await SmartSend("control", configs, "json");
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using NATS
|
|
using JSON3
|
|
|
|
# Subscribe to control subject
|
|
subscribe(nats, "control") do msg
|
|
env = MessageEnvelope(String(msg.data))
|
|
config = JSON3.read(env.payload)
|
|
|
|
# Execute simulation with parameters
|
|
step_size = config.step_size
|
|
iterations = config.iterations
|
|
|
|
# Send acknowledgment
|
|
response = Dict("status" => "Running", "correlation_id" => env.correlation_id)
|
|
publish(nats, "control_response", JSON3.stringify(response))
|
|
end
|
|
```
|
|
|
|
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
|
|
|
|
#### Julia (Sender)
|
|
```julia
|
|
using Arrow
|
|
using DataFrames
|
|
|
|
# Create large DataFrame
|
|
df = DataFrame(
|
|
id = 1:10_000_000,
|
|
value = rand(10_000_000),
|
|
category = rand(["A", "B", "C"], 10_000_000)
|
|
)
|
|
|
|
# Send via SmartSend - wrapped in a list
|
|
await SmartSend("analysis_results", [("table_data", df)], "table");
|
|
```
|
|
|
|
#### JavaScript (Receiver)
|
|
```javascript
|
|
const { SmartReceive } = require('./js_bridge');
|
|
|
|
const result = await SmartReceive(msg);
|
|
|
|
// Use table data for visualization with Perspective.js or D3
|
|
const table = result.data;
|
|
```
|
|
|
|
### Scenario 3: Live Binary Processing
|
|
|
|
#### JavaScript (Sender)
|
|
```javascript
|
|
const { SmartSend } = require('./js_bridge');
|
|
|
|
// Binary data wrapped in a list
|
|
const binaryData = [{
|
|
dataname: "audio_chunk",
|
|
data: binaryBuffer,
|
|
type: "binary"
|
|
}];
|
|
|
|
await SmartSend("binary_input", binaryData, "binary", {
|
|
metadata: {
|
|
sample_rate: 44100,
|
|
channels: 1
|
|
}
|
|
});
|
|
```
|
|
|
|
#### Julia (Receiver)
|
|
```julia
|
|
using WAV
|
|
using DSP
|
|
|
|
# Receive binary data
|
|
function process_binary(data)
|
|
# Perform FFT or AI transcription
|
|
spectrum = fft(data)
|
|
|
|
# Send results back (JSON + Arrow table)
|
|
results = Dict("transcription" => "sample text", "spectrum" => spectrum)
|
|
await SmartSend("binary_output", results, "json")
|
|
end
|
|
```
|
|
|
|
### Scenario 4: Catch-Up (JetStream)
|
|
|
|
#### Julia (Producer)
|
|
```julia
|
|
using NATSBridge
|
|
|
|
function publish_health_status(nats_url)
|
|
# Send status wrapped in a list
|
|
status = Dict("cpu" => rand(), "memory" => rand())
|
|
smartsend("health", [("status", status)], "json", nats_url=nats_url)
|
|
sleep(5) # Every 5 seconds
|
|
end
|
|
```
|
|
|
|
#### JavaScript (Consumer)
|
|
```javascript
|
|
const { connect } = require('nats');
|
|
|
|
const nc = await connect({ servers: ['nats://localhost:4222'] });
|
|
const js = nc.jetstream();
|
|
|
|
// Request replay from last 10 minutes
|
|
const consumer = await js.pullSubscribe("health", {
|
|
durable_name: "catchup",
|
|
max_batch: 100,
|
|
max_ack_wait: 30000
|
|
});
|
|
|
|
// Process historical and real-time messages
|
|
for await (const msg of consumer) {
|
|
const result = await SmartReceive(msg);
|
|
// result.data contains the list of payloads
|
|
// result.envelope contains the message envelope
|
|
msg.ack();
|
|
}
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `NATS_URL` | `nats://localhost:4222` | NATS server URL |
|
|
| `FILESERVER_URL` | `http://localhost:8080/upload` | HTTP file server URL |
|
|
| `SIZE_THRESHOLD` | `1_000_000` | Size threshold in bytes (1MB) |
|
|
|
|
### Message Envelope Schema
|
|
|
|
```json
|
|
{
|
|
"correlationId": "uuid-v4-string",
|
|
"msgId": "uuid-v4-string",
|
|
"timestamp": "2024-01-15T10:30:00Z",
|
|
|
|
"sendTo": "topic/subject",
|
|
"msgPurpose": "ACK | NACK | updateStatus | shutdown | chat",
|
|
"senderName": "agent-wine-web-frontend",
|
|
"senderId": "uuid4",
|
|
"receiverName": "agent-backend",
|
|
"receiverId": "uuid4",
|
|
"replyTo": "topic",
|
|
"replyToMsgId": "uuid4",
|
|
"BrokerURL": "nats://localhost:4222",
|
|
|
|
"metadata": {
|
|
"content_type": "application/octet-stream",
|
|
"content_length": 123456
|
|
},
|
|
|
|
"payloads": [
|
|
{
|
|
"id": "uuid4",
|
|
"dataname": "login_image",
|
|
"type": "image",
|
|
"transport": "direct",
|
|
"encoding": "base64",
|
|
"size": 15433,
|
|
"data": "base64-encoded-string",
|
|
"metadata": {
|
|
"checksum": "sha256_hash"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
### Zero-Copy Reading
|
|
- Use Arrow's memory-mapped file reading
|
|
- Avoid unnecessary data copying during deserialization
|
|
- Use Apache Arrow's native IPC reader
|
|
|
|
### Exponential Backoff
|
|
- Maximum retry count: 5
|
|
- Base delay: 100ms, max delay: 5000ms
|
|
- Implemented in both Julia and JavaScript implementations
|
|
|
|
### Correlation ID Logging
|
|
- Log correlation_id at every stage
|
|
- Include: send, receive, serialize, deserialize
|
|
- Use structured logging format
|
|
|
|
## Testing
|
|
|
|
Run the test scripts:
|
|
|
|
```bash
|
|
# Scenario 1: Command & Control (JavaScript sender)
|
|
node test/scenario1_command_control.js
|
|
|
|
# Scenario 2: Large Arrow Table (JavaScript sender)
|
|
node test/scenario2_large_table.js
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **NATS Connection Failed**
|
|
- Ensure NATS server is running
|
|
- Check NATS_URL configuration
|
|
|
|
2. **HTTP Upload Failed**
|
|
- Ensure file server is running
|
|
- Check FILESERVER_URL configuration
|
|
- Verify upload permissions
|
|
|
|
3. **Arrow IPC Deserialization Error**
|
|
- Ensure data is properly serialized to Arrow format
|
|
- Check Arrow version compatibility
|
|
|
|
## License
|
|
|
|
MIT |