26 KiB
Architecture Documentation: Bi-Directional Data Bridge
Overview
This document describes the architecture for a high-performance, bi-directional data bridge between Julia, JavaScript, and Python/Micropython applications using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
The system enables seamless communication across all three platforms:
- Julia ↔ JavaScript bi-directional messaging
- JavaScript ↔ Python/Micropython bi-directional messaging
- Julia ↔ Python/Micropython bi-directional messaging (via JSON serialization)
File Server Handler Architecture
The system uses handler functions to abstract file server operations, allowing support for different file server implementations (e.g., Plik, AWS S3, custom HTTP server).
Handler Function Signatures:
# Upload handler - uploads data to file server and returns URL
# The handler is passed to smartsend as fileserver_upload_handler parameter
# It receives: (file_server_url::String, dataname::String, data::Vector{UInt8})
# Returns: Dict{String, Any} with keys: "status", "uploadid", "fileid", "url"
fileserver_upload_handler(file_server_url::String, dataname::String, data::Vector{UInt8})::Dict{String, Any}
# Download handler - fetches data from file server URL with exponential backoff
# The handler is passed to smartreceive as fileserver_download_handler parameter
# It receives: (url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)
# Returns: Vector{UInt8} (the downloaded data)
fileserver_download_handler(url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)::Vector{UInt8}
This design allows the system to support multiple file server backends without changing the core messaging logic.
Multi-Payload Support (Standard API)
The system uses a standardized list-of-tuples format for all payload operations. Even when sending a single payload, the user must wrap it in a list.
API Standard:
# Input format for smartsend (always a list of tuples with type info)
[(dataname1, data1, type1), (dataname2, data2, type2), ...]
# Output format for smartreceive (returns a dictionary with payloads field containing list of tuples)
# Returns: Dict with envelope metadata and payloads field containing Vector{Tuple{String, Any, String}}
# {
# "correlation_id": "...",
# "msg_id": "...",
# "timestamp": "...",
# "send_to": "...",
# "msg_purpose": "...",
# "sender_name": "...",
# "sender_id": "...",
# "receiver_name": "...",
# "receiver_id": "...",
# "reply_to": "...",
# "reply_to_msg_id": "...",
# "broker_url": "...",
# "metadata": {...},
# "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...]
# }
Supported Types:
"text"- Plain text"dictionary"- JSON-serializable dictionaries (Dict, NamedTuple)"table"- Tabular data (DataFrame, array of structs)"image"- Image data (Bitmap, PNG/JPG bytes)"audio"- Audio data (WAV, MP3 bytes)"video"- Video data (MP4, AVI bytes)"binary"- Generic binary data (Vector{UInt8})
This design allows per-payload type specification, enabling mixed-content messages where different payloads can use different serialization formats in a single message.
Examples:
# Single payload - still wrapped in a list
smartsend(
"/test",
[("dataname1", data1, "dictionary")], # List with one tuple (data, type)
broker_url="nats://localhost:4222",
fileserver_upload_handler=plik_oneshot_upload
)
# Multiple payloads in one message with different types
smartsend(
"/test",
[("dataname1", data1, "dictionary"), ("dataname2", data2, "table")],
broker_url="nats://localhost:4222",
fileserver_upload_handler=plik_oneshot_upload
)
# Mixed content (e.g., chat with text, image, audio)
smartsend(
"/chat",
[
("message_text", "Hello!", "text"),
("user_image", image_data, "image"),
("audio_clip", audio_data, "audio")
],
broker_url="nats://localhost:4222"
)
# Receive returns a dictionary envelope with all metadata and deserialized payloads
env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff, max_retries=5, base_delay=100, max_delay=5000)
# env["payloads"] = [("dataname1", data1, type1), ("dataname2", data2, type2), ...]
# env["correlation_id"], env["msg_id"], etc.
# env is a dictionary containing envelope metadata and payloads field
Architecture Diagram
flowchart TD
subgraph Client
JS[JavaScript Client]
JSApp[Application Logic]
end
subgraph Server
Julia[Julia Service]
NATS[NATS Server]
FileServer[HTTP File Server]
end
JS -->|Control/Small Data| JSApp
JSApp -->|NATS| NATS
NATS -->|NATS| Julia
Julia -->|NATS| NATS
Julia -->|HTTP POST| FileServer
JS -->|HTTP GET| FileServer
style JS fill:#e1f5fe
style Julia fill:#e8f5e9
style NATS fill:#fff3e0
style FileServer fill:#f3e5f5
System Components
1. msg_envelope_v1 - Message Envelope
The msg_envelope_v1 structure provides a comprehensive message format for bidirectional communication between Julia, JavaScript, and Python/Micropython applications.
Julia Structure:
struct msg_envelope_v1
correlation_id::String # Unique identifier to track messages across systems
msg_id::String # This message id
timestamp::String # Message published timestamp
send_to::String # Topic/subject the sender sends to
msg_purpose::String # Purpose of this message (ACK | NACK | updateStatus | shutdown | ...)
sender_name::String # Sender name (e.g., "agent-wine-web-frontend")
sender_id::String # Sender id (uuid4)
receiver_name::String # Message receiver name (e.g., "agent-backend")
receiver_id::String # Message receiver id (uuid4 or nothing for broadcast)
reply_to::String # Topic to reply to
reply_to_msg_id::String # Message id this message is replying to
broker_url::String # NATS server address
metadata::Dict{String, Any}
payloads::Vector{msg_payload_v1} # Multiple payloads stored here
end
JSON Schema:
{
"correlation_id": "uuid-v4-string",
"msg_id": "uuid-v4-string",
"timestamp": "2024-01-15T10:30:00Z",
"send_to": "topic/subject",
"msg_purpose": "ACK | NACK | updateStatus | shutdown | chat",
"sender_name": "agent-wine-web-frontend",
"sender_id": "uuid4",
"receiver_name": "agent-backend",
"receiver_id": "uuid4",
"reply_to": "topic",
"reply_to_msg_id": "uuid4",
"broker_url": "nats://localhost:4222",
"metadata": {
},
"payloads": [
{
"id": "uuid4",
"dataname": "login_image",
"payload_type": "image",
"transport": "direct",
"encoding": "base64",
"size": 15433,
"data": "base64-encoded-string",
"metadata": {
}
},
{
"id": "uuid4",
"dataname": "large_data",
"payload_type": "table",
"transport": "link",
"encoding": "none",
"size": 524288,
"data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow",
"metadata": {
}
}
]
}
2. msg_payload_v1 - Payload Structure
The msg_payload_v1 structure provides flexible payload handling for various data types across all supported platforms.
Julia Structure:
struct msg_payload_v1
id::String # Id of this payload (e.g., "uuid4")
dataname::String # Name of this payload (e.g., "login_image")
payload_type::String # "text | dictionary | table | image | audio | video | binary"
transport::String # "direct | link"
encoding::String # "none | json | base64 | arrow-ipc"
size::Integer # Data size in bytes
data::Any # Payload data in case of direct transport or a URL in case of link
metadata::Dict{String, Any} # Dict("checksum" => "sha256_hash", ...)
end
Key Features:
- Supports multiple data types: text, dictionary, table, image, audio, video, binary
- Flexible transport: "direct" (NATS) or "link" (HTTP fileserver)
- Multiple payloads per message (essential for chat with mixed content)
- Per-payload and per-envelope metadata support
3. Transport Strategy Decision Logic
┌─────────────────────────────────────────────────────────────┐
│ smartsend Function │
│ Accepts: [(dataname1, data1, type1), ...] │
│ (Type is per payload, not standalone) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ For each payload: │
│ 1. Extract type from tuple │
│ 2. Serialize based on type │
│ 3. Check payload size │
└─────────────────────────────────────────────────────────────┘
│
┌────────────────┴─-────────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Direct Path │ │ Link Path │
│ (< 1MB) │ │ (> 1MB) │
│ │ │ │
│ • Serialize to │ │ • Serialize to │
│ IOBuffer │ │ IOBuffer │
│ • Base64 encode │ │ • Upload to │
│ • Publish to │ │ HTTP Server │
│ NATS │ │ • Publish to │
│ (with payload │ │ NATS with URL │
│ in envelope) │ │ (in envelope) │
└─────────────────┘ └─────────────────┘
4. Cross-Platform Architecture
flowchart TD
subgraph PythonMicropython
Py[Python/Micropython]
PySmartSend[smartsend]
PySmartReceive[smartreceive]
end
subgraph JavaScript
JS[JavaScript]
JSSmartSend[smartsend]
JSSmartReceive[smartreceive]
end
subgraph Julia
Julia[Julia]
JuliaSmartSend[smartsend]
JuliaSmartReceive[smartreceive]
end
subgraph NATS
NATSServer[NATS Server]
end
PySmartSend --> NATSServer
JSSmartSend --> NATSServer
JuliaSmartSend --> NATSServer
NATSServer --> PySmartReceive
NATSServer --> JSSmartReceive
NATSServer --> JuliaSmartReceive
style PythonMicropython fill:#e1f5fe
style JavaScript fill:#f3e5f5
style Julia fill:#e8f5e9
5. Python/Micropython Module Architecture
graph TD
subgraph PyModule
PySmartSend[smartsend]
SizeCheck[Size Check]
DirectPath[Direct Path]
LinkPath[Link Path]
HTTPClient[HTTP Client]
end
PySmartSend --> SizeCheck
SizeCheck -->|< 1MB| DirectPath
SizeCheck -->|>= 1MB| LinkPath
LinkPath --> HTTPClient
style PyModule fill:#b3e5fc
6. Julia Module Architecture
graph TD
subgraph JuliaModule
JuliaSmartSend[smartsend]
SizeCheck[Size Check]
DirectPath[Direct Path]
LinkPath[Link Path]
HTTPClient[HTTP Client]
end
JuliaSmartSend --> SizeCheck
SizeCheck -->|< 1MB| DirectPath
SizeCheck -->|>= 1MB| LinkPath
LinkPath --> HTTPClient
style JuliaModule fill:#c5e1a5
7. JavaScript Module Architecture
graph TD
subgraph JSModule
JSSmartSend[smartsend]
JSSmartReceive[smartreceive]
JetStreamConsumer[JetStream Pull Consumer]
ApacheArrow[Apache Arrow]
end
JSSmartSend --> NATS
JSSmartReceive --> JetStreamConsumer
JetStreamConsumer --> ApacheArrow
style JSModule fill:#f3e5f5
Implementation Details
Julia Implementation
Dependencies
NATS.jl- Core NATS functionalityArrow.jl- Arrow IPC serializationJSON3.jl- JSON parsingHTTP.jl- HTTP client for file serverDates.jl- Timestamps for logging
smartsend Function
function smartsend(
subject::String,
data::AbstractArray{Tuple{String, Any, String}, 1}; # List of (dataname, data, type) tuples
broker_url::String = DEFAULT_BROKER_URL, # NATS server URL
fileserver_url = DEFAULT_FILESERVER_URL,
fileserver_upload_handler::Function = plik_oneshot_upload,
size_threshold::Int = DEFAULT_SIZE_THRESHOLD,
correlation_id::Union{String, Nothing} = nothing,
msg_purpose::String = "chat",
sender_name::String = "NATSBridge",
receiver_name::String = "",
receiver_id::String = "",
reply_to::String = "",
reply_to_msg_id::String = "",
is_publish::Bool = true # Whether to automatically publish to NATS
)
Return Value:
- Returns a tuple
(env, env_json_str)where:env::msg_envelope_v1- The envelope object containing all metadata and payloadsenv_json_str::String- JSON string representation of the envelope for publishing
Options:
is_publish::Bool = true- Whentrue(default), the message is automatically published to NATS. Whenfalse, the function returns the envelope and JSON string without publishing, allowing manual publishing via NATS request-reply pattern.
The envelope object can be accessed directly for programmatic use, while the JSON string can be published directly to NATS using the request-reply pattern.
Input Format:
data::AbstractArray{Tuple{String, Any, String}}- Must be a list of (dataname, data, type) tuples:[("dataname1", data1, "type1"), ("dataname2", data2, "type2"), ...]- Even for single payloads:
[(dataname1, data1, "type1")] - Each payload can have a different type, enabling mixed-content messages
Flow:
- Iterate through the list of
(dataname, data, type)tuples - For each payload: extract the type from the tuple and serialize accordingly
- Check payload size
- If < threshold: publish directly to NATS with Base64-encoded payload
- If >= threshold: upload to HTTP server, publish NATS with URL
smartreceive Handler
function smartreceive(
msg::NATS.Msg;
fileserver_download_handler::Function = _fetch_with_backoff,
max_retries::Int = 5,
base_delay::Int = 100,
max_delay::Int = 5000
)
# Parse envelope
# Iterate through all payloads
# For each payload: check transport type
# If direct: decode Base64 payload
# If link: fetch from URL with exponential backoff using fileserver_download_handler
# Deserialize payload based on type
# Return envelope dictionary with all metadata and deserialized payloads
end
Output Format:
- Returns a dictionary (key-value map) containing all envelope fields:
correlation_id,msg_id,timestamp,send_to,msg_purpose,sender_name,sender_id,receiver_name,receiver_id,reply_to,reply_to_msg_id,broker_urlmetadata- Message-level metadata dictionarypayloads- List of dictionaries, each containing deserialized payload data
Process Flow:
- Parse the JSON envelope to extract all fields
- Iterate through each payload in
payloads - For each payload:
- Determine transport type (
directorlink) - If
direct: decode Base64 data from the message - If
link: fetch data from URL using exponential backoff (viafileserver_download_handler) - Deserialize based on payload type (
dictionary,table,binary, etc.)
- Determine transport type (
- Return envelope dictionary with
payloadsfield containing list of(dataname, data, type)tuples
Note: The fileserver_download_handler receives (url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String) and returns Vector{UInt8}.
JavaScript Implementation
Dependencies
nats.js- Core NATS functionalityapache-arrow- Arrow IPC serializationuuid- Correlation ID generation
smartsend Function
async function smartsend(subject, data, options = {})
// data format: [(dataname, data, type), ...]
// options object should include:
// - natsUrl: NATS server URL
// - fileserverUrl: base URL of the file server
// - sizeThreshold: threshold in bytes for transport selection
// - correlationId: optional correlation ID for tracing
Input Format:
data- Must be a list of (dataname, data, type) tuples:[(dataname1, data1, "type1"), (dataname2, data2, "type2"), ...]- Even for single payloads:
[(dataname1, data1, "type1")] - Each payload can have a different type, enabling mixed-content messages
Flow:
- Iterate through the list of (dataname, data, type) tuples
- For each payload: extract the type from the tuple and serialize accordingly
- Check payload size
- If < threshold: publish directly to NATS
- If >= threshold: upload to HTTP server, publish NATS with URL
smartreceive Handler
async function smartreceive(msg, options = {})
// options object should include:
// - fileserverDownloadHandler: function to fetch data from file server URL
// - max_retries: maximum retry attempts for fetching URL
// - base_delay: initial delay for exponential backoff in ms
// - max_delay: maximum delay for exponential backoff in ms
// - correlationId: optional correlation ID for tracing
Output Format:
- Returns a dictionary (key-value map) containing all envelope fields:
correlationId,msgId,timestamp,sendTo,msgPurpose,senderName,senderId,receiverName,receiverId,replyTo,replyToMsgId,brokerURLmetadata- Message-level metadata dictionarypayloads- List of dictionaries, each containing deserialized payload data
Process Flow:
- Parse the JSON envelope to extract all fields
- Iterate through each payload in
payloads - For each payload:
- Determine transport type (
directorlink) - If
direct: decode Base64 data from the message - If
link: fetch data from URL using exponential backoff - Deserialize based on payload type (
dictionary,table,binary, etc.)
- Determine transport type (
- Return envelope dictionary with
payloadsfield containing list of(dataname, data, type)tuples
Scenario Implementations
Scenario 1: Command & Control (Small Dictionary)
Julia (Sender/Receiver):
# Subscribe to control subject
# Parse JSON envelope
# Execute simulation with parameters
# Send acknowledgment
JavaScript (Sender/Receiver):
// Create small dictionary config
// Send via smartsend with type="dictionary"
Python/Micropython (Sender/Receiver):
# Create small dictionary config
# Send via smartsend with type="dictionary"
Scenario 2: Deep Dive Analysis (Large Arrow Table)
Julia (Sender/Receiver):
# Create large DataFrame
# Convert to Arrow IPC stream
# Check size (> 1MB)
# Upload to HTTP server
# Publish NATS with URL
JavaScript (Sender/Receiver):
// Receive NATS message with URL
// Fetch data from HTTP server
// Parse Arrow IPC with zero-copy
// Load into Perspective.js or D3
Python/Micropython (Sender/Receiver):
# Create large DataFrame
# Convert to Arrow IPC stream
# Check size (> 1MB)
# Upload to HTTP server
# Publish NATS with URL
Scenario 3: Live Audio Processing
JavaScript (Sender/Receiver):
// Capture audio chunk
// Send as binary with metadata headers
// Use smartsend with type="audio"
Julia (Sender/Receiver):
# Receive audio data
# Perform FFT or AI transcription
# Send results back (JSON + Arrow table)
Python/Micropython (Sender/Receiver):
# Capture audio chunk
# Send as binary with metadata headers
# Use smartsend with type="audio"
Scenario 4: Catch-Up (JetStream)
Julia (Producer/Consumer):
# Publish to JetStream
# Include metadata for temporal tracking
JavaScript (Producer/Consumer):
// Connect to JetStream
// Request replay from last 10 minutes
// Process historical and real-time messages
Python/Micropython (Producer/Consumer):
# Publish to JetStream
# Include metadata for temporal tracking
Scenario 5: Selection (Low Bandwidth)
Focus: Small Arrow tables, cross-platform communication. The Action: Any platform wants to send a small DataFrame to show on any receiving application for the user to choose.
Julia (Sender/Receiver):
# Create small DataFrame (e.g., 50KB - 500KB)
# Convert to Arrow IPC stream
# Check payload size (< 1MB threshold)
# Publish directly to NATS with Base64-encoded payload
# Include metadata for dashboard selection context
JavaScript (Sender/Receiver):
// Receive NATS message with direct transport
// Decode Base64 payload
// Parse Arrow IPC with zero-copy
// Load into selection UI component (e.g., dropdown, table)
// User makes selection
// Send selection back to Julia
Python/Micropython (Sender/Receiver):
# Create small DataFrame (e.g., 50KB - 500KB)
# Convert to Arrow IPC stream
# Check payload size (< 1MB threshold)
# Publish directly to NATS with Base64-encoded payload
# Include metadata for dashboard selection context
Use Case: Any server generates a list of available options (e.g., file selections, configuration presets) as a small DataFrame and sends to any receiving application for user selection. The selection is then sent back to the sender for processing.
Scenario 6: Chat System
Focus: Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging across all platforms.
Multi-Payload Support: The system supports mixed-payload messages where a single message can contain multiple payloads with different transport strategies. The smartreceive function iterates through all payloads in the envelope and processes each according to its transport type.
Julia (Sender/Receiver):
# Build chat message with mixed payloads:
# - Text: direct transport (Base64)
# - Small images: direct transport (Base64)
# - Large images: link transport (HTTP URL)
# - Audio/video: link transport (HTTP URL)
# - Tables: direct or link depending on size
# - Files: link transport (HTTP URL)
#
# Each payload uses appropriate transport strategy:
# - Size < 1MB → direct (NATS + Base64)
# - Size >= 1MB → link (HTTP upload + NATS URL)
#
# Include claim-check metadata for delivery tracking
# Support bidirectional messaging with replyTo fields
JavaScript (Sender/Receiver):
// Build chat message with mixed content:
// - User input text: direct transport
// - Selected image: check size, use appropriate transport
// - Audio recording: link transport for large files
// - File attachment: link transport
//
// Parse received message:
// - Direct payloads: decode Base64
// - Link payloads: fetch from HTTP with exponential backoff
// - Deserialize all payloads appropriately
//
// Render mixed content in chat interface
// Support bidirectional reply with claim-check delivery confirmation
Python/Micropython (Sender/Receiver):
# Build chat message with mixed payloads:
# - Text: direct transport (Base64)
# - Small images: direct transport (Base64)
# - Large images: link transport (HTTP URL)
# - Audio/video: link transport (HTTP URL)
# - Tables: direct or link depending on size
# - Files: link transport (HTTP URL)
#
# Each payload uses appropriate transport strategy:
# - Size < 1MB → direct (NATS + Base64)
# - Size >= 1MB → link (HTTP upload + NATS URL)
#
# Include claim-check metadata for delivery tracking
# Support bidirectional messaging with replyTo fields
Use Case: Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components across all platforms.
Implementation Note: The smartreceive function iterates through all payloads in the envelope and processes each according to its transport type. See the standard API format in Section 1: msgEnvelope_v1 supports AbstractArray{msgPayload_v1} for multiple payloads.
Performance Considerations
Zero-Copy Reading
- Use Arrow's memory-mapped file reading
- Avoid unnecessary data copying during deserialization
- Use Apache Arrow's native IPC reader
Exponential Backoff
- Implement exponential backoff for HTTP link fetching
- Maximum retry count: 5
- Base delay: 100ms, max delay: 5000ms
Correlation ID Logging
- Log correlation_id at every stage
- Include: send, receive, serialize, deserialize
- Use structured logging format
Testing Strategy
Unit Tests
- Test smartsend with various payload sizes
- Test smartreceive with direct and link transport
- Test Arrow IPC serialization/deserialization
Integration Tests
- Test full flow with NATS server
- Test large data transfer (> 100MB)
- Test audio processing pipeline
Performance Tests
- Measure throughput for small payloads
- Measure throughput for large payloads