Files
NATSBridge/docs/implementation.md
2026-02-25 20:27:51 +07:00

648 lines
22 KiB
Markdown

# Implementation Guide: Bi-Directional Data Bridge
## Overview
This document describes the implementation of the high-performance, bi-directional data bridge for **Julia** applications using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
The system enables seamless communication for Julia applications.
### Implementation Files
NATSBridge is implemented in Julia:
| Language | Implementation File | Description |
|----------|---------------------|-------------|
| **Julia** | [`src/NATSBridge.jl`](../src/NATSBridge.jl) | Full Julia implementation with Arrow IPC support |
### File Server Handler Architecture
The system uses **handler functions** to abstract file server operations, allowing support for different file server implementations (e.g., Plik, AWS S3, custom HTTP server).
**Handler Function Signatures:**
```julia
# Upload handler - uploads data to file server and returns URL
# The handler is passed to smartsend as fileserver_upload_handler parameter
# It receives: (fileserver_url::String, dataname::String, data::Vector{UInt8})
# Returns: Dict{String, Any} with keys: "status", "uploadid", "fileid", "url"
fileserver_upload_handler(fileserver_url::String, dataname::String, data::Vector{UInt8})::Dict{String, Any}
# Download handler - fetches data from file server URL with exponential backoff
# The handler is passed to smartreceive as fileserver_download_handler parameter
# It receives: (url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)
# Returns: Vector{UInt8} (the downloaded data)
fileserver_download_handler(url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)::Vector{UInt8}
```
This design allows the system to support multiple file server backends without changing the core messaging logic.
### Multi-Payload Support (Standard API)
The system uses a **standardized list-of-tuples format** for all payload operations. **Even when sending a single payload, the user must wrap it in a list.**
**API Standard:**
```julia
# Input format for smartsend (always a list of tuples with type info)
[(dataname1, data1, type1), (dataname2, data2, type2), ...]
# Output format for smartreceive (returns a dictionary with payloads field containing list of tuples)
# Returns: Dict with envelope metadata and payloads field containing Vector{Tuple{String, Any, String}}
# {
# "correlation_id": "...",
# "msg_id": "...",
# "timestamp": "...",
# "send_to": "...",
# "msg_purpose": "...",
# "sender_name": "...",
# "sender_id": "...",
# "receiver_name": "...",
# "receiver_id": "...",
# "reply_to": "...",
# "reply_to_msg_id": "...",
# "broker_url": "...",
# "metadata": {...},
# "payloads": [(dataname1, data1, type1), (dataname2, data2, type2), ...]
# }
```
**Supported Types:**
- `"text"` - Plain text
- `"dictionary"` - JSON-serializable dictionaries (Dict, NamedTuple)
- `"table"` - Tabular data (DataFrame, array of structs)
- `"image"` - Image data (Bitmap, PNG/JPG bytes)
- `"audio"` - Audio data (WAV, MP3 bytes)
- `"video"` - Video data (MP4, AVI bytes)
- `"binary"` - Generic binary data (Vector{UInt8})
This design allows per-payload type specification, enabling **mixed-content messages** where different payloads can use different serialization formats in a single message.
**Examples:**
```julia
# Single payload - still wrapped in a list
smartsend(
"/test",
[("dataname1", data1, "dictionary")], # List with one tuple (data, type)
broker_url="nats://localhost:4222",
fileserver_upload_handler=plik_oneshot_upload
)
# Multiple payloads in one message with different types
smartsend(
"/test",
[("dataname1", data1, "dictionary"), ("dataname2", data2, "table")],
broker_url="nats://localhost:4222",
fileserver_upload_handler=plik_oneshot_upload
)
# Mixed content (e.g., chat with text, image, audio)
smartsend(
"/chat",
[
("message_text", "Hello!", "text"),
("user_image", image_data, "image"),
("audio_clip", audio_data, "audio")
],
broker_url="nats://localhost:4222"
)
# Receive returns a dictionary envelope with all metadata and deserialized payloads
env = smartreceive(msg; fileserver_download_handler=_fetch_with_backoff, max_retries=5, base_delay=100, max_delay=5000)
# env["payloads"] = [("dataname1", data1, type1), ("dataname2", data2, type2), ...]
# env["correlation_id"], env["msg_id"], etc.
# env is a dictionary containing envelope metadata and payloads field
```
## Architecture
The Julia implementation follows the Claim-Check pattern:
```
┌─────────────────────────────────────────────────────────────────────────┐
│ SmartSend Function │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Is payload size < 1MB? │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────┴─────────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Direct Path │ │ Link Path │
│ (< 1MB) │ │ (> 1MB) │
│ │ │ │
│ • Serialize to │ │ • Serialize to │
│ Buffer │ │ Buffer │
│ • Base64 encode │ │ • Upload to │
│ • Publish to │ │ HTTP Server │
│ NATS │ │ • Publish to │
│ │ │ NATS with URL │
└─────────────────┘ └─────────────────┘
```
## smartsend Return Value
The `smartsend` function now returns a tuple containing both the envelope object and the JSON string representation:
```julia
env, env_json_str = smartsend(...)
# env::msg_envelope_v1 - The envelope object with all metadata and payloads
# env_json_str::String - JSON string for publishing to NATS
```
**Options:**
- `is_publish::Bool = true` - When `true` (default), the message is automatically published to NATS. When `false`, the function returns the envelope and JSON string without publishing, allowing manual publishing via NATS request-reply pattern.
This enables two use cases:
1. **Programmatic envelope access**: Access envelope fields directly via the `env` object
2. **Direct JSON publishing**: Publish the JSON string directly using NATS request-reply pattern
### Julia Module: [`src/NATSBridge.jl`](../src/NATSBridge.jl)
The Julia implementation provides:
- **[`msg_envelope_v1`](src/NATSBridge.jl)**: Struct for the unified JSON envelope
- **[`msg_payload_v1`](src/NATSBridge.jl)**: Struct for individual payload representation
- **[`smartsend()`](src/NATSBridge.jl)**: Handles transport selection based on payload size
- **[`smartreceive()`](src/NATSBridge.jl)**: Handles both direct and link transport
## Installation
### Julia Dependencies
```julia
using Pkg
Pkg.add("NATS")
Pkg.add("Arrow")
Pkg.add("JSON3")
Pkg.add("HTTP")
Pkg.add("UUIDs")
Pkg.add("Dates")
```
## Usage Tutorial
### Step 1: Start NATS Server
```bash
docker run -p 4222:4222 nats:latest
```
### Step 2: Start HTTP File Server (optional)
```bash
# Create a directory for file uploads
mkdir -p /tmp/fileserver
# Use any HTTP server that supports POST for file uploads
# Example: Python's built-in server
python3 -m http.server 8080 --directory /tmp/fileserver
```
### Step 3: Run Test Scenarios
```bash
# Scenario 1: Command & Control
julia test/scenario1_command_control.jl
# Scenario 2: Large Arrow Table
julia test/scenario2_large_table.jl
# Scenario 3: Julia-to-Julia communication
julia test/scenario3_julia_to_julia.jl
```
## Usage
### Scenario 1: Command & Control (Small Dictionary)
**Focus:** Sending small dictionary configurations. This is the simplest use case for command and control scenarios.
**Julia (Sender/Receiver):**
```julia
using NATSBridge
# Send small dictionary config (wrapped in list with type)
config = Dict("step_size" => 0.01, "iterations" => 1000, "threshold" => 0.5)
env, env_json_str = smartsend(
"control",
[("config", config, "dictionary")],
broker_url="nats://localhost:4222"
)
# env: msg_envelope_v1 with all metadata and payloads
# env_json_str: JSON string for publishing
```
**Julia (Sender/Receiver) with NATS_connection for connection reuse:**
```julia
using NATSBridge
# Create connection once for high-frequency publishing
conn = NATS.connect("nats://localhost:4222")
# Send multiple messages using the same connection (saves connection overhead)
for i in 1:100
config = Dict("iteration" => i, "data" => rand())
smartsend(
"control",
[("config", config, "dictionary")],
NATS_connection=conn, # Reuse connection
is_publish=true
)
end
# Close connection when done
NATS.close(conn)
```
**Use Case:** High-frequency publishing scenarios where connection reuse provides performance benefits by avoiding the overhead of establishing a new NATS connection for each message.
### Basic Multi-Payload Example
#### Julia (Sender)
```julia
using NATSBridge
# Send multiple payloads in one message (type is required per payload)
smartsend(
"/test",
[("dataname1", data1, "dictionary"), ("dataname2", data2, "table")],
broker_url="nats://localhost:4222",
fileserver_url="http://localhost:8080"
)
# Even single payload must be wrapped in a list with type
smartsend("/test", [("single_data", mydata, "dictionary")], broker_url="nats://localhost:4222")
```
#### Julia (Receiver)
```julia
using NATSBridge
# Receive returns a dictionary with envelope metadata and payloads field
env = smartreceive(msg)
# env["payloads"] = [(dataname1, data1, "dictionary"), (dataname2, data2, "table"), ...]
```
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
#### Julia (Sender)
```julia
using Arrow
using DataFrames
# Create large DataFrame
df = DataFrame(
id = 1:10_000_000,
value = rand(10_000_000),
category = rand(["A", "B", "C"], 10_000_000)
)
# Send via smartsend - wrapped in list with type
# Large payload will use link transport (HTTP fileserver)
env, env_json_str = smartsend(
"analysis_results",
[("table_data", df, "table")],
broker_url="nats://localhost:4222",
fileserver_url="http://localhost:8080"
)
# env: msg_envelope_v1 with all metadata and payloads
# env_json_str: JSON string for publishing
```
#### smartsend Function Signature (Julia)
```julia
function smartsend(
subject::String,
data::AbstractArray{Tuple{String, Any, String}, 1}; # List of (dataname, data, type) tuples
broker_url::String = DEFAULT_BROKER_URL, # NATS server URL
fileserver_url = DEFAULT_FILESERVER_URL,
fileserver_upload_handler::Function = plik_oneshot_upload,
size_threshold::Int = DEFAULT_SIZE_THRESHOLD,
correlation_id::Union{String, Nothing} = nothing,
msg_purpose::String = "chat",
sender_name::String = "NATSBridge",
receiver_name::String = "",
receiver_id::String = "",
reply_to::String = "",
reply_to_msg_id::String = "",
is_publish::Bool = true,
NATS_connection::Union{NATS.Connection, Nothing} = nothing # Pre-existing NATS connection (optional)
)
```
**New Keyword Parameter:**
- `NATS_connection::Union{NATS.Connection, Nothing} = nothing` - Pre-existing NATS connection. When provided, `smartsend` uses this connection instead of creating a new one, avoiding the overhead of connection establishment. This is useful for high-frequency publishing scenarios.
**Connection Handling Logic:**
```julia
if is_publish == false
# skip publish
elseif is_publish == true && NATS_connection === nothing
publish_message(broker_url, subject, env_json_str, cid) # Creates new connection
elseif is_publish == true && NATS_connection !== nothing
publish_message(NATS_connection, subject, env_json_str, cid) # Uses provided connection
end
```
**Example with pre-existing connection:**
```julia
using NATSBridge
# Create connection once
conn = NATS.connect("nats://localhost:4222")
# Send multiple messages using the same connection
for i in 1:100
data = rand(1000)
smartsend(
"analysis_results",
[("table_data", data, "table")],
NATS_connection=conn, # Reuse connection
is_publish=true
)
end
# Close connection when done
NATS.close(conn)
```
#### publish_message Function
The `publish_message` function provides two overloads for publishing messages to NATS:
**Overload 1 - URL-based publishing (creates new connection):**
```julia
function publish_message(broker_url::String, subject::String, message::String, correlation_id::String)
conn = NATS.connect(broker_url) # Create NATS connection
publish_message(conn, subject, message, correlation_id)
end
```
**Overload 2 - Connection-based publishing (uses pre-existing connection):**
```julia
function publish_message(conn::NATS.Connection, subject::String, message::String, correlation_id::String)
try
NATS.publish(conn, subject, message) # Publish message to NATS
log_trace(correlation_id, "Message published to $subject")
finally
NATS.drain(conn) # Ensure connection is closed properly
end
end
```
**Use Case:** Use the connection-based overload when you already have an established NATS connection and want to publish multiple messages without the overhead of creating a new connection for each publish.
**Integration with smartsend:**
```julia
# When NATS_connection is provided to smartsend, it uses the connection-based publish_message
env, env_json_str = smartsend(
"my.subject",
[("data", payload_data, "type")],
NATS_connection=my_connection, # Pre-existing connection
is_publish=true
)
# Uses: publish_message(NATS_connection, subject, env_json_str, cid)
# When NATS_connection is not provided, it uses the URL-based publish_message
env, env_json_str = smartsend(
"my.subject",
[("data", payload_data, "type")],
broker_url="nats://localhost:4222",
is_publish=true
)
# Uses: publish_message(broker_url, subject, env_json_str, cid)
```
**API Consistency Note:**
- **Julia:** Uses `NATS_connection` keyword parameter with function overloading for automatic connection management
### Scenario 3: Live Binary Processing
**Julia (Sender/Receiver):**
```julia
using NATSBridge
# Binary data wrapped in list with type
smartsend(
"binary_input",
[("audio_chunk", binary_buffer, "binary")],
broker_url="nats://localhost:4222",
metadata=["sample_rate" => 44100, "channels" => 1]
)
```
### Scenario 4: Catch-Up (JetStream)
**Julia (Producer/Consumer):**
```julia
using NATSBridge
function publish_health_status(broker_url)
# Send status wrapped in list with type
status = Dict("cpu" => rand(), "memory" => rand())
env, env_json_str = smartsend(
"health",
[("status", status, "dictionary")],
broker_url=broker_url
)
sleep(5) # Every 5 seconds
end
```
### Scenario 5: Selection (Low Bandwidth)
**Focus:** Small Arrow tables. The Action: Julia wants to send a small DataFrame to show on a receiving application for the user to choose.
**Julia (Sender/Receiver):**
```julia
using NATSBridge
using DataFrames
# Create small DataFrame (e.g., 50KB - 500KB)
options_df = DataFrame(
id = 1:10,
name = ["Option A", "Option B", "Option C", "Option D", "Option E",
"Option F", "Option G", "Option H", "Option I", "Option J"],
description = ["Description A", "Description B", "Description C", "Description D", "Description E",
"Description F", "Description G", "Description H", "Description I", "Description J"]
)
# Convert to Arrow IPC stream
# Check payload size (< 1MB threshold)
# Publish directly to NATS with Base64-encoded payload
# Include metadata for dashboard selection context
env, env_json_str = smartsend(
"dashboard.selection",
[("options_table", options_df, "table")],
broker_url="nats://localhost:4222",
metadata=Dict("context" => "user_selection")
)
# env: msg_envelope_v1 with all metadata and payloads
# env_json_str: JSON string for publishing
```
**Use Case:** Julia server generates a list of available options (e.g., file selections, configuration presets) as a small DataFrame and sends to a receiving application for user selection. The selection is then sent back to Julia for processing.
### Scenario 6: Chat System
**Focus:** Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging.
**Multi-Payload Support:** The system supports mixed-payload messages where a single message can contain multiple payloads with different transport strategies. The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type.
**Julia (Sender/Receiver):**
```julia
using NATSBridge
# Build chat message with mixed payloads:
# - Text: direct transport (Base64)
# - Small images: direct transport (Base64)
# - Large images: link transport (HTTP URL)
# - Audio/video: link transport (HTTP URL)
# - Tables: direct or link depending on size
# - Files: link transport (HTTP URL)
#
# Each payload uses appropriate transport strategy:
# - Size < 1MB → direct (NATS + Base64)
# - Size >= 1MB → link (HTTP upload + NATS URL)
#
# Include claim-check metadata for delivery tracking
# Support bidirectional messaging with replyTo fields
# Example: Chat with text, small image, and large file
chat_message = [
("message_text", "Hello, this is a test message!", "text"),
("user_avatar", image_bytes, "image"), # Small image, direct transport
("large_document", large_file_bytes, "binary") # Large file, link transport
]
env, env_json_str = smartsend(
"chat.room123",
chat_message,
broker_url="nats://localhost:4222",
msg_purpose="chat",
reply_to="chat.room123.responses"
)
# env: msg_envelope_v1 with all metadata and payloads
# env_json_str: JSON string for publishing
```
**Use Case:** Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components.
**Implementation Note:** The `smartreceive` function iterates through all payloads in the envelope and processes each according to its transport type. See the standard API format in Section 1: `msg_envelope_v1` supports `Vector{msg_payload_v1}` for multiple payloads.
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `NATS_URL` | `nats://localhost:4222` | NATS server URL |
| `FILESERVER_URL` | `http://localhost:8080` | HTTP file server URL (base URL without `/upload` suffix) |
| `SIZE_THRESHOLD` | `1_000_000` | Size threshold in bytes (1MB) |
### Message Envelope Schema
```json
{
"correlation_id": "uuid-v4-string",
"msg_id": "uuid-v4-string",
"timestamp": "2024-01-15T10:30:00Z",
"send_to": "topic/subject",
"msg_purpose": "ACK | NACK | updateStatus | shutdown | chat",
"sender_name": "agent-wine-web-frontend",
"sender_id": "uuid4",
"receiver_name": "agent-backend",
"receiver_id": "uuid4",
"reply_to": "topic",
"reply_to_msg_id": "uuid4",
"broker_url": "nats://localhost:4222",
"metadata": {
"content_type": "application/octet-stream",
"content_length": 123456
},
"payloads": [
{
"id": "uuid4",
"dataname": "login_image",
"payload_type": "image",
"transport": "direct",
"encoding": "base64",
"size": 15433,
"data": "base64-encoded-string",
"metadata": {
"checksum": "sha256_hash"
}
}
]
}
```
## Performance Considerations
### Zero-Copy Reading
- Use Arrow's memory-mapped file reading
- Avoid unnecessary data copying during deserialization
- Use Apache Arrow's native IPC reader
### Exponential Backoff
- Maximum retry count: 5
- Base delay: 100ms, max delay: 5000ms
### Correlation ID Logging
- Log correlation_id at every stage
- Include: send, receive, serialize, deserialize
- Use structured logging format
## Testing
Run the test scripts for Julia:
### Julia Tests
```bash
# Text message exchange
julia test/test_julia_to_julia_text_sender.jl
julia test/test_julia_to_julia_text_receiver.jl
# Dictionary exchange
julia test/test_julia_to_julia_dict_sender.jl
julia test/test_julia_to_julia_dict_receiver.jl
# File transfer
julia test/test_julia_to_julia_file_sender.jl
julia test/test_julia_to_julia_file_receiver.jl
# Mixed payload types
julia test/test_julia_to_julia_mix_payloads_sender.jl
julia test/test_julia_to_julia_mix_payloads_receiver.jl
# Table exchange
julia test/test_julia_to_julia_table_sender.jl
julia test/test_julia_to_julia_table_receiver.jl
```
## Troubleshooting
### Common Issues
1. **NATS Connection Failed**
- Ensure NATS server is running
2. **HTTP Upload Failed**
- Ensure file server is running
- Check `fileserver_url` configuration
- Verify upload permissions
3. **Arrow IPC Deserialization Error**
- Ensure data is properly serialized to Arrow format
- Check Arrow version compatibility
## License
MIT