Files

narawat 7433c147c9 update

2026-02-18 20:55:18 +07:00

21 KiB

Raw Blame History

Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript)

Overview

This document describes the architecture for a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.

File Server Handler Architecture

The system uses handler functions to abstract file server operations, allowing support for different file server implementations (e.g., Plik, AWS S3, custom HTTP server).

Handler Function Signatures:

# Upload handler - uploads data to file server and returns URL
# The handler is passed to smartsend as fileserverUploadHandler parameter
# It receives: (fileserver_url::String, dataname::String, data::Vector{UInt8})
# Returns: Dict{String, Any} with keys: "status", "uploadid", "fileid", "url"
fileserverUploadHandler(fileserver_url::String, dataname::String, data::Vector{UInt8})::Dict{String, Any}

# Download handler - fetches data from file server URL with exponential backoff
# The handler is passed to smartreceive as fileserverDownloadHandler parameter
# It receives: (url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)
# Returns: Vector{UInt8} (the downloaded data)
fileserverDownloadHandler(url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String)::Vector{UInt8}

This design allows the system to support multiple file server backends without changing the core messaging logic.

Multi-Payload Support (Standard API)

The system uses a standardized list-of-tuples format for all payload operations. Even when sending a single payload, the user must wrap it in a list.

API Standard:

# Input format for smartsend (always a list of tuples with type info)
[(dataname1, data1, type1), (dataname2, data2, type2), ...]

# Output format for smartreceive (always returns a list of tuples)
[(dataname1, data1, type1), (dataname2, data2, type2), ...]

Supported Types:

"text" - Plain text
"dictionary" - JSON-serializable dictionaries (Dict, NamedTuple)
"table" - Tabular data (DataFrame, array of structs)
"image" - Image data (Bitmap, PNG/JPG bytes)
"audio" - Audio data (WAV, MP3 bytes)
"video" - Video data (MP4, AVI bytes)
"binary" - Generic binary data (Vector{UInt8})

This design allows per-payload type specification, enabling mixed-content messages where different payloads can use different serialization formats in a single message.

Examples:

# Single payload - still wrapped in a list
smartsend(
    "/test",
    [("dataname1", data1, "dictionary")],  # List with one tuple (data, type)
    nats_url="nats://localhost:4222",
    fileserverUploadHandler=plik_oneshot_upload,
    metadata=user_provided_envelope_level_metadata
)

# Multiple payloads in one message with different types
smartsend(
    "/test",
    [("dataname1", data1, "dictionary"), ("dataname2", data2, "table")],
    nats_url="nats://localhost:4222",
    fileserverUploadHandler=plik_oneshot_upload
)

# Mixed content (e.g., chat with text, image, audio)
smartsend(
    "/chat",
    [
        ("message_text", "Hello!", "text"),
        ("user_image", image_data, "image"),
        ("audio_clip", audio_data, "audio")
    ],
    nats_url="nats://localhost:4222"
)

# Receive always returns a list
payloads = smartreceive(msg, fileserverDownloadHandler, max_retries, base_delay, max_delay)
# payloads = [("dataname1", data1, type1), ("dataname2", data2, type2), ...]

Architecture Diagram

flowchart TD
    subgraph Client
        JS[JavaScript Client]
        JSApp[Application Logic]
    end

    subgraph Server
        Julia[Julia Service]
        NATS[NATS Server]
        FileServer[HTTP File Server]
    end

    JS -->|Control/Small Data| JSApp
    JSApp -->|NATS| NATS
    NATS -->|NATS| Julia
    Julia -->|NATS| NATS
    Julia -->|HTTP POST| FileServer
    JS -->|HTTP GET| FileServer

    style JS fill:#e1f5fe
    style Julia fill:#e8f5e9
    style NATS fill:#fff3e0
    style FileServer fill:#f3e5f5

System Components

1. msgEnvelope_v1 - Message Envelope

The msgEnvelope_v1 structure provides a comprehensive message format for bidirectional communication between Julia and JavaScript services.

Julia Structure:

struct msgEnvelope_v1
  correlationId::String       # Unique identifier to track messages across systems
  msgId::String               # This message id
  timestamp::String           # Message published timestamp
  
  sendTo::String              # Topic/subject the sender sends to
  msgPurpose::String          # Purpose of this message (ACK | NACK | updateStatus | shutdown | ...)
  senderName::String          # Sender name (e.g., "agent-wine-web-frontend")
  senderId::String            # Sender id (uuid4)
  receiverName::String        # Message receiver name (e.g., "agent-backend")
  receiverId::String          # Message receiver id (uuid4 or nothing for broadcast)
  replyTo::String             # Topic to reply to
  replyToMsgId::String        # Message id this message is replying to
  brokerURL::String           # NATS server address
  
  metadata::Dict{String, Any}
  payloads::AbstractArray{msgPayload_v1}  # Multiple payloads stored here
end

JSON Schema:

{
  "correlationId": "uuid-v4-string",
  "msgId": "uuid-v4-string",
  "timestamp": "2024-01-15T10:30:00Z",
  
  "sendTo": "topic/subject",
  "msgPurpose": "ACK | NACK | updateStatus | shutdown | chat",
  "senderName": "agent-wine-web-frontend",
  "senderId": "uuid4",
  "receiverName": "agent-backend",
  "receiverId": "uuid4",
  "replyTo": "topic",
  "replyToMsgId": "uuid4",
  "brokerURL": "nats://localhost:4222",
  
  "metadata": {

  },
  
  "payloads": [
    {
      "id": "uuid4",
      "dataname": "login_image",
      "type": "image",
      "transport": "direct",
      "encoding": "base64",
      "size": 15433,
      "data": "base64-encoded-string",
      "metadata": {

      }
    },
    {
      "id": "uuid4",
      "dataname": "large_data",
      "type": "table",
      "transport": "link",
      "encoding": "none",
      "size": 524288,
      "data": "http://localhost:8080/file/UPLOAD_ID/FILE_ID/data.arrow",
      "metadata": {

      }
    }
  ]
}

2. msgPayload_v1 - Payload Structure

The msgPayload_v1 structure provides flexible payload handling for various data types.

Julia Structure:

struct msgPayload_v1
  id::String                    # Id of this payload (e.g., "uuid4")
  dataname::String              # Name of this payload (e.g., "login_image")
  type::String                  # "text | dictionary | table | image | audio | video | binary"
  transport::String             # "direct | link"
  encoding::String              # "none | json | base64 | arrow-ipc"
  size::Integer                 # Data size in bytes
  data::Any                     # Payload data in case of direct transport or a URL in case of link
  metadata::Dict{String, Any}   # Dict("checksum" => "sha256_hash", ...)
end

Key Features:

Supports multiple data types: text, dictionary, table, image, audio, video, binary
Flexible transport: "direct" (NATS) or "link" (HTTP fileserver)
Multiple payloads per message (essential for chat with mixed content)
Per-payload and per-envelope metadata support

3. Transport Strategy Decision Logic

┌─────────────────────────────────────────────────────────────┐
│                     smartsend Function                      │
│  Accepts: [(dataname1, data1, type1), ...]                  │
│  (No standalone type parameter - type per payload)          │
└─────────────────────────────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────┐
│  For each payload:                                          │
│  1. Extract type from tuple                                │
│  2. Serialize based on type                                │
│  3. Check payload size                                     │
└─────────────────────────────────────────────────────────────┘
                             │
            ┌────────────────┴─-────────────────┐
            ▼                                   ▼
      ┌─────────────────┐                 ┌─────────────────┐
      │  Direct Path    │                 │  Link Path      │
      │  (< 1MB)        │                 │  (> 1MB)        │
      │                 │                 │                 │
      │ • Serialize to  │                 │ • Serialize to  │
      │   IOBuffer      │                 │   IOBuffer      │
      │ • Base64 encode │                 │ • Upload to     │
      │ • Publish to    │                 │   HTTP Server   │
      │   NATS          │                 │ • Publish to    │
      │   (with payload │                 │   NATS with URL │
      │    in envelope) │                 │   (in envelope) │
      └─────────────────┘                 └─────────────────┘

4. Julia Module Architecture

graph TD
    subgraph JuliaModule
        smartsendJulia[smartsend Julia]
        SizeCheck[Size Check]
        DirectPath[Direct Path]
        LinkPath[Link Path]
        HTTPClient[HTTP Client]
    end

    smartsendJulia --> SizeCheck
    SizeCheck -->|< 1MB| DirectPath
    SizeCheck -->|>= 1MB| LinkPath
    LinkPath --> HTTPClient

    style JuliaModule fill:#c5e1a5

5. JavaScript Module Architecture

graph TD
    subgraph JSModule
        smartsendJS[smartsend JS]
        smartreceiveJS[smartreceive JS]
        JetStreamConsumer[JetStream Pull Consumer]
        ApacheArrow[Apache Arrow]
    end

    smartsendJS --> NATS
    smartreceiveJS --> JetStreamConsumer
    JetStreamConsumer --> ApacheArrow

    style JSModule fill:#f3e5f5

Implementation Details

Julia Implementation

Dependencies

NATS.jl - Core NATS functionality
Arrow.jl - Arrow IPC serialization
JSON3.jl - JSON parsing
HTTP.jl - HTTP client for file server
Dates.jl - Timestamps for logging

smartsend Function

function smartsend(
  subject::String,
  data::AbstractArray{Tuple{String, Any, String}};  # No standalone type parameter
  nats_url::String = "nats://localhost:4222",
  fileserverUploadHandler::Function = plik_oneshot_upload,
  size_threshold::Int = 1_000_000  # 1MB
)

Input Format:

data::AbstractArray{Tuple{String, Any, String}} - Must be a list of (dataname, data, type) tuples: [("dataname1", data1, "type1"), ("dataname2", data2, "type2"), ...]
Even for single payloads: [(dataname1, data1, "type1")]
Each payload can have a different type, enabling mixed-content messages

Flow:

Iterate through the list of (dataname, data, type) tuples
For each payload: extract the type from the tuple and serialize accordingly
Check payload size
If < threshold: publish directly to NATS with Base64-encoded payload
If >= threshold: upload to HTTP server, publish NATS with URL

smartreceive Handler

function smartreceive(
    msg::NATS.Message,
    fileserverDownloadHandler::Function;
    max_retries::Int = 5,
    base_delay::Int = 100,
    max_delay::Int = 5000
)
    # Parse envelope
    # Iterate through all payloads
    # For each payload: check transport type
    #   If direct: decode Base64 payload
    #   If link: fetch from URL with exponential backoff using fileserverDownloadHandler
    # Deserialize payload based on type
    # Return list of (dataname, data, type) tuples
end

Output Format:

Always returns a list of tuples: [(dataname1, data1, type1), (dataname2, data2, type2), ...]
Even for single payloads: [(dataname1, data1, type1)]

Process Flow:

Parse the JSON envelope to extract the payloads array
Iterate through each payload in payloads
For each payload:
- Determine transport type (direct or link)
- If direct: decode Base64 data from the message
- If link: fetch data from URL using exponential backoff (via fileserverDownloadHandler)
- Deserialize based on payload type (dictionary, table, binary, etc.)
Return list of (dataname, data, type) tuples

Note: The fileserverDownloadHandler receives (url::String, max_retries::Int, base_delay::Int, max_delay::Int, correlation_id::String) and returns Vector{UInt8}.

JavaScript Implementation

Dependencies

nats.js - Core NATS functionality
apache-arrow - Arrow IPC serialization
uuid - Correlation ID generation

smartsend Function

async function smartsend(subject, data, options = {})
    // data format: [(dataname, data, type), ...]
    // options object should include:
    // - natsUrl: NATS server URL
    // - fileserverUrl: base URL of the file server
    // - sizeThreshold: threshold in bytes for transport selection
    // - correlationId: optional correlation ID for tracing

Input Format:

data - Must be a list of (dataname, data, type) tuples: [(dataname1, data1, "type1"), (dataname2, data2, "type2"), ...]
Even for single payloads: [(dataname1, data1, "type1")]
Each payload can have a different type, enabling mixed-content messages

Flow:

Iterate through the list of (dataname, data, type) tuples
For each payload: extract the type from the tuple and serialize accordingly
Check payload size
If < threshold: publish directly to NATS
If >= threshold: upload to HTTP server, publish NATS with URL

smartreceive Handler

async function smartreceive(msg, options = {})
    // options object should include:
    // - fileserverDownloadHandler: function to fetch data from file server URL
    // - max_retries: maximum retry attempts for fetching URL
    // - base_delay: initial delay for exponential backoff in ms
    // - max_delay: maximum delay for exponential backoff in ms
    // - correlationId: optional correlation ID for tracing

Process Flow:

Parse the JSON envelope to extract the payloads array
Iterate through each payload in payloads
For each payload:
- Determine transport type (direct or link)
- If direct: decode Base64 data from the message
- If link: fetch data from URL using exponential backoff
- Deserialize based on payload type (dictionary, table, binary, etc.)
Return list of (dataname, data, type) tuples

Scenario Implementations

Scenario 1: Command & Control (Small Dictionary)

Julia (Receiver):

# Subscribe to control subject
# Parse JSON envelope
# Execute simulation with parameters
# Send acknowledgment

JavaScript (Sender):

// Create small dictionary config
// Send via smartsend with type="dictionary"

Scenario 2: Deep Dive Analysis (Large Arrow Table)

Julia (Sender):

# Create large DataFrame
# Convert to Arrow IPC stream
# Check size (> 1MB)
# Upload to HTTP server
# Publish NATS with URL

JavaScript (Receiver):

// Receive NATS message with URL
// Fetch data from HTTP server
// Parse Arrow IPC with zero-copy
// Load into Perspective.js or D3

Scenario 3: Live Audio Processing

JavaScript (Sender):

// Capture audio chunk
// Send as binary with metadata headers
// Use smartsend with type="audio"

Julia (Receiver):

// Receive audio data
// Perform FFT or AI transcription
// Send results back (JSON + Arrow table)

Scenario 4: Catch-Up (JetStream)

Julia (Producer):

# Publish to JetStream
# Include metadata for temporal tracking

JavaScript (Consumer):

// Connect to JetStream
// Request replay from last 10 minutes
// Process historical and real-time messages

Scenario 5: Selection (Low Bandwidth)

Focus: Small Arrow tables, Julia to JavaScript. The Action: Julia wants to send a small DataFrame to show on a JavaScript dashboard for the user to choose.

Julia (Sender):

# Create small DataFrame (e.g., 50KB - 500KB)
# Convert to Arrow IPC stream
# Check payload size (< 1MB threshold)
# Publish directly to NATS with Base64-encoded payload
# Include metadata for dashboard selection context

JavaScript (Receiver):

// Receive NATS message with direct transport
// Decode Base64 payload
// Parse Arrow IPC with zero-copy
// Load into selection UI component (e.g., dropdown, table)
// User makes selection
// Send selection back to Julia

Use Case: Julia server generates a list of available options (e.g., file selections, configuration presets) as a small DataFrame and sends to JavaScript dashboard for user selection. The selection is then sent back to Julia for processing.

Scenario 6: Chat System

Focus: Every conversational message is composed of any number and any combination of components, spanning the full spectrum from small to large. This includes text, images, audio, video, tables, and files—specifically accommodating everything from brief snippets to high-resolution images, large audio files, extensive tables, and massive documents. Support for claim-check delivery and full bi-directional messaging.

Multi-Payload Support: The system supports mixed-payload messages where a single message can contain multiple payloads with different transport strategies. The smartreceive function iterates through all payloads in the envelope and processes each according to its transport type.

Julia (Sender/Receiver):

# Build chat message with mixed payloads:
# - Text: direct transport (Base64)
# - Small images: direct transport (Base64)
# - Large images: link transport (HTTP URL)
# - Audio/video: link transport (HTTP URL)
# - Tables: direct or link depending on size
# - Files: link transport (HTTP URL)
# 
# Each payload uses appropriate transport strategy:
# - Size < 1MB → direct (NATS + Base64)
# - Size >= 1MB → link (HTTP upload + NATS URL)
# 
# Include claim-check metadata for delivery tracking
# Support bidirectional messaging with replyTo fields

JavaScript (Sender/Receiver):

// Build chat message with mixed content:
// - User input text: direct transport
// - Selected image: check size, use appropriate transport
// - Audio recording: link transport for large files
// - File attachment: link transport
// 
// Parse received message:
// - Direct payloads: decode Base64
// - Link payloads: fetch from HTTP with exponential backoff
// - Deserialize all payloads appropriately
// 
// Render mixed content in chat interface
// Support bidirectional reply with claim-check delivery confirmation

Use Case: Full-featured chat system supporting rich media. User can send text, small images directly, or upload large files that get uploaded to HTTP server and referenced via URLs. Claim-check pattern ensures reliable delivery tracking for all message components.

Implementation Note: The smartreceive function iterates through all payloads in the envelope and processes each according to its transport type. See the standard API format in Section 1: msgEnvelope_v1 supports AbstractArray{msgPayload_v1} for multiple payloads.

Performance Considerations

Zero-Copy Reading

Use Arrow's memory-mapped file reading
Avoid unnecessary data copying during deserialization
Use Apache Arrow's native IPC reader

Exponential Backoff

Implement exponential backoff for HTTP link fetching
Maximum retry count: 5
Base delay: 100ms, max delay: 5000ms

Correlation ID Logging

Log correlation_id at every stage
Include: send, receive, serialize, deserialize
Use structured logging format

Testing Strategy

Unit Tests

Test smartsend with various payload sizes
Test smartreceive with direct and link transport
Test Arrow IPC serialization/deserialization

Integration Tests

Test full flow with NATS server
Test large data transfer (> 100MB)
Test audio processing pipeline

Performance Tests

Measure throughput for small payloads
Measure throughput for large payloads

21 KiB Raw Blame History

Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript)

Overview

File Server Handler Architecture

Multi-Payload Support (Standard API)

Architecture Diagram

System Components

1. msgEnvelope_v1 - Message Envelope

2. msgPayload_v1 - Payload Structure

3. Transport Strategy Decision Logic

4. Julia Module Architecture

5. JavaScript Module Architecture

Implementation Details

Julia Implementation

Dependencies

smartsend Function

smartreceive Handler

JavaScript Implementation

Dependencies

smartsend Function

smartreceive Handler

Scenario Implementations

Scenario 1: Command & Control (Small Dictionary)

Scenario 2: Deep Dive Analysis (Large Arrow Table)

Scenario 3: Live Audio Processing

Scenario 4: Catch-Up (JetStream)

Scenario 5: Selection (Low Bandwidth)

Scenario 6: Chat System

Performance Considerations

Zero-Copy Reading

Exponential Backoff

Correlation ID Logging

Testing Strategy

Unit Tests

Integration Tests

Performance Tests

21 KiB

Raw Blame History