# Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript) ## Overview This document describes the architecture for a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads. ## Architecture Diagram ```mermaid flowchart TD subgraph Client JS[JavaScript Client] JSApp[Application Logic] end subgraph Server Julia[Julia Service] NATS[NATS Server] FileServer[HTTP File Server] end JS -->|Control/Small Data| JSApp JSApp -->|NATS| NATS NATS -->|NATS| Julia Julia -->|NATS| NATS Julia -->|HTTP POST| FileServer JS -->|HTTP GET| FileServer style JS fill:#e1f5fe style Julia fill:#e8f5e9 style NATS fill:#fff3e0 style FileServer fill:#f3e5f5 ``` ## System Components ### 1. Unified JSON Envelope Schema All messages use a standardized envelope format: ```json { "correlation_id": "uuid-v4-string", "type": "json|table|binary", "transport": "direct|link", "payload": "base64-encoded-string", // Only if transport=direct "url": "http://fileserver/path/to/data", // Only if transport=link "metadata": { "content_type": "application/octet-stream", "content_length": 123456, "format": "arrow_ipc_stream" } } ``` ### 2. Transport Strategy Decision Logic ``` ┌─────────────────────────────────────────────────────────────┐ │ SmartSend Function │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Is payload size < 1MB? │ └─────────────────────────────────────────────────────────────┘ │ ┌─────────────────┴─────────────────┐ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Direct Path │ │ Link Path │ │ (< 1MB) │ │ (> 1MB) │ │ │ │ │ │ • Serialize to │ │ • Serialize to │ │ IOBuffer │ │ IOBuffer │ │ • Base64 encode │ │ • Upload to │ │ • Publish to │ │ HTTP Server │ │ NATS │ │ • Publish to │ │ │ │ NATS with URL │ └─────────────────┘ └─────────────────┘ ``` ### 3. Julia Module Architecture ```mermaid graph TD subgraph JuliaModule SmartSendJulia[SmartSend Julia] SizeCheck[Size Check] DirectPath[Direct Path] LinkPath[Link Path] HTTPClient[HTTP Client] end SmartSendJulia --> SizeCheck SizeCheck -->|< 1MB| DirectPath SizeCheck -->|>= 1MB| LinkPath LinkPath --> HTTPClient style JuliaModule fill:#c5e1a5 ``` ### 4. JavaScript Module Architecture ```mermaid graph TD subgraph JSModule SmartSendJS[SmartSend JS] SmartReceiveJS[SmartReceive JS] JetStreamConsumer[JetStream Pull Consumer] ApacheArrow[Apache Arrow] end SmartSendJS --> NATS SmartReceiveJS --> JetStreamConsumer JetStreamConsumer --> ApacheArrow style JSModule fill:#f3e5f5 ``` ## Implementation Details ### Julia Implementation #### Dependencies - `NATS.jl` - Core NATS functionality - `Arrow.jl` - Arrow IPC serialization - `JSON3.jl` - JSON parsing - `HTTP.jl` - HTTP client for file server - `Dates.jl` - Timestamps for logging #### SmartSend Function ```julia function SmartSend( subject::String, data::Any, type::String = "json"; nats_url::String = "nats://localhost:4222", fileserver_url::String = "http://localhost:8080/upload", size_threshold::Int = 1_000_000 # 1MB ) ``` **Flow:** 1. Serialize data to Arrow IPC stream (if table) 2. Check payload size 3. If < threshold: publish directly to NATS with Base64-encoded payload 4. If >= threshold: upload to HTTP server, publish NATS with URL #### SmartReceive Handler ```julia function SmartReceive(msg::NATS.Message) # Parse envelope # Check transport type # If direct: decode Base64 payload # If link: fetch from URL with exponential backoff # Deserialize Arrow IPC to DataFrame end ``` ### JavaScript Implementation #### Dependencies - `nats.js` - Core NATS functionality - `apache-arrow` - Arrow IPC serialization - `uuid` - Correlation ID generation #### SmartSend Function ```javascript async function SmartSend(subject, data, type = 'json', options = {}) ``` **Flow:** 1. Serialize data to Arrow IPC buffer (if table) 2. Check payload size 3. If < threshold: publish directly to NATS 4. If >= threshold: upload to HTTP server, publish NATS with URL #### SmartReceive Handler ```javascript async function SmartReceive(msg, options = {}) ``` **Flow:** 1. Parse envelope 2. Check transport type 3. If direct: decode Base64 payload 4. If link: fetch with exponential backoff 5. Deserialize Arrow IPC with zero-copy ## Scenario Implementations ### Scenario 1: Command & Control (Small JSON) **Julia (Receiver):** ```julia # Subscribe to control subject # Parse JSON envelope # Execute simulation with parameters # Send acknowledgment ``` **JavaScript (Sender):** ```javascript // Create small JSON config // Send via SmartSend with type="json" ``` ### Scenario 2: Deep Dive Analysis (Large Arrow Table) **Julia (Sender):** ```julia # Create large DataFrame # Convert to Arrow IPC stream # Check size (> 1MB) # Upload to HTTP server # Publish NATS with URL ``` **JavaScript (Receiver):** ```javascript // Receive NATS message with URL // Fetch data from HTTP server // Parse Arrow IPC with zero-copy // Load into Perspective.js or D3 ``` ### Scenario 3: Live Audio Processing **JavaScript (Sender):** ```javascript // Capture audio chunk // Send as binary with metadata headers // Use SmartSend with type="audio" ``` **Julia (Receiver):** ```julia // Receive audio data // Perform FFT or AI transcription // Send results back (JSON + Arrow table) ``` ### Scenario 4: Catch-Up (JetStream) **Julia (Producer):** ```julia # Publish to JetStream # Include metadata for temporal tracking ``` **JavaScript (Consumer):** ```javascript // Connect to JetStream // Request replay from last 10 minutes // Process historical and real-time messages ``` ## Performance Considerations ### Zero-Copy Reading - Use Arrow's memory-mapped file reading - Avoid unnecessary data copying during deserialization - Use Apache Arrow's native IPC reader ### Exponential Backoff - Implement exponential backoff for HTTP link fetching - Maximum retry count: 5 - Base delay: 100ms, max delay: 5000ms ### Correlation ID Logging - Log correlation_id at every stage - Include: send, receive, serialize, deserialize - Use structured logging format ## Testing Strategy ### Unit Tests - Test SmartSend with various payload sizes - Test SmartReceive with direct and link transport - Test Arrow IPC serialization/deserialization ### Integration Tests - Test full flow with NATS server - Test large data transfer (> 100MB) - Test audio processing pipeline ### Performance Tests - Measure throughput for small payloads - Measure throughput for large payloads