295 lines
8.1 KiB
Markdown
295 lines
8.1 KiB
Markdown
# Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript)
|
|
|
|
## Overview
|
|
|
|
This document describes the architecture for a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
|
|
|
|
## Architecture Diagram
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
subgraph Client
|
|
JS[JavaScript Client]
|
|
JSApp[Application Logic]
|
|
end
|
|
|
|
subgraph Server
|
|
Julia[Julia Service]
|
|
NATS[NATS Server]
|
|
FileServer[HTTP File Server]
|
|
end
|
|
|
|
JS -->|Control/Small Data| JSApp
|
|
JSApp -->|NATS| NATS
|
|
NATS -->|NATS| Julia
|
|
Julia -->|NATS| NATS
|
|
Julia -->|HTTP POST| FileServer
|
|
JS -->|HTTP GET| FileServer
|
|
|
|
style JS fill:#e1f5fe
|
|
style Julia fill:#e8f5e9
|
|
style NATS fill:#fff3e0
|
|
style FileServer fill:#f3e5f5
|
|
```
|
|
|
|
## System Components
|
|
|
|
### 1. Unified JSON Envelope Schema
|
|
|
|
All messages use a standardized envelope format:
|
|
|
|
```json
|
|
{
|
|
"correlation_id": "uuid-v4-string",
|
|
"type": "json|table|binary",
|
|
"transport": "direct|link",
|
|
"payload": "base64-encoded-string", // Only if transport=direct
|
|
"url": "http://fileserver/path/to/data", // Only if transport=link
|
|
"metadata": {
|
|
"content_type": "application/octet-stream",
|
|
"content_length": 123456,
|
|
"format": "arrow_ipc_stream"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. Transport Strategy Decision Logic
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ SmartSend Function │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Is payload size < 1MB? │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────┴─────────────────┐
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Direct Path │ │ Link Path │
|
|
│ (< 1MB) │ │ (> 1MB) │
|
|
│ │ │ │
|
|
│ • Serialize to │ │ • Serialize to │
|
|
│ IOBuffer │ │ IOBuffer │
|
|
│ • Base64 encode │ │ • Upload to │
|
|
│ • Publish to │ │ HTTP Server │
|
|
│ NATS │ │ • Publish to │
|
|
│ │ │ NATS with URL │
|
|
└─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
### 3. Julia Module Architecture
|
|
|
|
```mermaid
|
|
graph TD
|
|
subgraph JuliaModule
|
|
SmartSendJulia[SmartSend Julia]
|
|
SizeCheck[Size Check]
|
|
DirectPath[Direct Path]
|
|
LinkPath[Link Path]
|
|
HTTPClient[HTTP Client]
|
|
end
|
|
|
|
SmartSendJulia --> SizeCheck
|
|
SizeCheck -->|< 1MB| DirectPath
|
|
SizeCheck -->|>= 1MB| LinkPath
|
|
LinkPath --> HTTPClient
|
|
|
|
style JuliaModule fill:#c5e1a5
|
|
```
|
|
|
|
### 4. JavaScript Module Architecture
|
|
|
|
```mermaid
|
|
graph TD
|
|
subgraph JSModule
|
|
SmartSendJS[SmartSend JS]
|
|
SmartReceiveJS[SmartReceive JS]
|
|
JetStreamConsumer[JetStream Pull Consumer]
|
|
ApacheArrow[Apache Arrow]
|
|
end
|
|
|
|
SmartSendJS --> NATS
|
|
SmartReceiveJS --> JetStreamConsumer
|
|
JetStreamConsumer --> ApacheArrow
|
|
|
|
style JSModule fill:#f3e5f5
|
|
```
|
|
|
|
## Implementation Details
|
|
|
|
### Julia Implementation
|
|
|
|
#### Dependencies
|
|
- `NATS.jl` - Core NATS functionality
|
|
- `Arrow.jl` - Arrow IPC serialization
|
|
- `JSON3.jl` - JSON parsing
|
|
- `HTTP.jl` - HTTP client for file server
|
|
- `Dates.jl` - Timestamps for logging
|
|
|
|
#### SmartSend Function
|
|
|
|
```julia
|
|
function SmartSend(
|
|
subject::String,
|
|
data::Any,
|
|
type::String = "json";
|
|
nats_url::String = "nats://localhost:4222",
|
|
fileserver_url::String = "http://localhost:8080/upload",
|
|
size_threshold::Int = 1_000_000 # 1MB
|
|
)
|
|
```
|
|
|
|
**Flow:**
|
|
1. Serialize data to Arrow IPC stream (if table)
|
|
2. Check payload size
|
|
3. If < threshold: publish directly to NATS with Base64-encoded payload
|
|
4. If >= threshold: upload to HTTP server, publish NATS with URL
|
|
|
|
#### SmartReceive Handler
|
|
|
|
```julia
|
|
function SmartReceive(msg::NATS.Message)
|
|
# Parse envelope
|
|
# Check transport type
|
|
# If direct: decode Base64 payload
|
|
# If link: fetch from URL with exponential backoff
|
|
# Deserialize Arrow IPC to DataFrame
|
|
end
|
|
```
|
|
|
|
### JavaScript Implementation
|
|
|
|
#### Dependencies
|
|
- `nats.js` - Core NATS functionality
|
|
- `apache-arrow` - Arrow IPC serialization
|
|
- `uuid` - Correlation ID generation
|
|
|
|
#### SmartSend Function
|
|
|
|
```javascript
|
|
async function SmartSend(subject, data, type = 'json', options = {})
|
|
```
|
|
|
|
**Flow:**
|
|
1. Serialize data to Arrow IPC buffer (if table)
|
|
2. Check payload size
|
|
3. If < threshold: publish directly to NATS
|
|
4. If >= threshold: upload to HTTP server, publish NATS with URL
|
|
|
|
#### SmartReceive Handler
|
|
|
|
```javascript
|
|
async function SmartReceive(msg, options = {})
|
|
```
|
|
|
|
**Flow:**
|
|
1. Parse envelope
|
|
2. Check transport type
|
|
3. If direct: decode Base64 payload
|
|
4. If link: fetch with exponential backoff
|
|
5. Deserialize Arrow IPC with zero-copy
|
|
|
|
## Scenario Implementations
|
|
|
|
### Scenario 1: Command & Control (Small JSON)
|
|
|
|
**Julia (Receiver):**
|
|
```julia
|
|
# Subscribe to control subject
|
|
# Parse JSON envelope
|
|
# Execute simulation with parameters
|
|
# Send acknowledgment
|
|
```
|
|
|
|
**JavaScript (Sender):**
|
|
```javascript
|
|
// Create small JSON config
|
|
// Send via SmartSend with type="json"
|
|
```
|
|
|
|
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
|
|
|
|
**Julia (Sender):**
|
|
```julia
|
|
# Create large DataFrame
|
|
# Convert to Arrow IPC stream
|
|
# Check size (> 1MB)
|
|
# Upload to HTTP server
|
|
# Publish NATS with URL
|
|
```
|
|
|
|
**JavaScript (Receiver):**
|
|
```javascript
|
|
// Receive NATS message with URL
|
|
// Fetch data from HTTP server
|
|
// Parse Arrow IPC with zero-copy
|
|
// Load into Perspective.js or D3
|
|
```
|
|
|
|
### Scenario 3: Live Audio Processing
|
|
|
|
**JavaScript (Sender):**
|
|
```javascript
|
|
// Capture audio chunk
|
|
// Send as binary with metadata headers
|
|
// Use SmartSend with type="audio"
|
|
```
|
|
|
|
**Julia (Receiver):**
|
|
```julia
|
|
// Receive audio data
|
|
// Perform FFT or AI transcription
|
|
// Send results back (JSON + Arrow table)
|
|
```
|
|
|
|
### Scenario 4: Catch-Up (JetStream)
|
|
|
|
**Julia (Producer):**
|
|
```julia
|
|
# Publish to JetStream
|
|
# Include metadata for temporal tracking
|
|
```
|
|
|
|
**JavaScript (Consumer):**
|
|
```javascript
|
|
// Connect to JetStream
|
|
// Request replay from last 10 minutes
|
|
// Process historical and real-time messages
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
### Zero-Copy Reading
|
|
- Use Arrow's memory-mapped file reading
|
|
- Avoid unnecessary data copying during deserialization
|
|
- Use Apache Arrow's native IPC reader
|
|
|
|
### Exponential Backoff
|
|
- Implement exponential backoff for HTTP link fetching
|
|
- Maximum retry count: 5
|
|
- Base delay: 100ms, max delay: 5000ms
|
|
|
|
### Correlation ID Logging
|
|
- Log correlation_id at every stage
|
|
- Include: send, receive, serialize, deserialize
|
|
- Use structured logging format
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
- Test SmartSend with various payload sizes
|
|
- Test SmartReceive with direct and link transport
|
|
- Test Arrow IPC serialization/deserialization
|
|
|
|
### Integration Tests
|
|
- Test full flow with NATS server
|
|
- Test large data transfer (> 100MB)
|
|
- Test audio processing pipeline
|
|
|
|
### Performance Tests
|
|
- Measure throughput for small payloads
|
|
- Measure throughput for large payloads
|