Files
NATSBridge/docs/architecture.md
2026-02-13 06:33:38 +07:00

295 lines
8.1 KiB
Markdown

# Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript)
## Overview
This document describes the architecture for a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
## Architecture Diagram
```mermaid
flowchart TD
subgraph Client
JS[JavaScript Client]
JSApp[Application Logic]
end
subgraph Server
Julia[Julia Service]
NATS[NATS Server]
FileServer[HTTP File Server]
end
JS -->|Control/Small Data| JSApp
JSApp -->|NATS| NATS
NATS -->|NATS| Julia
Julia -->|NATS| NATS
Julia -->|HTTP POST| FileServer
JS -->|HTTP GET| FileServer
style JS fill:#e1f5fe
style Julia fill:#e8f5e9
style NATS fill:#fff3e0
style FileServer fill:#f3e5f5
```
## System Components
### 1. Unified JSON Envelope Schema
All messages use a standardized envelope format:
```json
{
"correlation_id": "uuid-v4-string",
"type": "json|table|binary",
"transport": "direct|link",
"payload": "base64-encoded-string", // Only if transport=direct
"url": "http://fileserver/path/to/data", // Only if transport=link
"metadata": {
"content_type": "application/octet-stream",
"content_length": 123456,
"format": "arrow_ipc_stream"
}
}
```
### 2. Transport Strategy Decision Logic
```
┌─────────────────────────────────────────────────────────────┐
│ SmartSend Function │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Is payload size < 1MB? │
└─────────────────────────────────────────────────────────────┘
┌─────────────────┴─────────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Direct Path │ │ Link Path │
│ (< 1MB) │ │ (> 1MB) │
│ │ │ │
│ • Serialize to │ │ • Serialize to │
│ IOBuffer │ │ IOBuffer │
│ • Base64 encode │ │ • Upload to │
│ • Publish to │ │ HTTP Server │
│ NATS │ │ • Publish to │
│ │ │ NATS with URL │
└─────────────────┘ └─────────────────┘
```
### 3. Julia Module Architecture
```mermaid
graph TD
subgraph JuliaModule
SmartSendJulia[SmartSend Julia]
SizeCheck[Size Check]
DirectPath[Direct Path]
LinkPath[Link Path]
HTTPClient[HTTP Client]
end
SmartSendJulia --> SizeCheck
SizeCheck -->|< 1MB| DirectPath
SizeCheck -->|>= 1MB| LinkPath
LinkPath --> HTTPClient
style JuliaModule fill:#c5e1a5
```
### 4. JavaScript Module Architecture
```mermaid
graph TD
subgraph JSModule
SmartSendJS[SmartSend JS]
SmartReceiveJS[SmartReceive JS]
JetStreamConsumer[JetStream Pull Consumer]
ApacheArrow[Apache Arrow]
end
SmartSendJS --> NATS
SmartReceiveJS --> JetStreamConsumer
JetStreamConsumer --> ApacheArrow
style JSModule fill:#f3e5f5
```
## Implementation Details
### Julia Implementation
#### Dependencies
- `NATS.jl` - Core NATS functionality
- `Arrow.jl` - Arrow IPC serialization
- `JSON3.jl` - JSON parsing
- `HTTP.jl` - HTTP client for file server
- `Dates.jl` - Timestamps for logging
#### SmartSend Function
```julia
function SmartSend(
subject::String,
data::Any,
type::String = "json";
nats_url::String = "nats://localhost:4222",
fileserver_url::String = "http://localhost:8080/upload",
size_threshold::Int = 1_000_000 # 1MB
)
```
**Flow:**
1. Serialize data to Arrow IPC stream (if table)
2. Check payload size
3. If < threshold: publish directly to NATS with Base64-encoded payload
4. If >= threshold: upload to HTTP server, publish NATS with URL
#### SmartReceive Handler
```julia
function SmartReceive(msg::NATS.Message)
# Parse envelope
# Check transport type
# If direct: decode Base64 payload
# If link: fetch from URL with exponential backoff
# Deserialize Arrow IPC to DataFrame
end
```
### JavaScript Implementation
#### Dependencies
- `nats.js` - Core NATS functionality
- `apache-arrow` - Arrow IPC serialization
- `uuid` - Correlation ID generation
#### SmartSend Function
```javascript
async function SmartSend(subject, data, type = 'json', options = {})
```
**Flow:**
1. Serialize data to Arrow IPC buffer (if table)
2. Check payload size
3. If < threshold: publish directly to NATS
4. If >= threshold: upload to HTTP server, publish NATS with URL
#### SmartReceive Handler
```javascript
async function SmartReceive(msg, options = {})
```
**Flow:**
1. Parse envelope
2. Check transport type
3. If direct: decode Base64 payload
4. If link: fetch with exponential backoff
5. Deserialize Arrow IPC with zero-copy
## Scenario Implementations
### Scenario 1: Command & Control (Small JSON)
**Julia (Receiver):**
```julia
# Subscribe to control subject
# Parse JSON envelope
# Execute simulation with parameters
# Send acknowledgment
```
**JavaScript (Sender):**
```javascript
// Create small JSON config
// Send via SmartSend with type="json"
```
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
**Julia (Sender):**
```julia
# Create large DataFrame
# Convert to Arrow IPC stream
# Check size (> 1MB)
# Upload to HTTP server
# Publish NATS with URL
```
**JavaScript (Receiver):**
```javascript
// Receive NATS message with URL
// Fetch data from HTTP server
// Parse Arrow IPC with zero-copy
// Load into Perspective.js or D3
```
### Scenario 3: Live Audio Processing
**JavaScript (Sender):**
```javascript
// Capture audio chunk
// Send as binary with metadata headers
// Use SmartSend with type="audio"
```
**Julia (Receiver):**
```julia
// Receive audio data
// Perform FFT or AI transcription
// Send results back (JSON + Arrow table)
```
### Scenario 4: Catch-Up (JetStream)
**Julia (Producer):**
```julia
# Publish to JetStream
# Include metadata for temporal tracking
```
**JavaScript (Consumer):**
```javascript
// Connect to JetStream
// Request replay from last 10 minutes
// Process historical and real-time messages
```
## Performance Considerations
### Zero-Copy Reading
- Use Arrow's memory-mapped file reading
- Avoid unnecessary data copying during deserialization
- Use Apache Arrow's native IPC reader
### Exponential Backoff
- Implement exponential backoff for HTTP link fetching
- Maximum retry count: 5
- Base delay: 100ms, max delay: 5000ms
### Correlation ID Logging
- Log correlation_id at every stage
- Include: send, receive, serialize, deserialize
- Use structured logging format
## Testing Strategy
### Unit Tests
- Test SmartSend with various payload sizes
- Test SmartReceive with direct and link transport
- Test Arrow IPC serialization/deserialization
### Integration Tests
- Test full flow with NATS server
- Test large data transfer (> 100MB)
- Test audio processing pipeline
### Performance Tests
- Measure throughput for small payloads
- Measure throughput for large payloads