1st commit
This commit is contained in:
294
architecture.md
Normal file
294
architecture.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Architecture Documentation: Bi-Directional Data Bridge (Julia ↔ JavaScript)
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the architecture for a high-performance, bi-directional data bridge between a Julia service and a JavaScript (Node.js) service using NATS (Core & JetStream), implementing the Claim-Check pattern for large payloads.
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph Client
|
||||
JS[JavaScript Client]
|
||||
JSApp[Application Logic]
|
||||
end
|
||||
|
||||
subgraph Server
|
||||
Julia[Julia Service]
|
||||
NATS[NATS Server]
|
||||
FileServer[HTTP File Server]
|
||||
end
|
||||
|
||||
JS -->|Control/Small Data| JSApp
|
||||
JSApp -->|NATS| NATS
|
||||
NATS -->|NATS| Julia
|
||||
Julia -->|NATS| NATS
|
||||
Julia -->|HTTP POST| FileServer
|
||||
JS -->|HTTP GET| FileServer
|
||||
|
||||
style JS fill:#e1f5fe
|
||||
style Julia fill:#e8f5e9
|
||||
style NATS fill:#fff3e0
|
||||
style FileServer fill:#f3e5f5
|
||||
```
|
||||
|
||||
## System Components
|
||||
|
||||
### 1. Unified JSON Envelope Schema
|
||||
|
||||
All messages use a standardized envelope format:
|
||||
|
||||
```json
|
||||
{
|
||||
"correlation_id": "uuid-v4-string",
|
||||
"type": "json|table|binary",
|
||||
"transport": "direct|link",
|
||||
"payload": "base64-encoded-string", // Only if transport=direct
|
||||
"url": "http://fileserver/path/to/data", // Only if transport=link
|
||||
"metadata": {
|
||||
"content_type": "application/octet-stream",
|
||||
"content_length": 123456,
|
||||
"format": "arrow_ipc_stream"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Transport Strategy Decision Logic
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ SmartSend Function │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Is payload size < 1MB? │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
┌─────────────────┴─────────────────┐
|
||||
▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ Direct Path │ │ Link Path │
|
||||
│ (< 1MB) │ │ (> 1MB) │
|
||||
│ │ │ │
|
||||
│ • Serialize to │ │ • Serialize to │
|
||||
│ IOBuffer │ │ IOBuffer │
|
||||
│ • Base64 encode │ │ • Upload to │
|
||||
│ • Publish to │ │ HTTP Server │
|
||||
│ NATS │ │ • Publish to │
|
||||
│ │ │ NATS with URL │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
### 3. Julia Module Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph JuliaModule
|
||||
SmartSendJulia[SmartSend Julia]
|
||||
SizeCheck[Size Check]
|
||||
DirectPath[Direct Path]
|
||||
LinkPath[Link Path]
|
||||
HTTPClient[HTTP Client]
|
||||
end
|
||||
|
||||
SmartSendJulia --> SizeCheck
|
||||
SizeCheck -->|< 1MB| DirectPath
|
||||
SizeCheck -->|>= 1MB| LinkPath
|
||||
LinkPath --> HTTPClient
|
||||
|
||||
style JuliaModule fill:#c5e1a5
|
||||
```
|
||||
|
||||
### 4. JavaScript Module Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph JSModule
|
||||
SmartSendJS[SmartSend JS]
|
||||
SmartReceiveJS[SmartReceive JS]
|
||||
JetStreamConsumer[JetStream Pull Consumer]
|
||||
ApacheArrow[Apache Arrow]
|
||||
end
|
||||
|
||||
SmartSendJS --> NATS
|
||||
SmartReceiveJS --> JetStreamConsumer
|
||||
JetStreamConsumer --> ApacheArrow
|
||||
|
||||
style JSModule fill:#f3e5f5
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Julia Implementation
|
||||
|
||||
#### Dependencies
|
||||
- `NATS.jl` - Core NATS functionality
|
||||
- `Arrow.jl` - Arrow IPC serialization
|
||||
- `JSON3.jl` - JSON parsing
|
||||
- `HTTP.jl` - HTTP client for file server
|
||||
- `Dates.jl` - Timestamps for logging
|
||||
|
||||
#### SmartSend Function
|
||||
|
||||
```julia
|
||||
function SmartSend(
|
||||
subject::String,
|
||||
data::Any,
|
||||
type::String = "json";
|
||||
nats_url::String = "nats://localhost:4222",
|
||||
fileserver_url::String = "http://localhost:8080/upload",
|
||||
size_threshold::Int = 1_000_000 # 1MB
|
||||
)
|
||||
```
|
||||
|
||||
**Flow:**
|
||||
1. Serialize data to Arrow IPC stream (if table)
|
||||
2. Check payload size
|
||||
3. If < threshold: publish directly to NATS with Base64-encoded payload
|
||||
4. If >= threshold: upload to HTTP server, publish NATS with URL
|
||||
|
||||
#### SmartReceive Handler
|
||||
|
||||
```julia
|
||||
function SmartReceive(msg::NATS.Message)
|
||||
# Parse envelope
|
||||
# Check transport type
|
||||
# If direct: decode Base64 payload
|
||||
# If link: fetch from URL with exponential backoff
|
||||
# Deserialize Arrow IPC to DataFrame
|
||||
end
|
||||
```
|
||||
|
||||
### JavaScript Implementation
|
||||
|
||||
#### Dependencies
|
||||
- `nats.js` - Core NATS functionality
|
||||
- `apache-arrow` - Arrow IPC serialization
|
||||
- `uuid` - Correlation ID generation
|
||||
|
||||
#### SmartSend Function
|
||||
|
||||
```javascript
|
||||
async function SmartSend(subject, data, type = 'json', options = {})
|
||||
```
|
||||
|
||||
**Flow:**
|
||||
1. Serialize data to Arrow IPC buffer (if table)
|
||||
2. Check payload size
|
||||
3. If < threshold: publish directly to NATS
|
||||
4. If >= threshold: upload to HTTP server, publish NATS with URL
|
||||
|
||||
#### SmartReceive Handler
|
||||
|
||||
```javascript
|
||||
async function SmartReceive(msg, options = {})
|
||||
```
|
||||
|
||||
**Flow:**
|
||||
1. Parse envelope
|
||||
2. Check transport type
|
||||
3. If direct: decode Base64 payload
|
||||
4. If link: fetch with exponential backoff
|
||||
5. Deserialize Arrow IPC with zero-copy
|
||||
|
||||
## Scenario Implementations
|
||||
|
||||
### Scenario 1: Command & Control (Small JSON)
|
||||
|
||||
**Julia (Receiver):**
|
||||
```julia
|
||||
# Subscribe to control subject
|
||||
# Parse JSON envelope
|
||||
# Execute simulation with parameters
|
||||
# Send acknowledgment
|
||||
```
|
||||
|
||||
**JavaScript (Sender):**
|
||||
```javascript
|
||||
// Create small JSON config
|
||||
// Send via SmartSend with type="json"
|
||||
```
|
||||
|
||||
### Scenario 2: Deep Dive Analysis (Large Arrow Table)
|
||||
|
||||
**Julia (Sender):**
|
||||
```julia
|
||||
# Create large DataFrame
|
||||
# Convert to Arrow IPC stream
|
||||
# Check size (> 1MB)
|
||||
# Upload to HTTP server
|
||||
# Publish NATS with URL
|
||||
```
|
||||
|
||||
**JavaScript (Receiver):**
|
||||
```javascript
|
||||
// Receive NATS message with URL
|
||||
// Fetch data from HTTP server
|
||||
// Parse Arrow IPC with zero-copy
|
||||
// Load into Perspective.js or D3
|
||||
```
|
||||
|
||||
### Scenario 3: Live Audio Processing
|
||||
|
||||
**JavaScript (Sender):**
|
||||
```javascript
|
||||
// Capture audio chunk
|
||||
// Send as binary with metadata headers
|
||||
// Use SmartSend with type="audio"
|
||||
```
|
||||
|
||||
**Julia (Receiver):**
|
||||
```julia
|
||||
// Receive audio data
|
||||
// Perform FFT or AI transcription
|
||||
// Send results back (JSON + Arrow table)
|
||||
```
|
||||
|
||||
### Scenario 4: Catch-Up (JetStream)
|
||||
|
||||
**Julia (Producer):**
|
||||
```julia
|
||||
# Publish to JetStream
|
||||
# Include metadata for temporal tracking
|
||||
```
|
||||
|
||||
**JavaScript (Consumer):**
|
||||
```javascript
|
||||
// Connect to JetStream
|
||||
// Request replay from last 10 minutes
|
||||
// Process historical and real-time messages
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Zero-Copy Reading
|
||||
- Use Arrow's memory-mapped file reading
|
||||
- Avoid unnecessary data copying during deserialization
|
||||
- Use Apache Arrow's native IPC reader
|
||||
|
||||
### Exponential Backoff
|
||||
- Implement exponential backoff for HTTP link fetching
|
||||
- Maximum retry count: 5
|
||||
- Base delay: 100ms, max delay: 5000ms
|
||||
|
||||
### Correlation ID Logging
|
||||
- Log correlation_id at every stage
|
||||
- Include: send, receive, serialize, deserialize
|
||||
- Use structured logging format
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
- Test SmartSend with various payload sizes
|
||||
- Test SmartReceive with direct and link transport
|
||||
- Test Arrow IPC serialization/deserialization
|
||||
|
||||
### Integration Tests
|
||||
- Test full flow with NATS server
|
||||
- Test large data transfer (> 100MB)
|
||||
- Test audio processing pipeline
|
||||
|
||||
### Performance Tests
|
||||
- Measure throughput for small payloads
|
||||
- Measure throughput for large payloads
|
||||
Reference in New Issue
Block a user