Architecture

Pulse is built on a modular, serverless-first architecture designed for safe, reproducible ML pipelines.

Design Principles

Protocol-First

Every model starts with an explicit contract. No implicit behaviors, no surprises in production.

Serverless-Native

Scale to zero when idle, scale infinitely under load. Pay only for what you use.

Immutable by Default

Training data is snapshotted before use. No silent data mutations can affect reproducibility.

Observable

Complete lineage tracing from data source to deployed model. Debug any prediction.

Core Modules

Pulse consists of six core modules that work together to provide a complete ML runtime:

Protocol Layer

YAML-based contracts defining model schemas, input/output types, and validation rules. Ensures type safety across the entire pipeline.

Snapshot Engine

Immutable point-in-time captures of training data. Guarantees reproducibility and provides automatic rollback capabilities.

Training Orchestrator

Serverless training execution with automatic resource scaling. Supports distributed training across multiple workers.

Inference Runtime

Low-latency model serving with automatic batching, caching, and circuit breaker patterns for reliability.

Lineage Tracker

Complete audit trail from data ingestion through model deployment. Enables compliance and debugging.

Drift Detector

Continuous monitoring of model performance and data distribution. Triggers retraining when drift exceeds thresholds.

Data Flow

Data flows through Pulse in a predictable, auditable manner:

1

Data Ingestion

Datasource connectors pull data from configured sources

2

Snapshot Creation

Immutable snapshot captured before any training begins

3

Schema Validation

Data validated against protocol contract

4

Training Execution

Model trained on validated, snapshotted data

5

Artifact Storage

Model artifacts stored with full lineage metadata

6

Deployment

Model deployed to inference runtime with monitoring

Lineage Example

Every inference can be traced back to its training data:

pulse lineage inference_abc123

Inference: inference_abc123
├── Model: fraud-detector@1.0.0-def456
│   ├── Training Run: run_xyz789
│   │   ├── Started: 2024-01-15T02:00:00Z
│   │   ├── Duration: 4m 32s
│   │   └── Metrics:
│   │       ├── accuracy: 0.9847
│   │       └── f1_score: 0.9621
│   └── Snapshot: snap_abc123
│       ├── Datasource: transactions-db
│       ├── Created: 2024-01-15T01:59:45Z
│       ├── Rows: 1,247,832
│       └── Hash: sha256:e3b0c442...
├── Input Hash: sha256:a9f2d4...
└── Output Hash: sha256:7c8b1a...