From training loop to dashboard

Track metrics.
Compare runs.
Ship faster.

Runlog is a lightweight, developer-first training monitor. Send metrics from any script with a single logger, watch progress live, and keep a clean history of every experiment.

Streaming
Live updates
Setup
3 lines of code
Metrics
Auto-detected
Storage
Unlimited history
fineweb-run-042
FineWeb · 69M params · L40S
running
train loss
val loss
grad norm
step
tok/s
gpu mem
Train loss
Val loss
Learning rate
GPU util
step 0 / 20,000

Three lines
to connect any script.

Works with PyTorch, HuggingFace Trainer, Keras, XGBoost — or anything that runs in Python. Log any metric; charts appear automatically.

Auto-detected metric charts
Blazingly fast live streaming
Offline-first — zero data loss
Terminal logs captured automatically
Pause / stop from dashboard
Team workspaces with RBAC
Cross-user run comparison
from runlogger import RunLogger

logger = RunLogger(
    base_url="https://runlog.in",
    project_name="FineWeb",
    api_token="rl-gb-...",
    run_name="run-1",
)

# in your training loop
for step in range(total_steps):
    loss = train_one_step()
    logger.log(
        step=step, total_steps=total_steps,
        train_loss=loss.item(),
        lr=scheduler.get_last_lr()[0],
    )
    if step % eval_every == 0:
        logger.log_eval(
            step=step,
            val_loss=evaluate(),
            checkpoint_saved=True,
        )
logger.finish()

Everything a training run
demands of you.

From the first forward pass to the final checkpoint — Runlog keeps you informed, in control, and never in the dark about what your model is doing.

Observability
Realtime metric streaming
Every step, every value — streamed. Charts update live as your model trains. Zero polling, zero lag.
Streaming — Live updates
📡Reliability
Offline-first SDK
Go offline mid-run, lose your connection, or kill the process — not a single metric or terminal log is lost. Data is buffered locally and synced automatically when you reconnect. Manual sync available via CLI.
Auto-sync · Manual CLI · Zero loss
🖥Logging
Live terminal capture
Every print statement, tqdm bar, and framework log is captured automatically and streamed to your dashboard alongside metrics. Nothing extra to configure — it just works.
stdout · stderr · auto-captured
💬Collaboration
Workspace group chat
Every workspace has a built-in group chat. Discuss runs, share observations, and coordinate experiments without leaving the dashboard.
Real-time · workspace-scoped
⚖️Collaboration
Cross-user run comparison
Load a teammate's run directly into your compare view. Overlay your results against anyone who has shared their project with you — instantly, no export needed.
Cross-user · shared projects
🔔Alerts
Conditional email alerts
Set threshold rules on any logged metric. Get notified exactly when it matters — not before, not after.
Value goes above or below threshold
Metric stalls for N consecutive steps
Loss spikes above rolling average
💥Reliability
Crash detection
Runlog detects training crashes mid-run — Python exceptions, OOM errors, NaN loss — and sends an immediate alert with the last known state.
Instant notification
🧟Reliability
Dead run detection
Catch silent hangs. If no steps are logged within a configurable timeout window, the run is flagged as dead and you're alerted immediately.
Configurable timeout
Control
Interruptible training
Hit pause from the dashboard. Your training script receives a clean signal to checkpoint and halt.
Zero state loss
📊Logging
Dynamic metric logging
Log any key-value pair at any step. Charts are created automatically — no schema, no config. Add new metrics mid-run without restarting.
Auto-detected charts
📝Annotation
Run notes & observations
Write markdown notes directly on a run. Pin observations at a specific step. Annotate what changed between experiments before you forget.
Markdown · step-anchored
👥Collaboration
Team spaces
Invite teammates into a shared workspace. Assign roles with granular action-level control — not just who can view, but exactly what each member can create, edit, delete, or manage.
Action-level RBAC · shared projects
⚖️Analysis
Side-by-side run comparison
Overlay multiple runs on the same chart axes. Evaluate training methods, hyperparameter sweeps, and architecture changes at a glance.
Overlay · diff view
🔗Sharing
Publicly-shareable links
Generate a public read-only URL for any run. Share results with collaborators, reviewers, or your audience — no account required to view.
Public · read-only · revokable
💾Checkpoints
Checkpoint ledger
Tag checkpoint paths directly in the dashboard with their exact metric snapshot — val loss, perplexity, step. Find your best checkpoint without digging through logs.
Path · metrics · step
Coming soon
More features arriving
Training configuration updates and Slack alert integrations are currently under active development.

Have an idea? Suggest a feature →
In development
Train loss
↓ decaying
Val loss
step —
Perplexity
exp(val_loss)
Tokens / s
throughput
GPU util
auto-logged
Learning rate
cosine decay
Grad norm
clipped at 1.0
GPU mem
/ 48 GB