Everything a training run
demands of you.
From the first forward pass to the final checkpoint — Runlog keeps you informed, in control, and never in the dark about what your model is doing.
⚡Observability
Realtime metric streaming
Every step, every value — streamed. Charts update live as your model trains. Zero polling, zero lag.
Streaming — Live updates
📡Reliability
Offline-first SDK
Go offline mid-run, lose your connection, or kill the process — not a single metric or terminal log is lost. Data is buffered locally and synced automatically when you reconnect. Manual sync available via CLI.
Auto-sync · Manual CLI · Zero loss
🖥Logging
Live terminal capture
Every print statement, tqdm bar, and framework log is captured automatically and streamed to your dashboard alongside metrics. Nothing extra to configure — it just works.
stdout · stderr · auto-captured
💬Collaboration
Workspace group chat
Every workspace has a built-in group chat. Discuss runs, share observations, and coordinate experiments without leaving the dashboard.
Real-time · workspace-scoped
⚖️Collaboration
Cross-user run comparison
Load a teammate's run directly into your compare view. Overlay your results against anyone who has shared their project with you — instantly, no export needed.
Cross-user · shared projects
🔔Alerts
Conditional email alerts
Set threshold rules on any logged metric. Get notified exactly when it matters — not before, not after.
Value goes above or below threshold
Metric stalls for N consecutive steps
Loss spikes above rolling average
💥Reliability
Crash detection
Runlog detects training crashes mid-run — Python exceptions, OOM errors, NaN loss — and sends an immediate alert with the last known state.
Instant notification
🧟Reliability
Dead run detection
Catch silent hangs. If no steps are logged within a configurable timeout window, the run is flagged as dead and you're alerted immediately.
Configurable timeout
⏸Control
Interruptible training
Hit pause from the dashboard. Your training script receives a clean signal to checkpoint and halt.
Zero state loss
📊Logging
Dynamic metric logging
Log any key-value pair at any step. Charts are created automatically — no schema, no config. Add new metrics mid-run without restarting.
Auto-detected charts
📝Annotation
Run notes & observations
Write markdown notes directly on a run. Pin observations at a specific step. Annotate what changed between experiments before you forget.
Markdown · step-anchored
👥Collaboration
Team spaces
Invite teammates into a shared workspace. Assign roles with granular action-level control — not just who can view, but exactly what each member can create, edit, delete, or manage.
Action-level RBAC · shared projects
⚖️Analysis
Side-by-side run comparison
Overlay multiple runs on the same chart axes. Evaluate training methods, hyperparameter sweeps, and architecture changes at a glance.
Overlay · diff view
🔗Sharing
Publicly-shareable links
Generate a public read-only URL for any run. Share results with collaborators, reviewers, or your audience — no account required to view.
Public · read-only · revokable
💾Checkpoints
Checkpoint ledger
Tag checkpoint paths directly in the dashboard with their exact metric snapshot — val loss, perplexity, step. Find your best checkpoint without digging through logs.
Path · metrics · step
✦Coming soon
More features arriving
Training configuration updates and Slack alert integrations are currently under active development.
Have an idea?
Suggest a feature →
In development