From training loop to dashboard

Track metrics.
Compare runs.
Ship faster.

Runlog is a lightweight, developer-first training monitor. Send metrics from any script with a single logger, watch progress live, and keep a clean history of every experiment.

Streaming

Live updates

Setup

3 lines of code

Metrics

Auto-detected

Storage

Unlimited history

fineweb-run-042

FineWeb · 69M params · L40S

running

train loss

—

val loss

—

grad norm

—

step

—

tok/s

—

gpu mem

—

Train loss —

Val loss —

Learning rate —

GPU util —

step 0 / 20,000 —

Quick start

Three lines
to connect any script.

Works with PyTorch, HuggingFace Trainer, Keras, XGBoost — or anything that runs in Python. Log any metric; charts appear automatically.

✓ Auto-detected metric charts

✓ Blazingly fast live streaming

✓ Offline-first — zero data loss

✓ Terminal logs captured automatically

✓ Pause / stop from dashboard

✓ Team workspaces with RBAC

✓ Cross-user run comparison

from runlogger import RunLogger

logger = RunLogger(
    base_url="https://runlog.in",
    project_name="FineWeb",
    api_token="rl-gb-...",
    run_name="run-1",
)

# in your training loop
for step in range(total_steps):
    loss = train_one_step()
    logger.log(
        step=step, total_steps=total_steps,
        train_loss=loss.item(),
        lr=scheduler.get_last_lr()[0],
    )
    if step % eval_every == 0:
        logger.log_eval(
            step=step,
            val_loss=evaluate(),
            checkpoint_saved=True,
        )
logger.finish()

Platform features

Everything a training run
demands of you.

From the first forward pass to the final checkpoint — Runlog keeps you informed, in control, and never in the dark about what your model is doing.

⚡Observability

Realtime metric streaming

Every step, every value — streamed. Charts update live as your model trains. Zero polling, zero lag.

Streaming — Live updates

📡Reliability

Offline-first SDK

Go offline mid-run, lose your connection, or kill the process — not a single metric or terminal log is lost. Data is buffered locally and synced automatically when you reconnect. Manual sync available via CLI.

Auto-sync · Manual CLI · Zero loss

🖥Logging

Live terminal capture

Every print statement, tqdm bar, and framework log is captured automatically and streamed to your dashboard alongside metrics. Nothing extra to configure — it just works.

stdout · stderr · auto-captured

💬Collaboration

Workspace group chat

Every workspace has a built-in group chat. Discuss runs, share observations, and coordinate experiments without leaving the dashboard.

Real-time · workspace-scoped

⚖️Collaboration

Cross-user run comparison

Load a teammate's run directly into your compare view. Overlay your results against anyone who has shared their project with you — instantly, no export needed.

Cross-user · shared projects

🔔Alerts

Conditional email alerts

Set threshold rules on any logged metric. Get notified exactly when it matters — not before, not after.

Value goes above or below threshold

Metric stalls for N consecutive steps

Loss spikes above rolling average

💥Reliability

Crash detection

Runlog detects training crashes mid-run — Python exceptions, OOM errors, NaN loss — and sends an immediate alert with the last known state.

Instant notification

🧟Reliability

Dead run detection

Catch silent hangs. If no steps are logged within a configurable timeout window, the run is flagged as dead and you're alerted immediately.

Configurable timeout

⏸Control

Interruptible training

Hit pause from the dashboard. Your training script receives a clean signal to checkpoint and halt.

Zero state loss

📊Logging

Dynamic metric logging

Log any key-value pair at any step. Charts are created automatically — no schema, no config. Add new metrics mid-run without restarting.

Auto-detected charts

📝Annotation

Run notes & observations

Write markdown notes directly on a run. Pin observations at a specific step. Annotate what changed between experiments before you forget.

Markdown · step-anchored

👥Collaboration

Team spaces

Invite teammates into a shared workspace. Assign roles with granular action-level control — not just who can view, but exactly what each member can create, edit, delete, or manage.

Action-level RBAC · shared projects

⚖️Analysis

Side-by-side run comparison

Overlay multiple runs on the same chart axes. Evaluate training methods, hyperparameter sweeps, and architecture changes at a glance.

Overlay · diff view

🔗Sharing

Publicly-shareable links

Generate a public read-only URL for any run. Share results with collaborators, reviewers, or your audience — no account required to view.

Public · read-only · revokable

💾Checkpoints

Checkpoint ledger

Tag checkpoint paths directly in the dashboard with their exact metric snapshot — val loss, perplexity, step. Find your best checkpoint without digging through logs.

Path · metrics · step

✦Coming soon

More features arriving

Training configuration updates and Slack alert integrations are currently under active development.

Have an idea? Suggest a feature →

In development

What you can track live

Train loss

—

↓ decaying

Val loss

—

step —

Perplexity

—

exp(val_loss)

Tokens / s

—

throughput

GPU util

—

auto-logged

Learning rate

—

cosine decay

Grad norm

—

clipped at 1.0

GPU mem

—

/ 48 GB

Track metrics.Compare runs.Ship faster.

Three linesto connect any script.

Everything a training rundemands of you.

Track metrics.
Compare runs.
Ship faster.

Three lines
to connect any script.

Everything a training run
demands of you.