Beta is full
All 25 beta spots are taken.
Join the waitlist — we'll email you when a spot opens.
Dashboard
Your Projects
Each project holds multiple training runs.
Loading projects…
Project
Runs
No runs yet. Start training with Runlog.
Subscription
Choose your Plan
Upgrade anytime. Downgrade anytime.
Monthly
Annual
Loading plans…
Need something tailored?
Request a custom plan with limits that fit your exact needs.
View payment history
Collaboration
Your Workspace
Invite teammates to collaborate on projects.
Workspaces
+ Create
✉ Invitations
🕓 History
Loading workspaces…
Project
Runs Table
Documentation
RunLogger Docs
Sections
Installation
pip install runlog-sdk

Or install from source:

pip install git+https://github.com/runlog-in/runlog-sdk.git
Quick Start
from runlogger import RunLogger

logger = RunLogger(
    base_url      = "https://runlog.in",
    project_name  = "my-project",              # created automatically if missing
    api_token     = "rl-gb-...",               # Dashboard → API Tokens
    run_name      = "run-1",                   # optional — auto-generated if omitted
    config        = {"model": "gpt2", "params": "125M"},
    tags          = ["baseline", "v1"],
    offline_mode  = True,                      # preserve data if connection drops
)

for step in range(1000):
    loss = train_one_step()

    logger.log(step=step, total_steps=1000, loss=loss, lr=scheduler.get_lr())

    if step % 100 == 0:
        val_loss = evaluate()
        logger.log_eval(step=step, val_loss=val_loss, is_best=val_loss < best)

    if logger.should_pause():
        save_checkpoint(step)
        logger.finish("paused")
        break

logger.finish()
API Reference
RunLogger()

Creates a new run and connects to the dashboard.

base_urlstrDashboard URL e.g. https://runlog.in
api_tokenstrYour API token from Dashboard → API Tokens
project_namestrProject name — auto-created if missing
run_namestrName for this run — auto-generated if not provided (e.g. cosmic-nebula-42)
configdictHyperparameters and metadata. Visible on the run page.
start_stepintStep to start from. Use when resuming from a checkpoint. Default: 0.
tagslistRun tags e.g. ["baseline", "fp16", "v2"]
notesstrFree-text description of the run.
log_system_statsboolAuto-attach GPU/CPU/RAM stats to every log() call. Default: True.
offline_modeboolPreserve data locally if connection is unavailable. Syncs automatically on reconnect. Default: True. Requires a supported plan.
capture_terminalboolCapture stdout/stderr and stream terminal output to the dashboard alongside metrics. Default: True.
verboseboolPrint internal debug info — packet counts, sync intervals, orphan recovery detail. Default: False.
metricslistOptional list of metric names you plan to log. Metrics are tracked automatically — this is rarely needed. Default: [].
logger = RunLogger(
    base_url         = "https://runlog.in",
    api_token        = "rl-gb-...",
    project_name     = "llm-pretraining",
    run_name         = "gpt2-run-3",
    config           = {"params": "125M", "batch_size": 32, "max_steps": 50000},
    start_step       = 5000,
    tags             = ["fp16", "warmup-cosine"],
    notes            = "Resume from best checkpoint, new LR schedule",
    log_system_stats = True,
    offline_mode     = True,
)
logger.log(step, **kwargs)

Log training metrics at the current step. Pass any keyword arguments — each becomes a chart on the dashboard. total_steps enables the progress bar. Buffering and rate limiting are handled automatically.

logger.log(
    step           = step,
    total_steps    = total_steps,
    train_loss     = loss.item(),
    lr             = scheduler.get_last_lr()[0],
    tokens_per_sec = tokens_per_sec,
    total_tokens   = step * batch_size * seq_len,
    eta_seconds    = (total_steps - step) * step_time,
)
logger.log_eval(step, **kwargs)

Log evaluation metrics. Tracked separately from training metrics on the dashboard. Pass is_best=True to flag the current best checkpoint.

logger.log_eval(
    step             = step,
    val_loss         = val_loss,
    ppl              = math.exp(val_loss),
    accuracy         = accuracy,
    is_best          = is_best,
    checkpoint_saved = is_best,
    checkpoint_path  = "checkpoints/best.pt" if is_best else None,
)
logger.log_artifact(path, name, type, metadata=None)

Upload a file artifact attached to the run. Artifacts appear in the run's Artifacts panel. Supported types: model | dataset | image | file.

pathstrLocal path to the file.
namestrDisplay name on the dashboard e.g. "best-model".
typestrmodel | dataset | image | file
metadatadictOptional key-value info e.g. {"val_loss": 0.42, "step": 5000}.
logger.log_artifact("checkpoints/best.pt",
                    name="best-model", type="model",
                    metadata={"val_loss": 0.42, "step": 5000})

logger.log_artifact("data/train.csv",
                    name="training-data", type="dataset",
                    metadata={"rows": 50000})

logger.log_artifact("outputs/confusion_matrix.png",
                    name="confusion-matrix", type="image")
logger.should_pause()

Returns True if a pause was triggered from the dashboard. Call once per step — the flag clears automatically after being read.

if logger.should_pause():
    save_checkpoint(step)
    logger.finish("paused")
    sys.exit(0)
logger.finish(status)

Mark the run as done. Always call this at the end of your script. Status options: completed | crashed | paused. Waits up to 10 seconds for any pending data before closing.

Context Manager

Automatically calls finish("completed") on normal exit and finish("crashed") if an exception is raised.

with RunLogger(...) as logger:
    for step in range(steps):
        logger.log(step=step, loss=loss)
Offline Mode

RunLogger's offline mode is designed for real-world training conditions where connections are unreliable. Enable it once — everything else is automatic.

logger = RunLogger(
    ...,
    offline_mode = True,   # default
)
What it does

When offline_mode=True:

Mid-run disconnectTraining continues uninterrupted. All data is preserved locally and synced automatically when the connection is restored — in order, with no gaps.
Start offlineYou can start training with no connection at all. Everything is buffered locally and uploaded on the next successful connection.
Crashed or killed runsIf your process is killed mid-training, all data logged before the crash is preserved. The next time you start a run from the same directory, it is recovered and synced automatically — no manual steps.
Plan limit mid-runIf your daily log limit is reached, data that could not be uploaded is held locally. It is automatically uploaded the next day when your limit resets.
Terminal logsAll terminal output is captured and streamed to the dashboard in real time. If offline, chunks are stored locally and flushed on reconnect.
Plan requirement

Offline mode requires a supported plan. If your plan does not include it, it is disabled automatically at startup with a warning. If your plan is upgraded mid-run, offline mode activates immediately — no restart needed.

When to use
Long training runsoffline_mode=True
Unstable or intermittent networkoffline_mode=True
Short scripts, stable connectionEither
No local disk writes allowedoffline_mode=False
PyTorch
logger = RunLogger(
    base_url     = "https://runlog.in",
    api_token    = "rl-gb-...",
    project_name = "my-project",
    run_name     = "pytorch-run",
    config       = {"arch": "gpt2", "batch_size": batch_size},
    offline_mode = True,
)

try:
    for step in range(total_steps):
        loss = criterion(model(x), y)
        loss.backward()
        optimizer.step()
        scheduler.step()

        logger.log(
            step           = step,
            total_steps    = total_steps,
            train_loss     = loss.item(),
            lr             = scheduler.get_last_lr()[0],
            tokens_per_sec = batch_size * seq_len / step_time,
        )

        if step % eval_every == 0:
            val_loss = evaluate(model, val_loader)
            is_best  = val_loss < best_loss
            if is_best:
                torch.save(model.state_dict(), "best.pt")
            logger.log_eval(step=step, val_loss=val_loss, is_best=is_best,
                            checkpoint_path="best.pt" if is_best else None)

        if logger.should_pause():
            torch.save(model.state_dict(), f"pause_{step}.pt")
            logger.finish("paused")
            break

    logger.finish("completed")
except Exception:
    logger.finish("crashed")
    raise

For multi-GPU / DDP training, log only from rank 0:

if rank == 0:
    logger.log(step=step, loss=loss)
HuggingFace Trainer
from runlogger import RunLogger
from transformers import TrainerCallback

class RunLoggerCallback(TrainerCallback):
    def __init__(self, logger):
        self.logger = logger

    def on_log(self, args, state, control, logs=None, **kwargs):
        if logs:
            self.logger.log(step=state.global_step,
                            total_steps=state.max_steps, **logs)

    def on_evaluate(self, args, state, control, metrics=None, **kwargs):
        if metrics:
            self.logger.log_eval(step=state.global_step, **metrics)

    def on_train_end(self, args, state, control, **kwargs):
        self.logger.finish()

# usage:
logger  = RunLogger(..., offline_mode=True)
trainer = Trainer(..., callbacks=[RunLoggerCallback(logger)])
Keras / TensorFlow
import tensorflow as tf
from runlogger import RunLogger

class RunLoggerCallback(tf.keras.callbacks.Callback):
    def __init__(self, logger, total_epochs):
        self.logger       = logger
        self.total_epochs = total_epochs

    def on_epoch_end(self, epoch, logs=None):
        self.logger.log(step=epoch, total_steps=self.total_epochs, **(logs or {}))

    def on_train_end(self, logs=None):
        self.logger.finish()

# usage:
logger = RunLogger(..., offline_mode=True)
model.fit(X, y, epochs=50, callbacks=[RunLoggerCallback(logger, total_epochs=50)])
XGBoost
import xgboost as xgb
from runlogger import RunLogger

class RunLoggerXGBCallback(xgb.callback.TrainingCallback):
    def __init__(self, logger, total_rounds):
        self.logger       = logger
        self.total_rounds = total_rounds

    def after_iteration(self, model, epoch, evals_log):
        metrics = {}
        for data, metric_dict in evals_log.items():
            for name, vals in metric_dict.items():
                metrics[f"{data}_{name}"] = vals[-1]
        self.logger.log(step=epoch, total_steps=self.total_rounds, **metrics)
        return False

# usage:
logger = RunLogger(..., offline_mode=True)
bst    = xgb.train(params, dtrain, num_boost_round=100,
                   evals=[(dval, "val")],
                   callbacks=[RunLoggerXGBCallback(logger, 100)])
Artifacts

Log any file as an artifact — models, datasets, plots, configs. Artifacts appear in the run's Artifacts panel and stay associated with the run permanently.

modelModel weights or checkpoints (.pt, .pkl, .onnx, …)
datasetTraining or evaluation data files (.csv, .jsonl, …)
imagePlots, confusion matrices, sample outputs
fileConfigs, logs, or any other file
# model checkpoint
logger.log_artifact("checkpoints/best.pt",
                    name="best-model", type="model",
                    metadata={"val_loss": 0.42, "step": 5000})

# dataset
logger.log_artifact("data/train.csv",
                    name="training-data", type="dataset",
                    metadata={"rows": 50000, "source": "FineWeb"})

# evaluation plot
logger.log_artifact("outputs/confusion_matrix.png",
                    name="confusion-matrix", type="image")
Automatic System Stats

When optional packages are installed, RunLogger automatically appends hardware metrics to every log() call. These appear as charts alongside your training metrics — no extra code needed.

gpu_utilpynvmlGPU utilization (%)
gpu_mem_usedpynvmlGPU memory used (MB)
gpu_mem_totalpynvmlTotal GPU memory (MB)
cpu_utilpsutilCPU utilization (%)
ram_usedpsutilRAM used (MB)
ram_totalpsutilTotal system RAM (MB)
# install optional dependencies
pip install pynvml psutil

# disable if not needed
logger = RunLogger(..., log_system_stats=False)

Stats are collected from GPU 0. If no GPU is present only CPU/RAM metrics are logged. If neither package is installed, system stats are silently skipped.

Collaboration

Pro and Elite plans support team workspaces. Create a workspace, invite teammates by email, and share projects across your organization.

For team workspaces, go to Workspace in the sidebar. Roles:

adminManage members, all projects
memberCreate/edit projects, view all
viewerRead only
Plans

Plans and limits are managed from the dashboard's Plans page. Upgrade or downgrade at any time — changes take effect immediately, even mid-run.

Daily log limitRunLogger warns you when reached and resumes automatically the next day.
Max metrics trackedMetric keys beyond your plan's limit are ignored.
Log rateData is accepted at the rate your plan allows. The most recent value always gets through.
Offline modeAvailable on supported plans. Activates and deactivates automatically with plan changes.
Team workspacesAvailable on Pro and Elite plans.
Terminal Capture

When capture_terminal=True (the default), RunLogger intercepts all stdout and stderr output from your training script and streams it to the dashboard in real time alongside your metrics. No extra code needed — print() statements, tqdm progress bars, and framework logs all appear automatically.

logger = RunLogger(
        ...,
        capture_terminal = True,   # default — streams all print() output to dashboard
    )
Offline behaviour

If the connection drops mid-run, terminal chunks are stored locally and flushed to the dashboard on reconnect — in order, with no gaps. This requires offline_mode=True.

Disable if needed
logger = RunLogger(
        ...,
        capture_terminal = False,  # raw stdout only, nothing sent to dashboard
    )

Disable if your script produces extremely high-frequency output that you don't need on the dashboard, or if you're running in an environment where stdout redirection is not allowed.

Manual Sync CLI

If a run was interrupted and you want to sync its locally buffered data without starting a new run, use the runlogger-sync command:

# scan the default dumps/ directory
    runlogger-sync

    # scan a specific directory
    runlogger-sync --dir /path/to/runs

    # sync one specific file
    runlogger-sync --file dumps/.runlog_abc123.db

    # show full debug output
    runlogger-sync --verbose
    runlogger-sync -v
Options
--dirDirectory to scan for offline DB files. Default: dumps/
--fileSync a single specific DB file directly.
--base-urlServer URL — fallback if not stored in the DB. Can also be set via RUNLOGGER_URL.
--tokenAPI token — fallback if not stored in the DB. Can also be set via RUNLOGGER_TOKEN.
--verbose, -vShow full debug detail: run IDs, per-batch info, log uploads.
Environment variables
export RUNLOGGER_URL=https://runlog.in
    export RUNLOGGER_TOKEN=rl-...
    runlogger-sync
Notes

The token and server URL are stored inside each DB file, so you usually don't need to pass them manually. Safe to run multiple times — already-synced packets are skipped automatically. Unrecoverable DB files (missing token or payload) are discarded silently.

Auto Run Names

If you don't provide a run_name, one is generated automatically in the format adjective-noun-number:

cosmic-nebula-42
    silver-ridge-317
    eager-summit-5

Names are readable, memorable, and unique at any practical project scale. You'll see them on the dashboard and in logs. To use a fixed name instead:

logger = RunLogger(
        ...,
        run_name = "gpt2-baseline-run3",
    )
Tags & Notes
Tags

Tags appear on the dashboard and can be used to filter and group runs across a project. Pass any list of strings.

logger = RunLogger(
        ...,
        tags = ["baseline", "bf16", "fineweb", "v2"],
    )
Notes

Free-text notes visible on the run detail page. Useful for recording what you're testing in this run.

logger = RunLogger(
        ...,
        notes = "Testing SwiGLU vs GELU — same LR schedule, different FFN.",
    )
Error Handling
Errors raised at startup

These are raised immediately as RuntimeError before training begins:

RuntimeError: Invalid API token: rl-...
    RuntimeError: [Runlog] account is banned.
Everything else degrades gracefully
Connection lost mid-runData is preserved locally if offline_mode=True, retried automatically on reconnect.
Upload failureLogged to console. Training continues unaffected.
Plan limit reachedRunLogger warns you and stops logging for the rest of the day. Resets at midnight.
Recommended pattern
try:
        with RunLogger(...) as logger:
            for step in range(max_steps):
                loss = train()
                logger.log(step=step, loss=loss)
    except RuntimeError as e:
        print(f"RunLogger error: {e}")
        # continue training without logging, or exit
Verbose mode

Pass verbose=True to see internal detail — packet counts, sync intervals, orphan run recovery. Useful for diagnosing connection or sync issues.

logger = RunLogger(..., verbose=True)
FAQ
Do I need to call finish() if I use the context manager?

No — it is called automatically. Normal exit calls finish("completed"). An exception calls finish("crashed"). The exception is not suppressed.

What if I forget finish()?

The run stays marked as running on the dashboard indefinitely. Always call finish() or use the context manager.

Can I use RunLogger with multi-GPU / DDP training?

Yes. Log only from rank 0 to avoid duplicate data:

if rank == 0:
        logger.log(step=step, loss=loss)
Can I log string values as metrics?

No — metric values must be int, float, or bool. Pass strings in config, tags, or notes instead.

Can I have multiple loggers in one script?

Yes. Each RunLogger instance is independent and creates its own run.

Does RunLogger affect training performance?

No. All logging is non-blocking — your training loop is never slowed down.

What if my machine is killed mid-training?

If offline_mode=True, all data logged before the crash is preserved and recovered automatically the next time you start a run from the same directory. No manual steps required.

Where are offline DB files stored?

In a dumps/ directory relative to where your training script runs. Files are named .runlog_<run_id>.db and cleaned up automatically after a successful sync.

How do I debug connection or sync issues?

Pass verbose=True to RunLogger(...) to see full internal detail, or use runlogger-sync --verbose for manual sync debugging.

Can I use RunLogger with a self-hosted Runlog instance?

RunLogger is designed exclusively for use with runlog.in. Self-hosted deployments are not supported. Set base_url to https://runlog.in.

Runlog
The training monitor — lightweight, self-hosted, beautiful.
Real-time Streaming
Live metric updates. Watch your loss curve move as training happens.
Multi-run Compare
Overlay train and val loss across runs on a single chart. Spot the best experiment instantly.
Team Workspaces
Invite teammates, assign roles, and share projects across your organization.
API Token Auth
Per-project tokens let you log from any machine — Colab, cloud, local — securely.
Checkpoint Tracking
Automatically flags best checkpoints and logs artifact paths alongside your metrics.
Metric Alerts
Set alerts for loss plateaus, threshold crossings, and more. Never miss a crashed run.
Dynamic Charts
Auto-detected from whatever you log. Drag to reorder. Smooth with a slider.
Runs Table
Filter and sort all runs by status, tag, or loss. Built for large experiment histories.
System Stats
Automatic CPU utilization and RAM tracking logged alongside your metrics every step.
Offline Mode
Log runs without a connection. Metrics are queued locally and synced when you're back online.
Terminal Capture
Mirrors stdout and stderr into your run log. Every print and warning saved automatically.
Get in touch
Terms of Service | Privacy Policy
Runlog
© 2026 Runlog (runlog.in) All rights reserved.
Selected Runs
Active Metrics
Compare
Run Comparison
Select runs, then pick which metrics to overlay. Hover the chart for a unified crosshair tooltip.
Select Runs
Run
running
step 0 / ?
loss lr eta tok/s
Smoothing 0%
Run Details
Tags
Notes
Share
Not public
Checkpoints
No checkpoints yet.
Artifacts
No artifacts logged yet.
Account
Your Settings
Manage your profile, appearance, and preferences.
Font Size
SmallMediumLarge
13px
Appearance
Dark
Light
Timezone
Affects how timestamps are displayed throughout the dashboard.
Account
Your API Tokens
Click a project to manage its tokens.
Account Tokens cross-project
These tokens can access multiple projects. Choose scope below.
Workspace
Team Chat
Real-time messaging across your workspaces.
Workspaces
Select a workspace to start chatting.