About Weights & Biases (W&B)
Weights & Biases is an ML platform for experiment tracking, model and dataset versioning, evaluation, and observability across the model lifecycle. It includes tools for logging experiments, comparing model versions, tracing model inputs/outputs, and purpose-built evaluation tooling (Weave) for LLMs and agentic systems.
Key Features
- Experiment tracking & logging — track metrics, hyperparameters, artifacts and runs
- Model & dataset registry/versioning — store and compare model versions and datasets
- Evaluation tooling (Weave) and detailed tracing — scorers, judges and full input/output traces
- Rich visualization & collaboration — tables, reports, and integrations with common ML stacks
Use Cases & Best For
About Model Evaluation
Test and evaluate AI models