Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add new docs for jobs and add index
  • Loading branch information
lhoestq committed Dec 10, 2025
commit d1809cd4076ce3d1a9e038e9a05bdb36c50f0d21
41 changes: 41 additions & 0 deletions docs/jobs/_toctree.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
- local: index
title: Hugging Face Jobs

- title: Overview
sections:
- local: index
title: Hugging Face Jobs
- local: quickstart
title: Quickstart
- local: docker
title: Docker
- local: schedule
title: Schedule Jobs
- local: webhooks
title: Webhook Automation
- local: pricing
title: Pricing and Billing

- title: Tutorials
sections:
- title: Training
sections:
- local: training1
title: Training Tuto 1
- title: Inference
sections:
- local: inference1
title: Inference Tuto 1
- title: Data
sections:
- local: data1
title: Data Tuto 1

- title: Guides
sections:
- local: manage
title: Manage Jobs
- local: configuration
title: Configuration
- local: frameworks
title: Frameworks Setups
Empty file added docs/jobs/configuration.md
Empty file.
Empty file added docs/jobs/data1.md
Empty file.
Empty file added docs/jobs/docker.md
Empty file.
Empty file added docs/jobs/frameworks.md
Empty file.
11 changes: 11 additions & 0 deletions docs/jobs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Hugging Face Jobs

Run compute jobs on Hugging Face infrastructure with a familiar UV & Docker-like interface!

<div class="-mt-3 grid grid-cols-2 rounded-xl border lg:grid-cols-4"><div class="border-r p-4 max-lg:border-b"><h3 class="flex items-center gap-1.5 font-semibold"><svg class="text-green-500 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 13 13"><path d="M5.22433 7.95134L3.91933 6.64634C3.80933 6.53634 3.67433 6.48134 3.51433 6.48134C3.35433 6.48134 3.21433 6.54134 3.09433 6.66134C2.98433 6.77134 2.92933 6.91134 2.92933 7.08134C2.92933 7.25134 2.98433 7.39134 3.09433 7.50134L4.80433 9.21134C4.91433 9.32134 5.05433 9.37634 5.22433 9.37634C5.39433 9.37634 5.53433 9.32134 5.64433 9.21134L9.04933 5.80634C9.15933 5.69634 9.21433 5.56134 9.21433 5.40134C9.21433 5.24134 9.15433 5.10134 9.03433 4.98134C8.92433 4.87134 8.78433 4.81634 8.61433 4.81634C8.44433 4.81634 8.30433 4.87134 8.19433 4.98134L5.22433 7.95134ZM6.06433 12.8713C5.23433 12.8713 4.45433 12.7137 3.72433 12.3985C2.99433 12.0837 2.35933 11.6563 1.81933 11.1163C1.27933 10.5763 0.851931 9.94134 0.537131 9.21134C0.221931 8.48134 0.0643311 7.70134 0.0643311 6.87134C0.0643311 6.04134 0.221931 5.26134 0.537131 4.53134C0.851931 3.80134 1.27933 3.16634 1.81933 2.62634C2.35933 2.08634 2.99433 1.65874 3.72433 1.34354C4.45433 1.02874 5.23433 0.871338 6.06433 0.871338C6.89433 0.871338 7.67433 1.02874 8.40433 1.34354C9.13433 1.65874 9.76933 2.08634 10.3093 2.62634C10.8493 3.16634 11.2767 3.80134 11.5915 4.53134C11.9067 5.26134 12.0643 6.04134 12.0643 6.87134C12.0643 7.70134 11.9067 8.48134 11.5915 9.21134C11.2767 9.94134 10.8493 10.5763 10.3093 11.1163C9.76933 11.6563 9.13433 12.0837 8.40433 12.3985C7.67433 12.7137 6.89433 12.8713 6.06433 12.8713Z" fill="currentColor"></path></svg>UV & Docker-like CLI</h3> <p class="font-mono text-xs text-gray-600">uv,run,ps,logs,inspect</p></div> <div class="p-4 dark:border-gray-900 max-lg:border-b lg:border-r"><h3 class="flex items-center gap-1.5 font-semibold"><svg class="text-green-500 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 13 13"><path d="M5.22433 7.95134L3.91933 6.64634C3.80933 6.53634 3.67433 6.48134 3.51433 6.48134C3.35433 6.48134 3.21433 6.54134 3.09433 6.66134C2.98433 6.77134 2.92933 6.91134 2.92933 7.08134C2.92933 7.25134 2.98433 7.39134 3.09433 7.50134L4.80433 9.21134C4.91433 9.32134 5.05433 9.37634 5.22433 9.37634C5.39433 9.37634 5.53433 9.32134 5.64433 9.21134L9.04933 5.80634C9.15933 5.69634 9.21433 5.56134 9.21433 5.40134C9.21433 5.24134 9.15433 5.10134 9.03433 4.98134C8.92433 4.87134 8.78433 4.81634 8.61433 4.81634C8.44433 4.81634 8.30433 4.87134 8.19433 4.98134L5.22433 7.95134ZM6.06433 12.8713C5.23433 12.8713 4.45433 12.7137 3.72433 12.3985C2.99433 12.0837 2.35933 11.6563 1.81933 11.1163C1.27933 10.5763 0.851931 9.94134 0.537131 9.21134C0.221931 8.48134 0.0643311 7.70134 0.0643311 6.87134C0.0643311 6.04134 0.221931 5.26134 0.537131 4.53134C0.851931 3.80134 1.27933 3.16634 1.81933 2.62634C2.35933 2.08634 2.99433 1.65874 3.72433 1.34354C4.45433 1.02874 5.23433 0.871338 6.06433 0.871338C6.89433 0.871338 7.67433 1.02874 8.40433 1.34354C9.13433 1.65874 9.76933 2.08634 10.3093 2.62634C10.8493 3.16634 11.2767 3.80134 11.5915 4.53134C11.9067 5.26134 12.0643 6.04134 12.0643 6.87134C12.0643 7.70134 11.9067 8.48134 11.5915 9.21134C11.2767 9.94134 10.8493 10.5763 10.3093 11.1163C9.76933 11.6563 9.13433 12.0837 8.40433 12.3985C7.67433 12.7137 6.89433 12.8713 6.06433 12.8713Z" fill="currentColor"></path></svg>Any Hardware</h3> <p class="text-sm text-gray-600">CPUs to A100s &amp; TPUs</p></div> <div class="border-r p-4"><h3 class="flex items-center gap-1.5 font-semibold"><svg class="text-green-500 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 13 13"><path d="M5.22433 7.95134L3.91933 6.64634C3.80933 6.53634 3.67433 6.48134 3.51433 6.48134C3.35433 6.48134 3.21433 6.54134 3.09433 6.66134C2.98433 6.77134 2.92933 6.91134 2.92933 7.08134C2.92933 7.25134 2.98433 7.39134 3.09433 7.50134L4.80433 9.21134C4.91433 9.32134 5.05433 9.37634 5.22433 9.37634C5.39433 9.37634 5.53433 9.32134 5.64433 9.21134L9.04933 5.80634C9.15933 5.69634 9.21433 5.56134 9.21433 5.40134C9.21433 5.24134 9.15433 5.10134 9.03433 4.98134C8.92433 4.87134 8.78433 4.81634 8.61433 4.81634C8.44433 4.81634 8.30433 4.87134 8.19433 4.98134L5.22433 7.95134ZM6.06433 12.8713C5.23433 12.8713 4.45433 12.7137 3.72433 12.3985C2.99433 12.0837 2.35933 11.6563 1.81933 11.1163C1.27933 10.5763 0.851931 9.94134 0.537131 9.21134C0.221931 8.48134 0.0643311 7.70134 0.0643311 6.87134C0.0643311 6.04134 0.221931 5.26134 0.537131 4.53134C0.851931 3.80134 1.27933 3.16634 1.81933 2.62634C2.35933 2.08634 2.99433 1.65874 3.72433 1.34354C4.45433 1.02874 5.23433 0.871338 6.06433 0.871338C6.89433 0.871338 7.67433 1.02874 8.40433 1.34354C9.13433 1.65874 9.76933 2.08634 10.3093 2.62634C10.8493 3.16634 11.2767 3.80134 11.5915 4.53134C11.9067 5.26134 12.0643 6.04134 12.0643 6.87134C12.0643 7.70134 11.9067 8.48134 11.5915 9.21134C11.2767 9.94134 10.8493 10.5763 10.3093 11.1163C9.76933 11.6563 9.13433 12.0837 8.40433 12.3985C7.67433 12.7137 6.89433 12.8713 6.06433 12.8713Z" fill="currentColor"></path></svg>Run Anything</h3> <p class="text-sm text-gray-600">UV, Docker, HF Spaces &amp; more</p></div> <div class="p-4"><h3 class="flex items-center gap-1.5 font-semibold"><svg class="text-green-500 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 13 13"><path d="M5.22433 7.95134L3.91933 6.64634C3.80933 6.53634 3.67433 6.48134 3.51433 6.48134C3.35433 6.48134 3.21433 6.54134 3.09433 6.66134C2.98433 6.77134 2.92933 6.91134 2.92933 7.08134C2.92933 7.25134 2.98433 7.39134 3.09433 7.50134L4.80433 9.21134C4.91433 9.32134 5.05433 9.37634 5.22433 9.37634C5.39433 9.37634 5.53433 9.32134 5.64433 9.21134L9.04933 5.80634C9.15933 5.69634 9.21433 5.56134 9.21433 5.40134C9.21433 5.24134 9.15433 5.10134 9.03433 4.98134C8.92433 4.87134 8.78433 4.81634 8.61433 4.81634C8.44433 4.81634 8.30433 4.87134 8.19433 4.98134L5.22433 7.95134ZM6.06433 12.8713C5.23433 12.8713 4.45433 12.7137 3.72433 12.3985C2.99433 12.0837 2.35933 11.6563 1.81933 11.1163C1.27933 10.5763 0.851931 9.94134 0.537131 9.21134C0.221931 8.48134 0.0643311 7.70134 0.0643311 6.87134C0.0643311 6.04134 0.221931 5.26134 0.537131 4.53134C0.851931 3.80134 1.27933 3.16634 1.81933 2.62634C2.35933 2.08634 2.99433 1.65874 3.72433 1.34354C4.45433 1.02874 5.23433 0.871338 6.06433 0.871338C6.89433 0.871338 7.67433 1.02874 8.40433 1.34354C9.13433 1.65874 9.76933 2.08634 10.3093 2.62634C10.8493 3.16634 11.2767 3.80134 11.5915 4.53134C11.9067 5.26134 12.0643 6.04134 12.0643 6.87134C12.0643 7.70134 11.9067 8.48134 11.5915 9.21134C11.2767 9.94134 10.8493 10.5763 10.3093 11.1163C9.76933 11.6563 9.13433 12.0837 8.40433 12.3985C7.67433 12.7137 6.89433 12.8713 6.06433 12.8713Z" fill="currentColor"></path></svg>Pay-as-you-go</h3> <p class="text-sm text-gray-600">Pay only for seconds used</p></div></div>

The Hugging Face Hub provides compute for AI and data workflows via Jobs.

Jobs runs on Hugging Face infrastructure and aim at providing AI builders, Data engineers, developers and AI agents an easy access to cloud infrastructure to run their workloads. They are ideal to fine tune AI models and run inference with GPUs, but also for data ingestion and processing as well.

A job is defined with a command to run (e.g. a UV or python command), a hardware flavor (CPU, GPU, TPU), and optionnally a Docker Image from Hugging Face Spaces or Docker Hub. Many jobs can run in parallel, which is useful e.g. for parameters tuning or parallel inference and data processing.
Empty file added docs/jobs/inference1.md
Empty file.
Empty file added docs/jobs/manage.md
Empty file.
Empty file added docs/jobs/pricing.md
Empty file.
144 changes: 144 additions & 0 deletions docs/jobs/quickstart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# Quickstart

In this guide you will run a Job to fine-tune an open source model on Hugging Face infrastastructure in only a few minutes. Make sure you are logged in to Hugging Face and have access to your [Jobs page](https://huggingface.co/settings/jobs).

<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/empty-jobs-page.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/empty-jobs-page-dark.png"/>
</div>

## Getting started

First install the Hugging Face CLI:

1. Install the CLI

```bash
curl -LsSf https://hf.co/cli/install.sh | bash
```

Install the CLI (using Homebrew)

```bash
brew install huggingface-cli
```

Install the CLI (using uv)

```bash
uv tool install hf
```

2. Login to your Hugging Face account:

Login

```bash
hf auth login
```

3. Create your first jobs using the `hf jobs` command:

Run a UV command or script

```bash
hf jobs uv run python -c 'print("Hello from the cloud!")'
```

```bash
hf jobs uv run path/to/script.py
```

Run a Docker command

```bash
hf jobs run python:3.12 python -c 'print("Hello from the cloud!")'
```

4. Monitor your job

The job logs appear in your terminal, but you can also see the job in your jobs page. Open the job page to see the job information, status and logs:

<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/jobs-page-with-first-job.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/jobs-page-with-first-job-dark.png"/>
</div>

<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/first-job-page.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/firts-job-page-dark.png"/>
</div>


## The training script

Here is a simple training script to fine-tune a base model to a conversational model using Supervised Fine-Tuning (SFT). It uses the [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) model and the [trl-lib/Capybara](https://huggingface.co/datasets/trl-lib/Capybara) dataset, and the [TRL](https://huggingface.co/docs/trl/en/index) library, and saves the resulting model to your Hugging Face account under the name `"Qwen2.5-0.5B-SFT"`:

```python
from datasets import load_dataset
from trl import SFTTrainer

dataset = load_dataset("trl-lib/Capybara", split="train")
trainer = SFTTrainer(
model="Qwen/Qwen2.5-0.5B",
train_dataset=dataset,
)
trainer.train()
trainer.push_to_hub("Qwen2.5-0.5B-SFT")
```

Save this script as `train.py`

## Run the training job

`hf jobs` takes several arguments: select the hardware with `--flavor`, and pass environment variable with `--env` and `--secrets`. Here we use the A100 Large GPU flavor with `--flavor a100-large` and pass your Hugging Face token as a secret with `--secrets HF_TOKEN` in order to be able to push the resulting model to your account.

Moreover, UV accepts the `--with` argument to define python dependencies, so we use `--with trl` to have the `trl` library available.

You can now run the final command which looks like this:

```bash
hf jobs uv run \
--flavor a100-large \
--with trl \
--secrets HF_TOKEN \
train.py
```

The logs appear in your terminal, and you can safely Ctrl+C to stop streaming the logs, the job will keep running.

```
...
Downloaded nvidia-cudnn-cu12
Downloaded torch
Installed 66 packages in 233ms
Generating train split: 100%|██████████| 15806/15806 [00:00<00:00, 76686.50 examples/s]
Generating test split: 100%|██████████| 200/200 [00:00<00:00, 43880.36 examples/s]
Tokenizing train dataset: 100%|██████████| 15806/15806 [00:41<00:00, 384.97 examples/s]
Truncating train dataset: 100%|██████████| 15806/15806 [00:00<00:00, 212272.92 examples/s]
The model is already on multiple devices. Skipping the move to device specified in `args`.
The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151643}.
{'loss': 1.7357, 'grad_norm': 4.8733229637146, 'learning_rate': 1.9969635627530365e-05, 'entropy': 1.7238958358764649, 'num_tokens': 59528.0, 'mean_token_accuracy': 0.6124177813529968, 'epoch': 0.01}
{'loss': 1.6239, 'grad_norm': 6.200186729431152, 'learning_rate': 1.9935897435897437e-05, 'entropy': 1.644005584716797, 'num_tokens': 115219.0, 'mean_token_accuracy': 0.6259662985801697, 'epoch': 0.01}
{'loss': 1.4449, 'grad_norm': 6.167325496673584, 'learning_rate': 1.990215924426451e-05, 'entropy': 1.5156117916107177, 'num_tokens': 171787.0, 'mean_token_accuracy': 0.6586395859718323, 'epoch': 0.02}
{'loss': 1.6023, 'grad_norm': 5.133708953857422, 'learning_rate': 1.986842105263158e-05, 'entropy': 1.6885507702827454, 'num_tokens': 226067.0, 'mean_token_accuracy': 0.6271904468536377, 'epoch': 0.02}
```

Follow the Job advancements on the job page on Hugging Face:


<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/trl-sft-job-page.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/trl-sft-job-page-dark.png"/>
</div>

Once the job is done, find your model on your account:

<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/trl-sft-model-job-page.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/jobs/trl-sft-model-job-page-dark.png"/>
</div>

Congrats ! You just run your first Job to fine-tune an open source model 🔥

Feel free to try out your model locally and evaluate it using e.g. [tranfomers](https://huggingface.co/docs/transformers) by clicking on "Use this model", or deploy it to [Inference Endpoints](https://huggingface.co/docs/inference-endpoints) in one click using the "Deploy" button.
Empty file added docs/jobs/schedule.md
Empty file.
Empty file added docs/jobs/training1.md
Empty file.
Empty file added docs/jobs/webhooks.md
Empty file.