Skip to content

mrobledo07/yolov8s_reCAPTCHAv2_solver

Repository files navigation

Automated reCAPTCHA v2 Solver Using YOLOv8s

This project demonstrates how an object detection-focused architecture like YOLOv8s can be leveraged to solve the reCAPTCHA v2 reverse Turing test. The model is trained and evaluated on a ~62k-image augmented reCAPTCHA v2 dataset exported from RoboFlow. The best-performing model is exported to ONNX format for browser inference with ONNX Runtime Web.

WebGPU Implementation & Cross-Browser Configuration

To achieve hardware-accelerated inference for YOLOv8 in the browser, WebGPU was utilized as the primary execution provider. Unlike its predecessor (WebGL), WebGPU provides direct access to modern GPU features, significantly reducing CPU overhead and enabling performance close to native speeds. However, as of early 2026, WebGPU support on Linux remains behind experimental flags in several browsers due to driver stability requirements.

  • Mozilla Firefox (Linux): Access to the navigator.gpu API is disabled by default. It must be manually enabled via about:config by setting dom.webgpu.enabled and gfx.webgpu.ignore-blocklist to true. This bypasses the hardware vendor blocklist and exposes the Vulkan-backed WebGPU interface to the browser's JavaScript engine.

  • Google Chrome / Chromium (Linux): While support is more mature, WebGPU often requires explicit activation via chrome://flags (enabling "Unsafe WebGPU Support") or by launching the browser with the --enable-unsafe-webgpu and --enable-features=Vulkan command-line arguments.

In production environments, a Tiered Execution Strategy is recommended: the application should first attempt to initialize a WebGPU session for maximum performance (FP16), with an automatic fallback to WebAssembly (WASM) with SIMD for compatibility on older browsers or systems with unsupported drivers.

Running This Notebook on Your Own Machine

This notebook requires Python ≥ 3.8, PyTorch, and the Ultralytics library. A CUDA-capable NVIDIA GPU is strongly recommended for training (50 epochs on ~62 k images). Below are several ways to set up your environment.

If you want to perform the fine-tuning training by yourself, maybe changing some parameters, you can download my dataset from https://drive.google.com/file/d/1FI9kqsWEkjiaj5A136WhY6oAG9P1r44v/view?usp=drive_link and move it to the root of the project.

You can also use your own dataset.

Option 1 — Docker with the Ultralytics image (easiest)

Ultralytics publishes ready-to-use Docker images with PyTorch, CUDA drivers, and all dependencies pre-installed.

# Pull the GPU image (includes CUDA + cuDNN)
docker pull ultralytics/ultralytics:latest-cuda

# Run the container, mounting your project folder
docker run --ipc=host --gpus all -it \
  -v $(pwd):/workspace \
  -w /workspace \
  ultralytics/ultralytics:latest-cuda

# Inside the container, launch Jupyter
jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root

CPU-only variant: replace latest-cuda with latest if you don't have an NVIDIA GPU.

Option 2 — NVIDIA NGC PyTorch container (larger but more powerful)

The NVIDIA NGC PyTorch image ships with optimised CUDA libraries, NCCL, cuDNN, and TensorRT. It is bigger (~15 GB) but offers best-in-class GPU performance.

# Pull the latest NGC PyTorch image
docker pull nvcr.io/nvidia/pytorch:latest

# Launch the container
docker run --ipc=host --gpus all -it \
  -v $(pwd):/workspace \
  -w /workspace \
  nvcr.io/nvidia/pytorch:latest

# Inside the container install Ultralytics on top
pip install ultralytics onnx onnxruntime

# Then start Jupyter
jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root

Option 3 — Conda / Miniconda environment

If you prefer a native (non-Docker) setup, Conda is a great way to manage a reproducible Python environment with GPU-enabled PyTorch.

# 1. Install Miniconda (skip if you already have it)
#    https://docs.anaconda.com/miniconda/install/

# 2. Create and activate a new environment
conda create -n yolov8-recaptcha python=3.10 -y
conda activate yolov8-recaptcha

# 3. Install PyTorch with CUDA support (adjust cuda version as needed)
#    See https://pytorch.org/get-started/locally/ for the right command
conda install pytorch torchvision pytorch-cuda=12.4 -c pytorch -c nvidia -y

# 4. Install Ultralytics and notebook dependencies
pip install ultralytics onnx onnxruntime jupyter

# 5. Register the environment as a Jupyter kernel
python -m ipykernel install --user --name yolov8-recaptcha

# 6. Launch Jupyter
jupyter notebook

Option 4 — Python virtual environment (venv)

A lightweight alternative using only the Python standard library. You must have the correct NVIDIA drivers and CUDA toolkit already installed on your system for GPU support.

# 1. Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate        # Linux / macOS
# .venv\Scripts\activate         # Windows

# 2. Upgrade pip
pip install --upgrade pip

# 3. Install PyTorch with CUDA (adjust the index-url for your CUDA version)
#    See https://pytorch.org/get-started/locally/
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

# 4. Install Ultralytics and notebook dependencies
pip install ultralytics onnx onnxruntime jupyter

# 5. Register the environment as a Jupyter kernel
python -m ipykernel install --user --name yolov8-recaptcha

# 6. Launch Jupyter
jupyter notebook

💡 Tip: Whichever method you choose, verify your setup by running python -c "import torch; print(torch.cuda.is_available())" — it should print True if GPU acceleration is available. Instead of using jupyter to create a web server, you can also connect to the container using DevContainers in VSCode and run the jupyter notebook using VSCode (recommended).

Executing the Web Application

To set up and run the browser-based solver demonstration locally, use Python's built-in HTTP server.

Before all, download my already fine-tuned model from https://drive.google.com/file/d/1h8zYhtXaU1kOJrTiTBfi4uwpT5nAIyQt/view?usp=sharing. Move the resulting best.onnx file to web-demo/model/best.onnx so the web application can find it and load it in ONNX format in the browser.

  1. Open your terminal and navigate to the web-demo folder containing the static web files:
    cd web-demo
  2. Start the local development server:
    python3 -m http.server 8000
  3. Open a supported web browser and navigate to:
    http://localhost:8000
    

yolo

Depending on your browser, you might need to enable specific WebGPU or WebAssembly flags as outlined in the WebGPU Implementation & Cross-Browser Configuration section.

About

YOLOv8s fine-tuned with customized dataset for reCAPTCHA v2 automated solving. Browser-enabled inference with ONNX Runtime Web (WebAssembly) in the browser. Acceleration enabled using WebGPU.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors