Files
qai-cli/examples/meter-detection

YOLO26 Electric Meter Detection Example

This example trains a YOLO26 object detection model on the Roboflow Universe electric meter dataset using the existing qc-cli SageMaker training flow.

The workflow is intentionally command driven. Run each step yourself so you can inspect the dataset, update config.yaml, and decide when to submit the SageMaker job.

Dataset:

https://universe.roboflow.com/kemals-workspace-kbc8l/electric-meter-detection-o4tfi/dataset/1

Prerequisites

  • AWS credentials configured for the profile in config.yaml
  • Infrastructure already deployed with uv run qc-cli infra setup
  • A Roboflow API key exported as ROBOFLOW_API_KEY
  • curl and unzip available locally

Install or sync the project dependencies:

uv sync

Set the Roboflow API key for the current shell:

export ROBOFLOW_API_KEY=your-roboflow-api-key

1. Download The Dataset

Download version 1 of the dataset in YOLO format. The script uses the Roboflow REST API directly and does not require Python:

bash examples/meter-detection/download_dataset.sh

Confirm the extracted dataset has a YOLO data file and image splits:

find examples/meter-detection/data/electric-meter-detection -maxdepth 2 -type d | sort
find examples/meter-detection/data/electric-meter-detection -name data.yaml -print

The expected layout is similar to:

examples/meter-detection/data/electric-meter-detection/
  data.yaml
  train/
  valid/
  test/

The test/ split may be absent depending on the exported dataset version.

2. Configure SageMaker Training

Update config.yaml so the training section points at this example's source directory:

sagemaker:
  training:
    image_uri: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.6-cpu-py312-ubuntu22.04-sagemaker-v1
    instance_type: ml.g4dn.xlarge
    instance_count: 1
    source_dir: examples/meter-detection/source
    entry_point: train.py
    hyperparameters:
      model: yolo26n.pt
      epochs: 25
      imgsz: 640
      batch: 16
      workers: 2

Use yolo26n.pt for a lightweight first YOLO26 run. If those weights are unavailable in the installed Ultralytics package, use yolo11n.pt as the established fallback:

      model: yolo11n.pt

The source/requirements.txt file is installed by the SageMaker PyTorch container before running train.py.

For a CPU smoke test, use a CPU instance and reduce the workload:

sagemaker:
  training:
    image_uri: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.6-cpu-py312-ubuntu22.04-sagemaker-v1
    instance_type: ml.m4.xlarge
    instance_count: 1
    source_dir: examples/meter-detection/source
    entry_point: train.py
    hyperparameters:
      model: yolo26n.pt
      epochs: 1
      imgsz: 320
      batch: 4
      workers: 2

3. Check Infrastructure

Confirm the CLI can see the configured SageMaker role and S3 bucket:

uv run qc-cli infra status --config config.yaml

4. Upload The Dataset

Upload the downloaded Roboflow dataset to the s3.data_prefix configured in config.yaml:

uv run qc-cli upload examples/meter-detection/data/electric-meter-detection --config config.yaml

Directory uploads preserve paths relative to the uploaded directory, so SageMaker receives the dataset root with data.yaml plus the split directories.

5. Start Training

Submit the SageMaker training job:

uv run qc-cli train start --config config.yaml

The command prints the submitted SageMaker job name. Check progress with:

uv run qc-cli train status --config config.yaml

Or pass the job name explicitly:

uv run qc-cli train status qc-cli-YYYYMMDD-HHMMSS --config config.yaml

Outputs

When the job completes, SageMaker packages the files written under /opt/ml/model into model.tar.gz.

This example writes:

best.pt
model.onnx
metrics.json

The archive is stored under the configured s3.model_prefix.

Training Hyperparameters

Values under sagemaker.training.hyperparameters are passed to source/train.py as command-line arguments.

Name Type Default Description
model string yolo26n.pt Ultralytics model weights or model YAML.
epochs int 25 Number of training epochs.
imgsz int 640 Square training image size.
batch int 16 Images per training batch.
workers int 2 DataLoader worker count.
patience int 20 Early stopping patience.
device string auto Optional Ultralytics device value such as 0 or cpu.
data-yaml string auto Optional path to data.yaml; normally discovered from SM_CHANNEL_TRAIN.

Do not set train-dir or model-dir in normal SageMaker runs. SageMaker sets those automatically through SM_CHANNEL_TRAIN and SM_MODEL_DIR.