12 KiB
qc-cli
A CLI for Qualcomm's MLOps pipeline — browse and download models from Qualcomm AI Hub, fine-tune them on custom datasets using SageMaker, validate inference, and prepare artifacts for Qualcomm hardware deployment.
Requirements
- Python 3.13+
- uv
- AWS account with credentials configured (
aws configure) when usingqc-cli infra - AWS CDK CLI (
npm install -g aws-cdk) when usingqc-cli infra setuporqc-cli infra destroy
Installation
git clone <repo>
cd qc-cli
uv sync
Run commands with uv run qc-cli <command> or activate the venv first:
source .venv/bin/activate
qc-cli --help
Quick start
# 1. Create config.yaml in the current directory
qc-cli init
# 2. Edit config.yaml — at minimum set sagemaker.training.image_uri
# 3. Provision AWS infrastructure (S3 bucket + SageMaker IAM role).
# This is the step that requires the AWS CDK CLI.
qc-cli infra setup
# 4. Upload training data, then submit a SageMaker training job.
qc-cli upload ./my-dataset
qc-cli train start
qc-cli train status
Configuration
qc-cli init writes a config.yaml in the current directory. The fields you must fill in before using the tool:
infra:
stack_name: qc-cli-mlops-1a2b3c4d5e6f
aws:
region: us-east-1
profile: default # AWS CLI profile name
s3:
bucket: qc-cli-mlops-1a2b3c4d5e6f-data
sagemaker:
training:
image_uri: "" # ECR URI for your training container
instance_type: ml.m5.xlarge
instance_count: 1
entry_point: null # Optional: script inside source_dir
source_dir: null # Optional: local dir packaged and uploaded automatically
hyperparameters: {}
aihub:
device:
name: Samsung Galaxy S25 (Family)
target_runtime: tflite
input_specs: {} # Required before running qc-cli ai-hub commands
job_name: null # Optional prefix for AI Hub Workbench jobs
model_name: null # Optional name for uploaded local ONNX models
compile_options: null
profile_options: null
quantize_options: null
output_dir: build/qai-hub
qc-cli init generates the infra.stack_name and s3.bucket namespace once and writes it to config.yaml. Keep these values stable for a deployment; changing them points the CLI at different infrastructure.
The CLI isolates both application resources and CDK bootstrap resources. The application CloudFormation stack uses infra.stack_name, the S3 bucket uses the same generated namespace because bucket names are globally unique, and the SageMaker IAM role uses a CloudFormation-generated physical name. CDK bootstrap resources are derived internally from infra.stack_name, including a bootstrap stack named <stack_name>-bootstrap and a matching non-default CDK asset bucket qualifier. qc-cli infra destroy removes the application stack but leaves the CDK bootstrap stack in place; the command prints the retained bootstrap stack name.
hyperparameters is a flat map of values passed to the training container. Valid keys depend on the selected training image and entry point.
To provision an MLflow tracking server, set:
mlflow:
mode: create
experiment_name: qc-cli-training
registered_model_name: qc-cli-model
register_trained_models: true
In create mode, the CLI manages the tracking server name from infra.stack_name; you do not need to set tracking_server_name.
To use an existing MLflow tracking server, set:
mlflow:
mode: existing
tracking_server_name: your-tracking-server-name
When MLflow is enabled, train start creates an MLflow run for the SageMaker job. train status finalizes that run once the job reaches a terminal state and registers completed model artifacts as experiment model versions using the experiment-latest MLflow alias. An experiment version is an immutable trained-source artifact; it records that training produced a model, not that the model is better than earlier versions or ready for release.
To open the managed SageMaker MLflow UI, request a fresh presigned URL:
qc-cli mlflow open --config config.yaml
This opens a browser to a fresh presigned URL. It works for mode: create and for mode: existing when the existing server is managed by Amazon SageMaker. In create mode, the command uses the CLI-managed tracking server name. In existing mode, it uses mlflow.tracking_server_name. If the existing MLflow server is external to SageMaker, open it with that server's own URL instead.
Commands
init
qc-cli init Write config.yaml
qc-cli init --output <path> Write config to a custom path
qc-cli init --force Overwrite an existing config file
mlflow
qc-cli mlflow open Open a presigned MLflow UI URL in a browser
infra
qc-cli infra setup Deploy the CDK stack
qc-cli infra setup --no-bootstrap Deploy without running CDK bootstrap
qc-cli infra setup --cloudformation-execution-policy <arn> Set CDK bootstrap execution policy ARN
qc-cli infra status Show CDK stack/resource status
qc-cli infra destroy Destroy stack, retaining S3 data
qc-cli infra destroy --yes Destroy stack without confirmation
qc-cli infra destroy --delete-bucket-data Destroy stack and delete S3 data
--cloudformation-execution-policy is a one-time CDK bootstrap option, not a config.yaml setting. Pass it on infra setup when you need the CDK bootstrap CloudFormation execution role to use a policy other than the default AdministratorAccess:
qc-cli infra setup --cloudformation-execution-policy arn:aws:iam::aws:policy/PowerUserAccess
upload
qc-cli upload <file> Upload a single file to S3
qc-cli upload <dir> Upload all files in a directory tree to S3
qc-cli upload <file> --s3-key <key> Upload a file to a custom S3 key
Uploads use s3.bucket and s3.data_prefix from config.yaml. File uploads default to s3://<bucket>/<data_prefix>/<filename>. Directory uploads are recursive, preserve paths relative to the uploaded directory, and place files under s3://<bucket>/<data_prefix>/.
train
qc-cli train start Submit a SageMaker training job
qc-cli train status [job-name] Show job status; defaults to the last submitted job
qc-cli train list List recent training jobs
qc-cli train list --limit 3 Show a custom number of recent jobs
train start uses s3://<bucket>/<data_prefix>/ as the training channel and writes outputs under s3://<bucket>/<model_prefix>/. If sagemaker.training.source_dir is set, the CLI packages that directory, uploads it beside the job output prefix, and passes sagemaker_program/sagemaker_submit_directory to the SageMaker container.
The expected output artifact is SageMaker’s model.tar.gz, normally containing the trained model file your container writes to /opt/ml/model.
ai-hub
qc-cli ai-hub upload <calibration.npz|calibration-dir> <inputs.npz|inputs.npy>
qc-cli ai-hub upload <calibration> <inputs> --from-step validate
qc-cli ai-hub optimize [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
qc-cli ai-hub quantize <calibration.npz|calibration-dir> [--model-id ID] [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
qc-cli ai-hub compile [--model-id ID] [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
qc-cli ai-hub validate <inputs.npz|inputs.npy> [--model-id ID] [--input-name NAME]
qc-cli ai-hub profile [--model-id ID]
qc-cli ai-hub download [--model-id ID] [--output PATH]
ai-hub upload optimizes to ONNX, quantizes, validates, and profiles. When aihub.target_runtime is not onnx, it
also compiles the quantized model to that deployment runtime. The initial ONNX optimization gives external models
Workbench provenance and applies compiler optimization passes before quantization.
Resume behavior:
--from-step optimize Run optimize, quantize, optional final compile, validate, and profile.
--from-step quantize Quantize the last optimized ONNX, then optionally compile, validate, and profile.
--from-step compile Skip optimize and quantize; finalize the last quantized model for the target runtime.
--from-step validate Skip optimize, quantize, and compile; validate the last compiled model.
--from-step profile Skip optimize, quantize, compile, and validate; profile the last compiled model.
When a step runs in the current command, upload passes its returned model ID directly to the next step. When a step is skipped, the next step resolves the needed model ID from .qc-cli.json. This avoids re-running earlier AI Hub jobs when you only need to continue from a later step.
ai-hub optimize compiles an external model with --target_runtime onnx. ai-hub quantize uses an explicit
--model-id, the last optimized ONNX model, or an explicit/local model source in that order. ai-hub compile resolves
model sources in this order: --model-id, explicit source options, last quantized model, then the last training job.
For target_runtime: onnx, upload treats the quantized ONNX as the final model and skips a redundant second compile.
ai-hub download remains separate because downloading is outside the Workbench processing loop.
AI Hub authentication currently uses the local qai-hub SDK configuration. A planned follow-up is to support AWS Systems Manager Parameter Store SecureString for team-managed tokens, where config.yaml stores only a parameter name such as /qc-cli/aihub/token, AWS KMS encrypts the token at rest, and the CLI retrieves it at runtime with ssm:GetParameter plus kms:Decrypt permissions.
Model lifecycle
The CLI uses neutral experiment naming for trained artifacts and reserves release terminology for an explicit promotion step.
Current behavior:
qc-cli train startsubmits a SageMaker training job.qc-cli train statusfinalizes the MLflow run after the job reaches a terminal state.- If the job completed and
mlflow.register_trained_modelsis enabled, the SageMakermodel.tar.gzis registered as a new MLflow model version with:qc_cli.stage=experimentqc_cli.artifact_kind=trained_sourceqc_cli.source=sagemaker
- The MLflow alias
experiment-latestpoints at the most recently registered experiment version. - AI Hub upload commands create deployable derived artifacts from a trained-source experiment or local ONNX model.
Future release aliases such as v1 or production can point at a selected deployable artifact.
Example future metadata:
qc-cli-model version 12
qc_cli.stage=experiment
qc_cli.artifact_kind=trained_source
qc_cli.source=sagemaker
qc-cli-model-aihub version 3
qc_cli.stage=ai_hub_compiled
qc_cli.artifact_kind=deployable
qc_cli.parent_registered_model_name=qc-cli-model
qc_cli.parent_model_version=12
qc_cli.runtime=tflite
qc_cli.quantization=int8
qc_cli.target_device=Samsung Galaxy S25
In that flow, experiment-latest remains a training convenience alias. Release selection is a separate promotion decision based on the derived artifact, not on the experiment name.
AWS permissions required
The IAM user or role running the CLI needs:
| Action | Service |
|---|---|
| CreateBucket, DeleteBucket, PutObject, GetObject, ListBucket, DeleteObject | S3 |
| CreateRole, GetRole, DeleteRole, AttachRolePolicy, DetachRolePolicy | IAM |
| CreateStack, UpdateStack, DeleteStack, DescribeStacks, DescribeStackEvents | CloudFormation |
| GetCallerIdentity | STS |
| CreateTrainingJob, DescribeTrainingJob, ListTrainingJobs | SageMaker AI |
| CreateMlflowTrackingServer, DescribeMlflowTrackingServer, DeleteMlflowTrackingServer | SageMaker AI, when mlflow.mode is create or existing |
AdministratorAccess covers all of the above.