update ai-hub to first optimize model for Workbench
Remove old examples
This commit is contained in:
22
README.md
22
README.md
@@ -177,27 +177,35 @@ The expected output artifact is SageMaker’s `model.tar.gz`, normally containin
|
||||
```
|
||||
qc-cli ai-hub upload <calibration.npz|calibration-dir> <inputs.npz|inputs.npy>
|
||||
qc-cli ai-hub upload <calibration> <inputs> --from-step validate
|
||||
qc-cli ai-hub quantize <calibration.npz|calibration-dir> [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
|
||||
qc-cli ai-hub optimize [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
|
||||
qc-cli ai-hub quantize <calibration.npz|calibration-dir> [--model-id ID] [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
|
||||
qc-cli ai-hub compile [--model-id ID] [--onnx-path PATH] [--model-s3-uri URI] [--from-job NAME]
|
||||
qc-cli ai-hub validate <inputs.npz|inputs.npy> [--model-id ID] [--input-name NAME]
|
||||
qc-cli ai-hub profile [--model-id ID]
|
||||
qc-cli ai-hub download [--model-id ID] [--output PATH]
|
||||
```
|
||||
|
||||
`ai-hub upload` runs the four Workbench upload steps in order: quantize, compile, validate, and profile. Use `--from-step compile`, `--from-step validate`, or `--from-step profile` to resume from saved local state after a completed earlier step.
|
||||
`ai-hub upload` optimizes to ONNX, quantizes, validates, and profiles. When `aihub.target_runtime` is not `onnx`, it
|
||||
also compiles the quantized model to that deployment runtime. The initial ONNX optimization gives external models
|
||||
Workbench provenance and applies compiler optimization passes before quantization.
|
||||
|
||||
Resume behavior:
|
||||
|
||||
```text
|
||||
--from-step quantize Run quantize, compile, validate, and profile.
|
||||
--from-step compile Skip quantize; compile the last quantized model unless an explicit source is passed.
|
||||
--from-step validate Skip quantize and compile; validate the last compiled model.
|
||||
--from-step profile Skip quantize, compile, and validate; profile the last compiled model.
|
||||
--from-step optimize Run optimize, quantize, optional final compile, validate, and profile.
|
||||
--from-step quantize Quantize the last optimized ONNX, then optionally compile, validate, and profile.
|
||||
--from-step compile Skip optimize and quantize; finalize the last quantized model for the target runtime.
|
||||
--from-step validate Skip optimize, quantize, and compile; validate the last compiled model.
|
||||
--from-step profile Skip optimize, quantize, compile, and validate; profile the last compiled model.
|
||||
```
|
||||
|
||||
When a step runs in the current command, `upload` passes its returned model ID directly to the next step. When a step is skipped, the next step resolves the needed model ID from `.qc-cli.json`. This avoids re-running earlier AI Hub jobs when you only need to continue from a later step.
|
||||
|
||||
`ai-hub compile` resolves model sources in this order: `--model-id`, explicit source options (`--onnx-path`, `--model-s3-uri`, `--from-job`), last quantized model from state, then the last training job from local state. `ai-hub download` is separate because downloading the optimized artifact is outside the four-step Workbench upload loop.
|
||||
`ai-hub optimize` compiles an external model with `--target_runtime onnx`. `ai-hub quantize` uses an explicit
|
||||
`--model-id`, the last optimized ONNX model, or an explicit/local model source in that order. `ai-hub compile` resolves
|
||||
model sources in this order: `--model-id`, explicit source options, last quantized model, then the last training job.
|
||||
For `target_runtime: onnx`, upload treats the quantized ONNX as the final model and skips a redundant second compile.
|
||||
`ai-hub download` remains separate because downloading is outside the Workbench processing loop.
|
||||
|
||||
AI Hub authentication currently uses the local `qai-hub` SDK configuration. A planned follow-up is to support AWS Systems Manager Parameter Store `SecureString` for team-managed tokens, where `config.yaml` stores only a parameter name such as `/qc-cli/aihub/token`, AWS KMS encrypts the token at rest, and the CLI retrieves it at runtime with `ssm:GetParameter` plus `kms:Decrypt` permissions.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user