make sure resources are set up in isolated namespaces (#1)

Reviewed-on: #1
This commit was merged in pull request #1.
This commit is contained in:
2026-05-27 12:51:26 +00:00
parent 0e728cc193
commit 6ac9702dc5
11 changed files with 184 additions and 36 deletions

View File

@@ -30,7 +30,7 @@ qc-cli --help
# 1. Create config.yaml in the current directory
qc-cli init
# 2. Edit config.yaml — at minimum set s3.bucket and sagemaker.training.image_uri
# 2. Edit config.yaml — at minimum set sagemaker.training.image_uri
# 3. Provision AWS infrastructure (S3 bucket + SageMaker IAM role).
# This is the step that requires the AWS CDK CLI.
@@ -47,15 +47,17 @@ qc-cli train status
`qc-cli init` writes a `config.yaml` in the current directory. The fields you must fill in before using the tool:
```yaml
infra:
stack_name: qc-cli-mlops-1a2b3c4d5e6f
aws:
region: us-east-1
profile: default # AWS CLI profile name
s3:
bucket: your-unique-bucket-name
bucket: qc-cli-mlops-1a2b3c4d5e6f-data
sagemaker:
role_name: qc-cli-sagemaker-role
training:
image_uri: "" # ECR URI for your training container
instance_type: ml.m5.xlarge
@@ -65,6 +67,10 @@ sagemaker:
hyperparameters: {}
```
`qc-cli init` generates the `infra.stack_name` and `s3.bucket` namespace once and writes it to `config.yaml`. Keep these values stable for a deployment; changing them points the CLI at different infrastructure.
The CLI isolates both application resources and CDK bootstrap resources. The application CloudFormation stack uses `infra.stack_name`, the S3 bucket uses the same generated namespace because bucket names are globally unique, and the SageMaker IAM role uses a CloudFormation-generated physical name. CDK bootstrap resources are derived internally from `infra.stack_name`, including a bootstrap stack named `<stack_name>-bootstrap` and a matching non-default CDK asset bucket qualifier. `qc-cli infra destroy` removes the application stack but leaves the CDK bootstrap stack in place; the command prints the retained bootstrap stack name.
`hyperparameters` is a flat map of values passed to the training container. Valid keys depend on the selected training image and entry point.
To provision an MLflow tracking server, set:
@@ -105,6 +111,12 @@ qc-cli infra destroy --yes Destroy stack without confirmation
qc-cli infra destroy --delete-bucket-data Destroy stack and delete S3 data
```
`--cloudformation-execution-policy` is a one-time CDK bootstrap option, not a `config.yaml` setting. Pass it on `infra setup` when you need the CDK bootstrap CloudFormation execution role to use a policy other than the default `AdministratorAccess`:
```bash
qc-cli infra setup --cloudformation-execution-policy arn:aws:iam::aws:policy/PowerUserAccess
```
### `upload`
```