S3 Provider for Static Asset Placement
Context
Issue #280 introduces static asset placement for sandboxed assignments with a provider abstraction and an MVP local filesystem provider.
This issue defines the next step: an S3-backed provider for source resolution.
Goal
Add an S3 asset provider so root-level setup assets can be resolved from S3 and injected into sandbox target paths before execution.
Proposed Contract
Keep assignment-facing assets contract stable and provider-agnostic:
{
"assets": [
{
"source": "s3://autograder-assets/datasets/tp2/RESTAURANTES.CSV",
"target": "/tmp/RESTAURANTES.CSV",
"read_only": true
}
]
}
Parsing rules
source supports:
s3://<bucket>/<key>
- optional future alias scheme like
asset://... (not required in first S3 iteration)
- Keep existing local source support unchanged.
Implementation Plan
1) Provider abstraction extension
- Extend
AssetSourceResolver with an S3 provider implementation (for example S3AssetProvider).
- Choose provider by URI scheme in
source.
2) S3 client integration
- Use AWS SDK (
boto3) with default credential chain (IAM role, env vars, shared creds).
- Resolve object by bucket/key and stream bytes for injection.
- Capture metadata needed for observability (size, etag/version_id when available).
3) Security controls
- Optional bucket allowlist (
ASSET_S3_ALLOWED_BUCKETS).
- Optional key prefix allowlist per environment (
ASSET_S3_ALLOWED_PREFIXES).
- Reject unsupported schemes and malformed URIs.
- Enforce max size limits before full download when possible (using
head_object).
- No credentials passed into sandbox containers.
4) Reliability and performance
- Add bounded retries with backoff for transient S3 errors.
- Distinguish retryable vs non-retryable failures.
- Keep deterministic pre-flight failure messages for missing objects, access denied, timeouts.
- Optional (future) host-side cache by bucket/key/version to reduce repeated downloads.
5) Pipeline integration
6) Config and ops
- Document required IAM permissions:
s3:GetObject on allowed asset prefixes
s3:ListBucket only if needed
s3:GetObjectVersion if versioned objects are used
- Add environment variables and defaults for:
- request timeout
- max asset size
- allowed buckets/prefixes
- Provide deployment notes for staging/prod credential setup.
7) Tests
- Unit tests for URI parsing and provider selection.
- Unit tests for S3 provider behavior (mocked boto3):
- success path
- object not found
- access denied
- timeout/retry handling
- size limit rejection
- Integration tests (if infra allows):
- path using test bucket/object
- end-to-end pipeline with S3-sourced asset consumed by student code.
8) Documentation
- Update setup-config docs with S3 examples.
- Add operational runbook for IAM and bucket layout.
- Add troubleshooting section for common S3 failures.
Acceptance Criteria
source: s3://bucket/key is supported for global root-level assets.
- Assets are fetched from S3 and injected before setup commands/grading.
- Failures are explicit and fail pre-flight safely.
- Security constraints (allowlists/limits) are enforced.
- Tests and docs are updated.
Dependency
S3 Provider for Static Asset Placement
Context
Issue #280 introduces static asset placement for sandboxed assignments with a provider abstraction and an MVP local filesystem provider.
This issue defines the next step: an S3-backed provider for
sourceresolution.Goal
Add an S3 asset provider so root-level setup
assetscan be resolved from S3 and injected into sandbox target paths before execution.Proposed Contract
Keep assignment-facing
assetscontract stable and provider-agnostic:{ "assets": [ { "source": "s3://autograder-assets/datasets/tp2/RESTAURANTES.CSV", "target": "/tmp/RESTAURANTES.CSV", "read_only": true } ] }Parsing rules
sourcesupports:s3://<bucket>/<key>asset://...(not required in first S3 iteration)Implementation Plan
1) Provider abstraction extension
AssetSourceResolverwith an S3 provider implementation (for exampleS3AssetProvider).source.2) S3 client integration
boto3) with default credential chain (IAM role, env vars, shared creds).3) Security controls
ASSET_S3_ALLOWED_BUCKETS).ASSET_S3_ALLOWED_PREFIXES).head_object).4) Reliability and performance
5) Pipeline integration
6) Config and ops
s3:GetObjecton allowed asset prefixess3:ListBucketonly if neededs3:GetObjectVersionif versioned objects are used7) Tests
8) Documentation
Acceptance Criteria
source: s3://bucket/keyis supported for global root-level assets.Dependency