Releases: STiFLeR7/imgshape
v5.0.0
imgshape v4.2.0
🖼️ imgshape
The Data-Centric AI Toolkit for Vision Engineers
"Automatically analyze any image dataset and get model-ready preprocessing recommendations in one command."
🚀 Live Demo (Web) • 📖 Documentation • 💬 Report Bug / Discuss
✨ What's New in v4.2 "Bento Intelligence"
- 🍱 Bento Grid UI: A complete UX overhaul using a modular 12-column grid for high-density dataset insights.
- 🌊 Semantic Drift 2.0: detect dataset shifts using DINOv2 vision transformer embeddings.
- 🚀 Atlas Bento Engine: 40% faster fingerprinting via vectorized IO and multi-stage caching.
- 🧩 Domain Profiles: One-click configurations for Medical, Satellite, and OCR datasets.
⚡ 30-Second Start
Don't guess your dataset's health. Audit it immediately with the Atlas engine.
pip install imgshape
from imgshape import Atlas
# 1. Initialize the Atlas Orchestrator
atlas = Atlas()
# 2. Extract deterministic fingerprint
result = atlas.extract_fingerprint("./my_dataset")
# 3. View the verdict
print(result.summary())System Output:
{
"fingerprint_id": "fp_8a7d9f2",
"total_images": 4502,
"corrupt_files": 12,
"metrics": {
"avg_resolution": "1024x768",
"diversity_score": 0.89,
"channel_consistency": "FAIL"
},
"issues": ["Found 14 grayscale images in RGB dataset"]
}🔍 The Visual Dashboard (Atlas UI)
Experience imgshape's capabilities visually. The dashboard provides a real-time interface for dataset fingerprinting, augmentation previews, and pipeline configuration using the new Bento Grid layout.
Dashboard v4.2.0 showing Bento Grid layout and semantic drift detection.
🚀 Why imgshape?
Most vision models fail because of garbage data—corrupt files, mixed channels (RGBA vs RGB), or weird aspect ratios. imgshape catches these before you train using a deterministic rule engine.
| Module | Technical Function |
|---|---|
| 🔍 Instant Audit | Multi-threaded + GPU-accelerated scan for entropy, blur, and variance using PyTorch. |
| 🧠 Decision Engine | Heuristic-based suggestion engine with Provenance IDs and Reproducibility Hashes. |
| 📊 Semantic Drift | NEW: DINOv2-powered drift analysis between dataset versions. |
| 🍱 Bento Grid UI | NEW: High-density Modular Dashboard for interactive exploration. |
| 🛠️ Pipeline Export | Generates serialization-safe code for PyTorch, TensorFlow, and Albumentations. |
📦 Installation Matrix
Choose your deployment flavor.
| Command | Use Case | Size |
|---|---|---|
pip install imgshape |
Core / CI/CD | ~12MB |
pip install "imgshape[full]" |
Research / Power User | ~45MB |
pip install "imgshape[ui]" |
Interactive / Dashboard | ~30MB |
💡 Practical Use Cases
1. The "Sanity Check" (CI/CD Integration)
Block bad data from entering your training bucket. Ideal for GitHub Actions or Jenkins.
# Returns exit code 1 if corrupt files or schema violations are found
imgshape --check ./new_batch_v2 --strict-schema2. The "Pipeline Builder"
Don't guess augmentation parameters. Let the entropy statistics decide.
# analyze -> recommend -> export PyTorch snippet
imgshape --path ./train_data --analyze --recommend --out transforms.py3. The "Visual Explorer"
Verify RandomCrop or ColorJitter intensity manually before training.
# Launches local studio with auto-reload
imgshape --web --reload🏗️ Architecture & Internal Mechanics
imgshape (Aurora Engine) operates on a Fingerprint-Analyze-Decide loop, acting as a middleware between raw storage and compute.
graph TD
subgraph "Data Layer"
A[Raw Images]
end
subgraph "imgshape Core (Atlas Bento)"
B[Fingerprint Extractor] -->|Hash & Meta| C{Decision Engine}
C -->|Rules v4.2| D[Recommendation]
end
subgraph "Integration Layer"
D --> E[PyTorch/TF Code]
D --> F[JSON Artifacts]
D --> G[HTML/PDF Reports]
end
A --> BCore Components
- Atlas Bento Orchestrator: The central intent-driven API that manages the lifecycle of an analysis session.
- Fingerprint Extractor: A stateless module that computes immutable signatures for datasets (distributions, channel counts, hashes).
- Decision Engine: A rule-based system that maps dataset signatures + User Intent (e.g., "Speed" vs "Accuracy") to concrete preprocessing steps.
🤝 Community & Support
- Issues: Found a bug? Open an issue.
- Discussions: Feature requests? Join the discussion.
Built by Stifler for the AI Engineering community.
Star on GitHub ⭐ — it helps more people find clean data.
imgshape v4.1.0
🖼️ imgshape
The Data-Centric AI Toolkit for Vision Engineers
"Automatically analyze any image dataset and get model-ready preprocessing recommendations in one command."
🚀 Live Demo (Web) • 📖 Documentation • 💬 Report Bug / Discuss
⚡ 30-Second Start
Don't guess your dataset's health. Audit it immediately with the Atlas engine.
pip install imgshape
from imgshape import Atlas
# 1. Initialize the Atlas Orchestrator
atlas = Atlas()
# 2. Extract deterministic fingerprint
result = atlas.extract_fingerprint("./my_dataset")
# 3. View the verdict
print(result.summary())System Output:
{
"fingerprint_id": "fp_8a7d9f2",
"total_images": 4502,
"corrupt_files": 12,
"metrics": {
"avg_resolution": "1024x768",
"diversity_score": 0.89,
"channel_consistency": "FAIL"
},
"issues": ["Found 14 grayscale images in RGB dataset"]
}🔍 The Visual Dashboard (Atlas UI)
Experience imgshape's capabilities visually. The dashboard provides a real-time interface for dataset fingerprinting, augmentation previews, and pipeline configuration.
Dashboard v4.1.0 showing GPU acceleration status and drift detection.
🚀 Why imgshape?
Most vision models fail because of garbage data—corrupt files, mixed channels (RGBA vs RGB), or weird aspect ratios. imgshape catches these before you train using a deterministic rule engine.
| Module | Technical Function |
|---|---|
| 🔍 Instant Audit | Multi-threaded + GPU-accelerated scan for entropy, blur, and variance using PyTorch. |
| 🧠 Decision Engine | Heuristic-based suggestion engine with Provenance IDs and Reproducibility Hashes. |
| 📊 Comparison Layer | NEW: Drift Analysis and Similarity Indexing between dataset versions. |
| 🛠️ Pipeline Export | Generates serialization-safe code for PyTorch, TensorFlow, and Albumentations. |
| 🎨 Visual Studio | Local Web Dashboard for interactive augmentation testing and hypothesis verification. |
📦 Installation Matrix
Choose your deployment flavor.
| Command | Use Case | Size |
|---|---|---|
pip install imgshape |
Core / CI/CD | ~12MB |
pip install "imgshape[full]" |
Research / Power User | ~45MB |
pip install "imgshape[ui]" |
Interactive / Dashboard | ~30MB |
💡 Practical Use Cases
1. The "Sanity Check" (CI/CD Integration)
Block bad data from entering your training bucket. Ideal for GitHub Actions or Jenkins.
# Returns exit code 1 if corrupt files or schema violations are found
imgshape --check ./new_batch_v2 --strict-schema2. The "Pipeline Builder"
Don't guess augmentation parameters. Let the entropy statistics decide.
# analyze -> recommend -> export PyTorch snippet
imgshape --path ./train_data --analyze --recommend --out transforms.py3. The "Visual Explorer"
Verify RandomCrop or ColorJitter intensity manually before training.
# Launches local studio with auto-reload
imgshape --web --reload🏗️ Architecture & Internal Mechanics
imgshape (Aurora Engine) operates on a Fingerprint-Analyze-Decide loop, acting as a middleware between raw storage and compute.
graph TD
subgraph "Data Layer"
A[Raw Images]
end
subgraph "imgshape Core (Atlas)"
B[Fingerprint Extractor] -->|Hash & Meta| C{Decision Engine}
C -->|Rules v4.0| D[Recommendation]
end
subgraph "Integration Layer"
D --> E[PyTorch/TF Code]
D --> F[JSON Artifacts]
D --> G[HTML/PDF Reports]
end
A --> BCore Components
- Atlas Orchestrator: The central intent-driven API that manages the lifecycle of an analysis session.
- Fingerprint Extractor: A stateless module that computes immutable signatures for datasets (distributions, channel counts, hashes).
- Decision Engine: A rule-based system that maps dataset signatures + User Intent (e.g., "Speed" vs "Accuracy") to concrete preprocessing steps.
🤝 Community & Support
- Issues: Found a bug? Open an issue.
- Discussions: Feature requests? Join the discussion.
Built by Stifler for the AI Engineering community.
Star on GitHub ⭐ — it helps more people find clean data.
v4.0.0
📊 imgshape
Dataset Intelligence Layer for Computer Vision
v4.0.0 Atlas Edition
Deterministic Dataset Fingerprinting & Intelligent Decision Making
Fingerprinting • Rule-Based Decisions • Explainable AI • Deployable Artifacts • Production Ready
🌐 Live Demo • Documentation • v4 Guide • Report Bug • Request Feature
🚀 imgshape v4.0.0 (Atlas)
Atlas is a complete architectural redesign of imgshape, shifting from heuristic recommendations to deterministic dataset intelligence.
Core Capabilities
| Feature | Description |
|---|---|
| 🔬 Deterministic Fingerprinting | Stable, canonical dataset identities across runs and deployments |
| 🎯 Rule-Based Decisions | Explainable, traceable decisions with full reasoning |
| 📐 Five-Profile System | Spatial, Signal, Distribution, Quality, Semantic analysis |
| 📦 Deployable Artifacts | CI-safe, version-controlled outputs for production |
| 🔓 No Hidden Logic | Every decision includes complete rationale and confidence |
| ⚙️ Framework Agnostic | Works with PyTorch, TensorFlow, JAX, or plain NumPy |
Why Atlas?
Before (v3): "This dataset looks good for ResNet50."
Now (v4 Atlas): "This dataset's fingerprint is imgshape://vision/photographic/high-entropy. For task=classification with priority=speed, we recommend MobileNetV3 because: [8 explicit reasons with metrics]."
⚡ Quick Start
Installation
# Core package
pip install imgshape
# With web UI and full features
pip install "imgshape[full]"Python API (v4)
from imgshape import Atlas
# Initialize the analyzer
atlas = Atlas()
# Analyze a dataset
result = atlas.analyze(
dataset_path="path/to/images",
task="classification",
deployment="edge",
priority="speed"
)
# Inspect results
print(f"Fingerprint: {result.fingerprint.dataset_uri}")
# Fingerprint: imgshape://vision/photographic/high-entropy
print(f"Recommended Model: {result.decisions['model_family'].selected}")
# Recommended Model: MobileNetV3
print(f"Reasoning: {result.decisions['model_family'].why}")
# Reasoning: [8 evidence points with metrics]
# Export for CI/CD
artifact = result.to_artifact()
artifact.save("dataset_analysis.json")Command Line (v4)
# Generate fingerprint
imgshape --fingerprint path/to/images --format json
# Run full analysis
imgshape --atlas path/to/images --task classification --output analysis.json
# View decisions
imgshape --decisions path/to/images --priority speed --deployment edge
# Interactive web UI
imgshape --web
# Opens http://localhost:8080 with modern React interfaceWeb Interface
The imgshape web UI provides an interactive, modern interface for dataset analysis:
Live Demo: 🌐 imgshape.vercel.app
imgshape --webFeatures:
- 📊 Real-time fingerprint generation and visualization
- 🎯 Interactive decision explorer with full reasoning
- 📈 Dataset statistics dashboard
- 💾 Export analysis results (JSON, YAML, PDF)
- 🚀 Deploy artifacts directly from the UI
🏗️ Architecture
Core Components
┌─────────────────────────────────────────────────┐
│ Atlas Orchestrator │
│ (Main coordination & result aggregation) │
└────────────┬────────────────────────────────────┘
│
┌────────┼────────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌──────┐ ┌─────────┐
│Finger- │ │Rules │ │Artifact │
│print │ │Based │ │Generator│
│Engine │ │Decis-│ │ │
│ │ │ion │ │ │
└────────┘ └──────┘ └─────────┘
│ │ │
└────────┼─────────┘
│
▼
┌────────────────┐
│Result Bundle │
│ - Fingerprint │
│ - Decisions │
│ - Artifacts │
│ - Confidence │
└────────────────┘
Fingerprint Profiles
Every dataset receives a 5-dimensional fingerprint:
- Spatial Profile - Image dimensions, aspect ratios, scale distribution
- Signal Profile - Channel count, bit depth, dynamic range
- Distribution Profile - Entropy, skewness, color uniformity
- Quality Profile - Corruption rate, blur detection, noise estimation
- Semantic Profile - Inferred content type (faces, objects, aerial, medical, etc.)
🎯 Decision Domains
Atlas makes deterministic decisions across 8 domains:
| Domain | Examples |
|---|---|
| Model Family | ResNet, MobileNet, ViT, EfficientNet, etc. |
| Input Dimensions | 224x224, 512x512, or custom based on content |
| Preprocessing | Normalization parameters, augmentation strategy |
| Batch Size | Based on memory constraints and convergence |
| Optimizer | Adam, SGD, AdamW based on dataset characteristics |
| Augmentation | RandAugment, MixUp, Cutmix, intensity levels |
| Deployment Target | CPU, GPU, Edge (TensorRT, ONNX), Mobile |
| Training Duration | Early stopping patience, epoch count, callbacks |
📊 Example Analysis Output
{
"fingerprint": {
"dataset_uri": "imgshape://vision/photographic/high-entropy",
"dataset_id": "sha256:abc123...",
"sample_count": 50000,
"spatial": {
"resolution_class": "high",
"aspect_ratio_variance": 0.23,
"mean_dimensions": [1920, 1080]
},
"signal": {
"channel_count": 3,
"bit_depth": 8
},
"distribution": {
"entropy": 7.84,
"color_uniformity": 0.42
},
"quality": {
"corruption_rate": 0.0,
"blur_percentage": 3.2,
"noise_estimate": "gaussian"
},
"semantic": {
"inferred_type": "photographic",
"confidence": 0.92
}
},
"decisions": {
"model_family": {
"selected": "MobileNetV3",
"confidence": 0.87,
"why": [
"Dataset has 50k images (suitable for efficient models)",
"Spatial resolution is high (1920x1080 average)",
"Photographic content with 0.23 aspect ratio variance",
"Edge deployment prioritizes inference speed over accuracy",
"MobileNetV3 offers 2.8x faster inference than ResNet50",
"Maintains 91% of ResNet50 accuracy on ImageNet",
"Works on CPU and mobile devices",
"Recent architecture (2019) with good operator support"
],
"alternatives": ["EfficientNetB1", "ResNet34"]
},
"input_dimensions": {
"selected": [224, 224],
"confidence": 0.95,
"why": ["MobileNetV3 default", "High entropy favors standard sizes"]
}
},
"artifacts": {
"fingerprint_stable": true,
"fingerprint_format": "v4",
"export_formats": ["json", "yaml", "protobuf"]
}
}💻 Usage Patterns
1. CI/CD Integration
#!/bin/bash
# ci_check.sh - Ensure dataset integrity in your pipeline
imgshape --fingerprint data/train \
--output fingerprint.json \
--format json
# Compare with expected fingerprint
CURRENT=$(cat fingerprint.json | jq -r .dataset_id)
EXPECTED=$(cat .fingerprint_lock)
if [ "$CURRENT" != "$EXPECTED" ]; then
echo "❌ Dataset changed! Update .fingerprint_lock"
exit 1
fi
echo "✅ Dataset verified"2. Training Script Integration
from imgshape import Atlas
# In your training pipeline
atlas = Atlas()
analysis = atlas.analyze("data/train", task="classification")
# Use recommendations
model = create_model(
architecture=analysis.decisions['model_family'].selected,
input_size=analysis.decisions['input_dimensions'].selected
)
augmentation = get_augmentation_pipeline(
analysis.decisions['augmentation'].selected
)
print(f"Fingerprint: {analysis.fingerprint.dataset_uri}")
print(f"Model: {model.__class__.__name__}")3. Manual Inspection
# Generate comprehensive report
imgshape --atlas data/train \
--task detection \
--deployment gpu \
--priority accuracy \
--report analysis_report.md
# View decisions
imgshape --decisions data/train \
--output decisions.json \
--verbose🔌 Plugin System
Extend imgshape with custom fingerprint extractors and decision rules.
# plugins/medical_profiler.py
from imgshape.plugins import FingerprintPlugin
class MedicalProfiler(FingerprintPlugin):
"""Extract DICOM-specific attributes"""
NAME = "medical_profiler_v1"
def extract(self, dataset_path):
# Custom logic for medical imaging
return {
"modality": "CT",
"bit_depth": 16,
"is_3d": True
}Register and us...
imgshape v3.0.0
🖼️ imgshape — Smart Dataset Intelligence Toolkit (v3.0.0 • Aurora)
imgshape is a modular Python toolkit for image analysis, dataset inspection, augmentation & preprocessing recommendations, visualization, and pipeline export — now evolved into a Streamlit-powered dataset assistant for modern ML/DL workflows.
✨ What's New in v3.0.0 — Aurora Major Release
A complete redesign: from a static CLI toolkit → to an intelligent dataset analysis framework.
🧭 Highlights
- Full Streamlit App (
app.py) with 6 powerful tabs:- 📐 Shape → instant image shape detection
- 🔍 Analyze → entropy, color channels, dataset insights
- 🧠 Recommend → preprocessing & augmentation planning
- 🎨 Augment Visualizer → real-time augmentation previews
- 📄 Reports → export Markdown / HTML dataset reports
- 🔗 Pipeline Export → generate ready-to-run code snippets
🧩 Modular Architecture
- New
RecommendationPipelinesystem for building, saving, and exporting end-to-end pipelines. - Plugin framework (
/src/imgshape/plugins) with support for:AnalyzerPluginRecommenderPluginExporterPlugin
- Unified lazy import system for ultra-fast startup.
💡 Smart Recommendations
RecommendEngineprovides preprocessing & augmentation strategies based on:- Entropy, resolution, and dataset diversity
- User preferences (e.g.
preserve_aspect,low_res) - Optional YAML profiles (
/profiles/)
📊 Dataset Analyzer Improvements
- Counts only unique readable images (no overcount)
- Aggregates shapes, channels, entropy, and unreadable stats
- Sample summaries for representative examples
📁 Reports
- Markdown, HTML, and PDF (optional via
weasyprint+reportlab) - Embedded metadata, augmentations, and preprocessing recommendations
🧰 CLI Modernization
imgshape --web→ directly launches Streamlit UI- Extended with new actions:
--pipeline-export,--pipeline-apply,--snapshot-save,--snapshot-diff
- Plugin controls:
--plugin-list,--plugin-add,--plugin-remove
⚙️ Installation
pip install imgshapeRequires Python 3.8+
Core dependencies:Pillow,numpy,matplotlib,scikit-image,streamlit
Optional extras:
| Extra | Description |
|---|---|
imgshape[torch] |
PyTorch / torchvision support |
imgshape[pdf] |
PDF report generation via WeasyPrint |
imgshape[viz] |
Advanced plots with Seaborn & Plotly |
imgshape[ui] |
Streamlit UI + profile parsing |
imgshape[full] |
Full suite with all optional features |
💻 CLI Usage
# Shape detection
imgshape --path ./sample.jpg --shape
# Single image analysis
imgshape --path ./sample.jpg --analyze
# Preprocessing + augmentations
imgshape --path ./sample.jpg --recommend --augment
# Dataset compatibility check
imgshape --dir ./images --check mobilenet_v2
# Dataset visualization
imgshape --viz ./images
# Dataset report (md + html)
imgshape --path ./images --report --augment --report-format md,html --out report
# Torch integration (transform/DataLoader)
imgshape --path ./images --torchloader --augment --out transform_snippet.py
# Launch the Streamlit web UI
imgshape --web🖥️ Streamlit Interface (v3)
Run the visual interface directly:
streamlit run app.pyTabs Overview
| Tab | Function |
|---|---|
| 📐 Shape | Detects image dimensions & color channels |
| 🔍 Analyze | Dataset entropy, shapes, and channel distributions |
| 🧠 Recommend | Suggests preprocessing & augmentations |
| 🎨 Augment Visualizer | Interactive augmentation intensity slider |
| 📄 Reports | Generates Markdown & HTML dataset summaries |
| 🔗 Pipeline Export | Exports pipelines as code (PyTorch/YAML/JSON) |
🧠 Python API Example
from imgshape.shape import get_shape
from imgshape.analyze import analyze_type
from imgshape.recommender import recommend_preprocessing
from imgshape.pipeline import RecommendationPipeline
print(get_shape("sample.jpg"))
print(analyze_type("sample.jpg"))
print(recommend_preprocessing("sample.jpg"))
# Build a pipeline from a recommendation
rec = recommend_preprocessing("sample.jpg")
pipeline = RecommendationPipeline.from_recommender_output(rec)
print(pipeline.as_dict())🧩 Plugins
Extend imgshape with your own plugins:
# src/imgshape/plugins/custom_brightness.py
from imgshape.plugins import RecommenderPlugin
class CustomBrightnessPlugin(RecommenderPlugin):
NAME = "CustomBrightness"
def recommend(self, analysis):
return [{"name": "adjust_brightness", "spec": {"factor": 1.2}}]Then register it via CLI:
imgshape --plugin-add ./src/imgshape/plugins/custom_brightness.py📝 Reports (Markdown, HTML, PDF)
# Markdown & HTML reports
imgshape --report --path ./datasets/cats --report-format md,html
# Generate PDF (requires extras)
pip install imgshape[pdf]
imgshape --report --path ./datasets/dogs --report-format pdf🧪 Testing
Run all tests locally:
pytest -qOr install dev tools:
pip install imgshape[dev]
black --check src tests
flake8 src tests🧱 Developer & Build Guide
# Clean build artifacts
rm -rf dist build *.egg-info
# Build
python -m build
# Check metadata
twine check dist/*
# Upload (TestPyPI)
twine upload --repository testpypi dist/*
# Install locally
pip install dist/imgshape-3.0.0-py3-none-any.whl🔗 Resources
- Documentation: https://stifler7.github.io/imgshape
- GitHub Repository: https://github.com/STiFLeR7/imgshape
- Issues: https://github.com/STiFLeR7/imgshape/issues
- License: MIT
💫 Credits
Developed with ❤️ by Stifler
Researched / Developer
Empowering AI at the Edge.
🧭 Roadmap (v3.1.x)
- ONNX / TensorRT export for edge inference
- Auto-EDA visualization (class imbalance, histograms)
- Enhanced Streamlit dashboard with live metrics
- HuggingFace Spaces demo & CI/CD workflow
---
### 🧩 Summary of Key Updates
- Updated version → `v3.0.0 (Aurora)`
- Removed Gradio references (Streamlit is now primary)
- Added new **Pipeline**, **Plugins**, and **Recommender Engine** details
- Expanded CLI + Streamlit examples
- Ready for **PyPI rendering** and **GitHub preview**
imgshape v2.2.0
🖼️ imgshape — Smart Image Analysis & Preprocessing Toolkit (v2.2.0)
imgshape is a Python toolkit for image shape detection, dataset inspection, preprocessing & augmentation recommendations, visualization, report generation, and PyTorch DataLoader helpers — making it a smarter dataset assistant for ML/DL workflows.
⚡️ Why use imgshape?
-
📐 Detect image shapes (H × W × C) for single files or whole datasets.
-
🔍 Compute entropy, edge density, dominant color, and guess image type.
-
🧠 Get preprocessing recommendations (resize, normalization, suitable model family).
-
🔄 Augmentation recommender: suggest flips, crops, color jitter, etc., based on dataset stats.
-
📊 Visualizations: size histograms, dimension scatter plots, channel distribution.
-
✅ Model compatibility checks: verify dataset readiness for models like
mobilenet_v2,resnet18, etc. -
📝 Dataset reports: export Markdown/HTML/PDF with stats, plots, preprocessing, and augmentation plans.
-
🔗 Torch integration: generate ready-to-use
torchvision.transformsor even aDataLoader. -
🌐 Interactive GUI modes:
- Streamlit app (
app.py) → modern multi-tab UI - Gradio app (
--web) → quick prototyping
- Streamlit app (
🚀 Installation
pip install imgshapeRequires Python 3.8+
Core deps:Pillow,numpy,matplotlib,scikit-image,streamlit
Optional extras:
imgshape[torch]→ PyTorch / torchvision supportimgshape[pdf]→ PDF report generation (weasyprint)imgshape[viz]→ prettier plots (seaborn)
💻 CLI Usage
# Shape detection
imgshape --path ./sample.jpg --shape
# Single image analysis
imgshape --path ./sample.jpg --analyze
# Preprocessing + augmentations
imgshape --path ./sample.jpg --recommend --augment
# Dataset compatibility check
imgshape --dir ./images --check mobilenet_v2
# Dataset visualization
imgshape --viz ./images
# Dataset report (md + html)
imgshape --path ./images --report --augment --report-format md,html --out report
# Torch integration (transform/DataLoader)
imgshape --path ./images --torchloader --augment --out transform_snippet.py
# Launch Streamlit app
streamlit run app.py
# Launch Gradio GUI
imgshape --web📦 Python API
from imgshape.shape import get_shape
from imgshape.analyze import analyze_type
from imgshape.recommender import recommend_preprocessing
from imgshape.augmentations import AugmentationRecommender
print(get_shape("sample.jpg"))
print(analyze_type("sample.jpg"))
print(recommend_preprocessing("sample.jpg"))
# Augmentation plan
ar = AugmentationRecommender(seed=42)
plan = ar.recommend_for_dataset({"entropy_mean": 6.2, "image_count": 100})
print(plan.recommended_order)📝 New in v2.2.0
-
🌐 Streamlit App (
app.py) with 5 interactive tabs:- Shape → instant image shape detection
- Analyze → entropy, channels, and dataset visualization
- Recommend → preprocessing + heuristic augmentation plan
- Report → export dataset reports in Markdown/HTML
- TorchLoader → export
torchvision.transformspipelines or snippets
-
🔗 TorchLoader:
- Safe wrapper for Compose/snippet/no-op callable depending on availability.
- Backward compatibility with old
(plan, preprocessing)test calls.
-
🧠 AugmentationRecommender:
- Deterministic heuristic plans with
.as_dict()export. - Handles entropy, resolution, and imbalance.
- Deterministic heuristic plans with
-
✅ Compatibility Fixes:
check_compatibility()outputs structured results.- Deprecated alias
check_model_compatibility()preserved.
-
📝 Report Generators:
- Markdown + HTML outputs improved.
-
⚡️ Test Suite:
- Fixed pytest failures in
compatibility,report, andtorchloader.
- Fixed pytest failures in
-
🎨 UI Polishing:
- Defensive wrappers for
analyze_type,recommend_preprocessing, TorchLoader. - Footer links to Instagram, GitHub, HuggingFace, Kaggle, Medium.
- Defensive wrappers for
📎 Resources
- Source Code
- Issues
- License: MIT
imgshape v2.0.1
🖼️ imgshape — Smart Image Analysis & Preprocessing Toolkit (v2.0.1)
imgshape is a lightweight Python toolkit designed for image shape detection, dataset inspection, preprocessing recommendation, and AI model compatibility checks — all optimized for ML/DL workflows, both in research and production.
⚡️ Why use imgshape?
- 🔍 Automatically detect shape, dominant color, entropy, and type of an image.
- 🧠 Recommend preprocessing steps like resize dims, normalization, and suitable model types.
- 🖬 Analyze entire datasets to get size/shape distribution and dimension scatter plots.
- ✅ Check model compatibility (e.g. with
mobilenet_v2,resnet18, etc.). - 🌐 Supports CLI, Python API, and even a Gradio-based GUI for visual workflows.
🚀 Installation
pip install imgshapeRequires Python 3.8+ and packages: Pillow, matplotlib, seaborn, numpy, scikit-image, gradio
💻 CLI Usage
imgshape --path ./sample.jpg # Get image shape
imgshape --path ./sample.jpg --analyze # Analyze image type and entropy
imgshape --path ./sample.jpg --recommend # Recommend preprocessing steps
imgshape --dir ./images --check mobilenet_v2 # Check dataset compatibility with a model
imgshape --batch --path ./folder # Batch mode shape detection
imgshape --viz ./images # Visualize size/shape distribution
imgshape --web # Launch Gradio GUI




