MExECON is a multi-view pipeline for high-fidelity 3D reconstruction of clothed human avatars from sparse RGB images. By extending the "divide and conquer" framework of ECON, MExECON leverages multiple viewpoints to resolve depth ambiguities and capture fine-grained details (e.g., hoodies, hairstyles, backpacks) that are typically lost in single-view approaches.
Key Advantage: All multi-view gains are achieved without requiring network re-training.
• Joint Multi-view Body Optimization (JMBO): JMBO estimates a single SMPL-X body model from multiple RGB views and their camera parameters, ensuring consistency across viewpoints and avoiding artifacts from occluded areas. It starts by predicting per-view SMPL-X fits with PIXIE, averaging shape and pose parameters to initialize the body. This model is then jointly optimized across all views using a multi-view loss that combines silhouette, normal, landmark, and head alignment terms, weighted to balance their contributions.
• Head Pitch Regularization: A stabilizing loss term that prevents "broken neck" artifacts during multi-view landmark fitting.
• True Back-View Integration: Unlike the baseline which infers the back from the front, MExECON uses actual back-view image data to reconstruct high-frequency clothing folds and accessories.
• SOTA Performance: Achieves a 42% reduction in Chamfer distance compared to baseline method ECON and remains competitive with large-scale models like VGGT.
The pipeline consists of two primary phases.
-
Prior Body Estimation (JMBO): ◦ Initializes by averaging PIXIE parameters from all views.
◦ Optimizes shape and pose using a joint cost function:
$L = L_{Silhouette} + \lambda_n L_{Normal} + \lambda_l L_{Landmark} + \lambda_h L_{Head}$ -
Surface Reconstruction: ◦ Predicts detailed 2D normal maps for front and back views using the optimized SMPL-X prior.
◦ Uses d-BiNI (Depth-Aware Bilateral Normal Integration) to lift 2D maps into 3D partial surfaces.
◦ Stitches surfaces using IF-Nets+ for shape completion and Screened Poisson Reconstruction for a watertight mesh
MExECON was validated on 20 avatars from the THuman 2.1 dataset using an 8-view setup.
| Method | Chamfer (mm) ↓ | P2S Acc. (mm) ↓ | Normal (deg.) ↓ |
|---|---|---|---|
| ECON (Baseline) | 40.56 | 39.52 | 45.79 |
| MExECON (2-view) | 32.76 | 29.72 | 44.03 |
| MExECON (8-view) | 23.12 | 21.26 | 39.54 |
| VGGT (8-view) | 23.87 | 22.21 | 41.38 |
Data source: Table 2 of the MExECON paper.
• Requires calibrated cameras.
• Assumes a static subject across all input views.
• Partial normal integration currently focuses on front-back views; side-view details are handled via surface completion
- See installion doc for Windows to install all the required packages and setup the models on Windows
- See installion doc for Ubuntu to install all the required packages and setup the models on Ubuntu
# For single-person image-based reconstruction (~ 2 min)
python -m apps.infer -cfg ./configs/econ.yaml -in_dir ./example/0021 -out_dir ./results/0021/ -novis -front_view 6 -back_view 2If you find this work useful for your research, please cite our paper:
Uğur, F. E.; Redondo, R.; Barreiro, A.; Hristov, S. and Marí, R. (2026). MExECON: Multi-View Extended Explicit Clothed Humans Optimized via Normal Integration. In Proceedings of the 21st International Conference on Computer Vision Theory and Applications - Volume 3, ISBN 978-989-758-804-4, ISSN 2184-4321, pages 520-528.
Fulden Ece Uğur - ugur.fuldenece@gmail.com

