WordRobe: Text-Guided Generation of Textured 3D Garments

Method Overview

We propose 'WordRobe', a method to generate different types of 3D garments with openings (armholes, necklines etc.) and diverse textures via user-friendly text prompts. To achieve this, we incorporate three novel components in WordRobe — 3D garment latent space (Ω) which encodes unposed 3D garments as latent codes; Mapping Network (MLP_map) which predicts garment latent code from input text prompt ; and Text-guided texture synthesis to generate high-quality diverse texture maps for the 3D garments We provide an overview of the proposed method in the figure below at inference time, given an input text prompt, we first obtain its CLIP embedding ψ, which is subsequently passed to MLP_map to obtain the latent code ϕ ∈ Ω. We further perform two-step latent decoding of ϕ to generate the 3D garment as UDF, and extract the UV parametrized mesh representation for the same. Finally, we perform text-guided texture synthesis in a single feed-forward step by leveraging ControlNet to obtain the textured 3D garment mesh.

3D Garment Latent Space

WordRobe generates high-quality unposed 3D garment meshes with photorealistic textures from user-friendly text prompts. We achieve this by first learning a latent space of 3D garments using a novel two-stage encoder-decoder framework in a coarse-to-fine manner, representing the 3D garments as unsigned distance fields (UDFs). We also introduce an additional loss function to further disentangle the latent space, promoting better interpolation.

Mapping CLIP to Garment Latent Space

Once the garment latent space is learned, we train a mapping network to predict garment latent codes from CLIP embeddings. This allows CLIP-guided exploration of the latent space, enabling text-driven 3D garment generation and editing. For training the aforementioned mapping network, we develop a novel weakly-supervised training scheme that eliminates the need for explicit manual text annotations.

BibTeX

If you use our dataset or method in your research, please cite the following:

@misc{korosteleva2021generatingdatasets3dgarments, title={Generating Datasets of 3D Garments with Sewing Patterns}, author={Maria Korosteleva and Sung-Hee Lee}, year={2021}, eprint={2109.05633}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2109.05633}, }

@misc{srivastava2024wordrobetextguidedgenerationtextured, title={WordRobe: Text-Guided Generation of Textured 3D Garments}, author={Astitva Srivastava and Pranav Manu and Amit Raj and Varun Jampani and Avinash Sharma}, year={2024}, eprint={2403.17541}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2403.17541}, }

WordRobe: Text-Guided Generation of Textured 3D Garments

Method Overview

Text-Driven 3D Garment Generation & Editing

Composition of 3D Garments

Text-Driven Latent Editing

Sketch Guided Generation

Sketch + Text

3D Garment Extraction from Images

3D Garment Latent Space

3D Garment Interpolation

Mapping CLIP to Garment Latent Space

Results

Comparison with Text2Tex

Comparison with Text-to-3D Methods

Simulations

Dataset

BibTeX