We propose 'WordRobe', a method to generate different types of 3D garments with
openings (armholes, necklines etc.) and diverse textures via user-friendly text
prompts. To achieve this, we incorporate three novel components in WordRobe —
3D garment latent space (Ω) which encodes unposed 3D garments as latent
codes; Mapping Network (MLPmap) which predicts garment latent
code from input text prompt ; and Text-guided texture synthesis
to generate high-quality diverse texture maps for the 3D garments
We provide an overview of the proposed method in the figure below at inference time,
given an input text prompt, we first obtain its CLIP embedding ψ, which is
subsequently passed to MLPmap to obtain the latent code ϕ ∈ Ω. We further
perform two-step latent decoding of ϕ to generate the 3D garment as UDF,
and extract the UV parametrized mesh representation for the same. Finally, we
perform text-guided texture synthesis in a single feed-forward step by leveraging
ControlNet to obtain the textured 3D garment mesh.