Virtual Try-On Using Images: An Ideal Application of Generative AI and Pattern Recognition


Arun Subramanian

Machine Learning Engineer

Feb 9, 2024

Virtual Try-On


Image based Virtual-try aims to wrap the in-store garment image, onto the fashion-model image. The fashion model in question may exhibit individualized characteristics, including but not limited to physique, stature, weight, and various poses. While the image of the garment is typical a frontal photograph of the cloth. This holds significance for the ultimate consumer, as it allows them to virtually try on a garment before making an online purchase, conducting an initial evaluation online before visiting the physical store, or even experiencing a virtual try-on at the store without the need for a changing room. The significance of the term “image based” arises from the fact that traditional methods required a more cumbersome 3-d meshes to achieve the same, whereas the current Machine Learning methods do not, making it a robust application of Generative AI and pattern recognition.


Goals of a Virtual Try-on solution:

The synthesized image, after virtual try-on, is expected to be perceptually convincing, meeting the following desiderata (as postulated in the research paper VITON: An Image-Based Virtual Try-On Network (

  • (1) The body parts and pose of the person are the same as in the original image.
  • (2) The clothing item in the product image deforms naturally, conditioned on the pose and body shape of the person.
  • (3) Detailed visual patterns of the desired product are clearly visible, which include not only low-level features like color and texture but also complicated graphics like embroidery, logo, etc.

Challenges of a Virtual Try-on solution

  • The first challenge is how to warp the cloth such that the texture, text is preserved.
  • The second challenge is how to generate human body parts such as arms/legs, if the target image has these parts already occluded (for instance the target image is already wearing a full t-shirt), where the source cloth image is shorter.

Traditional Virtual Try-on Approaches

How 2d Virtual Try-on methods typically solve the challenges.

  • The 2D Viton challenge is solved by formulating the problem as generating the warped cloth, by using the target image's information such as pose, body segmentation etc. The warping can be determined by Thin-Plate Spline (TPS) based transformation or determining flow vectors that indicate where each pixel of the cloth moves. The latter is sometimes implicitly achieved by using GAN, but usually explicit flow vectors and TPS are usually preferred. However, this approach too fails because they only helps in determining a global flow field, while it is known that diverse garment transformations in different parts of the body are usually required. These deformation flows are usually learnt from data-driven approaches a.k.a using machine learning.

Machine learning approaches and problem statement and dataset and key inputs.

  • The standard datasets used to train this mapping typically consist of data sets that include images of clothing and individuals wearing the clothing, showcasing specific poses or shapes. While the pose key points (open pose, dense pose etc), and body segmentation are inferred from images of the person, the cloth segmentation and mask are inferred from images of the cloth, and both together serve as the priors/inputs to determine/learn the warping needed.

An example of inputs "inferred" from cloth and person image is demonstrated above. The pose map is inferred from the person's image.

The machine learning models used here typically first extract deep learned features of the input cloth and input target fashion model image, either through convolution networks or pyramid-cascades, and further learn the warping flow-field using U-Net image to image translation framework. Once the warping flow-field is learnt, it is applied to the cloth image, and applied over the target fashion model, to complete the synthesis of a virtual try-on.


Image-based virtual try-on is an ideal use case for exploiting the advances in both pattern recognition, machine learning and Generative AI. As and when advances in the inductive biases of deep learning models evolve, in addition to novel architectures and approaches, the ability to learn from a huge corpus of data involving cloth and fashion-model adorning the cloth promises to produce extremely novel and real-time virtual try-on outputs. Further, for the end user, the application can present itself in the form of a browser application, mobile application or even virtual try-room on the store, providing a seamless shopping experience for the customer and conversion of sales for the merchant.

Related Posts

Partner with Our Expert Consultants

Empower your AI journey with our expert consultants, tailored strategies, and innovative solutions.