UniFolding: Towards Sample-efficient, Scalable, and Generalizable Robotic Garment Folding
Han Xue*, Yutong Li*, Wenqiang Xu, Huanyu Li, Dongzhe Zheng, Cewu Lu
Shanghai Jiao Tong University
CoRL 2023
Abstract
This paper explores the development of UniFolding, a sample-efficient, scalable, and generalizable robotic system for unfolding and folding various garments. UniFolding employs the proposed UFONet neural network to integrate unfolding and folding decisions into a single policy model that is adaptable to different garment types and states. The design of UniFolding is based on a garment’s partial point cloud, which aids in generalization and reduces sensitivity to variations in texture and shape. The training pipeline prioritizes low-cost, sampleefficient data collection. Training data is collected via a human-centric process with offline and online stages. The offline stage involves human unfolding and folding actions via Virtual Reality, while the online stage utilizes human-in-theloop learning to fine-tune the model in a real-world setting. The system is tested on two garment types: long-sleeve and short-sleeve shirts. Performance is evaluated on 20 shirts with significant variations in textures, shapes, and materials.
Challenges
Method & Advantages
Experiments Setup
Action Primitives
The pipeline of UniFolding system. It contains two stages to fully fold a garment from a random initial state, namely Unfolding and Folding.
Method Details
Left: UFONet takes a masked point cloud of the observed garment state as the input, predicts the primitive action to be performed, and regresses the picking (placing) points.
Right: The training pipeline of our method. It consists of offline data collection in simulation with human demonstration and online data collection in the real world. In the offline data collection phase, we collect human demonstration data for unfolding and folding tasks through a Virtual Reality interface in a fast and low-cost manner. By leveraging human priors from the demonstrations, we can simplify the dense action space into a ranking problem with a sparse set of keypoint candidates. This substantially reduces exploration time in both simulation and the real world. After obtaining an initial policy from offline supervised learning, we perform self-supervised learning in simulation for unfolding tasks. In the online data collection phase, we adopt a human-in-the-loop learning approach to fine-tune the policy in the real world.
Demo
Garment Instances
We tested on two garment types: long-sleeve shirts, and short-sleeve T-shirts, choosing 60 real-world garment intances with varying shapes, sizes, textures, and materials. Sizes ranged from 38cm×60cm to 80cm× 167cm, aspect ratios varied between 0.2695 : 1 and 1.1167 : 1, and materials included cotton, polyester, spandex, nylon, viscose, wool and so on. The garments were divided into train/test sets at a 2 : 1 ratio. All the garments in the testing set remain unseen in all experiments.
Unseen Testing Set (Long-sleeve Shirts)
Unseen Testing Set (Short-sleeve T-Shirts)
Training Set (Long-sleeve Shirts)
Training Set (Short-sleeve T-Shirts)
Experiments for Each Unseen Garment Instance
Long-sleeve Shirts
Short-sleeve T-Shirts
Comparaison with Baseline
Each set of GIFs below represents a comparison between the ClothFunnes baseline (left) and our method (right), with experiments conducted with the same garments. We can see that the baseline method works relatively well on garments with solid and light color, but it suffers from complex textures, unusual shapes and materials. Our method exhibts strong generalization ability for various garments.
Annotation Examples for Fine-tuning
Short-sleeve T-shirts (training set)
Annotation of Best Action Type and Pick Points
Fling
Human Preference on Model Prediction
P1 (left) > P2 (right)
Human Preference on Model Prediction
P1 (left) > P2 (right)
Is human annotation (left) better than all pairs of keypoint candidate predictions (right)?
No
Annotation of Best Action Type and Pick Points
Fling
Human Preference on Model Prediction
P1 (left) < P2 (right)
Human Preference on Model Prediction
P1 (left) > P2 (right)
Is human annotation (left) better than all pairs of keypoint candidate predictions (right)?
Yes
Long-sleeve Shirts (training set)
Annotation of Best Action Type and Pick Points
Fling
Human Preference on Model Prediction
P1 (left) < P2 (right)
Human Preference on Model Prediction
P1 (left) < P2 (right)
Is human annotation (left) better than all pairs of keypoint candidate predictions (right)?
No
Annotation of Best Action Type and Pick Points
Fling
Human Preference on Model Prediction
P1 (left) < P2 (right)
Human Preference on Model Prediction
P1 (left) < P2 (right)
Is human annotation (left) better than all pairs of keypoint candidate predictions (right)?
Yes