VQ-BeT: Behavior Generation with Latent Actions

Seungjae Lee1   Yibin Wang2   Haritheja Etukuru2   H. Jin Kim1   Nur Muhammad Mahi Shafiullah2,*   Lerrel Pinto2,*  
Seoul National University1, New York University2
* equal advising

VQ-BeT on Real-World Kitchen Tasks

Toaster Closing (X5)

Fridge Closing (X5)

Toaster and Fridge Closing (X5)

Pick and Place Can Toaster Closing (X5)

Toaster Opening (X5)

Pick and Place Fridge Closing (X5)

Pick and Place Can Fridge (X5)

Pick and Place Can Toaster (X5)

VQ-BeT on Long-Horizon Real-World Kitchen Tasks

[ Pick up Bread → Place in the Bag → Pick up Bag → Place on the Table] (X8)
vq-bet
[Open Drawer → Pick and Place Box → Close Drawer] (X8)
vq-bet
[Can to Fridge → Fridge Closing → Toaster Opening] (X5)
vq-bet
[Can to Toaster → Toaster Closing → Fridge Closing] (X5)
vq-bet