Description: CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis
cams (2442) hoi (15) hand object manipulation (2)
In this work, we focus on a novel task of category-level functional hand-object manipulation synthesis covering both rigid and articulated object categories. Given an object geometry, an initial human hand pose as well as a sparse control sequence of object poses, our goal is to generate a physically reasonable hand-object manipulation sequence that performs like human beings. To address a such challenge, we first design CAnonicalized Manipulation Spaces (CAMS), a two-level space hierarchy that canonicalize
Our framework mainly consists of a CVAE-based planner module and an optimization-based synthesizer module . Given the generation condition as the input, the planner first generates a per-stage CAMS representation containing contact reference frames and sequences of finger embedding. Then the synthesizer optimizes the whole manipulation animation based on the CAMS embedding.
CAnonicalized Manipulation Spaces have a two-level canonicalization for manipulation representation. At the root level, the canonicalized contact targets (top right) describe the discrete contact information. At the leaf level, the canonicalized finger embedding (bottom right) transforms finger motion from global space into local reference frames defined on the contact targets.