To establish a systematic paradigm for dynamic tactile perception, introduce a tactile
dynamic pyramid that organizes tactile data into five tiers based on the complexity level
of the perception capabilities they support:
- Tier 5 (Press-Only): Collected by only pressing the sensor against
objects using either handheld operation or a robot arm. It mainly supports the recognition of
object-level attribute. (Touch and Go, ObjectFolder, VisGel, TVL, etc.)
- Tier 4 (Press-Only): Collected by pressing the sensor against
objects, followed by random sliding and rotation . It enables perception of surface-related dynamics
but lacking task relevance. (YCB-Slide, TacQuad, etc.)
- Tier 3 (Specific Action): Collected by controlling
the sensor to press and slide along the object surface following specific predefined actions.
It can facilitate action-level tactile understanding.
- Tier 2 (Manipulation Data): Collected during real object manipulation tasks
using a robot arm or a UMI device. It is essential for learning real-world manipulation skills
- Tier 1 (Force Data): Collected by a robot arm equipped with a force
sensor. It enables reasoning about force–deformation relationships and supporting fine-grained,
force-sensitive manipulation tasks. (e.g. FeelAnyForce)
Most existing tactile datasets reside in Tier 4 and 5, offering insufficient support for advanced dynamic perception tasks such as dexterous manipulation, while
higher-tier
data remain scarce. To address this gap, we present
ToucHD, a large-scale tactile dataset with 2,426,174 contact samples
designed as a Tactile Hierarchical Dynamic resource to enrich higher-tier dynamic tactile data.
Compared with existing tactile datasets,
ToucHD has advantages in terms of scale, sensor diversity, label diversity, and dynamic diversity.
The dataset comprises three subsets corresponding to the highest 3 tiers of the pyramid:
- Simulated Atomic Action Data (Sim). We collect 1,118,896 multi-sensor contact frames from five optical tactile sensors
performing 6 atomic actions—sliding left/right/up/down and rotating clockwise/counterclockwise on 1,043 3D objects.
- Real-World Manipulation Data (Mani). We modify FastUMI by equipping its two grippers with different tactile sensors and
collect 584,842 contact frames from 46 carefully designed manipulation tasks, while simultaneously recording the interaction videos.
- Touch-Force Paired Data (Force). We collect 722,436 touch–force pairs using five carefully selected tactile sensors and 71 distinct indenters.
Under programmatic control, each indenter performs sliding motions in four directions—forward, backward, left, and right—across the sensor
surface, while a wrist-mounted force sensor records 3D contact force sequences.