|
| 1 | +# RH20T Dataset API and Visualizer Implementation |
| 2 | + |
| 3 | +## Dataset API |
| 4 | + |
| 5 | +### Getting Started |
| 6 | + |
| 7 | +The RH20T Python API module is implemented in `rh20t_api` folder. To install the dependencies, run the following command: |
| 8 | + |
| 9 | +```bash |
| 10 | +pip install -r ./requirements_api.txt |
| 11 | +``` |
| 12 | + |
| 13 | +The information for each robot configuration are in `configs/configs.json` file, which is usually required for using the API. |
| 14 | + |
| 15 | +### Basic Usage |
| 16 | + |
| 17 | +Here presents basic usage of the dataset API, including a scene data loader for loading preprocessed scene data from the dataset, as well as an online preprocessor for real robotic manipulation inference. You can also refer to our visualizer implementation for better understanding how to use the scene data loader. |
| 18 | + |
| 19 | +#### Scene Data Loader |
| 20 | + |
| 21 | +The scene data loader should be initialized with a specific scene data folder and robot configurations. |
| 22 | + |
| 23 | +```python |
| 24 | +from rh20t_api.configurations import load_conf |
| 25 | +from rh20t_api.scene import RH20TScene |
| 26 | + |
| 27 | +robot_configs = load_conf("configs/configs.json") |
| 28 | +scene = RH20TScene(scene_path, robot_configs) |
| 29 | +``` |
| 30 | + |
| 31 | +Some of the methods/properties in the scene data loader that may be of use are listed in the following: |
| 32 | + |
| 33 | +|Method/Property|Comment| |
| 34 | +|---|---| |
| 35 | +|`RH20TScene.extrinsics_base_aligned`|The preprocessed extrinsics 4x4 matrices for each camera related to robot arm base| |
| 36 | +|`RH20TScene.folder`|The current scene folder (can be modified)| |
| 37 | +|`RH20TScene.is_high_freq`|Toggles reading high frequency data, default to False| |
| 38 | +|`RH20TScene.intrinsics`|Dict[str, np.ndarray] type of camera serial : calibrated 3x4 intrinsic matrices| |
| 39 | +|`RH20TScene.in_hand_serials`|The list of in-hand camera serials| |
| 40 | +|`RH20TScene.serials`|The list of all camera serials| |
| 41 | +|`RH20TScene.low_freq_timestamps`|The list of sorted low-frequency timestamps for each camera serial| |
| 42 | +|`RH20TScene.high_freq_timestamps`|The list of sorted high-frequency timestamps (different cameras share the same high-frequency timestamp list)| |
| 43 | +|`RH20TScene.start_timestamp`|The starting timestamp for the current scene| |
| 44 | +|`RH20TScene.end_timestamp`|The ending timestamp for the current scene| |
| 45 | +|`RH20TScene.get_audio_path()`|The audio path| |
| 46 | +|`RH20TScene.get_image_path_pairs(timestamp:int, image_types:List[str]=["color", "depth"])`|Query interpolated `Dict[str, List[str]]` type of color-depth image pairs paths for each camera given a timestamp| |
| 47 | +|`RH20TScene.get_image_path_pairs_period(time_interval:int, start_timestamp:int=None, end_timestamp:int=None)`|Query a list of interpolated `Dict[str, List[str]]` type of color-depth image pairs paths for each camera given a period of time in milliseconds (starting and ending timestamps will be set to the scene's if not specified)| |
| 48 | +|`RH20TScene.get_joints_angles(timestamp:int)`|Query interpolated joint angle vectors given a timestamp| |
| 49 | +|`RH20TScene.get_ft_aligned(timestamp:int, serial:str="base", zeroed:bool=True)`|Query interpolated preprocessed force-torque concatenated 6d vector given a timestamp and a camera serial (or "base")| |
| 50 | +|`RH20TScene.get_tcp_aligned(timestamp:int, serial:str="base")`|Query interpolated preprocessed tcp 7d quaternion pose vector given a timestamp and a camera serial (or "base")| |
| 51 | + |
| 52 | +#### Online Preprocessor |
| 53 | + |
| 54 | +The force and torque concatenated vectors, as well as the tcp values are collected in robot arm base coordinate. The implemented online preprocessor can be used to project these values to a certain camera coordinate online for inferencing purpose. It should be initialized with calibration result path, and the specific camera to project to. |
| 55 | + |
| 56 | +```python |
| 57 | +from rh20t_api.configurations import load_conf |
| 58 | +from rh20t_api.scene import RH20TScene |
| 59 | +from rh20t_api.online import RH20TOnline |
| 60 | + |
| 61 | +serial = "[The serial number of the camera to project to]" |
| 62 | + |
| 63 | +robot_configs = load_conf("configs/configs.json") |
| 64 | +scene = RH20TScene(scene_path, robot_configs) |
| 65 | + |
| 66 | +# before using the processor, it is recommended to collect the first several frames of data |
| 67 | +# when the robot arm is still, and update the sensor offsets considering the temperature drift |
| 68 | +sampled_raw_fts = # shaped (n, 6), raw force-torque vectors in the first several frames |
| 69 | +sampled_raw_tcps = # shaped (n, 6) or (n, 7), raw tcp values in the first several frames |
| 70 | +scene.configuration.update_offset(sampled_raw_fts, sampled_raw_tcps) |
| 71 | + |
| 72 | +# initialize the preprocessor |
| 73 | +processor = RH20TOnline(scene.calib_folder, scene.configuration, serial) |
| 74 | + |
| 75 | +# ... |
| 76 | + |
| 77 | +# online preprocessing ft_raw and tcp_raw, the processed values are |
| 78 | +# aligned, processed with sensor offsets (for force and torque) and |
| 79 | +# projected to the specified camera |
| 80 | +processed_ft = processor.project_raw_from_external_sensor(ft_raw, tcp_raw) |
| 81 | +processed_tcp, processed_ft_tcp = processor.project_raw_from_robot_sensor(tcp_raw) |
| 82 | +``` |
| 83 | + |
| 84 | +## Dataset Visualizer |
| 85 | + |
| 86 | +### Getting Started |
| 87 | + |
| 88 | +The visualizer can be configured in `configs/default.yaml`. The dependencies are installed via: |
| 89 | + |
| 90 | +```bash |
| 91 | +# Recommend running on Ubuntu |
| 92 | +pip install -r ./requirements.txt |
| 93 | +``` |
| 94 | + |
| 95 | +The minimal files requirement for visualizing a scene will be a scene folder placed together with a calibration folder like: |
| 96 | + |
| 97 | +```text |
| 98 | +- calib |
| 99 | + - [some timestamp] |
| 100 | + - ... |
| 101 | +- [scene folder] |
| 102 | +``` |
| 103 | + |
| 104 | +Before visualizing a scene, we should first preprocess the images to point clouds and cache them to save time, for example: |
| 105 | + |
| 106 | +```bash |
| 107 | +python visualize.py --scene_folder [SCENE_FOLDER] --cache_folder [CACHE_FOLDER] --preprocess |
| 108 | +``` |
| 109 | + |
| 110 | +You can modify configurations including sampling time interval, screenshot saving path, and choices to enable visualizing etc in `configs/default.yaml`. If you would only like to view the first several frames of point clouds, you can run the above command for a while and then stop it with `Ctrl + C`. It is recommended to cache at least the first frame point cloud since it is used for adjusting the initial viewing direction. |
| 111 | + |
| 112 | +Running the following command will visualize the dynamic scene, with an Open3D visualizer viewport showing the 3D dynamic scene, and an OpenCV viewport showing the real-world video captured by your chosen camera (determined by `chosen_cam_idx` in `configs/default.yaml`). Note that the scene folder and cache folder should match the previous ones. |
| 113 | + |
| 114 | +```bash |
| 115 | +python visualize.py --scene_folder [SCENE_FOLDER] --cache_folder [CACHE_FOLDER] |
| 116 | +``` |
| 117 | + |
| 118 | +During the visualization, you can: |
| 119 | + |
| 120 | +1. drag and view the scene with your mouse; |
| 121 | +2. press `Alt` to pause, then press `←` and `→` to view previous and next frames respectively, drag and view them, press `C` to save a screenshot of the current frame, and press `Alt` again for continue playing; |
| 122 | +3. press `Esc` to stop. |
0 commit comments