You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/rai_extensions/rai_perception/README.md
+89-99Lines changed: 89 additions & 99 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,125 +2,139 @@
2
2
3
3
# RAI Perception
4
4
5
-
This package provides ROS2 integration with [Idea-Research GroundingDINO Model](https://github.com/IDEA-Research/GroundingDINO) and [Grounded-SAM-2, RobotecAI fork](https://github.com/RobotecAI/Grounded-SAM-2)for object detection, segmentation, and gripping point calculation. The `GroundedSamAgent` and `GroundingDinoAgent` are ROS2 service nodes that can be readily added to ROS2 applications. It also provides tools that can be used with [RAI LLM agents](../tutorials/walkthrough.md) to construct conversational scenarios.
5
+
RAI Perception brings powerful computer vision capabilities to your ROS2 applications. It integrates [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) and [Grounded-SAM-2](https://github.com/RobotecAI/Grounded-SAM-2)to detect objects, create segmentation masks, and calculate gripping points.
6
6
7
-
In addition to these building blocks, this package includes utilities to facilitate development, such as a ROS2 client that demonstrates interactions with agent nodes.
7
+
The package includes two ready-to-use ROS2 service nodes (`GroundedSamAgent` and `GroundingDinoAgent`) that you can easily add to your applications. It also provides tools that work seamlessly with [RAI LLM agents](../tutorials/walkthrough.md) to build conversational robot scenarios.
8
8
9
-
## Installation
9
+
## Prerequisites
10
+
11
+
Before installing `rai-perception`, ensure you have:
10
12
11
-
While installing `rai_perception` via Pip is being actively worked on, to incorporate it into your application, you will need to set up a ROS2 workspace.
13
+
1.**ROS2 installed** (Jazzy recommended, or Humble). If you don't have ROS2 yet, follow the official ROS2 installation guide for [jazzy](https://docs.ros.org/en/jazzy/Installation.html) or [humble](https://docs.ros.org/en/humble/Installation.html).
14
+
2.**Python 3.8+** and `pip` installed (usually pre-installed on Ubuntu).
15
+
3.**NVIDIA GPU** with CUDA support (required for optimal performance).
16
+
4.**wget** installed (required for downloading model weights):
17
+
```bash
18
+
sudo apt install wget
19
+
```
12
20
13
-
### ROS2 Workspace Setup
21
+
## Installation
14
22
15
-
Create a ROS2 workspace and copy this package:
23
+
**Step 1:** Source ROS2 in your terminal:
16
24
17
25
```bash
18
-
mkdir -p ~/rai_perception_ws/src
19
-
cd~/rai_perception_ws/src
20
-
21
-
# only checkout rai_perception package
22
-
git clone --depth 1 --branch main https://github.com/RobotecAI/rai.git temp
23
-
cd temp
24
-
git archive --format=tar --prefix=rai_perception/ HEAD:src/rai_extensions/rai_perception | tar -xf -
25
-
mv rai_perception ../rai_perception
26
-
cd ..
27
-
rm -rf temp
26
+
# For ROS2 Jazzy (recommended)
27
+
source /opt/ros/jazzy/setup.bash
28
+
29
+
# For ROS2 Humble
30
+
source /opt/ros/humble/setup.bash
28
31
```
29
32
30
-
### ROS2 Dependencies
33
+
**Step 2:** Install ROS2 dependencies. `rai-perception` requires its ROS2 packages that needs to be installed separately:
34
+
35
+
```bash
36
+
# Update package lists first
37
+
sudo apt update
38
+
39
+
# Install rai_interfaces as a debian package
40
+
sudo apt install ros-jazzy-rai-interfaces # or ros-humble-rai-interfaces for Humble
41
+
```
31
42
32
-
Add required ROS dependencies. From the workspace root, run
43
+
**Step 3:** Install `rai-perception` via pip:
33
44
34
45
```bash
35
-
rosdep install --from-paths src --ignore-src -r
46
+
pip install rai-perception
36
47
```
37
48
38
-
### Build and Run
49
+
> [!TIP]
50
+
> It's recommended to install `rai-perception` in a virtual environment to avoid conflicts with other Python packages.
39
51
40
-
Source ROS2 and build:
52
+
> [!TIP]
53
+
> To avoid sourcing ROS2 in every new terminal, add the source command to your `~/.bashrc` file:
54
+
>
55
+
> ```bash
56
+
> echo "source /opt/ros/jazzy/setup.bash" >> ~/.bashrc # or humble
57
+
> ```
41
58
42
-
```bash
43
-
# Source ROS2 (humble or jazzy)
44
-
source /opt/ros/${ROS_DISTRO}/setup.bash
59
+
<!--- --8<-- [end:sec1] -->
45
60
46
-
# Build workspace
47
-
cd~/rai_perception_ws
48
-
colcon build --symlink-install
61
+
<!--- --8<-- [start:sec4] -->
49
62
50
-
# Source ROS2 packages
51
-
source install/setup.bash
52
-
```
63
+
## Getting Started
53
64
54
-
### Python Dependencies
65
+
This section provides a step-by-step guide to get you up and running with RAI Perception.
55
66
56
-
`rai_perception` depends on `rai-core` and `sam2`. There are many ways to set up a virtual environment and install these dependencies. Below, we provide an example using Poetry.
67
+
### Quick Start
57
68
58
-
**Step 1:** Copy the following template to `pyproject.toml` in your workspace root, updating it according to your directory setup:
69
+
After installing `rai-perception`, launch the perception agents:
> The weights will be downloaded to `~/.cache/rai` directory on first use.
85
+
86
+
The agents create two ROS 2 nodes: `grounding_dino` and `grounded_sam` using [ROS2Connector](../API_documentation/connectors/ROS_2_Connectors.md).
79
87
80
-
First, we create Virtual Environment with Poetry:
88
+
### Testing with Example Client
89
+
90
+
The `rai_perception/talker.py` example demonstrates how to use the perception services for object detection and segmentation. It shows the complete pipeline: GroundingDINO for object detection followed by GroundedSAM for instance segmentation, with visualization output.
91
+
92
+
**Step 1:** Open a terminal and source ROS2:
81
93
82
94
```bash
83
-
cd~/rai_perception_ws
84
-
poetry lock
85
-
poetry install
95
+
source /opt/ros/jazzy/setup.bash # or humble
86
96
```
87
97
88
-
Now, we are ready to launch perception agents:
98
+
**Step 2:** Launch the perception agents:
89
99
90
100
```bash
91
-
# Activate virtual environment
92
-
source"$(poetry env info --path)"/bin/activate
93
-
export PYTHONPATH
94
-
PYTHONPATH="$(dirname "$(dirname "$(poetry run which python)")")/lib/python$(poetry run python --version | awk '{print $2}'| cut -d. -f1,2)/site-packages:$PYTHONPATH"
You can use any image containing objects like dragons, lizards, or dinosaurs. For example, use the `sample.jpg` from the package's `images` folder. The client will detect these objects and save a visualization with bounding boxes and masks to `masks.png`in the current directory.
112
+
100
113
> [!TIP]
101
-
> To manage ROS 2 + Poetry environment with less friction: Keep build tools (colcon) at system level, use Poetry only for runtime dependencies of your packages.
114
+
>
115
+
> If you wish to integrate open-set vision into your ros2 launch file, a premade launch
116
+
> file can be found in`rai/src/rai_bringup/launch/openset.launch.py`
102
117
103
-
<!--- --8<-- [end:sec1] -->
118
+
### ROS2 Service Interface
104
119
105
-
`rai-perception` agents create two ROS 2 nodes: `grounding_dino` and `grounded_sam` using [ROS2Connector](../../../docs/API_documentation/connectors/ROS_2_Connectors.md).
> If you wish to integrate open-set vision into your ros2 launch file, a premade launch
114
-
> file can be found in `rai/src/rai_bringup/launch/openset.launch.py`
125
+
<!--- --8<-- [end:sec4] -->
115
126
116
-
> [!NOTE]
117
-
> The weights will be downloaded to `~/.cache/rai` directory.
127
+
<!--- --8<-- [start:sec5] -->
128
+
129
+
## Dive Deeper: Tools and Integration
118
130
119
-
## RAI Tools
131
+
This section provides information for developers looking to integrate RAI Perception tools into their applications.
120
132
121
-
`rai_perception` package contains tools that can be used by [RAI LLM agents](../../../docs/tutorials/walkthrough.md)
133
+
### RAI Tools
134
+
135
+
`rai_perception` package contains tools that can be used by [RAI LLM agents](../tutorials/walkthrough.md)
122
136
to enhance their perception capabilities. For more information on RAI Tools see
123
-
[Tool use and development](../../../docs/tutorials/tools.md) tutorial.
137
+
[Tool use and development](../tutorials/tools.md) tutorial.
124
138
125
139
<!--- --8<-- [start:sec2] -->
126
140
@@ -132,7 +146,7 @@ This tool calls the GroundingDINO service to detect objects from a comma-separat
132
146
133
147
> [!TIP]
134
148
>
135
-
> you can try example below with [rosbotxl demo](../../../docs/demos/rosbot_xl.md) binary.
149
+
> you can try example below with [rosbotxl demo](../demos/rosbot_xl.md) binary.
136
150
> The binary exposes `/camera/camera/color/image_raw` and `/camera/camera/depth/image_rect_raw` topics.
137
151
138
152
<!--- --8<-- [start:sec3] -->
@@ -198,30 +212,6 @@ with ROS2Context():
198
212
I have detected the following items in the picture desk: 2.43m away
199
213
```
200
214
201
-
## Simple ROS2 Client Node Example
202
-
203
-
The `rai_perception/talker.py` example demonstrates how to use the perception services for object detection and segmentation. It shows the complete pipeline: GroundingDINO for object detection followed by GroundedSAM for instance segmentation, with visualization output.
204
-
205
-
This example is useful for:
206
-
207
-
- Testing perception services integration
208
-
- Understanding the ROS2 service call patterns
209
-
- Seeing detection and segmentation results with bounding boxes and masks
0 commit comments