Skip to content

Commit f988755

Browse files
committed
Add a description of each program to the README
1 parent c3b0371 commit f988755

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,22 @@ used by default.
6363
cd ..
6464
./build/sobel_cpu
6565
```
66+
67+
## Program Breakdown
68+
69+
The idea of the project is to use the CPU implementation as a baseline
70+
and then apply each optimization step incrementally.
71+
72+
- **sobel_cpu**: CPU baseline implementation
73+
- **sobel_gpu_1_naive**: Literal port to GPU
74+
- **sobel_gpu_2_single_alloc**: Allocate only once the GPU memories
75+
and recicle them through all the iterations.
76+
- **sobel_gpu_3_pinned_mem**: Allocate host memory as
77+
non-pageable/pinned so that the transfer is highly optimized.
78+
- **sobel_gpu_4_shared_mem**: Allocate shared memory (if possible) for
79+
the GPU/CPU to eliminate the memory transfer.
80+
- **sobel_gpu_5_shared_mem_streams**: Use CUDA streams to process
81+
certain parts of the pipeline in parallel.
82+
- **sobel_gpu_5_pinned_mem_streams**: Use CUDA streams to process
83+
certain parts of the pipeline in parallel (alternative implementation
84+
for pinned memory instead of shared memory).

0 commit comments

Comments
 (0)