Skip to content

Add fast grayscale morphology using sparse table data structure #3929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: 4.x
Choose a base branch
from

Conversation

kyaco
Copy link

@kyaco kyaco commented Apr 29, 2025

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Abstract

This is a pull request for a program that calculates grayscale morphological operations quickly by using a data structure called a sparse table.

About sparse table

1D sparse table

For the data structure, see this: https://www.geeksforgeeks.org/sparse-table/

It is possible to quickly find the minimum (maximum) value of an array within a range whose length is a power of two.

Let's call the nth row of the sparse table the "row of depth n".
The nth row contains the minimum (maximum) values ​​in consecutive subsequences of length $2^n$ of the original array.

sparse tabl(min)
depth0(original array) 1 4 9 2 5 0 5 7 3
depth1 1 4 2 2 0 0 5 3 -
depth2 1 2 0 0 0 0 - - -
depth3 0 0 - - - - - - -

Let us denote by $F$ the operation of computing the next row from a row in a sparse table.

$st[0] = original~array$
$st[depth + 1] = F(st[depth])$

2D sparse table

For the data structure, see this: https://www.geeksforgeeks.org/2d-range-minimum-query-in-o1/

Change the order of the subscripts to:

$st[dr][dc][r][c]$

where $dr$ is the depth in the row direction, $dc$ is the depth in the column direction, $r$ is the index in the row direction, and $c$ is the index in the column direction.
$st[dr][dc]$ is a two-dimensional table that lists the minimum (maximum) values ​​within a rectangle R of height $2^{dr}$ and width $2^{dc}$ that is slid across the original array A. In other words, it is the result of a morphological operation when A is a grayscale image and R is the structuring element.

In terms of the depth (dr, dc) of a sparse table, the operation of increasing the depth by +1 in the row and col directions will be denoted as Fr and Fc, respectively.

$st[0][0] = original~image$
$st[dr+1][dc] = F_r(st[dr][dc])$
$st[dr][dc+1] = F_c(st[dr][dc])$

Sparse table morphology

Morphological operations are performed in the following steps:

  1. Prepare a set of rectangles whose width and height are powers of 2 to cover the structuring element
  2. Group the rectangles by height and width.
  3. Prepare a MAT DST to accumulate the results
  4. For each group, do the following
    1. Create st[dr][dc] that corresponds to the size of this group
    2. For each rectangle in the group, do the following
      • Shift st[dr][dc] by the position of each rectangle, and calculate min/max against DST.

At step 1, I implemented a method to find corners in a sparse table of structuring elements.
I don't know if it's the best method, but it's giving good results.

At step 4-1, st[dr][dc] is created by reusing st[dr'][dc'] of smaller size.
There are several possible reuse strategies, but I implemented a method that minimizes the total number of calculations of Fr and Fc.
This is reduced to the Rectilinear Steiner Arborescence Problem. (https://link.springer.com/article/10.1007/BF01758762).

Examples

With a structuring element of MORPH_ELLIPSE, size(5, 5)

The structuring element is decomposed into six rectangles.
The result of grouping in step 2 is as follows:

  • Size(4, 2): Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
  • Size(1, 4): Point(2, 0), Point(2, 1)

image

For size(4, 2), we need to calculate st[1][2], and for size(1, 4), we need to calculate st[2][0].

  1. st[0][0] =: original image
  2. st[1][0] =: Fr(st[0][0])
  3. st[1][1] =: Fc(st[1][0])
  4. st[1][2] =: Fc(st[1][1])
    • shift & min/max process for Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
  5. st[2][0] =: Fr(st[1][0])
    • shift & min/max process for Point(2, 0), Point(2, 1)

image

Performance measurement

  • Execution environment
    • Intel i7-8000 CPU @ 3.20GHz
    • 32GB RAM
  • image
    • lena.png(512x512), CV_8UC3
  • conditions
    • cv_RECT: MORPH_RECT, cv_erode
    • cv_CROSS: MORPH_CROSS, cv_erode
    • cv_ELLIPSE: MORPH_ELLIPSE, cv_erode
    • st_RECT: MORPH_RECT, sparse_table_erode
    • st_CROSS: MORPH_CROSS, sparse_table_erode
    • st_ELLIPSE: MORPH_ELLIPSE, sparse_table_erode

When kernel is decomposited in advance:

image

From the graph, time complexity against diameter of kernel are as follows:

  • cv_RECT and cv_CROSS: O(k)
  • cv_ELLIPSE: O(k^2)
  • st_RECT and st_CROSS: O(k^0.5)
  • st_ELLIPSE: O(k^0.76)

Including decomposition time:

image

When the kernel size approaches the same size as the target image, the kernel decomposition time becomes comparable to that of CV_RECT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants