-
-
Notifications
You must be signed in to change notification settings - Fork 200
Description
The SVD tutorial interprets the shape of the image in the following way:
img.shape
(768, 1024, 3)
The output is a tuple with three elements, which means that this is a three-dimensional array. In fact, since this is a color image, and we have used the
imread
function to read it, the data is organized in three 2D arrays, representing color channels (in this case, red, green and blue - RGB). You can see this by looking at the shape above: it indicates that we have an array of 3 matrices, each having shape 768x1024.
I think it is misleading to explain the organisational structure of the array as consisting of three 2D arrays. The data is not organized as an array of 3 matrices.
In numpy, the leftmost axis is the outermost. In shape (768, 1024, 3)
the colour channels axis stands for the innermost axis.
So rather than "three matrices of shape 768x1024," I think it helps building intuition better to talk about a 768x1024 grid of pixels, where each pixel has 3 values for RBG.
Do others agree with this? If so, I could make a PR that suggests a better wording.