add a page to discuss performance #64

jdries · 2022-05-12T07:12:42Z

No description provided.

soxofaan

just some minor notes

documentation/1.0/developers/backends/performance.md

LukeWeidenwalker · 2022-05-12T12:32:56Z

documentation/1.0/developers/backends/performance.md

+In general, process graphs are first analyzed as a whole before the actual processing starts. The analysis phase serves
+to reveal the optimal processing strategy and parameters. 
+
+These are a few examples of things that can be derived from a process graph and subsequent optimizations:


Please correct me if I'm wrong, but these examples don't seem all that backend-specific - if that is the case, wouldn't we want to come up with a selection of backend-independent pre-optimisations (editing the process graph directly), publish these as an openeo library and link to that here?

Are there any other relevant repos to link here? (I hesitate to link to openeo-odc, because it isn't very readable)

Actually, part of this is already part of openeo-python-driver which is a package that does not have too many dependencies on backend specific stuff.
I believe original authors of the odc backend simply choose to write and maintain their own basic process graph handling code, but there's indeed more opportunity to have shared libraries, especially for python related stuff.
Of course, most time is spent writing the actual processing engine, you can also reuse that, but then you don't run on dask anymore :-).

LukeWeidenwalker · 2022-05-12T12:37:24Z

documentation/1.0/developers/backends/performance.md

+For scalability, the openEO processes clearly define along which set of dimension labels of the datacube they operate. When
+a user writes a process graph, it should never instruct the backend to apply a black box algorithm or function on the 
+entire datacube. For most algorithms, this is not necessary, and loading the complete datacube of a Copernicus mission at once
+is simply not possible. Hence, users run 'callbacks' over a 1-dimensional array, or even multidimensional arrays or 'chunks'


Again, as someone fairly new to openEO jargon, it is unclear to me what "callback" means here - is the idea to "chunk" an array of processing tasks into individual callables? If so, I think that wants to be explicitly described!

"callback" is meant to refer to the reducer argument in reduce_dimension or process argument in apply_dimension. In openEO these are called "processes", but I think that is too generic to be used as replacement for "callback" in this context. Maybe citing an example (e.g. reducer in reduce_dimension) could help in the text?

I added a link to python client docs as I don't know of a better alternative.
@m-mohr I would expect something here: https://openeo.org/documentation/1.0/developers/api/reference.html#section/Processes/Process-Graphs
But now it seems to be explained as 'user defined process', which I actually think of as something else.

Yeah, callback people complained about it because it is "javascript" so now it is called "user-defined process" in openEO terminology, which is also used in the processes actually. Still not ideal though, indeed. What we recently used also is "child process" or "sub process", I think. But yeah maybe use "user-defined (child) process" and link to the API docs, indeed.

jdries · 2022-05-12T13:19:06Z

@m-mohr should I just merge this one, so that we have a page for tomorrow's meeting?

documentation/1.0/developers/backends/performance.md

m-mohr · 2022-05-12T15:11:05Z

@jdries I haven't read it, but it has some reviews so I added it to the menu and merged it. Thanks.

add a page to discuss performance

4afde33

jdries requested review from aljacob, m-mohr, soxofaan and LukeWeidenwalker May 12, 2022 07:12

soxofaan approved these changes May 12, 2022

View reviewed changes

documentation/1.0/developers/backends/performance.md Outdated Show resolved Hide resolved

documentation/1.0/developers/backends/performance.md Outdated Show resolved Hide resolved

LukeWeidenwalker reviewed May 12, 2022

View reviewed changes

documentation/1.0/developers/backends/performance.md Outdated Show resolved Hide resolved

LukeWeidenwalker reviewed May 12, 2022

View reviewed changes

integrate feedback

bbc526e

Add link to menu

182a799

m-mohr reviewed May 12, 2022

View reviewed changes

documentation/1.0/developers/backends/performance.md Outdated Show resolved Hide resolved

Update documentation/1.0/developers/backends/performance.md

1d18995

m-mohr merged commit ee869b3 into master May 12, 2022

m-mohr deleted the performance-guidelines branch May 12, 2022 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a page to discuss performance #64

add a page to discuss performance #64

jdries commented May 12, 2022

soxofaan left a comment

LukeWeidenwalker May 12, 2022

jdries May 12, 2022

LukeWeidenwalker May 12, 2022

soxofaan May 12, 2022

jdries May 12, 2022

m-mohr May 12, 2022 •

edited

Loading

m-mohr May 12, 2022

jdries commented May 12, 2022

m-mohr commented May 12, 2022

add a page to discuss performance #64

add a page to discuss performance #64

Conversation

jdries commented May 12, 2022

soxofaan left a comment

Choose a reason for hiding this comment

LukeWeidenwalker May 12, 2022

Choose a reason for hiding this comment

jdries May 12, 2022

Choose a reason for hiding this comment

LukeWeidenwalker May 12, 2022

Choose a reason for hiding this comment

soxofaan May 12, 2022

Choose a reason for hiding this comment

jdries May 12, 2022

Choose a reason for hiding this comment

m-mohr May 12, 2022 • edited Loading

Choose a reason for hiding this comment

m-mohr May 12, 2022

Choose a reason for hiding this comment

jdries commented May 12, 2022

m-mohr commented May 12, 2022

m-mohr May 12, 2022 •

edited

Loading