Setup guide: better recommendations for Docker usage #98

bertsky · 2020-01-10T11:07:50Z

The current formulation of the setup guide recommends running the docker image individually for the individual processor CLIs (translating native commands to docker calls). This is one possibility, but I would not recommend it first/exclusively.

The current (fat) Docker configuration is tailored specifically for usage of more than 1 processor within the same container, which is not only more efficient than spinning up multiple containers for multiple tools, but can also be much more convenient to set up a workflow. You can "login" to the image interactively, and (interactive or not) to run an entire workflow with multiple tools on your dataset, you currently have 3 options:

use ocrd process
use a complex shell command
use workflow-configuration (ocrd-make)

The latter already provides a number of sensible preconfigured workflows.

This is what I would document in the guide. (And the section Running a small workflow also does not cover this, or the Docker option, yet.)

The text was updated successfully, but these errors were encountered:

EEngl52 · 2020-05-13T10:40:32Z

thx for reporting @bertsky ! In the meantime this was already solved by adding the Docker calls in the respective sections of the user guide, wasn't it? https://ocr-d.de/en/user_guide#calling-several-processors

bertsky · 2020-05-13T11:04:54Z

In the meantime this was already solved by adding the Docker calls in the respective sections of the user guide, wasn't it? https://ocr-d.de/en/user_guide#calling-several-processors

No, that's not the same thing!

(But notice that https://ocr-d.de/en/setup#translating-native-commands-to-docker-calls appears as a broken link in the setup guide – it should point to https://ocr-d.de/en/user_guide#translating-native-commands-to-docker-calls.)

IMO it does not make much sense (practically) to recommend/explain the Docker option by showing how to repeat each individual command/step with a docker run call. You could do that once in https://ocr-d.de/en/user_guide#calling-a-single-processor and once in https://ocr-d.de/en/user_guide#calling-several-processors (and/or in the setup guide).

But then you should also explain how to "log in" to a Docker container – spinning it up only once – and there do all the wonderful things explained everywhere in the "native" installation context, including ocrd workspace commands, single processor calls, workflow tools etc.

bertsky · 2020-07-07T15:01:09Z

So to sum up:

drop all the "via Docker" command examples in the user guide (as they are impractical and redundant)
explain once (at the top of the user guide, and at the bottom of the setup guide) how to spin up Docker containers
and at that point: explain not only how to spin up one processor at a time (which is wasteful and impractical) but how to "log in" to a container and then do whatever you would do in native installation

bertsky · 2020-08-14T21:40:39Z

Am I the only one unsatisfied with documentation of the Docker option?

dariok · 2020-09-10T10:12:14Z

No, I second your recommendation of reworking the documentation in this regard. Users who are not familiar with docker will be surprised to see how quickly their drives will be cluttered after executing docker run 20 times. While many of us know enough about docker to solve this, it is safe to assume that a fair amount of users are using docker for the first time because of ocr-d, hence, the info should be changed.

kba · 2020-09-10T10:19:54Z

I hear you and we will improve docker documentation.

Users who are not familiar with docker will be surprised to see how quickly their drives will be cluttered after executing docker run 20 times.

You can mitigate that problem by using the docker run --rm option which will remove the container after executing and occasionally running docker system prune.

bertsky · 2020-09-10T10:24:38Z

You can mitigate that problem by using the docker run --rm option which will remove the container after executing

the user guide already shows this each time, but not the setup guide.

and occasionally running docker system prune.

Nice, didn't know this (used to help myself with docker container prune and docker image prune. Anyway, please also put this somewhere in the setup guide.

dariok · 2020-09-16T10:28:08Z

@kba: Thanks for replying!
As @bertsky said, docker run --rm is not mentioned in the setup guide. I have not checked the other pages but think that many people not used to working with docker may not spot the difference.

I myself prefer not to fire up a new container every time but instead naming and docker attaching).

Probably it’s easier not to include all docker hints on the individual pages but instead have a special page about the possible usage scenarios with docker. This could include mentioning docker run --rm and giving an intro about how to name a container and then later re-attach to it. That way, users can choose what is best in their situation.

kba transferred this issue from OCR-D/docs May 11, 2020

EEngl52 assigned VolkerHartmann May 13, 2020

kba assigned kba and EEngl52 and unassigned VolkerHartmann and EEngl52 Jul 7, 2020

bertsky mentioned this issue Sep 14, 2021

idea: use docker interactive shell instead of wrapping everything in docker calls #247

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup guide: better recommendations for Docker usage #98

Setup guide: better recommendations for Docker usage #98

bertsky commented Jan 10, 2020

EEngl52 commented May 13, 2020

bertsky commented May 13, 2020

bertsky commented Jul 7, 2020

bertsky commented Aug 14, 2020

dariok commented Sep 10, 2020

kba commented Sep 10, 2020

bertsky commented Sep 10, 2020

dariok commented Sep 16, 2020

Setup guide: better recommendations for Docker usage #98

Setup guide: better recommendations for Docker usage #98

Comments

bertsky commented Jan 10, 2020

EEngl52 commented May 13, 2020

bertsky commented May 13, 2020

bertsky commented Jul 7, 2020

bertsky commented Aug 14, 2020

dariok commented Sep 10, 2020

kba commented Sep 10, 2020

bertsky commented Sep 10, 2020

dariok commented Sep 16, 2020