Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup guide: better recommendations for Docker usage #98

Open
bertsky opened this issue Jan 10, 2020 · 8 comments
Open

Setup guide: better recommendations for Docker usage #98

bertsky opened this issue Jan 10, 2020 · 8 comments
Assignees

Comments

@bertsky
Copy link
Collaborator

bertsky commented Jan 10, 2020

The current formulation of the setup guide recommends running the docker image individually for the individual processor CLIs (translating native commands to docker calls). This is one possibility, but I would not recommend it first/exclusively.

The current (fat) Docker configuration is tailored specifically for usage of more than 1 processor within the same container, which is not only more efficient than spinning up multiple containers for multiple tools, but can also be much more convenient to set up a workflow. You can "login" to the image interactively, and (interactive or not) to run an entire workflow with multiple tools on your dataset, you currently have 3 options:

  • use ocrd process
  • use a complex shell command
  • use workflow-configuration (ocrd-make)

The latter already provides a number of sensible preconfigured workflows.

This is what I would document in the guide. (And the section Running a small workflow also does not cover this, or the Docker option, yet.)

@kba kba transferred this issue from OCR-D/docs May 11, 2020
@EEngl52
Copy link
Collaborator

EEngl52 commented May 13, 2020

thx for reporting @bertsky ! In the meantime this was already solved by adding the Docker calls in the respective sections of the user guide, wasn't it? https://ocr-d.de/en/user_guide#calling-several-processors

@bertsky
Copy link
Collaborator Author

bertsky commented May 13, 2020

In the meantime this was already solved by adding the Docker calls in the respective sections of the user guide, wasn't it? https://ocr-d.de/en/user_guide#calling-several-processors

No, that's not the same thing!

(But notice that https://ocr-d.de/en/setup#translating-native-commands-to-docker-calls appears as a broken link in the setup guide – it should point to https://ocr-d.de/en/user_guide#translating-native-commands-to-docker-calls.)

IMO it does not make much sense (practically) to recommend/explain the Docker option by showing how to repeat each individual command/step with a docker run call. You could do that once in https://ocr-d.de/en/user_guide#calling-a-single-processor and once in https://ocr-d.de/en/user_guide#calling-several-processors (and/or in the setup guide).

But then you should also explain how to "log in" to a Docker container – spinning it up only once – and there do all the wonderful things explained everywhere in the "native" installation context, including ocrd workspace commands, single processor calls, workflow tools etc.

@bertsky
Copy link
Collaborator Author

bertsky commented Jul 7, 2020

So to sum up:

  • drop all the "via Docker" command examples in the user guide (as they are impractical and redundant)
  • explain once (at the top of the user guide, and at the bottom of the setup guide) how to spin up Docker containers
  • and at that point: explain not only how to spin up one processor at a time (which is wasteful and impractical) but how to "log in" to a container and then do whatever you would do in native installation

@kba kba assigned kba and EEngl52 and unassigned VolkerHartmann and EEngl52 Jul 7, 2020
@bertsky
Copy link
Collaborator Author

bertsky commented Aug 14, 2020

Am I the only one unsatisfied with documentation of the Docker option?

@dariok
Copy link

dariok commented Sep 10, 2020

No, I second your recommendation of reworking the documentation in this regard. Users who are not familiar with docker will be surprised to see how quickly their drives will be cluttered after executing docker run 20 times. While many of us know enough about docker to solve this, it is safe to assume that a fair amount of users are using docker for the first time because of ocr-d, hence, the info should be changed.

@kba
Copy link
Member

kba commented Sep 10, 2020

I hear you and we will improve docker documentation.

Users who are not familiar with docker will be surprised to see how quickly their drives will be cluttered after executing docker run 20 times.

You can mitigate that problem by using the docker run --rm option which will remove the container after executing and occasionally running docker system prune.

@bertsky
Copy link
Collaborator Author

bertsky commented Sep 10, 2020

You can mitigate that problem by using the docker run --rm option which will remove the container after executing

the user guide already shows this each time, but not the setup guide.

and occasionally running docker system prune.

Nice, didn't know this (used to help myself with docker container prune and docker image prune. Anyway, please also put this somewhere in the setup guide.

@dariok
Copy link

dariok commented Sep 16, 2020

@kba: Thanks for replying!
As @bertsky said, docker run --rm is not mentioned in the setup guide. I have not checked the other pages but think that many people not used to working with docker may not spot the difference.

I myself prefer not to fire up a new container every time but instead naming and docker attaching).

Probably it’s easier not to include all docker hints on the individual pages but instead have a special page about the possible usage scenarios with docker. This could include mentioning docker run --rm and giving an intro about how to name a container and then later re-attach to it. That way, users can choose what is best in their situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants