Skip to content

Conversation

jdblischak
Copy link
Collaborator

This PR updates the parallel vignette to better compare the results from sequential and parallel processing. We thank @michaelmayer2 for alerting us to the fact that the default multithreading of {data.table} made the sequential processing timings confusing.

I made the following updates:

  • Updated the variable names to reflect whether they were created by sequential or parallel processing
  • Two important code chunks with the results had set eval=FALSE. I replaced the results table with a histogram to demonstrate the difference in durations for the two different enrollment strategies. I used all.equal() to demonstrate that set.seed() worked as expected
  • I set data.table::setDTthreads(threads = 1) so that the sequential processing time was completely sequential. I didn't need to do any extra for the parallel processing because {data.table} automatically sets threads = 1 when it runs inside a forked process. Thus this will behave like we want when run in a non-interactive R session on a Linux machine (namely in GitHub Actions to produce the pkgdown page and on CRAN to produce the bundled vignette). The multithreading behavior gets more complex when run in Windows or from RStudio, but I don't think we need to worry about this

@jdblischak jdblischak requested review from nanxstats and cmansch April 2, 2025 20:22
@jdblischak jdblischak self-assigned this Apr 2, 2025
@jdblischak jdblischak changed the title Update parallel vignette Update parallel vignette to address {data.table} multithreading Apr 2, 2025
Copy link
Collaborator

@nanxstats nanxstats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the suggestions and efforts to improve the correctness and readability of this parallel vignette - it's often a challenge because it's more convoluted than it should be. Now this is the definition of ✨clarity✨

@LittleBeannie LittleBeannie merged commit 27ad462 into Merck:main Apr 3, 2025
9 checks passed
@jdblischak jdblischak deleted the update-parallel-vignette branch April 4, 2025 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants