Skip to content
This repository was archived by the owner on Oct 12, 2023. It is now read-only.

Files

docs

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
May 17, 2018
Feb 1, 2019
Sep 12, 2018
May 15, 2018
Sep 12, 2018
May 14, 2018
Mar 13, 2017
May 15, 2018
May 15, 2018
Oct 16, 2018
May 15, 2018
Sep 12, 2018
May 15, 2018
Jul 13, 2018
May 15, 2018
Jun 19, 2019
Jun 27, 2018
May 15, 2018
Jun 7, 2018
Feb 5, 2019
May 15, 2018
May 15, 2018
May 15, 2018

doAzureParallel Guide

This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.

  1. Azure Introduction (link)

    Using Azure Batch

  2. Getting Started (link)

    Using the Getting Started to create credentials

    i. Generate Credentials Script (link)

    • Pre-built bash script for getting Azure credentials without Azure Portal

    ii. National Cloud Support (link)

    • How to run workload in Azure national clouds
  3. Customize Cluster (link)

    Setting up your cluster to user's specific needs

    i. Virtual Machine Sizes (link)

    • How do you choose the best VM type/size for your workload?

    ii. Autoscale (link)

    • Automatically scale up/down your cluster to save time and/or money.

    iii. Building Containers (link)

    • Creating your own Docker containers for reproducibility
  4. Managing Cluster (link)

    Managing your cluster's lifespan

  5. Customize Job

    Setting up your job to user's specific needs

    i. Asynchronous Jobs (link)

    • Best practices for managing long running jobs

    ii. Foreach Azure Options (link)

    • Use Azure package-defined foreach options to improve performance and user experience

    iii. Error Handling (link)

    • How Azure handles errors in your Foreach loop?
  6. Package Management (link)

    Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.

  7. Storage Management

    i. Distributing your Data (link)

    • Best practices and limitations for working with distributed data.

    ii. Persistent Storage (link)

    • Taking advantage of persistent storage for long-running jobs

    iii. Accessing Azure Storage through R (link)

    • Manage your Azure Storage files via R
  8. Performance Tuning (link)

    Best practices on optimizing your Foreach loop

  9. Debugging and Troubleshooting (link)

    Best practices on diagnosing common issues

  10. Azure Limitations (link)

    Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.

Additional Documentation

Read our FAQ for known issues and common questions.