Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of Spark executable along with calling application. #1633

Closed
wants to merge 1 commit into from

Conversation

nmalfroy
Copy link
Contributor

@nmalfroy nmalfroy commented Apr 1, 2024

Here's an example of:

  • A class that can be jar-ed and run from Spark that just reads a Parquet director and counts all of the rows and logs the result
  • A class that invokes a Spark job.

I've tried to document as well as I could but happy to help walk through it!

Note: to use, you need to be sure that you grant:

  • "Storage Blob Data Contributor" to the application running the spark submit job (integration in this example) on the tdrsnpsintsaeastus storage account (which stores the executable jar file)
  • "Storage Blob Data Contributor" to the Synapse managed identity on the tdrsnpsintsaeastus storage account (which stores the executable jar file)
  • "Synapse Apache Spark Administrator" to the application running the spark submit job (integration in this example) to the Synapse workspace (note: that last one has to be done from inside Synapse Studio)

Just run the "RunTheThingApp" with the environment variables from ./render-configs.sh -a integration -i and you should be good to go!

You can then monitor the run from Synapse Studio in the Monitor > Apache Spark Applications blade:
image

And see outputs of the logs in the driver logs:
image

Note: to use, you need to be sure that you grant:
- "Storage Blob Data Contributor" to the application running the spark submit job (integration in this example) on the tdrsnpsintsaeastus storage account (which stores the executable jar file)
- "Storage Blob Data Contributor" to the Synapse managed identity on the tdrsnpsintsaeastus storage account (which stores the executable jar file)
- "Synapse Apache Spark Administrator" to the application running the spark submit job (integration in this example) to the Synapse workspace (note: that last one has to be done from inside Synapse Studio)

Just run the "RunTheThingApp" with the environment variables from "./render-configs.sh -a integration  -i" and you should be good to go!
@snf2ye snf2ye closed this Oct 21, 2024
@snf2ye snf2ye deleted the nm-spark-test branch October 21, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants