Skip to content

Risers Fatigue Analysis workflow implemented in Spark.

Notifications You must be signed in to change notification settings

hpcdb/RFA-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Risers Fatigue Analysis Synthetic - Spark

In this repository, you find a synthetic implementation on Apache Spark framework of Riser Fatigue Analysis (RFA) scientific workflow based on a real case study in Oil and Gas domain. This implementation use the natively available Process library in Spark to call external black-box applications.

Content

Risers Fatigue Analysis synthetic workflow

Risers Fatigue Analysis (RFA) workflow is a real case study in the Oil and Gas domain. It is composed of seven activities that receive input tuples, perform complex calculations based on them, and transform tuples into resulting output tuples.

alt text

How to Run

Dependencies:

Setup and configuration:

Clone repository:

$ git clone https://github.com/hpcdb/RFA-Spark.git
$ cd RFA-Spark

Edit the input file:

$ vi input.dataset
  • Example:
ID;SPLITMAP;SPLITFACTOR;MAP1;MAP2;FILTER1;F1;FILTER2;F2;REDUCE;REDUCEFACTOR
1;5;8;5;5;5;50;5;50;5;4
  • Fields:
    • ID: Entry identifier
    • SPLITMAP: Average Task Cost in Uncompress activity (seconds)
    • SPLITFACTOR: Number of entries in the input dataset after uncompression
    • MAP1: Average Task Cost in Pre-Processing activity (seconds)
    • MAP2: Average Task Cost in Analyze Riser sactivity (seconds)
    • FILTER1:Average Task Cost in Calculate Wear and Tear activity (seconds)
    • F1: Amount of entries for Calculate Wear and Tear activity to filter in % (i.e., Percentage that will continue in the flow)
    • FILTER2:Average Task Cost in Analyze Position activity (seconds)
    • F2: Amount of entries for Analyze Position activity to filter in %(i.e., Percentage that will continue in the flow)
    • REDUCE: Average Task Cost in Compress Results activity (seconds)
    • REDUCEFACTOR: Number of compressed output entries

Run

$ export SPARK_HOME=/path/to/spark
  • Change directory to RFA-Spark home:
$ cd RFA-Spark
  • Run:
$ ./run.sh <spark-master-url> <num-executors> <total-executor-cores>

Where:

  • spark-master-url: The master URL for the cluster

  • num-executors: Number of Apache Spark executors requested on the cluster.

  • total-executor-cores: Total Number of cores requested on the cluster.

  • Example:

$ ./run.sh  spark://hostname:7077 1 2

Source Code

Source Code

How to Build

Build Dependencies

Build

  • Change directory to rfa-spark-project:
$ cd RFA-Spark/rfa-spark-project
  • Maven
$ mvn package

About

Risers Fatigue Analysis workflow implemented in Spark.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published