Skip to content

322 Notebook Executables #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
12ed2d7
Test
RavenMarQ Sep 12, 2024
334bac3
A untested prototype of understand_parse_dependencies
RavenMarQ Sep 26, 2024
33a293b
Completed Functions
RavenMarQ Oct 2, 2024
b5bf007
Update src.R
RavenMarQ Oct 2, 2024
b17d286
Working On Creating Notebook
RavenMarQ Oct 3, 2024
57150b4
Addressing the Code Review
RavenMarQ Oct 3, 2024
fbb6320
About-Completed Milestone
RavenMarQ Oct 4, 2024
654cac7
Final Touches
RavenMarQ Oct 9, 2024
3f3e425
i #308 Refactored understand_showcase.Rmd
beydlern Oct 10, 2024
f1d545b
Merge branch 'master' into 308-scitool-understand-parse
RavenMarQ Oct 11, 2024
857d3c4
After Merge with master Branch
RavenMarQ Oct 11, 2024
814b5a4
Documentation for the three functions
RavenMarQ Oct 11, 2024
ac36de2
Fixing mismatched name in Documentation
RavenMarQ Oct 11, 2024
9f999c1
i #308 Fixed Relative Paths in a Notebook
beydlern Oct 18, 2024
cc631fe
Merge branch 'master' into 308-scitool-understand-parse
carlosparadis Nov 12, 2024
085ea7e
Create Understand Notebook Exec
RavenMarQ Nov 12, 2024
8da6d22
Merge branch '308-scitool-understand-parse' into 322-creating-exec-fo…
RavenMarQ Nov 19, 2024
51c2b4c
Update understand.R
RavenMarQ Nov 19, 2024
06b76e0
Uploaded Completed executables
RavenMarQ Dec 6, 2024
4dacffe
Final Push
RavenMarQ Dec 10, 2024
b61974a
Merge branch 'master' into 322-creating-exec-for-notebooks
RavenMarQ Dec 11, 2024
114535e
Addressing Code Reviews
RavenMarQ Dec 11, 2024
bce7382
Delete .idea directory
RavenMarQ Dec 11, 2024
49e72aa
Merge branch '322-creating-exec-for-notebooks' of https://github.com/…
RavenMarQ Dec 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions .idea/kaiaulu.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Package: kaiaulu
Type: Package
Title: Kaiaulu
Version: 0.0.0.9700
Description: Kaiaulu is an R package and common interface that helps with understanding evolving software development communities, and the artifacts (gitlog, mailing list, files, etc.) which developers collaborate and communicate about. See Paradis et al., (2012) <doi:10.1007/978-3-031-15116-3_6>.
Description: Kaiaulu is an R package and common interface that helps with understanding evolving software development communities, and the artifacts (gitlog, mailing list, files, etc.) which developers collaborate and communicate about. See Paradis et al., (2012) <doi:10.1007/978-3-031-15116-3_6>.
Authors@R: c(
person('Carlos', 'Paradis', role = c('aut', 'cre'),
email = '[email protected]',
Expand All @@ -21,6 +21,7 @@ Authors@R: c(
person('Anthony', 'Lau', role = c('ctb')),
person('Sean', 'Sunoo', role = c('ctb')),
person('Ian Jaymes', 'Iwata', role= c('ctb')),
person('Raven', 'Quiddaoen', role= c('ctb')),
person('Nicholas', 'Beydler', role = c('ctb')),
person('Mark', 'Burgess', role = c('ctb'))
)
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
export(annotate_src_text)
export(assign_exact_identity)
export(bipartite_graph_projection)
export(build_understand_project)
export(commit_message_id_coverage)
export(community_oslom)
export(convert_pipermail_to_mbox)
Expand Down Expand Up @@ -189,6 +190,7 @@ export(parse_r_dependencies)
export(parse_r_function_definition)
export(parse_r_function_dependencies)
export(parse_rfile_ast)
export(parse_understand_dependencies)
export(query_src_text)
export(query_src_text_class_names)
export(query_src_text_namespace)
Expand All @@ -214,6 +216,7 @@ export(transform_gitlog_to_temporal_network)
export(transform_r_dependencies_to_network)
export(transform_reply_to_bipartite_network)
export(transform_temporal_gitlog_to_adsmj)
export(transform_understand_dependencies_to_network)
export(weight_scheme_count_deleted_nodes)
export(weight_scheme_cum_temporal)
export(weight_scheme_pairwise_cum_temporal)
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ __kaiaulu 0.0.0.9700 (in development)__

### NEW FEATURES

* `build_understand_project (project_path , language, output_dir)`, `parse_understand_dependencies(output_dir, parse_type)`, and `transform_understand_dependencies_to_network(parsed, weights)` have been added. These functions handle creating tables from xml data generated from Scitool's Understand. [#308](https://github.com/sailuh/kaiaulu/issues/308)
* The GitHUB API has been expanded to use refresh, along with other functions. `github_api_project_issue_search` has been added that makes the search/issues endpoint API calls. `github_api_project_issue_or_pr_comments_by_date` and `github_api_project_issue_by_date` have been added to download issue data and comments by date ranges. `github_parse_search_issues_refresh` has been added that parses the issue data downloaded from the search endpoint in the refresh_issues folder. `github_api_project_issue_refresh` and `github_api_project_issue_or_pr_comment_refresh` were added to download issue data or comments respectively that have not already been downloaded. `format_created_at_from_file` was added to retrieve the greatest date from a JSON file. See the Reference Docs on GitHub section for more details. [#282](https://github.com/sailuh/kaiaulu/issues/282)
* `config.R` now contains a set of getter functions used to centralize the gathering of configuration data and these getter functions are used to refactor configuration file information gathering. For example, loading configuration file information with variable assignment is as follows `git_repo_path <- config_file[["version_control"]][["log"]]` but refactoring with a config.R getter function becomes `git_repo_path <- get_git_repo_path(config_file)`. [#230](https://github.com/sailuh/kaiaulu/issues/230)
* `refresh_jira_issues()` had been added. It is a wrapper function for the previous downloader and downloads only issues greater than the greatest key already downloaded. [#275](https://github.com/sailuh/kaiaulu/issues/275)
Expand Down
160 changes: 160 additions & 0 deletions R/src.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,141 @@
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at https://mozilla.org/MPL/2.0/.

############## Understand Project Builder ##############
#' Build Understand Analysis Folder
#'
#' @description This function builds the data files for Understand from the project_path folder, reading from files that are written in the target language into output_dir
#'
#' @param project_path path to the project folder to analyze
#' @param language the primary language of the project (language must be supported by Understand)
#' @param output_dir path to output directory (formatted output_path/)
#' @export
#' @family parsers
build_understand_project <- function(project_path, language, output_dir){
# Create variables for command line
command <- "und"
project_path <- shQuote(project_path) # Quoting the project path
db_dir <- file.path(output_dir, "Understand.und")
args <- c("create", "-db", db_dir, "-languages", language)

# Build the Understand project by parsing through using Understand's und command
# Derived from pg. 352 in https://documentation.scitools.com/pdf/understand.pdf Sept. 2024 Edition
system2(command, args)
args <- c("-db", db_dir, "add", project_path)
system2(command, args)
args <- c("analyze", db_dir)
system2(command, args)

}

############## Parsers ##############
#' Parse Built Folder to Network
#'
#' @description This function parses the data in the Understand build folder to export the parse_type dependencies into a network
#'
#' @param understand_dir path to the built Understand project folder used in \code{\link{build_understand_project}}
#' @param parse_type Type of dependencies to generate into xml (either "file" or "class")
#' @export
#' @family parsers
parse_understand_dependencies <- function(understand_dir, parse_type = c("file", "class")) {
# Before running, check if parse_type is correct
parse_type <- match.arg(parse_type)

# Create the variables used in command lines
db_dir <- file.path(understand_dir, "Understand.und")
file_name <- paste0(parse_type, "Dependencies.xml")
xml_dir <- file.path(db_dir, file_name)

# Generate the XML file
# Derived from pg. 352 in https://documentation.scitools.com/pdf/understand.pdf Sept. 2024 Edition
args <- c("export", "-dependencies", parse_type, "cytoscape", xml_dir, db_dir)
system2("und", args)

# Generated XML file is assumed to be in this approximate format (regardless of parse_type) using Understand Build 1202
# <graph ...>
# ... [Irrelevant graph attributes and rdf grandchildren]
# <node id="67" label="ObjectMapper id:67">
# <att type="string" name="node.shape" value="rect"/>
# <att type="string" name="node.fontSize" value="5"/>
# <att type="string" name="node.label" value="ObjectMapper"/>
# <att type="string" name="longName" value="com.fasterxml.jackson.databind.ObjectMapper"/>
# <att type="string" name="kind" value="Unknown Class"/>
# <graphics type="RECTANGLE" h="35" w="35" x="0" y="0" fill="#ffffff" width="1" outline="#000000" cy:nodeTransparency="1.0" cy:nodeLabelFont="Default-0-8" cy:borderLineType="solid"/>
# </node>
# ... [Other nodes sharing the format]
# <edge source="2" target="9" label="App(Depends On)CalculatorUI">
# <att type="string" name="edge.targetArrowShape" value="ARROW"/>
# <att type="string" name="edge.color" value="#0000FF"/>
# <att type="string" name="canonicalName" value="App(Depends On)CalculatorUI"/>
# <att type="string" name="interaction" value="Depends On"/>
# <att type="string" name="dependency kind" value="Call, Create"/>
# </edge>
# ... [Other edges sharing the format]


# Parse the XML file
xml_data <- xmlParse(xml_dir) # Creates pointer to file
xml_nodes <- xmlRoot(xml_data) # Finds the head: graph
xml_nodes <- xmlChildren(xml_nodes)
# xml_nodes now contains the nodes and edges (which were children of graph) and also graph's atts

# From child nodes- filter for those with name "node"
# Create a list by iterating through all the children in xml_nodes
node_elements <- lapply(xml_nodes, function(child) {
if (xmlName(child) == "node") { # We're searching for nodes, not att or edges
id <- xmlGetAttr(child, "id") # Extract the id from the node line
att_nodes <- xmlChildren(child) # To access the atts of the node
node_label <- xmlGetAttr(att_nodes[[3]], "value") # Relevant att is the 3rd line
long_name <- xmlGetAttr(att_nodes[[4]], "value") # Relevant att is the 4th line
return(data.table(node_label = node_label, id = id, long_name = long_name)) # Returns the table containing the filtered node data
} else {
return(NULL) # Return NULL for the entry to be filtered out later
}
})

# Remove NULLs and combine the results from the node_elements list
node_list <- rbindlist(node_elements[!sapply(node_elements, is.null)], use.names = TRUE, fill = TRUE)

# From child nodes- filter for those with name "edge"
# Create a list by iterating through all the children in xml_nodes
edge_elements <- lapply(xml_nodes, function(child) {
if (xmlName(child) == "edge") { # We're searching for edges, not att or nodes
# Extract the id_from and id_to from the edge line
id_from <- xmlGetAttr(child, "source")
id_to <- xmlGetAttr(child, "target")
att_nodes <- xmlChildren(child) # To access the atts of the edge
dependency_kind <- xmlGetAttr(att_nodes[[5]], "value") # Relevant att is the 5th line
# Error handling for empty and NULL dependency_kind (this is necessary as errors do occur even in the formatted style)
# Code correctly handles all the edges, however produces error if error handling is not included... so...
if (!is.null(dependency_kind) && dependency_kind != "") {
dependency_kind <- unlist(stri_split(dependency_kind, regex = ",\\s*")) # Separates the string into a vector
return(data.table(id_from = id_from, id_to = id_to, dependency_kind = dependency_kind)) # Returns the table containing the filtered node data
} else {
return(NULL) # Return NULL for the entry to be filtered out later
}
} else {
return(NULL) # Return NULL for the entry to be filtered out later
}
})

# Remove NULLs and combine the results from the edge_elements list
edge_list <- rbindlist(edge_elements[!sapply(edge_elements, is.null)], use.names = TRUE, fill = TRUE)

# Merge edges with nodes to get label_from
edge_list <- merge(edge_list, node_list[, .(id, node_label)], by.x = "id_from", by.y = "id", all.x = TRUE)
setnames(edge_list, "node_label", "label_from")

# Merge again to get label_to
edge_list <- merge(edge_list, node_list[, .(id, node_label)], by.x = "id_to", by.y = "id", all.x = TRUE)
setnames(edge_list, "node_label", "label_to")

# Reorder columns to have label_from and label_to on the left
edge_list <- edge_list[, .(label_from, label_to, id_from, id_to, dependency_kind)]

# Create a list of the network to return
graph <- list(node_list = node_list, edge_list = edge_list)
return(graph)
}

#' Parse dependencies from Depends
#'
Expand Down Expand Up @@ -214,6 +347,33 @@ parse_r_dependencies <- function(folder_path){
}

############## Network Transform ##############
#' Transform Understand Dependencies
#'
#' @description This function subsets a parsed table from parse_understand_dependencies
#'
#' @param parsed Parsed table from \code{\link{parse_understand_dependencies}}
#' @param weight_types The weight types as defined in Depends. Accepts single string and vector input
#' @export
#' @family edgelists
transform_understand_dependencies_to_network <- function(parsed, weight_types) {

nodes <- parsed[["node_list"]]
edges <- parsed[["edge_list"]]

# Filter out by weights if vector provided
if (length(weight_types) > 0) {
edges <- edges[dependency_kind %in% weight_types]
}

# If filter removed all edges:
if (nrow(edges) == 0) {
stop("Error: No edges found under weight_types.")
}

# Create a list to return
graph <- list(node_list = nodes, edge_list = edges)
return(graph)
}

#' Transform parsed dependencies into a network
#'
Expand Down
42 changes: 42 additions & 0 deletions exec/rdependencies.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/usr/local/bin/Rscript

require(kaiaulu, quietly = TRUE)
require(cli, quietly = TRUE)
require(XML, quietly = TRUE)
require(stringi, quietly = TRUE)
require(data.table, quietly = TRUE)

doc <- "
USAGE:
rdependencies.R (-h | -help)
rdependencies.R --version
rdependencies.R parse help
rdependencies.R parse <folder_path> <save_path>

DESCRIPTION:
Analyzes a folder with R project files to return dependencies
"

arguments <- docopt::docopt(doc, version = 'Kaiaulu 0.0.0.9700')

# Currently unsure how variables would work

if (!arguments[["help"]]) {
folder_path <- arguments[["<folder_path>"]]
save_path <- arguments[["<save_path>"]]
}

# Determine which function to run and save output
if (arguments[["parse"]] & arguments[["help"]]) {
cli::cli_alert_info("Analyzes dependencies using parse_r_dependencies() and saves it at the csv file specified in <save_path>")
} else if (arguments[["parse"]]) {
result <- parse_r_dependencies(folder_path)
data.table::fwrite(result, save_path)
cli::cli_alert_success(paste0("Dependencies table was saved at: ", save_path))
} else if (arguments[["-h"]] || arguments[["--help"]]) {
cli::cli_alert_info(doc)
} else if (arguments[["--version"]]) {
cli::cli_alert_info('Kaiaulu 0.0.0.9700')
} else {
stop("No/invalid option(s) provided.")
}
68 changes: 68 additions & 0 deletions exec/understand.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/usr/local/bin/Rscript

require(kaiaulu, quietly = TRUE)
require(cli, quietly = TRUE)
require(XML, quietly = TRUE)
require(stringi, quietly = TRUE)
require(data.table, quietly = TRUE)

doc <- "
USAGE:
understand.R (-h | -help)
understand.R --version
understand.R build help
understand.R build <config_filepath>
understand.R parse help
understand.R parse <config_filepath> <save_path> [--class | --file]

DESCRIPTION:
Builds then analyzes a project using Scitool's Understand for dependencies between either classes or files.

OPTIONS:
--class parses class-level dependencies
--file parses file-level dependencies
"

arguments <- docopt::docopt(doc, version = 'Kaiaulu 0.0.0.9700')

# Currently unsure how variables would work

if (!arguments[["help"]]) {
conf <- parse_config(arguments[["<config_filepath>"]])
project_path <- get_understand_project_path(conf)
understand_folder <- get_understand_output_path(conf)
code_language <- get_understand_code_language(conf)
save_path <- arguments[["<save_path>"]]
node_file <- paste0(save_path, "/node.csv")
edge_file <- paste0(save_path, "/edge.csv")
}

# Ensuring directory exists for output
if (!dir.exists(understand_folder) & (arguments[["build"]] | arguments[["parse"]])) {
dir.create(understand_folder, recursive = TRUE)
}

# Determine which function to run and save output
if (arguments[["build"]] & arguments[["help"]]) {
cli::cli_alert_info("Builds an analysis of the project in the designated project_path set in the <config_filepath> using build_understand_project()")
} else if (arguments[["parse"]] & arguments[["help"]]) {
cli::cli_alert_info("From the built analysis, parses the dependency types of either files or classes (dependening on supplied flag) using parse_understand_dependencies() and saves two tables at the specified folder: <save_path>")
} else if (arguments[["build"]]) {
build_understand_project(project_path = project_path, language = code_language, output_dir = understand_folder)
cli::cli_alert_success("Project sucessfully built.")
} else if (arguments[["parse"]]) {
if (arguments[["--file"]]) {
result <- parse_understand_dependencies(understand_dir = understand_folder, parse_type = "file")
} else if (arguments[["--class"]]) {
result <- parse_understand_dependencies(understand_dir = understand_folder, parse_type = "class")
}
data.table::fwrite(result$node_list, node_file)
data.table::fwrite(result$edge_list, edge_file)
cli::cli_alert_success(paste0("Dependencies table was saved at: ", save_path))
} else if (arguments[["-h"]] || arguments[["--help"]]) {
cli::cli_alert_info(doc)
} else if (arguments[["--version"]]) {
cli::cli_alert_info('Kaiaulu 0.0.0.9700')
} else {
stop("No/invalid option(s) provided.")
}
Loading
Loading