Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation to dtm #29

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 37 additions & 3 deletions R/dtm.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,28 @@
#' @param X an n by d matrix of coordinates of points used to construct the uniform
#' empirical measure for the distance to measure, where n is the number of points
#' and d is the dimension.
#'
#' @param Grid an m by d matrix of coordinates of points where the distance to measure
#' is computed, where m is the number of points in Grid and d is the dimension.
#'
#' @param m0 a numeric variable for the smoothing parameter of the distance to measure.
#' Roughly, m0 is the the percentage of points of X that are considered when the distance
#' to measure is computed for each point of Grid. The value of m0 should be in (0,1).
#'
#' @param r a numeric variable for the tuning parameter of the distance to measure.
#' The value of r should be in [1,∞), and the default value is 2.
#'
#' @param weight either a number, or a vector of length n. If it is a number, then same
#' weight is applied to each points of X. If it is a vector, weight represents weights of
#' each points of X. The default value is 1.
#'
#' @return a vector of length m (the number of points stored in Grid)
#' containing the value of the distance to measure function evaluated at each point of Grid.

dtm <-
function(X, Grid, m0, r = 2, weight = 1) {


# check that parameters X and Grid are both matrices of matching dimension
if (!is.numeric(X) && !is.data.frame(X)) {
stop("X should be a matrix of coordinates")
}
Expand All @@ -10,12 +32,18 @@ function(X, Grid, m0, r = 2, weight = 1) {
if (NCOL(X) != NCOL(Grid)) {
stop("dimensions of X and Grid do not match")
}

# ensure that smoothing parameter m0 is a value between 0 and 1
if (!is.numeric(m0) || length(m0) != 1 || m0 < 0 || m0 > 1) {
stop("m0 should be a number between 0 and 1")
}

# ensure tuning parameter is a number in [1,∞)
if (!is.numeric(r) || length(r) != 1 || r < 1) {
stop("r should be a number greater than or equal to 1")
}

# verify that weight is either constant or that it provides a correspondence with every point
if (!is.numeric(weight) ||
(length(weight) != 1 && length(weight) != NROW(X))) {
stop("weight should be either a number or a vector of length equals the number of sample")
Expand All @@ -24,29 +52,35 @@ function(X, Grid, m0, r = 2, weight = 1) {
# without weight
if (length(weight) == 1) {
X <- as.matrix(X)
weightBound <- m0 * NROW(X)
weightBound <- m0 * NROW(X)
# use fast nearest neighbor search algorithm to find distances to k nearest neighbors
knnDistance <- FNN::knnx.dist(
data = X, query = as.matrix(Grid), k = ceiling(weightBound),
algorithm = c("kd_tree"))
# utilize embedded Dtm function to find distance to measure
return (Dtm(knnDistance = knnDistance, weightBound = weightBound, r = r))

# with weight
} else {
# establish the weightbound and weight parameters to be used in final DtmWeight function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These and the next set are great comments. The code below is somewhat non-obvious and the comments make it very easy to read what is happening.

X0 <- as.matrix(X[weight != 0, , drop = FALSE])
weight0 <- weight[weight != 0]
weight0sort <- sort(weight0)
weightBound <- m0 * sum(weight0)
weightSumTemp <- 0
# add sorted weight values to a sum until that sum reaches weight bound
for (k0 in seq(along = weight0)) {
weightSumTemp <- weightSumTemp + weight0sort[k0]
if (weightSumTemp >= weightBound) {
break
}
}
# create a matrix of nearest neighbor indeces using the kd tree algorithm
knnDistanceIndex <- FNN::get.knnx(
data = X0, query = as.matrix(Grid), k = k0, algorithm = c("kd_tree"))
# use embedded DtmWeight function
return (DtmWeight(
knnDistance = knnDistanceIndex[["nn.dist"]], weightBound = weightBound,
r = r, knnIndex = knnDistanceIndex[["nn.index"]], weight = weight0))
}
}
}