Selects and returns best-tuned model under CV.Vote.

cvVote(Y, X = NULL, trainIds, testIds, method = c("spacemap", "space"),
  tuneGrid, resPath = tempdir(), refit = TRUE, thresh = 0.5,
  iscale = TRUE, aszero = 1e-06, ...)

Arguments

Y

Numeric matrix \((N \times Q)\) containing N iid samples of the response vector \(\textbf{y}\).

X

Numeric matrix \((N \times P)\) containing N iid samples of the predictor vector \(\textbf{x}\).

trainIds

List of integer vectors, where each integer vector contains a split of training sample indices pertaining to Y, X.

testIds

List of integer vectors, where each integer vector contains a split of test sample indices pertaining to Y, X. Required to be of the same length as trainIds.

method

Character vector indicates network inference with function spacemap when method = "spacemap" or function space when method = "space". If X is non-null and method = "space", then space will infer (x--x, x--y, y--y) edges but only report (x--y, y--y) edges.

tuneGrid

Named with columns lam1, lam2, lam3 when method = "spacemap". Each row in the data.frame corresponds to a tuning parameter set that is input into spacemap. When method = "space", supply a data.frame with only one column being lam1.

resPath

Character vector specifying the directory where each model fit is written to file through serialization by saveRDS. Defaults to temporary directory that will be deleted at end of the R session. It is recommended to specify a directory where results can be stored permanently.

refit

Logical indicates to refit the model after convergence to reduce bias induced by penalty terms. Default to TRUE. The refit step defaults to a ridge regression with small penalty of 0.01 to encourage numerical stability. The user can change the ridge penalty by adding an additional parameter refitRidge.

thresh

Numerical threshold between 0 and 1 (defaults to 0.5 or majority vote) indicating the minimum proportion of times () a given edge must be represented in the trained models to be reported in the final CV.Vote model. For example, If 0.5 is specified, and there are 10 training splits, then an edge in the final model must be reported in 6 of the 10 traiing models.

iscale

Logical indicating to standardize the whole input data. Defaults to TRUE. See base::scale(x, center = TRUE, scale = TRUE) for details of standardization.

aszero

Positive numeric value (defaults to 1e-6) indicating at what point to consider extremely small parameter estimates of \(\Gamma\) and \(\rho\) as zero.

...

Additional arguments for spacemap or space to change their default settings (e.g. setting tol = 1e-4).

Value

A list containing

  • A list called cvVote with two elements:

    • xy, an adjacency matrix where \(xy(p,q)\) element is 1 for an edge between \(x_p\) and \(y_q\) and 0 otherwise; and

    • yy Adjacency matrix where \(yy(q,l)\) element is 1 for an edge between \(y_q\) and \(y_l\) and 0 otherwise.

  • minTune List containing the optimal tuning penalty set.

  • minIndex Integer specifying the index of minTune in tuneGrid.

  • metricScores Data.frame for input to tuneVis for inspecting the CV score curve and model size as a function of the tuning penalties.

Examples

library(spacemap) data(sim1) ########################## #DEFINE TRAINING/TEST SETS library(caret) #sample size N <- nrow(sim1$X) #number of folds K <- 5L set.seed(265616L) #no special population structure, but create randomized dummy structure of A and B testSets <- createFolds(y = sample(x = c("A", "B"), size = N, replace = TRUE), k = K) trainSets <- lapply(testSets, function(s) setdiff(seq_len(N), s)) nsplits <- sapply(testSets, length) ########################## #SPACE (Y input) tsp <- expand.grid(lam1 = seq(65, 75, length = 3)) cvspace <- cvVote(Y = sim1$Y, trainIds = trainSets, testIds = testSets, method = "space", tuneGrid = tsp)
#> Computing CV scores over the grid...
########################## # SPACEMAP (Y and X input) tmap <- expand.grid(lam1 = seq(65, 75, length = 2), lam2 = seq(21, 35, length = 2), lam3 = seq(10, 40, length = 2)) cvsmap <- cvVote(Y = sim1$Y, X = sim1$X, trainIds = trainSets, testIds = testSets, method = "spacemap", tuneGrid = tmap)
#> Computing CV scores over the grid...