Title: | A Fast Algorithm for Sparse Precision Matrix Estimation |
---|---|
Description: | It implements an improved and computationally faster version of the original Stepwise Gaussian Graphical Algorithm for estimating the Omega precision matrix from high-dimensional data. Zamar, R., Ruiz, M., Lafit, G. and Nogales, J. (2021) <doi:10.52933/jdssv.v1i2.11>. |
Authors: | Juan G. Colonna [cre, aut] , Marcelo Ruiz [aut] |
Maintainer: | Juan G. Colonna <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.2 |
Built: | 2024-11-23 04:03:10 UTC |
Source: | https://github.com/juancolonna/faststepgraph |
cv.FastStepGraph
implements the cross-validation for the Fast Step Graph algorithm.
cv.FastStepGraph( x, alpha_f_min, alpha_f_max, n_folds = 10, b_coef = 0.5, n_alpha = 20, nei.max = 5, data_scale = FALSE, data_shuffle = TRUE, max.iterations = NULL, return_model = FALSE, parallel = FALSE, n_cores = NULL )
cv.FastStepGraph( x, alpha_f_min, alpha_f_max, n_folds = 10, b_coef = 0.5, n_alpha = 20, nei.max = 5, data_scale = FALSE, data_shuffle = TRUE, max.iterations = NULL, return_model = FALSE, parallel = FALSE, n_cores = NULL )
x |
Data matrix (of size n x p). |
alpha_f_min |
Minimum alpha_f value for the cross-validation procedure (example 0.1). |
alpha_f_max |
Maximum alpha_f value for the cross-validation procedure (example 0.9). |
n_folds |
Number of folds for the cross-validation procedure (default value 10). This parameter also accepts the string 'LOOCV' to perform Leave-One-Out cross-validation. |
b_coef |
This parameter applies the empirical rule alpha_b=b_coef*alpha_f during the initial search for the optimal alpha_f parameter while alpha_b remains fixed, after finding optimal alpha_f, alpha_b is varied to find its optimal value. The default value of b_coef is 0.5. |
n_alpha |
Number of elements in the grid for the cross-validation (default value 20). |
nei.max |
Maximum number of variables in every neighborhood (default value 5). |
data_scale |
Boolean parameter (TRUE or FALSE), when to scale data to zero mean and unit variance (default FALSE). |
data_shuffle |
Boolean parameter (default TRUE), when samples (rows of X) must be randomly shuffled. |
max.iterations |
Maximum number of iterations (integer), the defaults values is set to p*(p-1). |
return_model |
Default FALSE. If set to TRUE, at the end of cross-validation, FastStepGraph is called with the optimal parameters alpha_f and alpha_b, returning |
parallel |
Boolean parameter (TRUE or FALSE), when to run Cross-Validation in parallel using a multicore architecture (default FALSE). |
n_cores |
An 'int' value specifying the number of cores do you want to use if 'parallel=TRUE'. If n_cores is not specified, the maximum number of cores on your machine minus one will be set automatically. |
A list with the values:
alpha_f_opt |
the optimal alpha_f value. |
alpha_f_opt |
the optimal alpha_f value. |
CV.loss |
minimum loss. |
If return_model=TRUE, then also returns:
vareps |
Response variables. |
beta |
Regression coefficients. |
Edges |
Estimated set of edges. |
Omega |
Estimated precision matrix. |
Prof. Juan G. Colonna, PhD. [email protected]
Prof. Marcelo Ruiz, PhD. [email protected]
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data res <- FastStepGraph::cv.FastStepGraph(data$X, alpha_f_min=0.1, alpha_f_max = 0.9, data_scale=TRUE)
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data res <- FastStepGraph::cv.FastStepGraph(data$X, alpha_f_min=0.1, alpha_f_max = 0.9, data_scale=TRUE)
Improved and faster implementation of the Stepwise Gaussian Graphical Algorithm.
FastStepGraph( x, alpha_f, alpha_b = NULL, nei.max = 5, data_scale = FALSE, max.iterations = NULL )
FastStepGraph( x, alpha_f, alpha_b = NULL, nei.max = 5, data_scale = FALSE, max.iterations = NULL )
x |
Data matrix (of size n_samples x p_variables). |
alpha_f |
Forward threshold (no default value). |
alpha_b |
Backward threshold. If alpha_b=NULL, then the rule alpha_b <- 0.5*alpha_f is applied. |
nei.max |
Maximum number of variables in every neighborhood (default value 5). |
data_scale |
Boolean parameter (TRUE or FALSE), when to scale data to zero mean and unit variance (default FALSE). |
max.iterations |
Maximum number of iterations (integer), the defaults values is set to p*(p-1). |
A list with the values:
vareps |
Response variables. |
beta |
Regression coefficients. |
Edges |
Estimated set of edges. |
Omega |
Estimated precision matrix. |
Prof. Juan G. Colonna, PhD. [email protected]
Prof. Marcelo Ruiz, PhD. [email protected]
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data G <- FastStepGraph::FastStepGraph(data$X, alpha_f = 0.22, alpha_b = 0.14, data_scale=TRUE)
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data G <- FastStepGraph::FastStepGraph(data$X, alpha_f = 0.22, alpha_b = 0.14, data_scale=TRUE)
Helper function to simulate Simulate Gaussian Data with an Autoregressive (AR) Model
SigmaAR(n_rows, p_columns, phi)
SigmaAR(n_rows, p_columns, phi)
n_rows |
Number of samples (rows of X). |
p_columns |
Number of variables (columns of X). |
phi |
Auto-regression coefficient. |
A list with the values:
Sigma |
A covariance matrix. |
Omega |
A precision matrix. |
X |
A normalized data matrix with Gaussian distribution. |
Prof. Juan G. Colonna, PhD. [email protected]
Prof. Marcelo Ruiz, PhD. [email protected]
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data
data <- FastStepGraph::SigmaAR(30, 50, 0.4) # Simulate Gaussian Data