This function returns the principal component directions used by the plotting
and estimation functions in this package. By default, it computes the usual
non-private directions from the sample covariance matrix. If g_dppca = TRUE,
it computes private directions using the spherical-transformation version of
g-DPPCA proposed by Kim and Jung (2025)
: it adds
Gaussian noise to the spherical Kendall matrix and then extracts its leading
eigenvectors.
Usage
dp_pc_dir(
X,
k,
center = TRUE,
standardize = FALSE,
g_dppca = FALSE,
eps = NULL,
delta = NULL,
cpp.option = FALSE
)Arguments
- X
A numeric matrix or data frame. Rows correspond to observations and columns correspond to variables.
- k
Number of principal component directions to return. Must be an integer between
1and the number of columns inX.- center
A logical value indicating whether to center the columns of
Xbefore computing principal component directions. The default isTRUE.- standardize
A logical value indicating whether to scale the columns of
Xby their sample standard deviations after optional centering. The default isFALSE.- g_dppca
Whether to compute private principal component directions using the spherical Kendall mechanism based on the g-DPPCA method. If
FALSE, the usual non-private directions are computed from the sample covariance matrix. The default isFALSE.- eps
Positive number defining the
epsilonprivacy parameter for private principal component directions. Required wheng_dppca = TRUE.- delta
Number in
(0, 1)defining thedeltaprivacy parameter for private principal component directions. Required wheng_dppca = TRUE.- cpp.option
A logical value reserved for a future C++ implementation of the spherical Kendall matrix. Currently only
FALSEis supported.
Value
A numeric matrix with ncol(X) rows and k columns. The columns are
orthonormal principal component directions.
Details
The non-private option computes leading eigenvectors of the sample covariance matrix of the preprocessed data. The private option is based on the spherical Kendall mechanism of Kim and Jung (2025) : it first forms the spherical Kendall matrix from pairwise normalized differences, adds symmetric Gaussian noise, and then computes leading eigenvectors. The final eigenvector matrix is re-orthonormalized by QR decomposition. For a detailed procedure and mathematical formulations, refer https://yejinjo0220.github.io/dppca/articles/pc_direction.
References
Kim M, Jung S (2025). “Robust and Differentially Private Principal Component Analysis.” Statistical Analysis and Data Mining: An ASA Data Science Journal, 18(6), e70053. doi:10.1002/sam.70053 .
Examples
data(gau, package = "dppca")
# Use a small subset to keep the example fast.
X <- gau[1:200, ]
# Non-private principal component directions
V <- dp_pc_dir(X, k = 2)
head(V)
#> [,1] [,2]
#> [1,] -0.276390432 -0.05512405
#> [2,] -0.008155916 -0.03392589
#> [3,] -0.433823614 -0.26807870
#> [4,] -0.007115806 0.43519507
#> [5,] 0.406349436 0.12107397
#> [6,] 0.108264920 0.07128666
# Private principal component directions
set.seed(123)
V_private <- dp_pc_dir(
X,
k = 2,
g_dppca = TRUE,
eps = 2,
delta = 1e-3
)
head(V_private)
#> [,1] [,2]
#> [1,] -0.223077596 0.3476042
#> [2,] -0.050923111 -0.2384854
#> [3,] -0.018308885 0.1395241
#> [4,] 0.091084919 0.1563828
#> [5,] 0.009964607 -0.2978056
#> [6,] 0.161973207 0.1422014