Title: | Robust Analysis of High Dimensional Data |
---|---|
Description: | A collection of methods for the robust analysis of univariate and multivariate functional data, possibly in high-dimensional cases, and hence with attention to computational efficiency and simplicity of use. See the R Journal publication of Ieva et al. (2019) <doi:10.32614/RJ-2019-032> for an in-depth presentation of the 'roahd' package. See Aleman-Gomez et al. (2021) <arXiv:2103.08874> for details about the concept of depthgram. |
Authors: | Nicholas Tarabelloni [aut], Ana Arribas-Gil [aut], Francesca Ieva [aut], Anna Maria Paganoni [aut], Juan Romo [aut], Francesco Palma [ctb], Aymeric Stamm [ctb, cre] , Antonio Elias-Fernandez [ctb] |
Maintainer: | Aymeric Stamm <[email protected]> |
License: | GPL-3 |
Version: | 1.4.3.9000 |
Built: | 2024-11-06 05:13:06 UTC |
Source: | https://github.com/astamm/roahd |
This is a convenience function that simplifies the task of appending univariate functional observations of two datasets to a unique univariate functional dataset.
append_fData(fD1, fD2)
append_fData(fD1, fD2)
fD1 |
is the first functional dataset, stored into an |
fD2 |
is the second functional dataset, stored into an |
The two original datasets must be compatible, i.e. must be defined on the same grid.
If we denote with the first dataset, defined over the
grid
, and with
the second functional dataset,
defined on the same grid, the method returns the union dataset obtained by taking all the
observations together.
The function returns an fData
object containing the union of fD1
and fD2
# Creating two simple univariate datasets grid = seq(0, 2 * pi, length.out = 100) values1 = matrix( c(sin(grid), sin(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values2 = matrix( c(cos(grid), cos(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) fD1 = fData( grid, values1 ) fD2 = fData( grid, values2 ) # Appending them to a unique dataset append_fData(fD1, fD2)
# Creating two simple univariate datasets grid = seq(0, 2 * pi, length.out = 100) values1 = matrix( c(sin(grid), sin(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values2 = matrix( c(cos(grid), cos(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) fD1 = fData( grid, values1 ) fD2 = fData( grid, values2 ) # Appending them to a unique dataset append_fData(fD1, fD2)
This is a convenience function that simplifies the task of appending multivariate functional observations of two datasets to a unique multivariate functional dataset.
append_mfData(mfD1, mfD2)
append_mfData(mfD1, mfD2)
mfD1 |
is the first multivariate functional dataset, stored into an |
mfD2 |
is the second multivariate functional dataset, stored into an |
The two original datasets must be compatible, i.e. must have same number of components
(dimensions) and must be defined on the same grid. If we denote with
,
the first dataset, defined over the
grid
, and with
,
the second functional dataset, the method returns the union dataset obtained by taking all the
observations together.
The function returns a mfData
object containing the union of mfD1
and mfD2
# Creating two simple bivariate datasets grid = seq(0, 2 * pi, length.out = 100) values11 = matrix( c(sin(grid), sin(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values12 = matrix( c(sin(3 * grid), sin(4 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values21 = matrix( c(cos(grid), cos(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values22 = matrix( c(cos(3 * grid), cos(4 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) mfD1 = mfData( grid, list(values11, values12) ) mfD2 = mfData( grid, list(values21, values22) ) # Appending them to a unique dataset append_mfData(mfD1, mfD2)
# Creating two simple bivariate datasets grid = seq(0, 2 * pi, length.out = 100) values11 = matrix( c(sin(grid), sin(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values12 = matrix( c(sin(3 * grid), sin(4 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values21 = matrix( c(cos(grid), cos(2 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) values22 = matrix( c(cos(3 * grid), cos(4 * grid)), nrow = 2, ncol = length(grid), byrow=TRUE) mfD1 = mfData( grid, list(values11, values12) ) mfD2 = mfData( grid, list(values21, values22) ) # Appending them to a unique dataset append_mfData(mfD1, mfD2)
This function implements an order relation between univariate functional data based on the area-under-curve relation, that is to say a pre-order relation obtained by comparing the area-under-curve of two different functional data.
area_ordered(fData, gData)
area_ordered(fData, gData)
fData |
the first univariate functional dataset containing elements to
be compared, in form of |
gData |
the second univariate functional dataset containing elements to
be compared , in form of |
Given a univariate functional dataset,
and another functional dataset
defined over the same compact interval
, the function computes
the area-under-curve (namely, the integral) in both the datasets, and checks
whether the first ones are lower or equal than the second ones.
By default the function tries to compare each with the
corresponding
, thus assuming
, but when either
or
, the comparison is carried out cycling over the
dataset with fewer elements. In all the other cases (
and
either
or
) the function stops.
The function returns a logical vector of length
containing the value of the predicate for all the corresponding elements.
Valencia, D., Romo, J. and Lillo, R. (2015). A Kendall correlation
coefficient for functional dependence, Universidad Carlos III de Madrid
technical report,
http://EconPapers.repec.org/RePEc:cte:wsrepe:ws133228
.
P = 1e3 grid = seq( 0, 1, length.out = P ) Data_1 = matrix( c( 1 * grid, 2 * grid ), nrow = 2, ncol = P, byrow = TRUE ) Data_2 = matrix( 3 * ( 0.5 - abs( grid - 0.5 ) ), nrow = 1, byrow = TRUE ) Data_3 = rbind( Data_1, Data_1 ) fD_1 = fData( grid, Data_1 ) fD_2 = fData( grid, Data_2 ) fD_3 = fData( grid, Data_3 ) area_ordered( fD_1, fD_2 ) area_ordered( fD_2, fD_3 )
P = 1e3 grid = seq( 0, 1, length.out = P ) Data_1 = matrix( c( 1 * grid, 2 * grid ), nrow = 2, ncol = P, byrow = TRUE ) Data_2 = matrix( 3 * ( 0.5 - abs( grid - 0.5 ) ), nrow = 1, byrow = TRUE ) Data_3 = rbind( Data_1, Data_1 ) fD_1 = fData( grid, Data_1 ) fD_2 = fData( grid, Data_2 ) fD_3 = fData( grid, Data_3 ) area_ordered( fD_1, fD_2 ) area_ordered( fD_2, fD_3 )
This method computes the (signed) area under the curve of elements of a univariate functional dataset, namely, their integral.
area_under_curve(fData)
area_under_curve(fData)
fData |
the functional dataset containing elements whose areas under the
curve have to be computed, in form of |
Given a univariate functional dataset, ,
defined over a compact interval
and observed on an evenly
spaced 1D grid
, the
function computes:
where .
The function returns a numeric vector containing the values of areas
under the curve for all the elements of the functional dataset
fData
.
P = 1e3 grid = seq( 0, 1, length.out = P ) fD = fData( grid, matrix( c( sin( 2 * pi * grid ), cos( 2 * pi * grid ), 4 * grid * ( 1 - grid ) ), nrow = 3, ncol = P, byrow = TRUE ) ) plot( fD ) area_under_curve( fD )
P = 1e3 grid = seq( 0, 1, length.out = P ) fD = fData( grid, matrix( c( sin( 2 * pi * grid ), cos( 2 * pi * grid ), 4 * grid * ( 1 - grid ) ), nrow = 3, ncol = P, byrow = TRUE ) ) plot( fD ) area_under_curve( fD )
mfData
classThis S3 method provides a way to convert some objects to the class
mfData
, thus obtaining a multivariate functional dataset.
as.mfData(x, ...) ## S3 method for class 'list' as.mfData(x, ...)
as.mfData(x, ...) ## S3 method for class 'list' as.mfData(x, ...)
x |
a list of univariate functional datasets, provided in form of
|
... |
additional parameters. |
The function returns a mfData
object, obtained starting from
argument x
.
grid = seq( 0, 1, length.out = 100 ) fD_1 = fData( grid, sin( 2 * pi * grid ) ) fD_2 = fData( grid, cos( 2 * pi * grid ) ) plot( as.mfData( list( fD_1, fD_2 ) ) )
grid = seq( 0, 1, length.out = 100 ) fD_1 = fData( grid, sin( 2 * pi * grid ) ) fD_2 = fData( grid, cos( 2 * pi * grid ) ) plot( as.mfData( list( fD_1, fD_2 ) ) )
This function computes the bootstrap confidence interval of coverage probability
for the Spearman correlation coefficient between two univariate functional samples.
BCIntervalSpearman( fD1, fD2, ordering = "MEI", bootstrap_iterations = 1000, alpha = 0.05, verbose = FALSE )
BCIntervalSpearman( fD1, fD2, ordering = "MEI", bootstrap_iterations = 1000, alpha = 0.05, verbose = FALSE )
fD1 |
is the first univariate functional sample in form of an |
fD2 |
is the first univariate functional sample in form of an |
ordering |
is either |
bootstrap_iterations |
is the number of bootstrap iterations to use in order to estimate the confidence interval (default is 1000). |
alpha |
controls the coverage probability (1- |
verbose |
whether to log information on the progression of bootstrap iterations. |
The function takes two samples of compatible functional data (i.e., they must be defined over the same grid and have same number of observations) and computes a bootstrap confidence interval for their Spearman correlation coefficient.
The function returns a list of two elements, lower
and upper
, representing
the lower and upper end of the bootstrap confidence interval.
cor_spearman
, cor_spearman_accuracy
, fData
,
mfData
, BCIntervalSpearmanMultivariate
set.seed(1) N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2)) BCIntervalSpearman(mfD$fDList[[1]], mfD$fDList[[2]], ordering = "MEI") BCIntervalSpearman(mfD$fDList[[1]], mfD$fDList[[2]], ordering = "MHI") # BC intervals contain zero since the functional samples are uncorrelated.
set.seed(1) N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2)) BCIntervalSpearman(mfD$fDList[[1]], mfD$fDList[[2]], ordering = "MEI") BCIntervalSpearman(mfD$fDList[[1]], mfD$fDList[[2]], ordering = "MHI") # BC intervals contain zero since the functional samples are uncorrelated.
This function computes the bootstrap confidence intervals of coverage probability
for the Spearman correlation coefficients within a multivariate functional dataset.
BCIntervalSpearmanMultivariate( mfD, ordering = "MEI", bootstrap_iterations = 1000, alpha = 0.05, verbose = FALSE )
BCIntervalSpearmanMultivariate( mfD, ordering = "MEI", bootstrap_iterations = 1000, alpha = 0.05, verbose = FALSE )
mfD |
is the multivariate functional sample in form of |
ordering |
is either |
bootstrap_iterations |
is the number of bootstrap iterations to use in order to estimate the confidence intervals (default is 1000). |
alpha |
controls the coverage probability (1- |
verbose |
whether to log information on the progression of bootstrap iterations. |
The function takes a multivariate functional dataset and computes a matrix of bootstrap confidence intervals for its Spearman correlation coefficients.
The function returns a list of two elements, lower
and upper
, representing
the matrices of lower and upper ends of the bootstrap confidence intervals for each pair of
components. The elements on the main diagonal are set to 1.
cor_spearman
, cor_spearman_accuracy
, fData
,
mfData
, BCIntervalSpearman
set.seed(1) N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(4 * pi * grid), Cov = Cov ) Data_3 <- generate_gauss_fdata( N = N, centerline = sin(6 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a multivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2, Data_3)) BCIntervalSpearmanMultivariate(mfD, ordering = "MEI") # BC intervals contain zero since the functional samples are uncorrelated.
set.seed(1) N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(4 * pi * grid), Cov = Cov ) Data_3 <- generate_gauss_fdata( N = N, centerline = sin(6 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a multivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2, Data_3)) BCIntervalSpearmanMultivariate(mfD, ordering = "MEI") # BC intervals contain zero since the functional samples are uncorrelated.
This function computes the Band Depth (BD) of elements of a functional dataset.
BD(Data) ## S3 method for class 'fData' BD(Data) ## Default S3 method: BD(Data)
BD(Data) ## S3 method for class 'fData' BD(Data) ## Default S3 method: BD(Data)
Data |
either an object of class |
Given a univariate functional dataset, ,
this function computes the sample BD of each element with respect to the
other elements of the dataset, i.e.:
where is the graphic of
,
is
the envelope of
and
, and
.
See the References section for more details.
The function returns a vector containing the values of BD for the given dataset.
Lopez-Pintado, S. and Romo, J. (2009). On the Concept of Depth for Functional Data, Journal of the American Statistical Association, 104, 718-734.
Lopez-Pintado, S. and Romo. J. (2007). Depth-based inference for functional data, Computational Statistics & Data Analysis 51, 4957-4968.
MBD
, BD_relative
,
MBD_relative
, fData
grid = seq( 0, 1, length.out = 1e2 ) D = matrix( c( 1 + sin( 2 * pi * grid ), 0 + sin( 4 * pi * grid ), 1 - sin( pi * ( grid - 0.2 ) ), 0.1 + cos( 2 * pi * grid ), 0.5 + sin( 3 * pi + grid ), -2 + sin( pi * grid ) ), nrow = 6, ncol = length( grid ), byrow = TRUE ) fD = fData( grid, D ) BD( fD ) BD( D )
grid = seq( 0, 1, length.out = 1e2 ) D = matrix( c( 1 + sin( 2 * pi * grid ), 0 + sin( 4 * pi * grid ), 1 - sin( pi * ( grid - 0.2 ) ), 0.1 + cos( 2 * pi * grid ), 0.5 + sin( 3 * pi + grid ), -2 + sin( pi * grid ) ), nrow = 6, ncol = length( grid ), byrow = TRUE ) fD = fData( grid, D ) BD( fD ) BD( D )
This function computes Band Depth (BD) of elements of a univariate functional dataset with respect to another univariate functional dataset.
BD_relative(Data_target, Data_reference) ## S3 method for class 'fData' BD_relative(Data_target, Data_reference) ## Default S3 method: BD_relative(Data_target, Data_reference)
BD_relative(Data_target, Data_reference) ## S3 method for class 'fData' BD_relative(Data_target, Data_reference) ## Default S3 method: BD_relative(Data_target, Data_reference)
Data_target |
is the univariate functional dataset, provided either as
a |
Data_reference |
is the dataset, provided either as a |
Given a univariate functional dataset of elements , and another univariate functional dataset of elements
, this function computes the BD of
elements of the former with respect to elements of the latter, i.e.:
where is the graphic of
and
is
the envelope of
and
.
The function returns a vector containing the BD of elements in
Data_target
with respect to elements in Data_reference
.
BD
, MBD
, MBD_relative
,
fData
grid = seq( 0, 1, length.out = 1e2 ) Data_ref = matrix( c( 0 + sin( 2 * pi * grid ), 1 + sin( 2 * pi * grid ), -1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) Data_test_1 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = 1, ncol = length( grid ), byrow = TRUE ) Data_test_2 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = length( grid ), ncol = 1, byrow = TRUE ) Data_test_3 = 0.6 + sin( 2 * pi * grid ) Data_test_4 = array( 0.6 + sin( 2 * pi * grid ), dim = length( grid ) ) Data_test_5 = array( 0.6 + sin( 2 * pi * grid ), dim = c( 1, length( grid ) ) ) Data_test_6 = array( 0.6 + sin( 2 * pi * grid ), dim = c( length( grid ), 1 ) ) Data_test_7 = matrix( c( 0.5 + sin( 2 * pi * grid ), -0.5 + sin( 2 * pi * grid ), 1.1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) fD_ref = fData( grid, Data_ref ) fD_test_1 = fData( grid, Data_test_1 ) fD_test_2 = fData( grid, Data_test_2 ) fD_test_3 = fData( grid, Data_test_3 ) fD_test_4 = fData( grid, Data_test_4 ) fD_test_5 = fData( grid, Data_test_5 ) fD_test_6 = fData( grid, Data_test_6 ) fD_test_7 = fData( grid, Data_test_7 ) BD_relative( fD_test_1, fD_ref ) BD_relative( Data_test_1, Data_ref ) BD_relative( fD_test_2, fD_ref ) BD_relative( Data_test_2, Data_ref ) BD_relative( fD_test_3, fD_ref ) BD_relative( Data_test_3, Data_ref ) BD_relative( fD_test_4, fD_ref ) BD_relative( Data_test_4, Data_ref ) BD_relative( fD_test_5, fD_ref ) BD_relative( Data_test_5, Data_ref ) BD_relative( fD_test_6, fD_ref ) BD_relative( Data_test_6, Data_ref ) BD_relative( fD_test_7, fD_ref ) BD_relative( Data_test_7, Data_ref )
grid = seq( 0, 1, length.out = 1e2 ) Data_ref = matrix( c( 0 + sin( 2 * pi * grid ), 1 + sin( 2 * pi * grid ), -1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) Data_test_1 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = 1, ncol = length( grid ), byrow = TRUE ) Data_test_2 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = length( grid ), ncol = 1, byrow = TRUE ) Data_test_3 = 0.6 + sin( 2 * pi * grid ) Data_test_4 = array( 0.6 + sin( 2 * pi * grid ), dim = length( grid ) ) Data_test_5 = array( 0.6 + sin( 2 * pi * grid ), dim = c( 1, length( grid ) ) ) Data_test_6 = array( 0.6 + sin( 2 * pi * grid ), dim = c( length( grid ), 1 ) ) Data_test_7 = matrix( c( 0.5 + sin( 2 * pi * grid ), -0.5 + sin( 2 * pi * grid ), 1.1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) fD_ref = fData( grid, Data_ref ) fD_test_1 = fData( grid, Data_test_1 ) fD_test_2 = fData( grid, Data_test_2 ) fD_test_3 = fData( grid, Data_test_3 ) fD_test_4 = fData( grid, Data_test_4 ) fD_test_5 = fData( grid, Data_test_5 ) fD_test_6 = fData( grid, Data_test_6 ) fD_test_7 = fData( grid, Data_test_7 ) BD_relative( fD_test_1, fD_ref ) BD_relative( Data_test_1, Data_ref ) BD_relative( fD_test_2, fD_ref ) BD_relative( Data_test_2, Data_ref ) BD_relative( fD_test_3, fD_ref ) BD_relative( Data_test_3, Data_ref ) BD_relative( fD_test_4, fD_ref ) BD_relative( Data_test_4, Data_ref ) BD_relative( fD_test_5, fD_ref ) BD_relative( Data_test_5, Data_ref ) BD_relative( fD_test_6, fD_ref ) BD_relative( Data_test_6, Data_ref ) BD_relative( fD_test_7, fD_ref ) BD_relative( Data_test_7, Data_ref )
This function performs a bootstrap test that checks whether the Spearman correlation structures (e.g. matrices) of two populations of compatible multivariate functional data are equal or not.
BTestSpearman( mfD1, mfD2, bootstrap_iterations = 1000, ordering = "MEI", normtype = "f", verbose = FALSE )
BTestSpearman( mfD1, mfD2, bootstrap_iterations = 1000, ordering = "MEI", normtype = "f", verbose = FALSE )
mfD1 |
is the first functional dataset, specified in form of |
mfD2 |
is the second functional dataset, specified in form of |
bootstrap_iterations |
is the number of bootstrap iterations to be performed. |
ordering |
is the kind of ordering to be used in the computation of Spearman's correlation
coefficient (default is |
normtype |
is the norm to be used when comparing the Spearman correlation matrices of the two
functional datasets (default is Frobenius, allowed values are the same as for parameter |
verbose |
a boolean flag specifying whether to print the progress of bootstrap iterations or not (default is FALSE). |
Given a first multivariate functional population, with
,
defined on the grid
, and a second multivariate functional population,
with
, defined on the same grid
, the
function performs a bootstrap test to check the hypothesis:
where R_X and R_Y denote the L x L matrices of Spearman correlation coefficients of the two populations.
The two functional samples must have the same number of components and must be defined over the same
discrete interval .
The test is performed through a bootstrap argument, so
a number of bootstrap iterations must be specified as well. A high value for this parameter may result
in slow performances of the test (you may consider setting verbose
to TRUE
to get
hints on the process).
The function returns the estimates of the test's p-value and statistics.
BCIntervalSpearman
, BCIntervalSpearmanMultivariate
, mfData
set.seed(1) N <- 200 P <- 100 L <- 2 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating two populations of bivariate functional data # # The first population has very high correlation between first and second component centerline_1 <- matrix( data = rep(sin(2 * pi * grid)), nrow = L, ncol = P, byrow = TRUE ) values1 <- generate_gauss_mfdata( N = N, L = L, correlations = 0.9, centerline = centerline_1, listCov = list(Cov, Cov) ) mfD1 <- mfData(grid, values1) # Pointwise estimate cor_spearman(mfD1) # The second population has zero correlation between first and second component centerline_2 <- matrix( data = rep(cos(2 * pi * grid)), nrow = L, ncol = P, byrow = TRUE ) values2 <- generate_gauss_mfdata( N = N, L = L, correlations = 0, centerline = centerline_2, listCov = list(Cov, Cov) ) mfD2 <- mfData(grid, values2) # Pointwise estimate cor_spearman(mfD2) # Applying the test BTestSpearman(mfD1, mfD2)
set.seed(1) N <- 200 P <- 100 L <- 2 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating two populations of bivariate functional data # # The first population has very high correlation between first and second component centerline_1 <- matrix( data = rep(sin(2 * pi * grid)), nrow = L, ncol = P, byrow = TRUE ) values1 <- generate_gauss_mfdata( N = N, L = L, correlations = 0.9, centerline = centerline_1, listCov = list(Cov, Cov) ) mfD1 <- mfData(grid, values1) # Pointwise estimate cor_spearman(mfD1) # The second population has zero correlation between first and second component centerline_2 <- matrix( data = rep(cos(2 * pi * grid)), nrow = L, ncol = P, byrow = TRUE ) values2 <- generate_gauss_mfdata( N = N, L = L, correlations = 0, centerline = centerline_2, listCov = list(Cov, Cov) ) mfD2 <- mfData(grid, values2) # Pointwise estimate cor_spearman(mfD2) # Applying the test BTestSpearman(mfD1, mfD2)
This function computes the Kendall's tau correlation coefficient for a bivariate functional dataset, with either a max or area-under-curve order order relation between univariate functional elements (components).
cor_kendall(mfD, ordering = "max")
cor_kendall(mfD, ordering = "max")
mfD |
a bivariate functional dataset whose Kendall's tau coefficient
must be computed, in form of bivariate |
ordering |
the ordering relation to use on functional observations,
either |
Given a bivariate functional dataset, with first components and second components
, the function exploits either the order relation based on the maxima
or the area-under-curve relation to compare data and produce concordances and
discordances, that are then used to compute the tau coefficient.
See the references for more details.
The function returns the Kendall's tau correlation coefficient for
the bivariate dataset provided with mfData
.
Valencia, D., Romo, J. and Lillo, R. (2015). A Kendall correlation
coefficient for functional dependence, Universidad Carlos III de Madrid
technical report,
http://EconPapers.repec.org/RePEc:cte:wsrepe:ws133228
.
mfData
, area_ordered
,
max_ordered
#### TOTALLY INDEPENDENT COMPONENTS N = 2e2 P = 1e3 grid = seq( 0, 1, length.out = P ) # Creating an exponential covariance function to simulate guassian data Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating (independent) gaussian functional data with given center and # covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional # dataset mfD = mfData( grid, list( Data_1, Data_2 ) ) # Correlation approx. zero (components were created independently) cor_kendall( mfD, ordering = 'max' ) # Correlation approx. zero (components were created independently) cor_kendall( mfD, ordering = 'area' ) #### TOTALLY DEPENDENT COMPONENTS # Nonlinear transform of first component Data_3 = t( apply( Data_1, 1, exp ) ) # Creating bivariate dataset starting from nonlinearly-dependent components mfD = mfData( grid, list( Data_1, Data_3 ) ) # Correlation very high (components are nonlinearly dependent) cor_kendall( mfD, ordering = 'max' ) # Correlation very high (components are nonlinearly dependent) cor_kendall( mfD, ordering = 'area' )
#### TOTALLY INDEPENDENT COMPONENTS N = 2e2 P = 1e3 grid = seq( 0, 1, length.out = P ) # Creating an exponential covariance function to simulate guassian data Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating (independent) gaussian functional data with given center and # covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional # dataset mfD = mfData( grid, list( Data_1, Data_2 ) ) # Correlation approx. zero (components were created independently) cor_kendall( mfD, ordering = 'max' ) # Correlation approx. zero (components were created independently) cor_kendall( mfD, ordering = 'area' ) #### TOTALLY DEPENDENT COMPONENTS # Nonlinear transform of first component Data_3 = t( apply( Data_1, 1, exp ) ) # Creating bivariate dataset starting from nonlinearly-dependent components mfD = mfData( grid, list( Data_1, Data_3 ) ) # Correlation very high (components are nonlinearly dependent) cor_kendall( mfD, ordering = 'max' ) # Correlation very high (components are nonlinearly dependent) cor_kendall( mfD, ordering = 'area' )
This function computes the Spearman's correlation coefficient for a multivariate functional dataset, with either a Modified Epigraph Index (MEI) or Modified Hypograph Index (MHI) ranking of univariate elements of data components.
cor_spearman(mfD, ordering = "MEI")
cor_spearman(mfD, ordering = "MEI")
mfD |
a multivariate functional dataset whose Spearman's correlation
coefficient must be computed, in form of multivariate |
ordering |
the ordering relation to use on functional observations,
either |
Given a multivariate functional dataset, with first components , second components
, etc., the function exploits either the MEI or MHI to
compute the matrix of Spearman's correlation coefficients. Such matrix is
symmetrical and has ones on the diagonal. The entry (i, j) represents the
Spearman correlation coefficient between curves of component i and j.
See the references for more details.
If the original dataset is bivariate, the function returns only the scalar value of the correlation coefficient for the two components. When the number of components is L >2, it returns the whole matrix of Spearman's correlation coefficients for all the components.
Valencia, D., Romo, J. and Lillo, R. (2015). Spearman coefficient for
functions, Universidad Carlos III de Madrid technical report,
http://EconPapers.repec.org/RePEc:cte:wsrepe:ws133329
.
#### TOTALLY INDEPENDENT COMPONENTS N = 2e2 P = 1e3 grid = seq( 0, 1, length.out = P ) # Creating an exponential covariance function to simulate guassian data Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating (independent) gaussian functional data with given center and # covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional # dataset mfD = mfData( grid, list( Data_1, Data_2 ) ) # Correlation approx. zero (components were created independently) cor_spearman( mfD, ordering = 'MEI' ) # Correlation approx. zero (components were created independently) cor_spearman( mfD, ordering = 'MHI' ) #### TOTALLY DEPENDENT COMPONENTS # Nonlinear transform of first component Data_3 = t( apply( Data_1, 1, exp ) ) # Creating bivariate dataset starting from nonlinearly-dependent components mfD = mfData( grid, list( Data_1, Data_3 ) ) # Correlation very high (components are nonlinearly dependent) cor_spearman( mfD, ordering = 'MEI' ) # Correlation very high (components are nonlinearly dependent) cor_spearman( mfD, ordering = 'MHI' )
#### TOTALLY INDEPENDENT COMPONENTS N = 2e2 P = 1e3 grid = seq( 0, 1, length.out = P ) # Creating an exponential covariance function to simulate guassian data Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating (independent) gaussian functional data with given center and # covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional # dataset mfD = mfData( grid, list( Data_1, Data_2 ) ) # Correlation approx. zero (components were created independently) cor_spearman( mfD, ordering = 'MEI' ) # Correlation approx. zero (components were created independently) cor_spearman( mfD, ordering = 'MHI' ) #### TOTALLY DEPENDENT COMPONENTS # Nonlinear transform of first component Data_3 = t( apply( Data_1, 1, exp ) ) # Creating bivariate dataset starting from nonlinearly-dependent components mfD = mfData( grid, list( Data_1, Data_3 ) ) # Correlation very high (components are nonlinearly dependent) cor_spearman( mfD, ordering = 'MEI' ) # Correlation very high (components are nonlinearly dependent) cor_spearman( mfD, ordering = 'MHI' )
This function computes the bootstrap estimates of standard error and bias of the Spearman's correlation coefficient for a multivariate functional dataset.
cor_spearman_accuracy( mfD, ordering = "MEI", bootstrap_iterations = 1000, verbose = FALSE )
cor_spearman_accuracy( mfD, ordering = "MEI", bootstrap_iterations = 1000, verbose = FALSE )
mfD |
a multivariate functional dataset whose Spearman's correlation
coefficient must be computed, in form of multivariate |
ordering |
the ordering relation to use on functional observations,
either |
bootstrap_iterations |
the number of bootstrap iterations to be used for estimation of bias and standard error. |
verbose |
a logical flag specifying whether to log information on the estimation progress. |
Given a multivariate functional dataset ,
defined over the grid
, having
components
, and a chosen ordering strategy (MEI or MHI),
the function computes the matrix of Spearman's correlation indices of the
dataset components, as well as their bias and standard deviation estimates
through a specified number of bootstrap iterations (bias and standard error
are updated with on-line formulas).
a list of three elements: mean
, the mean of the matrix of
correlation coefficients; bias
, a matrix containing the estimated
bias (mean - point estimate of correlation coefficients); sd
, a
matrix containing the estimated standard deviation of the coefficients'
matrix. In case the multivariate functional dataset has only two
components, the return type is scalar and not matrix.
N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2)) # Computes bootstrap estimate of Spearman correlation cor_spearman_accuracy(mfD, ordering = "MEI") cor_spearman_accuracy(mfD, ordering = "MHI")
N <- 200 P <- 100 grid <- seq(0, 1, length.out = P) # Creating an exponential covariance function to simulate Gaussian data Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) # Simulating (independent) Gaussian functional data with given center and covariance function Data_1 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) Data_2 <- generate_gauss_fdata( N = N, centerline = sin(2 * pi * grid), Cov = Cov ) # Using the simulated data as (independent) components of a bivariate functional dataset mfD <- mfData(grid, list(Data_1, Data_2)) # Computes bootstrap estimate of Spearman correlation cor_spearman_accuracy(mfD, ordering = "MEI") cor_spearman_accuracy(mfD, ordering = "MHI")
S3
method to compute the sample covariance and cross-covariance
functions for a set of functional data.
cov_fun(X, Y = NULL) ## S3 method for class 'fData' cov_fun(X, Y = NULL) ## S3 method for class 'mfData' cov_fun(X, Y = NULL)
cov_fun(X, Y = NULL) ## S3 method for class 'fData' cov_fun(X, Y = NULL) ## S3 method for class 'mfData' cov_fun(X, Y = NULL)
X |
is the (eventually first) functional dataset, i.e. either an object
of class |
Y |
is the (optional) second functional dataset to be used to compute the
cross-covariance function, either |
Given a univariate random function X, defined
over the grid , the covariance
function is defined as:
Given another random function, Y, defined over the same grid as X, the cross- covariance function of X and Y is:
For a generic L-dimensional random function X, i.e. an L-dimensional multivariate functional datum, the covariance function is defined as the set of blocks:
while the cross-covariance function is defined by the blocks:
The method cov_fun
provides the sample estimator of the covariance or
cross-covariance functions for univariate or multivariate functional datasets.
The class of X
(fData
or mfData
) is used to dispatch the
correct implementation of the method.
The following cases are given:
if X
is of class fData
and Y
is NULL
, then
the covariance function of X
is returned;
if X
is of class fData
and Y
is of
class fData
,
the cross-covariance function of the two datasets is returned;
if X
is of class mfData
and Y
is NULL
,
the upper-triangular blocks of the covariance function of X
are returned (in form of list and by row, i.e. in the sequence 1_1, 1_2, ...,
1_L, 2_2, ... - have a look at the labels of the list with str
);
if X
is of class mfData
and Y
is of
class fData
,
the cross-covariances of X
's components and Y
are
returned (in form of list);
if X
is of class mfData
and Y
is of
class mfData
,
the upper-triangular blocks of the cross-covariance of X
's and
Y
's components are returned (in form of list and by row, i.e. in the
sequence 1_1, 1_2, ..., 1_L, 2_2, ... - have a look at the labels
of the list with str
));
In any case, the return type is either an instance of the S3
class Cov
or a list of instances of such class (for the case of multivariate
functional data).
# Generating a univariate functional dataset N = 1e2 P = 1e2 t0 = 0 t1 = 1 time_grid = seq( t0, t1, length.out = P ) Cov = exp_cov_function( time_grid, alpha = 0.3, beta = 0.4 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) D2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) fD2 = fData( time_grid, D2 ) # Computing the covariance function of fD1 C = cov_fun( fD1 ) str( C ) # Computing the cross-covariance function of fD1 and fD2 CC = cov_fun( fD1, fD2 ) str( CC ) # Generating a multivariate functional dataset L = 3 C1 = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) C2 = exp_cov_function( time_grid, alpha = 0.2, beta = 0.5 ) C3 = exp_cov_function( time_grid, alpha = 0.3, beta = 1 ) centerline = matrix( c( sin( 2 * pi * time_grid ), sqrt( time_grid ), 10 * ( time_grid - 0.5 ) * time_grid ), nrow = 3, byrow = TRUE ) D3 = generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCov = list( C1, C2, C3 ) ) # adding names for better readability of BC3's labels names( D3 ) = c( 'comp1', 'comp2', 'comp3' ) mfD3 = mfData( time_grid, D3 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) # Computing the block covariance function of mfD3 BC3 = cov_fun( mfD3 ) str( BC3 ) # computing cross-covariance between mfData and fData objects CC = cov_fun( mfD3, fD1 ) str( CC )
# Generating a univariate functional dataset N = 1e2 P = 1e2 t0 = 0 t1 = 1 time_grid = seq( t0, t1, length.out = P ) Cov = exp_cov_function( time_grid, alpha = 0.3, beta = 0.4 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) D2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) fD2 = fData( time_grid, D2 ) # Computing the covariance function of fD1 C = cov_fun( fD1 ) str( C ) # Computing the cross-covariance function of fD1 and fD2 CC = cov_fun( fD1, fD2 ) str( CC ) # Generating a multivariate functional dataset L = 3 C1 = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) C2 = exp_cov_function( time_grid, alpha = 0.2, beta = 0.5 ) C3 = exp_cov_function( time_grid, alpha = 0.3, beta = 1 ) centerline = matrix( c( sin( 2 * pi * time_grid ), sqrt( time_grid ), 10 * ( time_grid - 0.5 ) * time_grid ), nrow = 3, byrow = TRUE ) D3 = generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCov = list( C1, C2, C3 ) ) # adding names for better readability of BC3's labels names( D3 ) = c( 'comp1', 'comp2', 'comp3' ) mfD3 = mfData( time_grid, D3 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) # Computing the block covariance function of mfD3 BC3 = cov_fun( mfD3 ) str( BC3 ) # computing cross-covariance between mfData and fData objects CC = cov_fun( mfD3, fD1 ) str( CC )
This function computes the three 'DepthGram' representations from a p-variate functional data set.
depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## Default S3 method: depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## S3 method for class 'fData' depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## S3 method for class 'mfData' depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL )
depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## Default S3 method: depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## S3 method for class 'fData' depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL ) ## S3 method for class 'mfData' depthgram( Data, marginal_outliers = FALSE, boxplot_factor = 1.5, outliergram_factor = 1.5, ids = NULL )
Data |
A |
marginal_outliers |
A boolean specifying whether the function should
return shape and amplitude outliers over each dimension. Defaults to
|
boxplot_factor |
A numeric value specifying the inflation factor for
marginal functional boxplots. This is ignored if |
outliergram_factor |
A numeric value specifying the inflation factor for
marginal outliergrams. This is ignored if |
ids |
A character vector specifying labels for individual observations.
Defaults to |
An object of class depthgram
which is a list with the following
items:
mbd.mei.d
: vector MBD of the MEI dimension-wise.
mei.mbd.d
: vector MEI of the MBD dimension-wise.
mbd.mei.t
: vector MBD of the MEI time-wise.
mei.mbd.t
: vector MEI of the MEI time-wise.
mbd.mei.t2
: vector MBD of the MEI time/correlation-wise.
mei.mbd.t2
: vector MEI of the MBD time/correlation-wise.
shp.out.det
: detected shape outliers by dimension.
mag.out.det
: detected magnitude outliers by dimension.
mbd.d
: matrix n x p
of MBD dimension-wise.
mei.d
: matrix n x p
of MEI dimension-wise.
mbd.t
: matrix n x p
of MBD time-wise.
mei.t
: matrix n x p
of MEI time-wise.
mbd.t2
: matrix n x p
of MBD time/correlation-wise
mei.t2
: matrix n x p
of MBD time/correlation-wise.
Aleman-Gomez, Y., Arribas-Gil, A., Desco, M. Elias-Fernandez, A., and Romo, J. (2021). "Depthgram: Visualizing Outliers in High Dimensional Functional Data with application to Task fMRI data exploration".
N <- 2e2 P <- 1e3 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) Data <- list() Data[[1]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) Data[[2]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) names <- paste0("id_", 1:nrow(Data[[1]])) DG1 <- depthgram(Data, marginal_outliers = TRUE, ids = names) fD <- fData(grid, Data[[1]]) DG2 <- depthgram(fD, marginal_outliers = TRUE, ids = names) mfD <- mfData(grid, Data) DG3 <- depthgram(mfD, marginal_outliers = TRUE, ids = names)
N <- 2e2 P <- 1e3 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) Data <- list() Data[[1]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) Data[[2]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) names <- paste0("id_", 1:nrow(Data[[1]])) DG1 <- depthgram(Data, marginal_outliers = TRUE, ids = names) fD <- fData(grid, Data[[1]]) DG2 <- depthgram(fD, marginal_outliers = TRUE, ids = names) mfD <- mfData(grid, Data) DG3 <- depthgram(mfD, marginal_outliers = TRUE, ids = names)
This function computes the Epigraphic Index (EI) of elements of a univariate functional dataset.
EI(Data) ## S3 method for class 'fData' EI(Data) ## Default S3 method: EI(Data)
EI(Data) ## S3 method for class 'fData' EI(Data) ## Default S3 method: EI(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the
EI, i.e.:
where indicates the graph of
,
indicates the epigraph of
.
The function returns a vector containing the values of EI for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) EI( fD ) EI( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) EI( fD ) EI( Data )
This function computes the discretization of an exponential covariance function of the form:
over a 1D grid , thus obtaining the
matrix
of values:
exp_cov_function(grid, alpha, beta)
exp_cov_function(grid, alpha, beta)
grid |
a vector of time points. |
alpha |
the alpha parameter in the exponential covariance formula. |
beta |
the beta parameter in the exponential covariance formula. |
generate_gauss_fdata
,
generate_gauss_mfdata
grid = seq( 0, 1, length.out = 5e2 ) alpha = 0.2 beta = 0.3 dev.new() image( exp_cov_function( grid, alpha, beta ), main = 'Exponential covariance function', xlab = 'grid', ylab = 'grid')
grid = seq( 0, 1, length.out = 5e2 ) alpha = 0.2 beta = 0.3 dev.new() image( exp_cov_function( grid, alpha, beta ), main = 'Exponential covariance function', xlab = 'grid', ylab = 'grid')
This function can be used to perform the functional boxplot of univariate or multivariate functional data.
fbplot( Data, Depths = "MBD", Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... ) ## S3 method for class 'fData' fbplot( Data, Depths = "MBD", Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... ) ## S3 method for class 'mfData' fbplot( Data, Depths = list(def = "MBD", weights = "uniform"), Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... )
fbplot( Data, Depths = "MBD", Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... ) ## S3 method for class 'fData' fbplot( Data, Depths = "MBD", Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... ) ## S3 method for class 'mfData' fbplot( Data, Depths = list(def = "MBD", weights = "uniform"), Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... )
Data |
the univariate or multivariate functional dataset whose
functional boxplot must be determined, in form of |
Depths |
either a vector containing the depths for each element of the dataset, or:
In both cases the name of the functions to compute depths must be available in the caller's environment. |
Fvalue |
the value of the inflation factor |
adjust |
either
|
display |
either a logical value indicating whether you want the functional boxplot to be displayed, or the number of the graphical device where you want the functional boxplot to be displayed. |
xlab |
the label to use on the x axis when displaying the functional boxplot. |
ylab |
the label (or list of labels for the multivariate functional case) to use on the y axis when displaying the functional boxplot. |
main |
the main title (or list of titles for the multivariate functional case) to be used when displaying the functional boxplot. |
... |
additional graphical parameters to be used in plotting functions. |
Even when used in graphical way to plot the functional boxplot, the function returns a list of three elements:
Depths
: contains the depths of each element of the functional
dataset.
Fvalue
: is the value of F used to obtain the outliers.
ID_out
: contains the vector of indices of dataset elements flagged
as outliers (if any).
In the univariate functional case, when the adjustment option is
selected, the value of is optimized for the univariate functional
dataset provided with
Data
.
In practice, a number adjust$N_trials
of times a synthetic population
(of size adjust$tiral_size
with the same covariance (robustly
estimated from data) and centerline as fData
is simulated without
outliers and each time an optimized value is computed so that a
given proportion (
adjust$TPR
) of observations is flagged as outliers.
The final value of F
for the functional boxplot is determined as an
average of . At each time step the
optimization problem is solved using
stats::uniroot
(Brent's method).
Sun, Y., & Genton, M. G. (2012). Functional boxplots. Journal of Computational and Graphical Statistics.
Sun, Y., & Genton, M. G. (2012). Adjusted functional boxplots for spatio-temporal data visualization and outlier detection. Environmetrics, 23(1), 54-64.
fData
, MBD
, BD
,
mfData
, multiMBD
, multiBD
# UNIVARIATE FUNCTIONAL BOXPLOT - NO ADJUSTMENT set.seed(1) N = 2 * 100 + 1 P = 2e2 grid = seq( 0, 1, length.out = P ) D = 10 * matrix( sin( 2 * pi * grid ), nrow = N, ncol = P, byrow = TRUE ) D = D + rexp(N, rate = 0.05) # c( 0, 1 : (( N - 1 )/2), -( ( ( N - 1 ) / 2 ) : 1 ) )^4 fD = fData( grid, D ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 3)) plot( fD, lwd = 2, main = 'Functional dataset', xlab = 'time', ylab = 'values' ) fbplot( fD, main = 'Functional boxplot', xlab = 'time', ylab = 'values', Fvalue = 1.5 ) boxplot(fD$values[,1], ylim = range(fD$values), main = 'Boxplot of functional dataset at t_0 ' ) par(oldpar) # UNIVARIATE FUNCTIONAL BOXPLOT - WITH ADJUSTMENT set.seed( 161803 ) P = 2e2 grid = seq( 0, 1, length.out = P ) N = 1e2 # Generating a univariate synthetic gaussian dataset Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) ) fD = fData( grid, Data ) dev.new() fbplot( fD, adjust = list( N_trials = 10, trial_size = 5 * N, VERBOSE = TRUE ), xlab = 'time', ylab = 'Values', main = 'My adjusted functional boxplot' ) # MULTIVARIATE FUNCTIONAL BOXPLOT - NO ADJUSTMENT set.seed( 1618033 ) P = 1e2 N = 1e2 L = 2 grid = seq( 0, 1, length.out = 1e2 ) C1 = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) C2 = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a bivariate functional dataset of gaussian data with partially # correlated components Data = generate_gauss_mfdata( N, L, centerline = matrix( sin( 2 * pi * grid ), nrow = 2, ncol = P, byrow = TRUE ), correlations = rep( 0.5, 1 ), listCov = list( C1, C2 ) ) mfD = mfData( grid, Data ) dev.new() fbplot( mfD, Fvalue = 2.5, xlab = 'time', ylab = list( 'Values 1', 'Values 2' ), main = list( 'First component', 'Second component' ) )
# UNIVARIATE FUNCTIONAL BOXPLOT - NO ADJUSTMENT set.seed(1) N = 2 * 100 + 1 P = 2e2 grid = seq( 0, 1, length.out = P ) D = 10 * matrix( sin( 2 * pi * grid ), nrow = N, ncol = P, byrow = TRUE ) D = D + rexp(N, rate = 0.05) # c( 0, 1 : (( N - 1 )/2), -( ( ( N - 1 ) / 2 ) : 1 ) )^4 fD = fData( grid, D ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 3)) plot( fD, lwd = 2, main = 'Functional dataset', xlab = 'time', ylab = 'values' ) fbplot( fD, main = 'Functional boxplot', xlab = 'time', ylab = 'values', Fvalue = 1.5 ) boxplot(fD$values[,1], ylim = range(fD$values), main = 'Boxplot of functional dataset at t_0 ' ) par(oldpar) # UNIVARIATE FUNCTIONAL BOXPLOT - WITH ADJUSTMENT set.seed( 161803 ) P = 2e2 grid = seq( 0, 1, length.out = P ) N = 1e2 # Generating a univariate synthetic gaussian dataset Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) ) fD = fData( grid, Data ) dev.new() fbplot( fD, adjust = list( N_trials = 10, trial_size = 5 * N, VERBOSE = TRUE ), xlab = 'time', ylab = 'Values', main = 'My adjusted functional boxplot' ) # MULTIVARIATE FUNCTIONAL BOXPLOT - NO ADJUSTMENT set.seed( 1618033 ) P = 1e2 N = 1e2 L = 2 grid = seq( 0, 1, length.out = 1e2 ) C1 = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) C2 = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a bivariate functional dataset of gaussian data with partially # correlated components Data = generate_gauss_mfdata( N, L, centerline = matrix( sin( 2 * pi * grid ), nrow = 2, ncol = P, byrow = TRUE ), correlations = rep( 0.5, 1 ), listCov = list( C1, C2 ) ) mfD = mfData( grid, Data ) dev.new() fbplot( mfD, Fvalue = 2.5, xlab = 'time', ylab = list( 'Values 1', 'Values 2' ), main = list( 'First component', 'Second component' ) )
S3
Class for univariate functional datasets.This function implements a constructor for elements of S3
class
fData
, aimed at implementing a representation of a functional
dataset.
fData(grid, values)
fData(grid, values)
grid |
the evenly spaced grid over which the functional observations are
measured. It must be a numeric vector of length |
values |
the values of the observations in the functional dataset,
provided in form of a 2D data structure (e.g. matrix or array) having as
rows the observations and as columns their measurements over the 1D grid of
length |
The functional dataset is represented as a collection of measurement of the
observations on an evenly spaced, 1D grid of discrete points (representing,
e.g. time), namely, for functional data defined over a grid :
The function returns a S3
object of class fData
, containing
the following elements:
"N
": the number of elements in the dataset;
"P
": the number of points in the 1D grid over which elements
are measured;
"t0
": the starting point of the 1D grid;
"tP
": the ending point of the 1D grid;
"values
": the matrix of measurements of the functional
observations on the 1D grid provided with grid
.
generate_gauss_fdata
, sub-.fData
# Defining parameters N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: values = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) fD = fData( grid, values )
# Defining parameters N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: values = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) fD = fData( grid, values )
This function can be used to generate a palette of colors useful to plot
functional datasets with the plot
methods.
fDColorPalette(N, hue_range = c(0, 360), alpha = 0.8, ...)
fDColorPalette(N, hue_range = c(0, 360), alpha = 0.8, ...)
N |
number of different colors (ideally, functional observations). |
hue_range |
the range of hues in the HCL scheme. |
alpha |
the alpha channel parameter(s) of the colors (transparency). |
... |
additional parameters to be passed to |
The function, built around scales::hue_pal
, allows to set up the
HCL parameters of the set of colors desired, and besides to set up the
alpha channel value.
N = 1e2 angular_grid = seq( 0, 359, length.out = N ) dev.new() plot( angular_grid, angular_grid, col = fDColorPalette( N, hue_range = c( 0, 359 ), alpha = 1 ), pch = 16, cex = 3 )
N = 1e2 angular_grid = seq( 0, 359, length.out = N ) dev.new() plot( angular_grid, angular_grid, col = fDColorPalette( N, hue_range = c( 0, 359 ), alpha = 1 ), pch = 16, cex = 3 )
generate_gauss_fdata
generates a dataset of univariate functional data
with a desired mean and covariance function.
generate_gauss_fdata(N, centerline, Cov = NULL, CholCov = NULL)
generate_gauss_fdata(N, centerline, Cov = NULL, CholCov = NULL)
N |
the number of distinct functional observations to generate. |
centerline |
the centerline of the distribution, represented as a one-
dimensional data structure of length |
Cov |
the covariance operator (provided in form of a |
CholCov |
the Cholesky factor of the covariance operator (provided in
form of a |
In particular, the following model is considered for the generation of data:
where is the center and
is a centered gaussian
process with covariance function
.
That is to say:
All the functions are supposed to be observed on an evenly-spaced, one-
dimensional grid of P points: .
The function returns a matrix containing the discretized
values of the generated observations (in form of an
matrix).
exp_cov_function
, fData
,
generate_gauss_mfdata
N = 30 P = 1e2 t0 = 0 tP = 1 time_grid = seq( t0, tP, length.out = P ) C = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) CholC = chol( C ) centerline = sin( 2 * pi * time_grid ) invisible(generate_gauss_fdata( N, centerline, Cov = C )) invisible(generate_gauss_fdata( N, centerline, CholCov = CholC ))
N = 30 P = 1e2 t0 = 0 tP = 1 time_grid = seq( t0, tP, length.out = P ) C = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) CholC = chol( C ) centerline = sin( 2 * pi * time_grid ) invisible(generate_gauss_fdata( N, centerline, Cov = C )) invisible(generate_gauss_fdata( N, centerline, CholCov = CholC ))
generate_gauss_mfdata
generates a dataset of multivariate functional
data with a desired mean and covariance function in each dimension and a
desired correlation structure among components.
generate_gauss_mfdata( N, L, centerline, correlations, listCov = NULL, listCholCov = NULL )
generate_gauss_mfdata( N, L, centerline, correlations, listCov = NULL, listCholCov = NULL )
N |
the number of distinct functional observations to generate. |
L |
the number of components of the multivariate functional data. |
centerline |
the centerline of the distribution, represented as a 2-dimensional data structure with L rows (one for each dimension) having the measurements along the grid as columns. |
correlations |
is the vector containing the
that is to say, the row-wise, upper triangular part of the correlation matrix without the diagonal. |
listCov |
a list containing the |
listCholCov |
the Cholesky factor of the |
In particular, the following model is considered for the generation of data:
where is the number of components of the multivariate functional
random variable,
is the
th component of the center and
is a centered gaussian process with covariance function
. That is to say:
A correlation structure among is
allowed in the following way:
All the functions are supposed to be observed on an evenly-spaced, one-
dimensional grid of P points: .
The function returns a list of L matrices, one for each component of
the multivariate functional random variable, containing the discretized
values of the generated observations (in form of
matrices).
exp_cov_function
, mfData
,
generate_gauss_fdata
N = 30 P = 1e2 L = 3 time_grid = seq( 0, 1, length.out = P ) C1 = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) C2 = exp_cov_function( time_grid, alpha = 0.2, beta = 0.5 ) C3 = exp_cov_function( time_grid, alpha = 0.3, beta = 1 ) centerline = matrix( c( sin( 2 * pi * time_grid ), sqrt( time_grid ), 10 * ( time_grid - 0.5 ) * time_grid ), nrow = 3, byrow = TRUE ) generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCov = list( C1, C2, C3 ) ) CholC1 = chol( C1 ) CholC2 = chol( C2 ) CholC3 = chol( C3 ) generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCholCov = list( CholC1, CholC2, CholC3 ) )
N = 30 P = 1e2 L = 3 time_grid = seq( 0, 1, length.out = P ) C1 = exp_cov_function( time_grid, alpha = 0.1, beta = 0.2 ) C2 = exp_cov_function( time_grid, alpha = 0.2, beta = 0.5 ) C3 = exp_cov_function( time_grid, alpha = 0.3, beta = 1 ) centerline = matrix( c( sin( 2 * pi * time_grid ), sqrt( time_grid ), 10 * ( time_grid - 0.5 ) * time_grid ), nrow = 3, byrow = TRUE ) generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCov = list( C1, C2, C3 ) ) CholC1 = chol( C1 ) CholC2 = chol( C2 ) CholC3 = chol( C3 ) generate_gauss_mfdata( N, L, centerline, correlations = c( 0.5, 0.5, 0.5 ), listCholCov = list( CholC1, CholC2, CholC3 ) )
This function computes the Hypograph Index (HI) of elements of a univariate functional dataset.
HI(Data) ## S3 method for class 'fData' HI(Data) ## Default S3 method: HI(Data)
HI(Data) ## S3 method for class 'fData' HI(Data) ## Default S3 method: HI(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the
HI, i.e.:
where indicates the graph of
,
indicates the hypograph of
.
The function returns a vector containing the values of HI for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) HI( fD ) HI( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) HI( fD ) HI( Data )
This function computes the Half-Region Depth (HRD) of elements of a univariate functional dataset.
HRD(Data) ## S3 method for class 'fData' HRD(Data) ## Default S3 method: HRD(Data)
HRD(Data) ## S3 method for class 'fData' HRD(Data) ## Default S3 method: HRD(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the HRD
of its elements, i.e.:
where indicates the Epigraph Index (EI) of
with
respect to the dataset, and
indicates the Hypograph Index of
with respect to the dataset.
The function returns a vector containing the values of HRD for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) HRD( fD ) HRD( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) HRD( fD ) HRD( Data )
This function implements an order relation between univariate functional data based on the maximum relation, that is to say a pre-order relation obtained by comparing the maxima of two different functional data.
max_ordered(fData, gData)
max_ordered(fData, gData)
fData |
the first univariate functional dataset containing elements to
be compared, in form of |
gData |
the second univariate functional dataset containing elements to
be compared, in form of |
Given a univariate functional dataset,
and another functional dataset
defined over the same compact interval
, the function computes
the maxima in both the datasets, and checks whether the first ones are lower
or equal than the second ones.
By default the function tries to compare each with the
corresponding
, thus assuming
, but when either
or
, the comparison is carried out cycling over the
dataset with fewer elements. In all the other cases (
and
either
or
) the function stops.
The function returns a logical vector of length
containing the value of the predicate for all the corresponding elements.
Valencia, D., Romo, J. and Lillo, R. (2015). A Kendall correlation
coefficient for functional dependence, Universidad Carlos III de Madrid
technical report,
http://EconPapers.repec.org/RePEc:cte:wsrepe:ws133228
.
maxima
, minima
, fData
,
area_ordered
P = 1e2 grid = seq( 0, 1, length.out = P ) Data_1 = matrix( c( 1 * grid, 2 * grid ), nrow = 2, ncol = P, byrow = TRUE ) Data_2 = matrix( 3 * ( 0.5 - abs( grid - 0.5 ) ), nrow = 1, byrow = TRUE ) Data_3 = rbind( Data_1, Data_1 ) fD_1 = fData( grid, Data_1 ) fD_2 = fData( grid, Data_2 ) fD_3 = fData( grid, Data_3 ) max_ordered( fD_1, fD_2 ) max_ordered( fD_2, fD_3 )
P = 1e2 grid = seq( 0, 1, length.out = P ) Data_1 = matrix( c( 1 * grid, 2 * grid ), nrow = 2, ncol = P, byrow = TRUE ) Data_2 = matrix( 3 * ( 0.5 - abs( grid - 0.5 ) ), nrow = 1, byrow = TRUE ) Data_3 = rbind( Data_1, Data_1 ) fD_1 = fData( grid, Data_1 ) fD_2 = fData( grid, Data_2 ) fD_3 = fData( grid, Data_3 ) max_ordered( fD_1, fD_2 ) max_ordered( fD_2, fD_3 )
This function computes the maximum value of each element of a univariate functional dataset, optionally returning also the value of the grid where they are fulfilled.
maxima(fData, ..., which = FALSE)
maxima(fData, ..., which = FALSE)
fData |
the functional dataset containing elements whose maxima have to
be computed, in form of |
... |
additional parameters. |
which |
logical flag specifying whether the grid values where maxima are fulfilled have to be returned too. |
If which = FALSE
, the function returns a vector containing the
maxima for each element of the functional dataset; if which = TRUE
,
the function returns a data.frame
whose field value
contains
the values of maxima, and grid
contains the grid points where maxima
are reached.
P = 1e3 grid = seq( 0, 1, length.out = P ) Data = matrix( c( 1 * grid, 2 * grid, 3 * ( 0.5 - abs( grid - 0.5 ) ) ), nrow = 3, ncol = P, byrow = TRUE ) fD = fData( grid, Data ) maxima( fD, which = TRUE )
P = 1e3 grid = seq( 0, 1, length.out = P ) Data = matrix( c( 1 * grid, 2 * grid, 3 * ( 0.5 - abs( grid - 0.5 ) ) ), nrow = 3, ncol = P, byrow = TRUE ) fD = fData( grid, Data ) maxima( fD, which = TRUE )
This function computes the Modified Band Depth (MBD) of elements of a functional dataset.
MBD(Data, manage_ties = FALSE) ## S3 method for class 'fData' MBD(Data, manage_ties = FALSE) ## Default S3 method: MBD(Data, manage_ties = FALSE)
MBD(Data, manage_ties = FALSE) ## S3 method for class 'fData' MBD(Data, manage_ties = FALSE) ## Default S3 method: MBD(Data, manage_ties = FALSE)
Data |
either a |
manage_ties |
a logical flag specifying whether a check for ties and
relative treatment must be carried out or not (default is |
Given a univariate functional dataset, ,
defined over a compact interval
,
this function computes the sample MBD of each element with respect to the
other elements of the dataset, i.e.:
where is the normalized Lebesgue measure over
, that is
.
See the References section for more details.
The function returns a vector containing the values of MBD for the given dataset.
Lopez-Pintado, S. and Romo, J. (2009). On the Concept of Depth for Functional Data, Journal of the American Statistical Association, 104, 718-734.
Lopez-Pintado, S. and Romo. J. (2007). Depth-based inference for functional data, Computational Statistics & Data Analysis 51, 4957-4968.
BD
, MBD_relative
,
BD_relative
, fData
grid = seq( 0, 1, length.out = 1e2 ) D = matrix( c( 1 + sin( 2 * pi * grid ), 0 + sin( 4 * pi * grid ), 1 - sin( pi * ( grid - 0.2 ) ), 0.1 + cos( 2 * pi * grid ), 0.5 + sin( 3 * pi + grid ), -2 + sin( pi * grid ) ), nrow = 6, ncol = length( grid ), byrow = TRUE ) fD = fData( grid, D ) MBD( fD ) MBD( D )
grid = seq( 0, 1, length.out = 1e2 ) D = matrix( c( 1 + sin( 2 * pi * grid ), 0 + sin( 4 * pi * grid ), 1 - sin( pi * ( grid - 0.2 ) ), 0.1 + cos( 2 * pi * grid ), 0.5 + sin( 3 * pi + grid ), -2 + sin( pi * grid ) ), nrow = 6, ncol = length( grid ), byrow = TRUE ) fD = fData( grid, D ) MBD( fD ) MBD( D )
This function computes Modified Band Depth (BD) of elements of a univariate functional dataset with respect to another univariate functional dataset.
MBD_relative(Data_target, Data_reference) ## S3 method for class 'fData' MBD_relative(Data_target, Data_reference) ## Default S3 method: MBD_relative(Data_target, Data_reference)
MBD_relative(Data_target, Data_reference) ## S3 method for class 'fData' MBD_relative(Data_target, Data_reference) ## Default S3 method: MBD_relative(Data_target, Data_reference)
Data_target |
is the univariate functional dataset, provided either as
an |
Data_reference |
is the dataset, provided either as an |
Given a univariate functional dataset of elements , and another univariate functional dataset of elements
, defined over the same compact interval
, this function computes the MBD of
elements of the former with respect to elements of the latter, i.e.:
, where
is the
normalized Lebesgue measure over
, that is
.
The function returns a vector containing the MBD of elements in
Data_target
with respect to elements in Data_reference
.
MBD
, BD
, BD_relative
,
fData
grid = seq( 0, 1, length.out = 1e2 ) Data_ref = matrix( c( 0 + sin( 2 * pi * grid ), 1 + sin( 2 * pi * grid ), -1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) Data_test_1 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = 1, ncol = length( grid ), byrow = TRUE ) Data_test_2 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = length( grid ), ncol = 1, byrow = TRUE ) Data_test_3 = 0.6 + sin( 2 * pi * grid ) Data_test_4 = array( 0.6 + sin( 2 * pi * grid ), dim = length( grid ) ) Data_test_5 = array( 0.6 + sin( 2 * pi * grid ), dim = c( 1, length( grid ) ) ) Data_test_6 = array( 0.6 + sin( 2 * pi * grid ), dim = c( length( grid ), 1 ) ) Data_test_7 = matrix( c( 0.5 + sin( 2 * pi * grid ), -0.5 + sin( 2 * pi * grid ), 1.1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) fD_ref = fData( grid, Data_ref ) fD_test_1 = fData( grid, Data_test_1 ) fD_test_2 = fData( grid, Data_test_2 ) fD_test_3 = fData( grid, Data_test_3 ) fD_test_4 = fData( grid, Data_test_4 ) fD_test_5 = fData( grid, Data_test_5 ) fD_test_6 = fData( grid, Data_test_6 ) fD_test_7 = fData( grid, Data_test_7 ) MBD_relative( fD_test_1, fD_ref ) MBD_relative( Data_test_1, Data_ref ) MBD_relative( fD_test_2, fD_ref ) MBD_relative( Data_test_2, Data_ref ) MBD_relative( fD_test_3, fD_ref ) MBD_relative( Data_test_3, Data_ref ) MBD_relative( fD_test_4, fD_ref ) MBD_relative( Data_test_4, Data_ref ) MBD_relative( fD_test_5, fD_ref ) MBD_relative( Data_test_5, Data_ref ) MBD_relative( fD_test_6, fD_ref ) MBD_relative( Data_test_6, Data_ref ) MBD_relative( fD_test_7, fD_ref ) MBD_relative( Data_test_7, Data_ref )
grid = seq( 0, 1, length.out = 1e2 ) Data_ref = matrix( c( 0 + sin( 2 * pi * grid ), 1 + sin( 2 * pi * grid ), -1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) Data_test_1 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = 1, ncol = length( grid ), byrow = TRUE ) Data_test_2 = matrix( c( 0.6 + sin( 2 * pi * grid ) ), nrow = length( grid ), ncol = 1, byrow = TRUE ) Data_test_3 = 0.6 + sin( 2 * pi * grid ) Data_test_4 = array( 0.6 + sin( 2 * pi * grid ), dim = length( grid ) ) Data_test_5 = array( 0.6 + sin( 2 * pi * grid ), dim = c( 1, length( grid ) ) ) Data_test_6 = array( 0.6 + sin( 2 * pi * grid ), dim = c( length( grid ), 1 ) ) Data_test_7 = matrix( c( 0.5 + sin( 2 * pi * grid ), -0.5 + sin( 2 * pi * grid ), 1.1 + sin( 2 * pi * grid ) ), nrow = 3, ncol = length( grid ), byrow = TRUE ) fD_ref = fData( grid, Data_ref ) fD_test_1 = fData( grid, Data_test_1 ) fD_test_2 = fData( grid, Data_test_2 ) fD_test_3 = fData( grid, Data_test_3 ) fD_test_4 = fData( grid, Data_test_4 ) fD_test_5 = fData( grid, Data_test_5 ) fD_test_6 = fData( grid, Data_test_6 ) fD_test_7 = fData( grid, Data_test_7 ) MBD_relative( fD_test_1, fD_ref ) MBD_relative( Data_test_1, Data_ref ) MBD_relative( fD_test_2, fD_ref ) MBD_relative( Data_test_2, Data_ref ) MBD_relative( fD_test_3, fD_ref ) MBD_relative( Data_test_3, Data_ref ) MBD_relative( fD_test_4, fD_ref ) MBD_relative( Data_test_4, Data_ref ) MBD_relative( fD_test_5, fD_ref ) MBD_relative( Data_test_5, Data_ref ) MBD_relative( fD_test_6, fD_ref ) MBD_relative( Data_test_6, Data_ref ) MBD_relative( fD_test_7, fD_ref ) MBD_relative( Data_test_7, Data_ref )
This S3
method implements the cross-sectional mean of a
univariate functional dataset stored in a fData
object, i.e. the
mean computed point-by-point along the grid over which the dataset is
defined.
## S3 method for class 'fData' mean(x, ...)
## S3 method for class 'fData' mean(x, ...)
x |
the univariate functional dataset whose cross-sectional mean must be
computed, in form of |
... |
possible additional parameters. This argument is kept for
compatibility with the |
The function returns a fData
object with one observation
defined on the same grid as the argument x
's representing the
desired cross-sectional mean.
N = 1e2 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean target_mean = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) fD = fData( grid, generate_gauss_fdata( N, centerline = target_mean, Cov = C ) ) # Graphical representation of the mean plot( fD ) plot( mean( fD ), col = 'black', lwd = 2, lty = 2, add = TRUE )
N = 1e2 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean target_mean = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) fD = fData( grid, generate_gauss_fdata( N, centerline = target_mean, Cov = C ) ) # Graphical representation of the mean plot( fD ) plot( mean( fD ), col = 'black', lwd = 2, lty = 2, add = TRUE )
This S3
method implements the cross-sectional mean of a
multivariate functional dataset stored in a mfData
object, i.e. the
mean computed point-by-point along the grid over which the dataset is
defined.
## S3 method for class 'mfData' mean(x, ...)
## S3 method for class 'mfData' mean(x, ...)
x |
the multivariate functional dataset whose cross-sectional mean must
be computed, in form of |
... |
possible additional parameters. This argument is kept for
compatibility with the |
The function returns a mfData
object with one observation
defined on the same grid as the argument x
's representing the
desired cross-sectional mean.
N = 1e2 L = 3 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean target_mean = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) # Independent components correlations = c( 0, 0, 0 ) mfD = mfData( grid, generate_gauss_mfdata( N, L, correlations = correlations, centerline = matrix( target_mean, nrow = 3, ncol = P, byrow = TRUE ), listCov = list( C, C, C ) ) ) # Graphical representation of the mean oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, L)) for(iL in 1:L) { plot(mfD$fDList[[iL]]) plot( mean(mfD)$fDList[[iL]], col = 'black', lwd = 2, lty = 2, add = TRUE ) } par(oldpar)
N = 1e2 L = 3 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean target_mean = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) # Independent components correlations = c( 0, 0, 0 ) mfD = mfData( grid, generate_gauss_mfdata( N, L, correlations = correlations, centerline = matrix( target_mean, nrow = 3, ncol = P, byrow = TRUE ), listCov = list( C, C, C ) ) ) # Graphical representation of the mean oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, L)) for(iL in 1:L) { plot(mfD$fDList[[iL]]) plot( mean(mfD)$fDList[[iL]], col = 'black', lwd = 2, lty = 2, add = TRUE ) } par(oldpar)
This method computes the sample median of a univariate functional dataset based on a definition of depth for univariate functional data.
median_fData(fData, type = "MBD", ...)
median_fData(fData, type = "MBD", ...)
fData |
the univariate functional dataset whose
median is required, in form of |
type |
a string specifying the name of the function defining the depth
for univariate data to be used. It must be a valid name of a function defined
in the current environment, default is |
... |
additional parameters to be used in the function specified by
argument |
Provided a definition of functional depth for univariate data, the corresponding median (i.e. the deepest element of the sample) is returned as the desired median. This method does not coincide with the computation of the cross-sectional median of the sample of the point-by-point measurements on the grid. Hence, the sample median is a member of the dataset provided.
The function returns a fData
object containing the desired
sample median.
fData
, mean.fData
,
median_mfData
N = 1e2 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean # Being the distribution symmetric, the sample mean and median are coincident target_median = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) fD = fData( grid, generate_gauss_fdata( N, centerline = target_median, Cov = C ) ) # Graphical representation of the mean plot( fD ) plot( median_fData( fD ), col = 'black', lwd = 2, lty = 2, add = TRUE )
N = 1e2 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean # Being the distribution symmetric, the sample mean and median are coincident target_median = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) fD = fData( grid, generate_gauss_fdata( N, centerline = target_median, Cov = C ) ) # Graphical representation of the mean plot( fD ) plot( median_fData( fD ), col = 'black', lwd = 2, lty = 2, add = TRUE )
This method computes the sample median of a multivariate functional dataset based on a definition of depth for multivariate functional data.
median_mfData(mfData, type = "multiMBD", ...)
median_mfData(mfData, type = "multiMBD", ...)
mfData |
the multivariate functional dataset whose
median is required, in form of |
type |
a string specifying the name of the function defining the depth
for multivariate data to be used. It must be a valid name of a function
defined in the current environment, default is |
... |
additional parameters to be used in the function specified by
argument |
Provided a definition of functional depth for multivariate data, the corresponding median (i.e. the deepest element of the sample) is returned as the desired median. This method does not coincide with the computation of the cross-sectional median of the sample of the point-by-point measurements on the grid. Hence, the sample median is a member of the dataset provided.
The function returns a mfData
object containing the desired
sample median.
mfData
, mean.mfData
,
median_fData
N = 1e2 L = 3 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean # Being the distribution symmetric, the sample mean and median are coincident target_median = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) # Strongly dependent components correlations = c( 0.9, 0.9, 0.9 ) mfD = mfData( grid, generate_gauss_mfdata( N, L, correlations = correlations, centerline = matrix( target_median, nrow = 3, ncol = P, byrow = TRUE ), listCov = list( C, C, C ) ) ) med_mfD = median_mfData( mfD, type = 'multiMBD', weights = 'uniform' ) # Graphical representation of the mean oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, L)) for(iL in 1:L) { plot(mfD$fDList[[iL]]) plot( med_mfD$fDList[[iL]], col = 'black', lwd = 2, lty = 2, add = TRUE ) } par(oldpar)
N = 1e2 L = 3 P = 1e2 grid = seq( 0, 1, length.out = P ) # Generating a gaussian functional sample with desired mean # Being the distribution symmetric, the sample mean and median are coincident target_median = sin( 2 * pi * grid ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.2 ) # Strongly dependent components correlations = c( 0.9, 0.9, 0.9 ) mfD = mfData( grid, generate_gauss_mfdata( N, L, correlations = correlations, centerline = matrix( target_median, nrow = 3, ncol = P, byrow = TRUE ), listCov = list( C, C, C ) ) ) med_mfD = median_mfData( mfD, type = 'multiMBD', weights = 'uniform' ) # Graphical representation of the mean oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, L)) for(iL in 1:L) { plot(mfD$fDList[[iL]]) plot( med_mfD$fDList[[iL]], col = 'black', lwd = 2, lty = 2, add = TRUE ) } par(oldpar)
This function computes the Modified Epigraphic Index (MEI) of elements of a univariate functional dataset.
MEI(Data) ## S3 method for class 'fData' MEI(Data) ## Default S3 method: MEI(Data)
MEI(Data) ## S3 method for class 'fData' MEI(Data) ## Default S3 method: MEI(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the
MEI, i.e.:
where is the normalized Lebesgue measure over
, that is
.
The function returns a vector containing the values of MEI for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MEI( fD ) MEI( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MEI( fD ) MEI( Data )
A dataset containing the 8-Lead ECG traces of 50 healthy subjects. They can be used to compare the
signals of pathological subjects stored in mfD_LBBB
and mfD_RBBB
objects.
mfD_healthy
mfD_healthy
A mfData
object.
The 8 leads are, in order, V1, V2, V3, V4, V5, D1 and D2. The signals have been registered and smoothed over an evenly spaced grid of 1024 time points at 1kHz.
A dataset containing the 8-Lead ECG traces of 50 subjects suffering from Left-Bundle-Branch-Block (LBBB), a cardiac pathology affecting the conduction process and resulting in some peculiar distortions of the ECG.
mfD_LBBB
mfD_LBBB
A mfData
object.
The 8 leads are, in order, V1, V2, V3, V4, V5, D1 and D2. The signals have been registered and smoothed over an evenly spaced grid of 1024 time points at 1kHz.
S3
class for multivariate functional datasetsThis function implements a constructor for elements of S3
class
mfData
, aimed at implementing a representation of a multivariate
functional dataset.
mfData(grid, Data_list)
mfData(grid, Data_list)
grid |
the (evenly spaced) grid over which the functional dataset is defined. |
Data_list |
a |
The functional dataset is represented as a collection of L
components,
each one an object of class fData
. Each component must contain elements
defined on the same grid as the others, and must contain the same number of
elements (N
).
The function returns a S3
object of class mfData
, containing
the following elements:
"N
": the number of elements in the dataset;
"L
": the number of components of the functional dataset;
"P
": the number of points in the 1D grid over which elements
are measured;
"t0
": the starting point of the 1D grid;
"tP
": the ending point of the 1D grid;
"fDList
": the list of fData
objects representing the
L
components as corresponding univariate functional datasets.
fData
, generate_gauss_fdata
,
generate_gauss_mfdata
# Defining parameters N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object mfData( grid, list( Data_1, Data_2 ) )
# Defining parameters N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object mfData( grid, list( Data_1, Data_2 ) )
This function computes the Modified Hypograph Index (MEI) of elements of a univariate functional dataset.
MHI(Data) ## S3 method for class 'fData' MHI(Data) ## Default S3 method: MHI(Data)
MHI(Data) ## S3 method for class 'fData' MHI(Data) ## Default S3 method: MHI(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the
MHI, i.e.:
where is the normalized Lebesgue measure over
, that is
.
The function returns a vector containing the values of MHI for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MHI( fD ) MHI( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MHI( fD ) MHI( Data )
This function computes the Modified Half-Region Depth (MHRD) of elements of a univariate functional dataset.
MHRD(Data) ## S3 method for class 'fData' MHRD(Data) ## Default S3 method: MHRD(Data)
MHRD(Data) ## S3 method for class 'fData' MHRD(Data) ## Default S3 method: MHRD(Data)
Data |
either an |
Given a univariate functional dataset, ,
defined over a compact interval
, this function computes the MHRD
of its elements, i.e.:
where indicates the Modified Epigraph Index (MEI) of
with respect to the dataset, and
indicates the
Modified Hypograph Index of
with respect to the dataset.
The function returns a vector containing the values of MHRD for each
element of the functional dataset provided in Data
.
Lopez-Pintado, S. and Romo, J. (2012). A half-region depth for functional data, Computational Statistics and Data Analysis, 55, 1679-1695.
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MHRD( fD ) MHRD( Data )
N = 20 P = 1e2 grid = seq( 0, 1, length.out = P ) C = exp_cov_function( grid, alpha = 0.2, beta = 0.3 ) Data = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), C ) fD = fData( grid, Data ) MHRD( fD ) MHRD( Data )
This function computes computes the minimum value of each element of a univariate functional dataset, optionally returning also the value of the grid where they are fulfilled.
minima(fData, ..., which = FALSE)
minima(fData, ..., which = FALSE)
fData |
the functional dataset containing elements whose minima have to
be computed, in form of |
... |
additional parameters. |
which |
logical flag specifying whether the grid values where minima are fulfilled have to be returned too. |
If which = FALSE
, the function returns a vector containing the
minima for each element of the functional dataset; if which = TRUE
,
the function returns a data.frame
whose field value
contains
the values of minima, and grid
contains the grid points where minima
are reached.
P = 1e3 grid = seq( 0, 1, length.out = P ) Data = matrix( c( 1 * grid, 2 * grid, 3 * ( 0.5 - abs( grid - 0.5 ) ) ), nrow = 3, ncol = P, byrow = TRUE ) fD = fData( grid, Data ) minima( fD, which = TRUE )
P = 1e3 grid = seq( 0, 1, length.out = P ) Data = matrix( c( 1 * grid, 2 * grid, 3 * ( 0.5 - abs( grid - 0.5 ) ) ), nrow = 3, ncol = P, byrow = TRUE ) fD = fData( grid, Data ) minima( fD, which = TRUE )
These functions compute the Band Depth (BD) and Modified Band Depth (MBD) of elements of a multivariate functional dataset.
multiMBD(Data, weights = "uniform", manage_ties = FALSE) ## S3 method for class 'mfData' multiMBD(Data, weights = "uniform", manage_ties = FALSE) ## Default S3 method: multiMBD(Data, weights = "uniform", manage_ties = FALSE) multiBD(Data, weights = "uniform") ## S3 method for class 'mfData' multiBD(Data, weights = "uniform") ## Default S3 method: multiBD(Data, weights = "uniform")
multiMBD(Data, weights = "uniform", manage_ties = FALSE) ## S3 method for class 'mfData' multiMBD(Data, weights = "uniform", manage_ties = FALSE) ## Default S3 method: multiMBD(Data, weights = "uniform", manage_ties = FALSE) multiBD(Data, weights = "uniform") ## S3 method for class 'mfData' multiBD(Data, weights = "uniform") ## Default S3 method: multiBD(Data, weights = "uniform")
Data |
specifies the the multivariate functional dataset.
It is either an object of class |
weights |
either a set of weights (of the same length of |
manage_ties |
a logical flag specifying whether the check for ties and
the relative treatment is to be carried out while computing the MBDs in each
dimension. It is directly passed to |
Given a multivariate functional dataset composed of elements with
components each,
, and a set of
non-negative weights,
these functions compute the BD and MBD of each element of the functional dataset, namely:
The function returns a vector containing the depths of each element of the multivariate functional dataset.
Ieva, F. and Paganoni, A. M. (2013). Depth measures for multivariate functional data, Communications in Statistics: Theory and Methods, 41, 1265-1276.
Tarabelloni, N., Ieva, F., Biasi, R. and Paganoni, A. M. (2015). Use of Depth Measure for Multivariate Functional Data in Disease Prediction: An Application to Electrocardiograph Signals, International Journal of Biostatistics, 11.2, 189-201.
MBD
, BD
, toListOfValues
,
mfData
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) multiBD( mfD, weights = 'uniform' ) multiMBD( mfD, weights = 'uniform', manage_ties = TRUE ) multiBD( mfD, weights = c( 1/3, 2/3 )) multiMBD( mfD, weights = c( 1/3, 2/3 ), manage_ties = FALSE ) multiBD( list( Data_1, Data_2 ), weights = 'uniform') multiMBD( list( Data_1, Data_2 ), weights = 'uniform', manage_ties = TRUE ) multiBD( list( Data_1, Data_2 ), weights = c( 1/3, 2/3 )) multiMBD( list( Data_1, Data_2 ), weights = c( 1/3, 2/3 ), manage_ties = FALSE )
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) multiBD( mfD, weights = 'uniform' ) multiMBD( mfD, weights = 'uniform', manage_ties = TRUE ) multiBD( mfD, weights = c( 1/3, 2/3 )) multiMBD( mfD, weights = c( 1/3, 2/3 ), manage_ties = FALSE ) multiBD( list( Data_1, Data_2 ), weights = 'uniform') multiMBD( list( Data_1, Data_2 ), weights = 'uniform', manage_ties = TRUE ) multiBD( list( Data_1, Data_2 ), weights = c( 1/3, 2/3 )) multiMBD( list( Data_1, Data_2 ), weights = c( 1/3, 2/3 ), manage_ties = FALSE )
These functions compute the Modified Epigraph Index of elements of a multivariate functional dataset.
multiMEI(Data, weights = "uniform") ## S3 method for class 'mfData' multiMEI(Data, weights = "uniform") ## Default S3 method: multiMEI(Data, weights = "uniform")
multiMEI(Data, weights = "uniform") ## S3 method for class 'mfData' multiMEI(Data, weights = "uniform") ## Default S3 method: multiMEI(Data, weights = "uniform")
Data |
specifies the the multivariate functional dataset.
It is either an object of class |
weights |
either a set of weights (of the same length of |
Given a multivariate functional dataset composed of elements with
components each,
, and a set of
non-negative weights,
these functions compute the MEI of each element of the functional dataset, namely:
The function returns a vector containing the values of MEI of each element of the multivariate functional dataset.
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) # Uniform weights multiMEI( mfD, weights = 'uniform' ) # Non-uniform, custom weights multiMEI( mfD, weights = c(2/3, 1/3) )
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) # Uniform weights multiMEI( mfD, weights = 'uniform' ) # Non-uniform, custom weights multiMEI( mfD, weights = c(2/3, 1/3) )
These functions compute the Modified Hypograph Index of elements of a multivariate functional dataset.
multiMHI(Data, weights = "uniform") ## S3 method for class 'mfData' multiMHI(Data, weights = "uniform") ## Default S3 method: multiMHI(Data, weights = "uniform")
multiMHI(Data, weights = "uniform") ## S3 method for class 'mfData' multiMHI(Data, weights = "uniform") ## Default S3 method: multiMHI(Data, weights = "uniform")
Data |
specifies the the multivariate functional dataset.
It is either an object of class |
weights |
either a set of weights (of the same length of |
Given a multivariate functional dataset composed of elements with
components each,
, and a set of
non-negative weights,
these functions compute the MHI of each element of the functional dataset, namely:
The function returns a vector containing the values of MHI of each element of the multivariate functional dataset.
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) # Uniform weights multiMHI( mfD, weights = 'uniform' ) # Non-uniform, custom weights multiMHI( mfD, weights = c(2/3, 1/3) )
N = 20 P = 1e3 grid = seq( 0, 10, length.out = P ) # Generating an exponential covariance function to be used to simulate gaussian # functional data Cov = exp_cov_function( grid, alpha = 0.2, beta = 0.8 ) # First component of the multivariate guassian functional dataset Data_1 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) # First component of the multivariate guassian functional dataset Data_2 = generate_gauss_fdata( N, centerline = rep( 0, P ), Cov = Cov ) mfD = mfData( grid, list( Data_1, Data_2 ) ) # Uniform weights multiMHI( mfD, weights = 'uniform' ) # Non-uniform, custom weights multiMHI( mfD, weights = c(2/3, 1/3) )
This function performs the outliergram of a multivariate functional dataset.
multivariate_outliergram( mfData, MBD_data = NULL, MEI_data = NULL, weights = "uniform", p_check = 0.05, Fvalue = 1.5, shift = TRUE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL )
multivariate_outliergram( mfData, MBD_data = NULL, MEI_data = NULL, weights = "uniform", p_check = 0.05, Fvalue = 1.5, shift = TRUE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL )
mfData |
the multivariate functional dataset whose outliergram has to be determined; |
MBD_data |
a vector containing the MBD for each element of the dataset; If missing, MBDs are computed with the specified choice of weights; |
MEI_data |
a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed; |
weights |
the weights choice to be used to compute multivariate MBDs and MEIs; |
p_check |
percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset). |
Fvalue |
the |
shift |
whether to apply the shifting algorithm to properly manage observations having low or high MEI. Default is TRUE. |
display |
either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed; |
xlab |
the label to use on the x axis in the outliergram plot; |
ylab |
the label to use on the x axis in the outliergram plot; |
main |
the title to use in the outliergram; |
The method applies the extension of the univariate outliergram to the case of multivariate functional datasets. Differently from the function for the univariate case, only the outliergram plot is displayed.
Differently from the case of univariate functional data, in this case the function does not apply an automatic tuning of the F parameter, since the related procedure would become computationally too heavy for general datasets. If a good value of F is sought, it is recommended to run several trials of the outliergram and manually select the best value.
Ieva, F. & Paganoni, A.M. Stat Papers (2017). https://doi.org/10.1007/s00362-017-0953-1.
outliergram
, mfData
, MBD
,
MEI
N = 2e2 P = 1e2 t0 = 0 t1 = 1 set.seed(1) # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.2) # Simulating the measurements of two univariate functional datasets with # required center and covariance function f1 = function(x) x * ( 1 - x ) f2 = function(x) x^3 Data = generate_gauss_mfdata( N, L = 2, centerline = matrix(c(sin(2 * pi * grid), cos(2 * pi * grid)), nrow=2, byrow=TRUE), listCov = list(C, C), correlations = 0.1 ) # Building the mfData object mfD = mfData( grid, Data ) dev.new() out = multivariate_outliergram(mfD, Fvalue = 2., shift=TRUE) col_non_outlying = scales::hue_pal( h = c( 180, 270 ), l = 60 )( N - length( out$ID_outliers ) ) col_non_outlying = set_alpha( col_non_outlying, 0.5 ) col_outlying = scales::hue_pal( h = c( - 90, 180 ), c = 150 )( length( out$ID_outliers ) ) colors = rep('black', N) colors[out$ID_outliers] = col_outlying colors[colors == 'black'] = col_non_outlying lwd = rep(1, N) lwd[out$ID_outliers] = 2 dev.new() plot(mfD, col=colors, lwd=lwd)
N = 2e2 P = 1e2 t0 = 0 t1 = 1 set.seed(1) # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.2) # Simulating the measurements of two univariate functional datasets with # required center and covariance function f1 = function(x) x * ( 1 - x ) f2 = function(x) x^3 Data = generate_gauss_mfdata( N, L = 2, centerline = matrix(c(sin(2 * pi * grid), cos(2 * pi * grid)), nrow=2, byrow=TRUE), listCov = list(C, C), correlations = 0.1 ) # Building the mfData object mfD = mfData( grid, Data ) dev.new() out = multivariate_outliergram(mfD, Fvalue = 2., shift=TRUE) col_non_outlying = scales::hue_pal( h = c( 180, 270 ), l = 60 )( N - length( out$ID_outliers ) ) col_non_outlying = set_alpha( col_non_outlying, 0.5 ) col_outlying = scales::hue_pal( h = c( - 90, 180 ), c = 150 )( length( out$ID_outliers ) ) colors = rep('black', N) colors[out$ID_outliers] = col_outlying colors[colors == 'black'] = col_non_outlying lwd = rep(1, N) lwd[out$ID_outliers] = 2 dev.new() plot(mfD, col=colors, lwd=lwd)
This function performs the outliergram of a univariate functional data set, possibly with an adjustment of the true positive rate of outliers discovered under assumption of gaussianity.
outliergram( fData, MBD_data = NULL, MEI_data = NULL, p_check = 0.05, Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... )
outliergram( fData, MBD_data = NULL, MEI_data = NULL, p_check = 0.05, Fvalue = 1.5, adjust = FALSE, display = TRUE, xlab = NULL, ylab = NULL, main = NULL, ... )
fData |
the univariate functional dataset whose outliergram has to be determined. |
MBD_data |
a vector containing the MBD for each element of the dataset. If missing, MBDs are computed. |
MEI_data |
a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed. |
p_check |
percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset). |
Fvalue |
the |
adjust |
either
|
display |
either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed. |
xlab |
a list of two labels to use on the x axis when displaying the functional dataset and the outliergram |
ylab |
a list of two labels to use on the y axis when displaying the functional dataset and the outliergram; |
main |
a list of two titles to be used on the plot of the functional dataset and the outliergram; |
... |
additional graphical parameters to be used only in the plot of the functional dataset |
Even when used graphically to plot the outliergram, the function returns a list containing:
Fvalue
: the value of the parameter F used;
d
: the vector of values of the parameter for each observation
(distance to the parabolic border of the outliergram);
ID_outliers
: the vector of observations id corresponding to outliers.
When the adjustment option is selected, the value of is optimized for
the univariate functional dataset provided with
fData
. In practice,
a number adjust$N_trials
of times a synthetic population
(of size adjust$trial_size
with the same covariance (robustly
estimated from data) and centerline as fData
is simulated without
outliers and each time an optimized value is computed so that a
given proportion (
adjust$TPR
) of observations is flagged as outliers.
The final value of F
for the outliergram is determined as an average
of . At each time step the optimization
problem is solved using
stats::uniroot
(Brent's method).
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
set.seed(1618) N <- 200 P <- 200 N_extra <- 4 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.2, beta = 0.8) Data <- generate_gauss_fdata( N = N, centerline = sin(4 * pi * grid), Cov = Cov ) Data_extra <- array(0, dim = c(N_extra, P)) Data_extra[1, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid + pi / 2), Cov = Cov ) Data_extra[2, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid - pi / 2), Cov = Cov ) Data_extra[3, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid + pi / 3), Cov = Cov ) Data_extra[4, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid - pi / 3), Cov = Cov ) Data <- rbind(Data, Data_extra) fD <- fData(grid, Data) # Outliergram with default Fvalue = 1.5 outliergram(fD, display = TRUE) # Outliergram with Fvalue enforced to 2.5 outliergram(fD, Fvalue = 2.5, display = TRUE) # Outliergram with estimated Fvalue to ensure TPR of 1% outliergram( fData = fD, adjust = list( N_trials = 10, trial_size = 5 * nrow(Data), TPR = 0.01, VERBOSE = FALSE ), display = TRUE )
set.seed(1618) N <- 200 P <- 200 N_extra <- 4 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.2, beta = 0.8) Data <- generate_gauss_fdata( N = N, centerline = sin(4 * pi * grid), Cov = Cov ) Data_extra <- array(0, dim = c(N_extra, P)) Data_extra[1, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid + pi / 2), Cov = Cov ) Data_extra[2, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid - pi / 2), Cov = Cov ) Data_extra[3, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid + pi / 3), Cov = Cov ) Data_extra[4, ] <- generate_gauss_fdata( N = 1, centerline = sin(4 * pi * grid - pi / 3), Cov = Cov ) Data <- rbind(Data, Data_extra) fD <- fData(grid, Data) # Outliergram with default Fvalue = 1.5 outliergram(fD, display = TRUE) # Outliergram with Fvalue enforced to 2.5 outliergram(fD, Fvalue = 2.5, display = TRUE) # Outliergram with estimated Fvalue to ensure TPR of 1% outliergram( fData = fD, adjust = list( N_trials = 10, trial_size = 5 * nrow(Data), TPR = 0.01, VERBOSE = FALSE ), display = TRUE )
Cov
objectsThis function performs the plot of an object of class Cov
, i.e. a
covariance or cross-covariance function.
## S3 method for class 'Cov' plot(x, ...)
## S3 method for class 'Cov' plot(x, ...)
x |
the covariance or cross-covariance function of class |
... |
additional graphical parameters to be used in plotting functions |
It builds above the function graphics::image
, therefore any additional
parameter suitable for graphics::image
will also be suitable as ...
argument to plot.Cov
.
# Generating a univariate functional dataset N = 1e2 P = 1e2 t0 = 0 t1 = 1 time_grid = seq( t0, t1, length.out = P ) Cov = exp_cov_function( time_grid, alpha = 0.3, beta = 0.4 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) # Computing the covariance function of fD1 plot( cov_fun( fD1 ), main = 'Covariance function', xlab = 'time', ylab = 'time' )
# Generating a univariate functional dataset N = 1e2 P = 1e2 t0 = 0 t1 = 1 time_grid = seq( t0, t1, length.out = P ) Cov = exp_cov_function( time_grid, alpha = 0.3, beta = 0.4 ) D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov ) fD1 = fData( time_grid, D1 ) # Computing the covariance function of fD1 plot( cov_fun( fD1 ), main = 'Covariance function', xlab = 'time', ylab = 'time' )
This function plots the three 'DepthGram' representations from the output of
the depthgram
function.
## S3 method for class 'depthgram' plot( x, limits = FALSE, ids = NULL, print = FALSE, plot_title = "", shorten = TRUE, col = NULL, pch = 19, sp = 2, st = 4, sa = 10, text_labels = "", ... )
## S3 method for class 'depthgram' plot( x, limits = FALSE, ids = NULL, print = FALSE, plot_title = "", shorten = TRUE, col = NULL, pch = 19, sp = 2, st = 4, sa = 10, text_labels = "", ... )
x |
An object of class |
limits |
A boolean specifying whether the empirical limits for outlier
detection should be drawn. Defaults to |
ids |
A character vector specifying labels for individual observations.
Defaults to |
print |
A boolean specifying whether the graphical output should be
optimized for printed version. Defaults to |
plot_title |
A character string specifying the main title for the plot.
Defaults to |
shorten |
A boolean specifying whether labels must be shorten to 15
characters. Defaults to |
col |
Color palette used for the plot. Defaults to |
pch |
Point shape. See |
sp |
Point size. See |
st |
Label size. See |
sa |
Axis title sizes. See |
text_labels |
A character vector specifying the labels for the
individuals. It is overridden if |
... |
Other arguments to be passed to the base |
A list with the following items:
p
: list with all the interactive (plotly) depthGram plots;
out
: outliers detected;
colors
: used colors for plotting.
Aleman-Gomez, Y., Arribas-Gil, A., Desco, M. Elias-Fernandez, A., and Romo, J. (2021). "Depthgram: Visualizing Outliers in High Dimensional Functional Data with application to Task fMRI data exploration".
N <- 50 P <- 50 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) Data <- list() Data[[1]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) Data[[2]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) names <- paste0("id_", 1:nrow(Data[[1]])) DG <- depthgram(Data, marginal_outliers = TRUE, ids = names) plot(DG)
N <- 50 P <- 50 grid <- seq(0, 1, length.out = P) Cov <- exp_cov_function(grid, alpha = 0.3, beta = 0.4) Data <- list() Data[[1]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) Data[[2]] <- generate_gauss_fdata( N, centerline = sin(2 * pi * grid), Cov = Cov ) names <- paste0("id_", 1:nrow(Data[[1]])) DG <- depthgram(Data, marginal_outliers = TRUE, ids = names) plot(DG)
fData
objectsThis function performs the plot of a functional univariate dataset stored in
an object of class fData
. It is able to accept all the usual
customizable graphical parameters, otherwise it will use the default ones.
## S3 method for class 'fData' plot(x, ...)
## S3 method for class 'fData' plot(x, ...)
x |
the univariate functional dataset in form of |
... |
additional graphical parameters to be used in plotting functions |
N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: values = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) fD = fData( grid, values ) plot( fD )
N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: values = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) fD = fData( grid, values ) plot( fD )
mfData
objectsThis function performs the plot of a functional multivariate dataset stored
in an object of class mfData
. It is able to accept all the usual
customizable graphical parameters, otherwise it will use the default ones.
## S3 method for class 'mfData' plot(x, ...)
## S3 method for class 'mfData' plot(x, ...)
x |
the multivariate functional dataset in form of |
... |
additional graphical parameters to be used in plotting functions
(see |
The current active graphical device is split into a number of sub-figures,
each one meant to contain the plot of the corresponding dimension of the
mfData
object. In particular, they are arranged in a rectangular
lattice with a number of rows equal to
and a number of columns equal to
.
A special use of the graphical parameters allows to set up y-labels and
titles for all the sub-figures in the graphical window. In particular,
parameters ylab
and main
can take as argument either a single
string, that are repeatedly used for all the sub-graphics, or a list of
different strings (one for each of the L
dimensions) that have to be
used in the corresponding graphic.
N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object and plotting tt plot( mfData( grid, list( Data_1, Data_2 ) ), xlab = 'time', ylab = list( '1st dim.', '2nd dim.' ), main = list( 'An important plot here', 'And another one here' ) )
N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object and plotting tt plot( mfData( grid, list( Data_1, Data_2 ) ), xlab = 'time', ylab = list( '1st dim.', '2nd dim.' ), main = list( 'An important plot here', 'And another one here' ) )
+
and -
for fData
objectsThese methods provide operators +
and -
to perform sums
or differences between an fData
object and either another
fData
object or other compliant data structures, like matrices or
vectors or arrays, representing the pointwise measurements of the second
term of the sum.
## S3 method for class 'fData' fD + A ## S3 method for class 'fData' fD - A
## S3 method for class 'fData' fD + A ## S3 method for class 'fData' fD - A
fD |
the univariate functional dataset in form of |
A |
either an |
If the second term of the operation is an fData
object, it must be
defined over the same grid as the first.
The function returns an fData
object, whose function values
have undergone the sum/difference.
fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = 21, ncol = 10, byrow = TRUE ) ) fD + 1 : 10 fD + array( 1, dim = c( 1, 10 ) ) fD + fD fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = 21, ncol = 10, byrow = TRUE ) ) fD - 2 : 11 fD - array( 1, dim = c( 1, 10 ) ) fD - fD
fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = 21, ncol = 10, byrow = TRUE ) ) fD + 1 : 10 fD + array( 1, dim = c( 1, 10 ) ) fD + fD fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = 21, ncol = 10, byrow = TRUE ) ) fD - 2 : 11 fD - array( 1, dim = c( 1, 10 ) ) fD - fD
set_alpha
manipulates a vector of color representations in order
to setup the alpha value, and get the desired transparency level.
set_alpha(col, alpha)
set_alpha(col, alpha)
col |
a vector of colors |
alpha |
the value(s) of alpha for (each of) the colors. |
original_col = c( 'blue', 'red', 'green', 'yellow' ) alpha_col = set_alpha( original_col, 0.5 ) alpha_col = set_alpha( original_col, c(0.5, 0.5, 0.2, 0.1 ) ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plot( seq_along( original_col ), seq_along( original_col ), col = original_col, pch = 16, cex = 2, main = 'Original colors' ) plot( seq_along( alpha_col ), seq_along( alpha_col ), col = alpha_col, pch = 16, cex = 2, main = 'Alpha colors' ) par(oldpar)
original_col = c( 'blue', 'red', 'green', 'yellow' ) alpha_col = set_alpha( original_col, 0.5 ) alpha_col = set_alpha( original_col, c(0.5, 0.5, 0.2, 0.1 ) ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plot( seq_along( original_col ), seq_along( original_col ), col = original_col, pch = 16, cex = 2, main = 'Original colors' ) plot( seq_along( alpha_col ), seq_along( alpha_col ), col = alpha_col, pch = 16, cex = 2, main = 'Alpha colors' ) par(oldpar)
sub-.fData
to subset fData
objectsThis method provides an easy and natural way to subset a functional dataset
stored in a fData
object, without having to deal with the inner
representation of fData
class.
## S3 method for class 'fData' fD[i, j, as_fData = TRUE]
## S3 method for class 'fData' fD[i, j, as_fData = TRUE]
fD |
the univariate functional dataset in form of |
i |
a valid expression to subset rows ( observations ) of the univariate functional dataset. |
j |
a valid expression to subset columns ( measurements over the grid ) of the univariate functional dataset (must be contiguous). |
as_fData |
logical flag to specify whether the output should be returned
as an |
The method returns either an fData
object ( if as_fData
= TRUE
) or a matrix
( if as_fData = FALSE
) containing the
required subset ( both in terms of observations and measurement points ) of
the univariate functional dataset.
N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: fD = fData( grid, generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(2, 2)) # Original data plot(fD) # Subsetting observations plot(fD[c(1, 2, 3), , as_fData = TRUE]) # Subsetting measurements plot(fD[, 1:30]) # Subsetting both observations and measurements plot(fD[1:10, 50:P]) par(oldpar) # Subsetting both observations and measurements but returning a matrix fD[1:10, 50:P, as_fData = FALSE]
N = 20 P = 1e2 # One dimensional grid grid = seq( 0, 1, length.out = P ) # Generating an exponential covariance function (see related help for more # information ) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Generating a synthetic dataset with a gaussian distribution and # required mean and covariance function: fD = fData( grid, generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(2, 2)) # Original data plot(fD) # Subsetting observations plot(fD[c(1, 2, 3), , as_fData = TRUE]) # Subsetting measurements plot(fD[, 1:30]) # Subsetting both observations and measurements plot(fD[1:10, 50:P]) par(oldpar) # Subsetting both observations and measurements but returning a matrix fD[1:10, 50:P, as_fData = FALSE]
sub-.mfData
to subset mfData
objectsThis method provides an easy and natural way to subset a multivariate
functional dataset stored in a mfData
object, without having to
deal with the inner representation of mfData
class.
## S3 method for class 'mfData' mfD[i, j]
## S3 method for class 'mfData' mfD[i, j]
mfD |
the multivariate functional dataset in form of |
i |
a valid expression to subset rows ( observations ) of the univariate functional dataset. |
j |
a valid expression to subset columns ( measurements over the grid ) of the univariate functional dataset (must be contiguous). |
The method returns and mfData
object containing the
required subset ( both in terms of observations and measurement points ) of
the multivariate functional dataset.
# Defining parameters N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object mfD = mfData( grid, list( Data_1, Data_2 ) ) # Subsetting the first 10 elements and 10 time points mfD[1:10, 1:10] # Subsetting only observations mfD[1:10,] # Subsetting only time points (contiguously) mfD[,1:10]
# Defining parameters N = 1e2 P = 1e3 t0 = 0 t1 = 1 # Defining the measurement grid grid = seq( t0, t1, length.out = P ) # Generating an exponential covariance matrix to be used in the simulation of # the functional datasets (see the related help for details) C = exp_cov_function( grid, alpha = 0.3, beta = 0.4 ) # Simulating the measurements of two univariate functional datasets with # required center and covariance function Data_1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) Data_2 = generate_gauss_fdata( N, centerline = sin( 2 * pi * grid ), Cov = C ) # Building the mfData object mfD = mfData( grid, list( Data_1, Data_2 ) ) # Subsetting the first 10 elements and 10 time points mfD[1:10, 1:10] # Subsetting only observations mfD[1:10,] # Subsetting only time points (contiguously) mfD[,1:10]
*
and /
for fData
objectsThese methods provide operators *
and /
to perform products
or divisions between an fData
object and either a number or a
compliant 1D data structure, like numeric vector, array or
matrix. The operation is computed by performing the element-wise product
or division between fD
's observations and the provided value(s).
## S3 method for class 'fData' fD * a ## S3 method for class 'fData' fD / a
## S3 method for class 'fData' fD * a ## S3 method for class 'fData' fD / a
fD |
the univariate functional dataset in form of |
a |
either a single number or a 1D data structure (such as numeric
raw vector, matrix or array) specifying the factor(s) to use in the
multiplication/division of |
If the second argument is a 1D data structure, it must have length N
equal to the number of observations in fD
.
The function returns an fData
object, whose function values
have undergone the product/division.
N = 11 fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = N, ncol = 10, byrow = TRUE ) ) fD * 2 fD * seq( 1, N ) N = 11 fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = N, ncol = 10, byrow = TRUE ) ) fD / 2 fD / rep( 10, N )
N = 11 fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = N, ncol = 10, byrow = TRUE ) ) fD * 2 fD * seq( 1, N ) N = 11 fD = fData( seq( 0, 1, length.out = 10 ), values = matrix( seq( 1, 10 ), nrow = N, ncol = 10, byrow = TRUE ) ) fD / 2 fD / rep( 10, N )
mfData
list of valuesThis utility function manipulates a mfData
object in order to extract
from the list of its fData
objects ( namely, mfData$fDList
)
the measurement values of each component and stores them into a list.
toListOfValues(mfData)
toListOfValues(mfData)
mfData |
the multivariate functional dataset in form of |
Given a mfData
of L
components, the function is equivalent to
list( mfData$fDList[[ 1 ]]$values,
...,
mfData$fDList[[ L ]]$values )
.
The function returns the list of values of each fData
object
representing the components of mfData
.
grid = seq( 0, 1, length.out = 5 ) D_1 = matrix( 1 : 5, nrow = 10, ncol = 5, byrow = TRUE ) D_2 = 2 * D_1 D_3 = 3 * D_1 mfD = mfData( grid, list( D_1, D_2, D_3 ) ) mfD toListOfValues( mfD )
grid = seq( 0, 1, length.out = 5 ) D_1 = matrix( 1 : 5, nrow = 10, ncol = 5, byrow = TRUE ) D_2 = 2 * D_1 D_3 = 3 * D_1 mfD = mfData( grid, list( D_1, D_2, D_3 ) ) mfD toListOfValues( mfD )
This function manipulates a numeric data structure of vector/array/matrix type in order to obtain a matrix representation. For 1D data structures and column/row arrays and matrices the output is turned in a matrix format with just one row. If the input structure is rectangular, instead, it is only converted in matrix format.
toRowMatrixForm(D)
toRowMatrixForm(D)
D |
a generic array, matrix or vector to be converted in row-matrix format. |
The function is not supposed to work with arbitrary N-dimensional arrays.
toRowMatrixForm( 1 : 10 ) toRowMatrixForm( array( 1 : 10, dim = c(1,10 ) ) ) toRowMatrixForm( array( 1 : 10, dim = c( 10, 1 ) ) ) toRowMatrixForm( matrix( 1 : 10, ncol = 10, nrow = 1 ) ) toRowMatrixForm( matrix( 1 : 10, ncol = 1, nrow = 10 ) ) toRowMatrixForm( matrix( 1 : 12, ncol = 3, nrow = 4 ) ) toRowMatrixForm( matrix( 1 : 12, ncol = 4, nrow = 3 ) )
toRowMatrixForm( 1 : 10 ) toRowMatrixForm( array( 1 : 10, dim = c(1,10 ) ) ) toRowMatrixForm( array( 1 : 10, dim = c( 10, 1 ) ) ) toRowMatrixForm( matrix( 1 : 10, ncol = 10, nrow = 1 ) ) toRowMatrixForm( matrix( 1 : 10, ncol = 1, nrow = 10 ) ) toRowMatrixForm( matrix( 1 : 12, ncol = 3, nrow = 4 ) ) toRowMatrixForm( matrix( 1 : 12, ncol = 4, nrow = 3 ) )
This function operates on a univariate functional dataset and transforms its observations unfolding their values and turning them into monotone functions.
unfold(fData)
unfold(fData)
fData |
the univariate functional dataset in form of |
Each function of the fData
object is transformed into a non-monotone
function into a monotone function by “unfolding” it at any of its maxima.
For more details about the definition of the transform, see the reference.
The function returns an fData
object whose observations are
the unfolded version of the corresponding observations in the argument
fData
.
Arribas-Gil, A. and Romo, J. (2012). Robust depth-based estimation in the time warping model, Biostatistics, 13 (3), 398–414.
P = 1e3 time_grid = seq( 0, 1, length.out = P ) D = matrix( c( sin( 2 * pi * time_grid ), cos( 2 * pi * time_grid ), sin( 10 * pi * time_grid ) * time_grid + 2 ), ncol = P, nrow = 3, byrow = TRUE ) # Functional dataset fD = fData( time_grid, D ) # Unfolded version fD_unfold = unfold( fD ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plot(fD, main = 'Original data') plot(fD_unfold, main = 'Unfolded data') par(oldpar)
P = 1e3 time_grid = seq( 0, 1, length.out = P ) D = matrix( c( sin( 2 * pi * time_grid ), cos( 2 * pi * time_grid ), sin( 10 * pi * time_grid ) * time_grid + 2 ), ncol = P, nrow = 3, byrow = TRUE ) # Functional dataset fD = fData( time_grid, D ) # Unfolded version fD_unfold = unfold( fD ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plot(fD, main = 'Original data') plot(fD_unfold, main = 'Unfolded data') par(oldpar)
This function carries out the warping of elements of a univariate functional dataset by using a set of pre-computed warping functions.
warp(fData, warpings)
warp(fData, warpings)
fData |
the functional dataset whose observations must be warped in
form of |
warpings |
the warping functions |
Given a univariate functional dataset and a set
of warping functions
, such that:
where spans the warped (or registered) grid and
spans the
original grid, the function computes the warping given by the following
composition:
The function returns the univariate functional dataset of warped
functions, in form of fData
object.
set.seed( 1618033 ) N = 30 t0 = 0 t1 = 1 P = 1e3 + 1 time_grid = seq( t0, t1, length.out = P ) means = round( runif( N, t0 + (t1 - t0) / 8, t1 - (t1 - t0) / 8 ), 3 ) Data = matrix( sapply( means, function( m )( dnorm( time_grid, mean = m, sd = 0.05 ) ) ), ncol = P, nrow = N, byrow = TRUE ) fD = fData( time_grid, Data ) # Piecewise linear warpings template_warping = function( m )( c( time_grid[ time_grid <= 0.5 ] * m / 0.5, ( time_grid[ time_grid > 0.5 ] - 0.5 ) * (1 - m ) / 0.5 + m ) ) warpings = matrix( sapply( means, template_warping ), ncol = P, nrow = N, byrow = TRUE ) wfD = fData( time_grid, warpings ) fD_warped = warp( fD, wfD ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 3)) plot( fD, main = 'Unregistered functions', xlab = 'actual grid', ylab = 'values' ) plot( wfD, main = 'Warping functions', xlab = 'registered grid', ylab = 'actual grid' ) plot( fD_warped, main = 'Warped functions', xlab = 'registered grid', ylab = 'values' ) par(oldpar)
set.seed( 1618033 ) N = 30 t0 = 0 t1 = 1 P = 1e3 + 1 time_grid = seq( t0, t1, length.out = P ) means = round( runif( N, t0 + (t1 - t0) / 8, t1 - (t1 - t0) / 8 ), 3 ) Data = matrix( sapply( means, function( m )( dnorm( time_grid, mean = m, sd = 0.05 ) ) ), ncol = P, nrow = N, byrow = TRUE ) fD = fData( time_grid, Data ) # Piecewise linear warpings template_warping = function( m )( c( time_grid[ time_grid <= 0.5 ] * m / 0.5, ( time_grid[ time_grid > 0.5 ] - 0.5 ) * (1 - m ) / 0.5 + m ) ) warpings = matrix( sapply( means, template_warping ), ncol = P, nrow = N, byrow = TRUE ) wfD = fData( time_grid, warpings ) fD_warped = warp( fD, wfD ) dev.new() oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 3)) plot( fD, main = 'Unregistered functions', xlab = 'actual grid', ylab = 'values' ) plot( wfD, main = 'Warping functions', xlab = 'registered grid', ylab = 'actual grid' ) plot( fD_warped, main = 'Warped functions', xlab = 'registered grid', ylab = 'values' ) par(oldpar)