Title: | Asymptotic Efficient Closed-Form Estimators for Multivariate Distributions |
---|---|
Description: | Asymptotic efficient closed-form estimators (MLEces) are provided in this package for three multivariate distributions(gamma, Weibull and Dirichlet) whose maximum likelihood estimators (MLEs) are not in closed forms. Closed-form estimators are strong consistent, and have the similar asymptotic normal distribution like MLEs. But the calculation of MLEces are much faster than the corresponding MLEs. Further details and explanations of MLEces can be found in. Jang, et al. (2023) <doi:10.1111/stan.12299>. Kim, et al. (2023) <doi:10.1080/03610926.2023.2179880>. |
Authors: | Jun Zhao [aut, cre, com], Yu-Kwang Kim [aut], Yu-Hyeong Jang [aut], Jae Ho Chang [aut], Sang Kyu Lee [aut], Hyoung-Moon Kim [aut, ths] |
Maintainer: | Jun Zhao <[email protected]> |
License: | GPL-2 |
Version: | 2.1.0 |
Built: | 2025-02-18 06:27:51 UTC |
Source: | https://github.com/ijun2018/mlece |
Performing closed-form estimators against other methods
benchMLEce(data, distname, methods)
benchMLEce(data, distname, methods)
data |
a numeric matrix. |
distname |
a character indicating which distribution to be fitted. |
methods |
a vector of methods: two characters among |
A matrix with estimate and time in seconds per method for assigned distributions.
#bivariate gamma distribution data_BiGam= rBiGam(100, c(1,4,5)) benchMLEce(data_BiGam, distname="BiGam", methods=c("MLEce","MME")) #bivariate Weibull distribution data_BiWei <- rBiWei(n=50, c(4,3,3,4,0.6)) benchMLEce(data_BiWei, distname="BiWei", methods=c("MLE","CME")) #multivariate Dirichlet distribution data_Diri <- LaplacesDemon::rdirichlet(80, c(3,4,1,3,4)) benchMLEce(data_Diri, distname="Dirichlet", methods=c("MLEce","MLE"))
#bivariate gamma distribution data_BiGam= rBiGam(100, c(1,4,5)) benchMLEce(data_BiGam, distname="BiGam", methods=c("MLEce","MME")) #bivariate Weibull distribution data_BiWei <- rBiWei(n=50, c(4,3,3,4,0.6)) benchMLEce(data_BiWei, distname="BiWei", methods=c("MLE","CME")) #multivariate Dirichlet distribution data_Diri <- LaplacesDemon::rdirichlet(80, c(3,4,1,3,4)) benchMLEce(data_Diri, distname="Dirichlet", methods=c("MLEce","MLE"))
Getting estimated values of efficient closed-form estimators
## S3 method for class 'MLEce' coef(object, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'MLEce' coef(object, digits = max(3, getOption("digits") - 3), ...)
object |
an object of class |
digits |
a numeric number of significant digits. |
... |
not used, but exists because of the compatibility. |
a numeric vector or a list, containing assigned distribution and estimated values, is given.
data_BiGam = rBiGam(100, c(1,4,5)) res_BiGam = MLEce(data_BiGam, "BiGam") coef(res_BiGam) data_BiWei = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(data_BiWei, "BiWei") coef(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") coef(est_Diri)
data_BiGam = rBiGam(100, c(1,4,5)) res_BiGam = MLEce(data_BiGam, "BiGam") coef(res_BiGam) data_BiWei = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(data_BiWei, "BiWei") coef(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") coef(est_Diri)
Getting confidence intervals for efficient closed-form estimators
confCI(object, bootsize = 1000, level = 0.95)
confCI(object, bootsize = 1000, level = 0.95)
object |
an object of class |
bootsize |
a numeric value for the steps in the bootstrap method; default value is 1,000. |
level |
a numeric value between 0 and 1 for controlling the significance level of confidence interval; default value is 0.95. |
The confidence interval is obtained by bootstrap method for the estimated parameters in the assigned distribution.
a numeric a list is given, containing assigned distribution, confidence intervals and alpha which is equal to one minus the significance level.
data(flood) est_BiGam <- MLEce(flood, "BiGam") confCI(est_BiGam) datt = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(datt, "BiWei") confCI(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") confCI(est_Diri)
data(flood) est_BiGam <- MLEce(flood, "BiGam") confCI(est_BiGam) datt = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(datt, "BiWei") confCI(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") confCI(est_Diri)
The data is a subset of the flood events data of the Madawaska basin which is located in the province of Qebec, Canada. Daily streamflow data from 1919 to 1995 are available from HYDAT CD (1998) and Yue (2001).
data(flood, package = "MLEce")
data(flood, package = "MLEce")
A dataframe with 2 variables and 77 observations as follows:
daily average flood volume
flood peak
Environment Canada (1998), HYDAT CD-ROM Version 98-1.05.8: Surface water and sediment data.
Yue. S. (2001) A bivariate gamma distribution for use in multivariate flood frequency analysis. Hydrological Processes, 15, 1033–1045.
The data is about the counts of the frequency of occurrence of different kinds of fossil pollen grains and is available in Mosimann (1962).
data(fossil_pollen, package = "MLEce")
data(fossil_pollen, package = "MLEce")
A dataframe with 73 observations and 4 variables: pinus, abies, quercus and alnus pollens.
Mosimann, J. E. (1962) On the compound multinomial distribution, the multivariate beta distribution, and correlations among proportions. Biometrika, 49, 65-82.
Goodness-of-fit test for the efficient closed-form estimators
gof(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'gof' print(x, digits = max(3, getOption("digits") - 3), ...)
gof(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'gof' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
an object of class "MLEce" made by the function |
digits |
a numeric number of significant digits. |
... |
additional arguments affecting the goodness-of-fit test. |
Generalized Cramer-von Mises test (chiu and Liu, 2009) is applied to do the goodness-of-fit test for multivariate distributions. For the bivariate gamma and Dirichlet distributions, the L2-symmetric discrepancy (SD2) statistics are applied. But the L2-centred discrepancy (CD2) statistics are applied in the bivariate Weibull distribution.
chiu, S. N. and Liu, K. I. (2009) Generalized Cramer-Von Mises goodness-of-fit tests for multivariate distributions. Computational Statistics and Data Analysis, 53, 3817-3834.
data_BiGam <- rBiGam(100, c(1,4,5)) res_BiGam <- MLEce(data_BiGam, "BiGam") gof(res_BiGam) datt = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(datt, "BiWei") gof(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") gof(est_Diri)
data_BiGam <- rBiGam(100, c(1,4,5)) res_BiGam <- MLEce(data_BiGam, "BiGam") gof(res_BiGam) datt = rBiWei(n=50, c(4,3,3,4,0.6)) est_BiWei <-MLEce(datt, "BiWei") gof(est_BiWei) data_Diri <- LaplacesDemon::rdirichlet(n=60, c(3,1,2,4)) est_Diri <- MLEce(data_Diri, "Dirichlet") gof(est_Diri)
The closed-form estimators (MLEces) are calculated for three distributions: bivariate gamma, bivariate Weibull and multivariate Dirichlet.
MLEce(data, distname)
MLEce(data, distname)
data |
a numeric matrix. |
distname |
a character indicating which distribution to be fitted. |
Based on root n-consistent estimators, the closed-form estimators (MLEces) are calculated for the parameters in bivariate gamma, bivariate Weibull and multivariate Dirichlet distributions whose maximum likelihood estimators (MLEs) are not in closed forms. The MLEces are strong consistent and asymptotic normally like the corresponding MLEs, but their calculation are much faster than MLEs. For the bivariate gamma and multivariate Dirichlet distribution, their root n-consistent estimators are the corresponding method of moments estimators (MMEs). The correlation-based estimators (CMEs) are applied as root n-consistent estimators in the bivariate Weibull distribution.
MLEce
returns an object of class "MLEce"
. The object class "MLEce"
is a list containing the following components.
distribution |
a character string of a distribution assuming that data set comes from and the data was fitted to. |
estimation |
the estimated values of parameters in assigned distribution. |
Kim, H.-M., Jang, Y.-H., Arnold, B. C. and Zhao, J. (2023) New efficient estimators for the Weibull distribution. Communications in Statistics - Theory and Methods, 1-26.
Jang, Y.-H., Zhao, J., Kim, H.-M., Yu, K., Kwon, S.and Kim, S. (2023) New closed-form efficient estimator for the multivariate gamma distribution. Statistica Neerlandica, 1–18.
Chang, J. H., Lee, S. K. and Kim, H.-M. (2023) An asymptotically efficient closed–form estimator for the multivariate Dirichlet distribution. submitted.
#bivariate gamma distribution data_BiGam <- rBiGam(100, c(1,4,5)) res_BiGam <- MLEce(data_BiGam, "BiGam") print(res_BiGam) data(flood) est_BiGam <- MLEce(flood, "BiGam") print(est_BiGam) #bivariate Weibull distribution data_BiWei <- rBiWei(n=30, c(4,3,3,4,0.6)) res_BiWei <- MLEce(data_BiWei, "BiWei") print(res_BiWei) #real data example data(airquality) air_data <- airquality[ ,3:4] air_data[ ,2] <- air_data[ ,2]*0.1 est_BiWei <- MLEce(air_data, "BiWei") print(est_BiWei) #Dirichlet distribution data_Diri <- LaplacesDemon::rdirichlet(n=60, c(1,2,3)) res_Diri <- MLEce(data_Diri, "Dirichlet") print(res_Diri) data(fossil_pollen) #real data example fossil_data <- fossil_pollen/rowSums(fossil_pollen) eps <- 1e-10 fossil_data <- (fossil_data +eps)/(1+2*eps) est_fossil <- MLEce(fossil_data, "Dirichlet") print(est_fossil)
#bivariate gamma distribution data_BiGam <- rBiGam(100, c(1,4,5)) res_BiGam <- MLEce(data_BiGam, "BiGam") print(res_BiGam) data(flood) est_BiGam <- MLEce(flood, "BiGam") print(est_BiGam) #bivariate Weibull distribution data_BiWei <- rBiWei(n=30, c(4,3,3,4,0.6)) res_BiWei <- MLEce(data_BiWei, "BiWei") print(res_BiWei) #real data example data(airquality) air_data <- airquality[ ,3:4] air_data[ ,2] <- air_data[ ,2]*0.1 est_BiWei <- MLEce(air_data, "BiWei") print(est_BiWei) #Dirichlet distribution data_Diri <- LaplacesDemon::rdirichlet(n=60, c(1,2,3)) res_Diri <- MLEce(data_Diri, "Dirichlet") print(res_Diri) data(fossil_pollen) #real data example fossil_data <- fossil_pollen/rowSums(fossil_pollen) eps <- 1e-10 fossil_data <- (fossil_data +eps)/(1+2*eps) est_fossil <- MLEce(fossil_data, "Dirichlet") print(est_fossil)
plot
method for a class "MLEce".
## S3 method for class 'MLEce' plot( x, which = c(1, 2, 3, 4), ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )
## S3 method for class 'MLEce' plot( x, which = c(1, 2, 3, 4), ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )
x |
an object of class "MLEce" made by the function |
which |
if a subset of the plots is required, specify a subset of 1:4. |
ask |
logical; if TRUE, the user is asked before each plot. |
... |
not used, but exists because of the compatibility. |
The boxplot for given data is presented first with which=1
. For which=2
, a contour line is drawn by the probability density function of the estimated parameter based on effective closed-form estimators. In the counter plot, the x-axis is the first column of data and the y-axis is the second column of data. For which=3
, a marginally fitted probability density plot is given for the first column of input data. And a fitted line is added for the efficient closed-form estimator. For which=4
, is a marginally fitted probability density plot is given like the former one for the second column of input data. Note that, marginally fitted probability density plots in which=3
and which=4
present comparisons between efficient closed form estimators (MLEces) and correlation based method estimators (CMEs) for the bivariate Weibull distribution. Note that this plot
commend is limited at the bivariate distributions.
data(flood) est_BiGam <- MLEce(flood, "BiGam") plot(est_BiGam, c(3)) air_data <- airquality[ ,3:4] air_data[ ,2] <- air_data[ ,2]*0.1 est_BiWei <- MLEce(air_data, "BiWei") plot(est_BiWei) data(fossil_pollen) fossil_data <- cbind(fossil_pollen[,1]/100,rowSums(fossil_pollen[,-1]/100)) est_fossil <- MLEce(fossil_data, "Dirichlet") plot(est_fossil,c(2))
data(flood) est_BiGam <- MLEce(flood, "BiGam") plot(est_BiGam, c(3)) air_data <- airquality[ ,3:4] air_data[ ,2] <- air_data[ ,2]*0.1 est_BiWei <- MLEce(air_data, "BiWei") plot(est_BiWei) data(fossil_pollen) fossil_data <- cbind(fossil_pollen[,1]/100,rowSums(fossil_pollen[,-1]/100)) est_fossil <- MLEce(fossil_data, "Dirichlet") plot(est_fossil,c(2))
Generating random data for the bivariate gamma distribution with parameters.
rBiGam(n, paras)
rBiGam(n, paras)
n |
number of observations. |
paras |
parameters of bivariate gamma distribution (shape1, shape2, scale). |
Random generation for the bivariate gamma distribution is presented. The specific generation formulas can be found in Jang, et al. (2020).
rBiGam
generates random deviates. The length of generated data is determined by "n"
.
Jang, Y.-H., Zhao, J., Kim, H.-M., Yu, K., Kwon, S.and Kim, S. (2023) New closed-form efficient estimator for the multivariate gamma distribution. Statistica Neerlandica, 1–18.
datt = rBiGam(n=50, c(4,3,3))
datt = rBiGam(n=50, c(4,3,3))
Generating random data for the bivariate Weibull distribution.
rBiWei(n, paras)
rBiWei(n, paras)
n |
number of observations. |
paras |
parameters of bivariate Weibull distribution (alpha1, beta1, alpha2, beta2, delta). |
rBiWei
generates random number data for bivariate Weibull distribution.
rBiWei
generates random deviates.The length of generated data is determined by "n"
datt = rBiWei(n=50, c(4,3,3,4,0.6))
datt = rBiWei(n=50, c(4,3,3,4,0.6))
summary
method for a class "MLEce".
## S3 method for class 'MLEce' summary(object, ...) ## S3 method for class 'summary.MLEce' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'MLEce' summary(object, ...) ## S3 method for class 'summary.MLEce' print(x, digits = max(3, getOption("digits") - 3), ...)
object |
an object of class "MLEce" made by the function |
... |
not used, but exists because of the compatibility. |
x |
an object of class "summary.MLEce". |
digits |
a numeric number of significant digits. |
summary
presents information about effective closed-form estimators calculated by MLEce
containing the following components.
Distribution |
the distribution assigned to fit the data to. |
Quantile |
a numeric vector describing the data set with min, 1st quantile, median, 3rd quantile, and max values. |
Correlation |
correlation coefficient between two vectors of the data |
Estimation |
estimated values of parameters, standard error and confidence intervals are given. |
#bivariate gamma distribution data(flood) est_res1 <- MLEce(flood, "BiGam") summary(est_res1) #bivariate Weibull distribution datt = rBiWei(n=50, c(2,3,3,4,0.4)) est_res2 <-MLEce(datt, "BiWei") summary(est_res2) #Dirichilet distribution data(fossil_pollen) fossil_data <- fossil_pollen/rowSums(fossil_pollen) eps <- 1e-10 fossil_data <- (fossil_data +eps)/(1+2*eps) est_res3 <- MLEce(fossil_data, "Dirichlet") summary(est_res3)
#bivariate gamma distribution data(flood) est_res1 <- MLEce(flood, "BiGam") summary(est_res1) #bivariate Weibull distribution datt = rBiWei(n=50, c(2,3,3,4,0.4)) est_res2 <-MLEce(datt, "BiWei") summary(est_res2) #Dirichilet distribution data(fossil_pollen) fossil_data <- fossil_pollen/rowSums(fossil_pollen) eps <- 1e-10 fossil_data <- (fossil_data +eps)/(1+2*eps) est_res3 <- MLEce(fossil_data, "Dirichlet") summary(est_res3)