Implementation of various statistical models for multivariate event
history data doi:10.1007/s10985-013-9244-x. Including multivariate
cumulative incidence models doi:10.1002/sim.6016, and bivariate random
effects probit models (Liability models)
doi:10.1016/j.csda.2015.01.014. Modern methods for survival analysis,
including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial
regression) with fast computation of influence functions. Restricted
mean survival time regression and years lost for competing risks.
Average treatment effects and G-computation. All functions can be used
with clusters and will work for large data.
Installation
install.packages("mets")
The development version may be installed directly from github (requires
Rtools on windows and
development tools (+Xcode) for Mac
OS X):
Thomas H. Scheike and Klaus K. Holst and Jacob B. Hjelmborg (2013).
Estimating heritability for cause specific mortality based on twin
studies. Lifetime Data Analysis.
http://dx.doi.org/10.1007/s10985-013-9244-x
Klaus K. Holst and Thomas H. Scheike Jacob B. Hjelmborg (2015). The
Liability Threshold Model for Censored Twin Data. Computational
Statistics and Data Analysis.
http://dx.doi.org/10.1016/j.csda.2015.01.014
BibTeX:
@Article{,
title = {A Practical Guide to Family Studies with Lifetime Data},
author = {Thomas H. Scheike and Klaus K. Holst},
year = {2014},
volume = {9},
pages = {47-69},
journal = {Annual Review of Statistics and Its Application},
doi = {10.1146/annurev-statistics-040120-024253},
}
@Article{,
title={Estimating heritability for cause specific mortality based on twin studies},
author={Scheike, Thomas H. and Holst, Klaus K. and Hjelmborg, Jacob B.},
year={2013},
issn={1380-7870},
journal={Lifetime Data Analysis},
doi={10.1007/s10985-013-9244-x},
url={http://dx.doi.org/10.1007/s10985-013-9244-x},
publisher={Springer US},
keywords={Cause specific hazards; Competing risks; Delayed entry;
Left truncation; Heritability; Survival analysis},
pages={1-24},
language={English}
}
@Article{,
title={The Liability Threshold Model for Censored Twin Data},
author={Holst, Klaus K. and Scheike, Thomas H. and Hjelmborg, Jacob B.},
year={2015},
doi={10.1016/j.csda.2015.01.014},
url={http://dx.doi.org/10.1016/j.csda.2015.01.014},
journal={Computational Statistics and Data Analysis}
}
Examples: Twins Polygenic modelling
First considering standard twin modelling (ACE, AE, ADE, and more
models)
# simulated data with pairs of observations in twins on long #data format
set.seed(1)
d <- twinsim(1000, b1=c(1,-1), b2=c(), acde=c(1,1,0,1))
# Polygenic model with Additive genetic effects, and shared and individual environmental effects (ACE)
ace <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id")
ace
#> Estimate Std. Error Z value Pr(>|z|)
#> y -0.019439 0.041817 -0.4649 0.642
#> sd(A) 0.902004 0.203739 4.4273 9.544e-06
#> sd(C) 1.137025 0.132852 8.5586 < 2.2e-16
#> sd(E) 1.728992 0.037408 46.2194 < 2.2e-16
#>
#> MZ-pairs DZ-pairs
#> 1000 1000
#>
#> Variance decomposition:
#> Estimate 2.5% 97.5%
#> A 0.15966 0.01867 0.30065
#> C 0.25370 0.13920 0.36820
#> E 0.58664 0.53677 0.63650
#>
#>
#> Estimate 2.5% 97.5%
#> Broad-sense heritability 0.15966 0.01867 0.30065
#>
#> Estimate 2.5% 97.5%
#> Correlation within MZ: 0.41336 0.36229 0.46196
#> Correlation within DZ: 0.33353 0.27933 0.38561
#>
#> 'log Lik.' -8779.953 (df=4)
#> AIC: 17567.91
#> BIC: 17590.31
# An AE-model could be fitted as
ae <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id", type="ae")
# AIC
AIC(ae)-AIC(ace)
#> [1] 15.20656
# To adjust for the covariates we simply alter the formula statement
ace2 <- twinlm(y ~ x1+x2, data=d, DZ="DZ", zyg="zyg", id="id", type="ace")
## Summary/GOF
summary(ace2)
#> Estimate Std. Error Z value Pr(>|z|)
#> y -0.026049 0.034844 -0.7476 0.4547
#> sd(A) 1.066060 0.072890 14.6256 <2e-16
#> sd(C) 0.980740 0.073569 13.3309 <2e-16
#> sd(E) 0.979980 0.021887 44.7736 <2e-16
#> y~x1 1.006963 0.021900 45.9807 <2e-16
#> y~x2 -0.993802 0.021962 -45.2512 <2e-16
#>
#> MZ-pairs DZ-pairs
#> 1000 1000
#>
#> Variance decomposition:
#> Estimate 2.5% 97.5%
#> A 0.37156 0.27300 0.47012
#> C 0.31446 0.22643 0.40250
#> E 0.31398 0.28381 0.34414
#>
#>
#> Estimate 2.5% 97.5%
#> Broad-sense heritability 0.37156 0.27300 0.47012
#>
#> Estimate 2.5% 97.5%
#> Correlation within MZ: 0.68602 0.65467 0.71502
#> Correlation within DZ: 0.50024 0.45538 0.54257
#>
#> 'log Lik.' -7449.697 (df=6)
#> AIC: 14911.39
#> BIC: 14945
Examples: Twins Polygenic modelling time-to-events Data
In the context of time-to-events data we consider the “Liability
Threshold model” with IPCW adjustment for censoring.
First we fit the bivariate probit model (same marginals in MZ and DZ
twins but different correlation parameter). Here we evaluate the risk of
getting cancer before the last double cancer event (95 years)
Examples: Twins Concordance for time-to-events Data
data(prt) ## Prostate data example (sim)
## Bivariate competing risk, concordance estimates
p33 <- bicomprisk(Event(time,status)~strata(zyg)+id(id),
data=prt, cause=c(2,2), return.data=1, prodlim=TRUE)
#> Strata 'DZ'
#> Strata 'MZ'
p33dz <- p33$model$"DZ"$comp.risk
p33mz <- p33$model$"MZ"$comp.risk
## Probability weights based on Aalen's additive model (same censoring within pair)
prtw <- ipw(Surv(time,status==0)~country+zyg, data=prt,
obs.only=TRUE, same.cens=TRUE,
cluster="id", weight.name="w")
## Marginal model (wrongly ignoring censorings)
bpmz <- biprobit(cancer~1 + cluster(id),
data=subset(prt,zyg=="MZ"), eqmarg=TRUE)
## Extended liability model
bpmzIPW <- biprobit(cancer~1 + cluster(id),
data=subset(prtw,zyg=="MZ"),
weights="w")
smz <- summary(bpmzIPW)
## Concordance
plot(p33mz,ylim=c(0,0.1),axes=FALSE, automar=FALSE,atrisk=FALSE,background=TRUE,background.fg="white")
axis(2); axis(1)
abline(h=smz$prob["Concordance",],lwd=c(2,1,1),col="darkblue")
## Wrong estimates:
abline(h=summary(bpmz)$prob["Concordance",],lwd=c(2,1,1),col="lightgray",lty=2)
Examples: Cox model, RMST
We can fit the Cox model and compute many useful summaries, such as
restricted mean survival and standardized treatment effects
(G-estimation). First estimating the standardized survival
Based on the phreg we can also compute the restricted mean survival time
and years lost (via Kaplan-Meier estimates). The function does it for
all times at once and can be plotted as restricted mean survival or
years lost at the different time horizons
For competing risks the years lost can be decomposed into different
causes and is based on the integrated Aalen-Johansen estimators for the
different strata
Computations are again done for all time horizons at once as illustrated
in the plot.
Examples: Cox model IPTW
We can fit the Cox model with inverse probability of treatment weights
based on logistic regression. The treatment weights can be
time-dependent and then multiplicative weights are applied (see details
and vignette).
and we can get standard errors for predictions based on the influence
functions of the baseline and the regression coefficients (these are
used in the predict function)
library(mets)
data(hfactioncpx12)
hf <- hfactioncpx12
hf$severity <- abs((5+rnorm(741)*2))[hf$id]
## marginal mean using formula
outNZ <- recurrent_marginal(Event(entry,time,status)~strata(treatment)+cluster(id)
+marks(severity),hf,cause=1,death.code=2)
plot(outNZ,se=TRUE)
summary(outNZ,times=3)
For comparison we also compute the IPCW estimates at time 3, using
the linear model, and note that they are identical. Standard errors are however based on different formula that are asymptotically
equivalent, and we note that they are very similar.
Examples: Regression for RMST/Restricted mean survival for survival and competing risks using IPCW
RMST can be computed using the Kaplan-Meier (via phreg) and the for
competing risks via the cumulative incidence functions, but we can also
get these estimates via IPCW adjustment and then we can do regression
mets)Multivariate Event Times (
mets)Implementation of various statistical models for multivariate event history data doi:10.1007/s10985-013-9244-x. Including multivariate cumulative incidence models doi:10.1002/sim.6016, and bivariate random effects probit models (Liability models) doi:10.1016/j.csda.2015.01.014. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions. Restricted mean survival time regression and years lost for competing risks. Average treatment effects and G-computation. All functions can be used with clusters and will work for large data.
Installation
The development version may be installed directly from github (requires Rtools on windows and development tools (+Xcode) for Mac OS X):
or to get development version
Citation
To cite the
metspackage please use one of the following referencesBibTeX:
Examples: Twins Polygenic modelling
First considering standard twin modelling (ACE, AE, ADE, and more models)
Examples: Twins Polygenic modelling time-to-events Data
In the context of time-to-events data we consider the “Liability Threshold model” with IPCW adjustment for censoring.
First we fit the bivariate probit model (same marginals in MZ and DZ twins but different correlation parameter). Here we evaluate the risk of getting cancer before the last double cancer event (95 years)
Liability threshold model with ACE random effects structure
Examples: Twins Concordance for time-to-events Data
Examples: Cox model, RMST
We can fit the Cox model and compute many useful summaries, such as restricted mean survival and standardized treatment effects (G-estimation). First estimating the standardized survival
Based on the phreg we can also compute the restricted mean survival time and years lost (via Kaplan-Meier estimates). The function does it for all times at once and can be plotted as restricted mean survival or years lost at the different time horizons
For competing risks the years lost can be decomposed into different causes and is based on the integrated Aalen-Johansen estimators for the different strata
Computations are again done for all time horizons at once as illustrated in the plot.
Examples: Cox model IPTW
We can fit the Cox model with inverse probability of treatment weights based on logistic regression. The treatment weights can be time-dependent and then multiplicative weights are applied (see details and vignette).
Examples: Competing risks regression, Binomial Regression
We can fit the logistic regression model at a specific time-point with IPCW adjustment
Examples: Competing risks regression, Fine-Gray/Logistic link
We can fit the Fine-Gray model and the logit-link competing risks model (using IPCW adjustment). Starting with the logit-link model
Similarly, the Fine-Gray model can be estimated using IPCW adjustment
and we can get standard errors for predictions based on the influence functions of the baseline and the regression coefficients (these are used in the predict function)
further G-estimation can be done
Examples: Marginal mean for recurrent events
We can estimate the expected number of events non-parametrically and get standard errors for this estimator
Examples: Ghosh-Lin for recurrent events
We can fit the Ghosh-Lin model for the expected number of events observed before dying (using IPCW adjustment and get predictions)
and we can get standard errors for predictions based on the influence functions of the baseline and the regression coefficients
The influence functions of the baseline and regression coefficients at a specific time-point can be obtained
and G-computation
Examples: Fixed time modelling for recurrent events
We can fit a log-link regression model at 2 years for the expected number of events observed before dying (using IPCW adjustment)
Examples: Cumulative Medical Cost
Estimate mean cumulative cost (see also vignette)
For comparison we also compute the IPCW estimates at time 3, using the linear model, and note that they are identical. Standard errors are however based on different formula that are asymptotically equivalent, and we note that they are very similar.
We also apply the semiparametric proportional cost model with IPCW adjustment:
Examples: Regression for RMST/Restricted mean survival for survival and competing risks using IPCW
RMST can be computed using the Kaplan-Meier (via phreg) and the for competing risks via the cumulative incidence functions, but we can also get these estimates via IPCW adjustment and then we can do regression
Examples: Average treatment effects (ATE) for survival or competing risks
We can compute ATE for survival or competing risks data for the probability of dying
or the the restricted mean survival or years-lost to cause 1
Here event is 0/1 thus leading to restricted mean and cause taking the values 0,1,2 produces regression for the years lost due to cause 1.
Examples: While Alive estimands for recurrent events
We consider an RCT and aim to describe the treatment effect via while alive estimands