Title: | Fitting an Interval Response Variable Using `gamlss.family' Distributions |
---|---|
Description: | This is an add-on package to GAMLSS. The purpose of this package is to allow users to fit interval response variables in GAMLSS models. The main function gen.cens() generates a censored version of an existing GAMLSS family distribution. |
Authors: | Mikis Stasinopoulos [aut, cre, cph] , Robert Rigby [aut] , Nicoleta Mortan [ctb], Alexander Seipp [ctb] |
Maintainer: | Mikis Stasinopoulos <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 5.0-8 |
Built: | 2025-01-04 02:38:28 UTC |
Source: | https://github.com/gamlss-dev/gamlss.cens |
This is an add-on package to GAMLSS. The purpose of this package is to allow users to fit interval response variables in GAMLSS models. The main function gen.cens() generates a censored version of an existing GAMLSS family distribution.
The DESCRIPTION file:
Package: | gamlss.cens |
Title: | Fitting an Interval Response Variable Using `gamlss.family' Distributions |
Version: | 5.0-8 |
Date: | 2023-11-02 |
Authors@R: | c(person("Mikis", "Stasinopoulos", role = c("aut", "cre", "cph"), email = "[email protected]", comment = c(ORCID = "0000-0003-2407-5704")), person("Robert", "Rigby", role = "aut", email = "[email protected]", comment = c(ORCID = "0000-0003-3853-1707")), person("Nicoleta", "Mortan", role = "ctb"), person("Alexander", "Seipp", role = "ctb") ) |
Description: | This is an add-on package to GAMLSS. The purpose of this package is to allow users to fit interval response variables in GAMLSS models. The main function gen.cens() generates a censored version of an existing GAMLSS family distribution. |
License: | GPL-2 | GPL-3 |
URL: | https://www.gamlss.com/ |
BugReports: | https://github.com/gamlss-dev/gamlss.cens/issues |
Depends: | R (>= 2.2.1), gamlss.dist, gamlss, survival, methods |
Repository: | https://gamlss-dev.r-universe.dev |
RemoteUrl: | https://github.com/gamlss-dev/gamlss.cens |
RemoteRef: | HEAD |
RemoteSha: | 4720a82609a79a71462bdd1a52ed92a13f8712ed |
Author: | Mikis Stasinopoulos [aut, cre, cph] (<https://orcid.org/0000-0003-2407-5704>), Robert Rigby [aut] (<https://orcid.org/0000-0003-3853-1707>), Nicoleta Mortan [ctb], Alexander Seipp [ctb] |
Maintainer: | Mikis Stasinopoulos <[email protected]> |
Index of help topics:
cens Function to Fit Censored Data Using a gamlss.family Distribution cens.d Censored Probability Density Function of a gamlss.family Distribution cens.p Censored Cumulative Probability Density Function of a gamlss.family Distribution cens.q Censored Inverse Cumulative Probability Density Function of a gamlss.family Distribution gamlss.cens-package Fitting an Interval Response Variable Using 'gamlss.family' Distributions gen.cens A Function to Generate Appropriate Functions to Be Used to Fit a Censored Response variable in GAMLSS lip Data for lip
Mikis Stasinopoulos [aut, cre, cph] (<https://orcid.org/0000-0003-2407-5704>), Robert Rigby [aut] (<https://orcid.org/0000-0003-3853-1707>), Nicoleta Mortan [ctb], Alexander Seipp [ctb]
Maintainer: Mikis Stasinopoulos <[email protected]>
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
library(survival) library(gamlss) library(gamlss.dist) # comparing results with package survival # fitting the exponential distribution ms1<-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='exponential') mg1<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(EXP),c.crit=0.00001) if(abs(-2*ms1$loglik[2]-deviance(mg1))>0.001) stop(paste("descrepancies in exp")) if(sum(coef(ms1)-coef(mg1))>0.001) warning(paste("descrepancies in coef in exp")) summary(ms1) summary(mg1) # fitting the Weibull distribution ms2 <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='weibull') mg2 <-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI, delta=c(0.001,0.001)), c.crit=0.00001) if(abs(-2*ms2$loglik[2]-deviance(mg2))>0.005) stop(paste("descrepancies in deviance in WEI")) summary(ms2);summary(mg2) # compare the scale parameter 1/exp(coef(mg2,"sigma")) # now fit the Weibull in different parameterrazions mg21<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI2), method=mixed(2,30)) mg21<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI3))
library(survival) library(gamlss) library(gamlss.dist) # comparing results with package survival # fitting the exponential distribution ms1<-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='exponential') mg1<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(EXP),c.crit=0.00001) if(abs(-2*ms1$loglik[2]-deviance(mg1))>0.001) stop(paste("descrepancies in exp")) if(sum(coef(ms1)-coef(mg1))>0.001) warning(paste("descrepancies in coef in exp")) summary(ms1) summary(mg1) # fitting the Weibull distribution ms2 <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='weibull') mg2 <-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI, delta=c(0.001,0.001)), c.crit=0.00001) if(abs(-2*ms2$loglik[2]-deviance(mg2))>0.005) stop(paste("descrepancies in deviance in WEI")) summary(ms2);summary(mg2) # compare the scale parameter 1/exp(coef(mg2,"sigma")) # now fit the Weibull in different parameterrazions mg21<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI2), method=mixed(2,30)) mg21<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI3))
This function can be used to fit censored or interval response variables.
It takes as an argument an existing gamlss.family
distribution
and generates
a new gamlss.family
object which then can be used to fit
right, left or interval censored data.
cens(family = "NO", type = c("right", "left", "interval"), name = "cens", local = TRUE, delta = NULL, ...)
cens(family = "NO", type = c("right", "left", "interval"), name = "cens", local = TRUE, delta = NULL, ...)
family |
a |
name |
the characters you want to add to the name of new functions, by default is |
type |
what type of censoring is required, |
local |
if TRUE the function will try to find the environment of |
delta |
the delta increment used in the numerical derivatives |
... |
for extra arguments |
This function is created to help users to fit censored data using an existing
gamlss.family
distribution.
It does this by taking an existing gamlss.family
and changing
some of the components of the distribution to help the fitting process.
It particular it (i) creates a (d
) function (for calculating the censored
likelihood) and a (p
) function (for generating the quantile residuals)
within gamlss
,
(ii) changes the global deviance function G.dev.incr
,
the first derivative functions (see note below)
and other quantities from the original distribution.
It returns a gamlss.family
object which has all the components needed for fitting a distribution in gamlss
.
This function is experimental and could be changed in the future.
The function cens
changes the first derivatives of the original gamlss family
d
function to numerical derivatives for the new censored d
function.
The default increment delta
, for this numerical derivatives function,
is eps * pmax(abs(x), 1)
where eps<-sqrt(.Machine$double.eps)
.
The default delta
could be inappropriate for
specific applications and can be overwritten by using the argument delta
.
Note that in order to get the correct standard errors you have to generate the "d" function by using
gen.cens()
.
Mikis Stasinopoulos [email protected] and Bob Rigby [email protected]
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
# comparing output with the survreg() of package survival library(gamlss.dist) library(survival) #-------------------------------------------------------------------- # right censoring example # example from survreg() # fitting the exponential distribution mexp<-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='exponential') gexp<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(EXP), c.crit=0.00001) if(abs(-2*mexp$loglik[2]-deviance(gexp))>0.001) stop(paste("descrepancies in exponential models")) if(sum(coef(mexp)-coef(gexp))>0.001) warning(paste("descrepancies in coef in exponential models")) summary(mexp) gen.cens(EXP) summary(gexp) # fitting different distributions # weibull mwei <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='weibull') gwei<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI, delta=c(0.0001,0.0001)), c.crit=0.00001) if(abs(-2*mwei$loglik[2]-deviance(gwei))>0.005) stop(paste("descrepancies in deviance in WEI")) scoef <- sum(coef(mwei)-coef(gwei)) if(abs(scoef)>0.005) warning(cat("descrepancies in coef in WEI of ", scoef, "\n")) # WEI3 is weibull parametrised with mu as the mean gwei3 <- gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI3)) # log normal mlogno <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='lognormal') glogno<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(LOGNO, delta=c(0.001,0.001)), c.cyc=0.00001) if(abs(-2*mlogno$loglik[2]-deviance(glogno))>0.005) stop(paste("descrepancies in deviance in LOGNO")) coef(mlogno);coef(glogno) #-------------------------------------------------------------------- # now interval response variable data(lip) with(lip, y) mg1<-survreg(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, dist="weibull") gg1<- gamlss(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, family=cens(WEI,type="interval"), c.crit=0.00001, n.cyc=200, trace=FALSE) summary(mg1) gen.cens(WEI,type="interval") summary(gg1) #-------------------------------------------------------------------- # now fitting discretised continuous distribution to count data # fitting discretised Gamma data(species) # first generate the distributions gen.cens(GA, type="interval") gen.cens(IG, type="interval") mGA<-gamlss(Surv(fish,fish+1,type= "interval2")~log(lake)+I(log(lake)^2), sigma.fo=~log(lake), data=species, family=GAic) # fitting discretised inverse Gaussian mIG<-gamlss(Surv(fish,fish+1,type= "interval2")~log(lake)+I(log(lake)^2), sigma.fo=~log(lake), data=species, family=IGic) AIC(mGA,mIG) plot(fish~log(lake), data=species) with(species, lines(log(lake)[order(lake)], fitted(mIG)[order(lake)])) #--------------------------------------------------------------------
# comparing output with the survreg() of package survival library(gamlss.dist) library(survival) #-------------------------------------------------------------------- # right censoring example # example from survreg() # fitting the exponential distribution mexp<-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='exponential') gexp<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(EXP), c.crit=0.00001) if(abs(-2*mexp$loglik[2]-deviance(gexp))>0.001) stop(paste("descrepancies in exponential models")) if(sum(coef(mexp)-coef(gexp))>0.001) warning(paste("descrepancies in coef in exponential models")) summary(mexp) gen.cens(EXP) summary(gexp) # fitting different distributions # weibull mwei <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='weibull') gwei<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI, delta=c(0.0001,0.0001)), c.crit=0.00001) if(abs(-2*mwei$loglik[2]-deviance(gwei))>0.005) stop(paste("descrepancies in deviance in WEI")) scoef <- sum(coef(mwei)-coef(gwei)) if(abs(scoef)>0.005) warning(cat("descrepancies in coef in WEI of ", scoef, "\n")) # WEI3 is weibull parametrised with mu as the mean gwei3 <- gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(WEI3)) # log normal mlogno <-survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='lognormal') glogno<-gamlss(Surv(futime, fustat) ~ ecog.ps + rx, data=ovarian, family=cens(LOGNO, delta=c(0.001,0.001)), c.cyc=0.00001) if(abs(-2*mlogno$loglik[2]-deviance(glogno))>0.005) stop(paste("descrepancies in deviance in LOGNO")) coef(mlogno);coef(glogno) #-------------------------------------------------------------------- # now interval response variable data(lip) with(lip, y) mg1<-survreg(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, dist="weibull") gg1<- gamlss(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, family=cens(WEI,type="interval"), c.crit=0.00001, n.cyc=200, trace=FALSE) summary(mg1) gen.cens(WEI,type="interval") summary(gg1) #-------------------------------------------------------------------- # now fitting discretised continuous distribution to count data # fitting discretised Gamma data(species) # first generate the distributions gen.cens(GA, type="interval") gen.cens(IG, type="interval") mGA<-gamlss(Surv(fish,fish+1,type= "interval2")~log(lake)+I(log(lake)^2), sigma.fo=~log(lake), data=species, family=GAic) # fitting discretised inverse Gaussian mIG<-gamlss(Surv(fish,fish+1,type= "interval2")~log(lake)+I(log(lake)^2), sigma.fo=~log(lake), data=species, family=IGic) AIC(mGA,mIG) plot(fish~log(lake), data=species) with(species, lines(log(lake)[order(lake)], fitted(mIG)[order(lake)])) #--------------------------------------------------------------------
Creates a probability density function from a current gamlss.family
distribution to be used for fitting a censored or interval response variable.
cens.d(family = "NO", type = c("right", "left", "interval"), ...)
cens.d(family = "NO", type = c("right", "left", "interval"), ...)
family |
a |
type |
whether |
... |
for extra arguments |
This function is used to calculate the likelihood function for censored data.
This function is not supposed to be used on its own but it is used in function gen.cens
.
Returns a modified d family function.
The argument of the original function d
function are the same.
For an example see gen.cens()
Mikis Stasinopoulos [email protected] and Bob Rigby
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
#see the help for function cens for an example
#see the help for function cens for an example
Creates a cumulative density function from a current gamlss.family
distribution suitable for censored or interval response variable data.
cens.p(family = "NO", type = c("right", "left", "interval"), ...)
cens.p(family = "NO", type = c("right", "left", "interval"), ...)
family |
a |
type |
whether |
... |
for extra arguments |
This function is used to calculate the quantile residuals for censored data distributions.
This function is not supposed to be used on its own but it is used in the function gen.cens
.
Returns a modified p family function. The argument of the original function d function are the same.
For an example see gen.cens()
Mikis Stasinopoulos [email protected] and Bob Rigby [email protected]
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
#see the help for function cens for an example
#see the help for function cens for an example
Creates the inverse cumulative density function from a current gamlss.family
distribution suitable for censored or interval response variable data.
This is a dummy function identical to the uncensored one but it is needed for
consistency in centile estimation from censored data.
cens.q(family = "NO", ...)
cens.q(family = "NO", ...)
family |
a |
... |
for extra arguments |
This is dummy function, used only to calculate centiles from censored response
variable.
This function is not supposed to be used on its own but is used by the function
gen.cens
.
Returns a modified q family function. The argument of the original function q function are the same.
For an example see gen.cens()
Mikis Stasinopoulos [email protected] and Bob Rigby [email protected]
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
#see the help for function cens for an example
#see the help for function cens for an example
The gen.cens()
function allows the user to generate a
d
, p
, (dummy) q
and fitting gamlss
functions for
censor and interval response variables. The function can take any gamlss.family
distribution.
gen.cens(family = "NO", type = c("right", "left", "interval"), name = "cens", ...)
gen.cens(family = "NO", type = c("right", "left", "interval"), name = "cens", ...)
family |
a |
name |
the characters you want to add to the name of new functions, by default is the first letter of |
type |
whether |
... |
for extra arguments |
Returns the d
, p
, (dummy) q
and the fitting used in the fitting gamlss algorithm (The one
used in the fitting gamlss algorithm) of a gamlss.family
distribution.
Mikis Stasinopoulos [email protected] and Bob Rigby
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, 1-38.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
library(gamlss.dist) data(lip) gen.cens(WEI,type="interval") WEIic gg1<- gamlss(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, family=WEIic, c.crit=0.00001, n.cyc=200, trace=FALSE)
library(gamlss.dist) data(lip) gen.cens(WEI,type="interval") WEIic gg1<- gamlss(y ~ poly(Tem,2)+poly(pH,2)+poly(aw,2), data=lip, family=WEIic, c.crit=0.00001, n.cyc=200, trace=FALSE)
The data set used in this package are collected by Dr Peggy Braun (University of Leipzig) and passed on to use by professor Jane Sutherland of London Metropolitan University.
It consists of experimental enzymology results from a research project which attempted to develop a generic food spoilage model.
The data set contains a column called NAMES, which shows the experiment name,
three columns with values of the environmental conditions: temperature (Tem
),
pH
and water activity (aw
), and the rest of the columns
contains the activity of the cocktails, observed at certain days.
The researchers recorded the activity of proteases and lipases in each cocktail and were interested in predicting the time when the activity started given the environmental conditions. The activity is a positive integer and enzymes are considered inactive when activity=0.
data(lip)
data(lip)
A data frame with 120 observations on the following 14 variables.
name
a factor with levels the different experiment
Tem
a numeric vector showing the temperature
pH
a numeric vector PH
aw
a numeric vector water activity
X0.d
a numeric vector if enzyme reacted at day 0
X1.d
a numeric vector if enzyme reacted at day 1
X2.d
a numeric vector if enzyme reacted at day 2
X4.d
a numeric vector if enzyme reacted at days 3 or 4
X11.d
a numeric vector if enzyme reacted at days 5 to 11
X18d.
a numeric vector if enzyme reacted at days 12 to q18
X25.d
a numeric vector if enzyme reacted at days 19 to 25
X32.d
a numeric vector if enzyme reacted at days 26 to 32
X39.d
a numeric vector if enzyme reacted at days 33 to 39
y
a matrix with 3 columns: this is a Surv()
object indicating the start the finish and censored indicator as defined in
function Surv()
of survival.
Prof. Jane Sutherland, London Metropolitan University
data(lip) with(lip, y)
data(lip) with(lip, y)