- NAME
LINMIX_ERR
- PURPOSE
Bayesian approach to linear regression with errors in both X and Y
- EXPLANATION
Perform linear regression of y on x when there are measurement
errors in both variables. the regression assumes :
ETA = ALPHA + BETA * XI + EPSILON
X = XI + XERR
Y = ETA + YERR
- re, (ALPHA, BETA) are the regression coefficients, EPSILON is the
- trinsic random scatter about the regression, XERR is the
- asurement error in X, and YERR is the measurement error in
EPSILON is assumed to be normally-distributed with mean zero and
- riance SIGSQR. XERR and YERR are assumed to be
- rmally-distributed with means equal to zero, variances XSIG^2 and
- IG^2, respectively, and covariance XYCOV. The distribution of XI
modelled as a mixture of normals, with group proportions PI,
- an MU, and variance TAUSQR. Bayesian inference is employed, and
- structure containing random draws from the posterior is
- turned. Convergence of the MCMC to the posterior is monitored
- ing the potential scale reduction factor (RHAT, Gelman et
- .2004). In general, when RHAT < 1.1 then approximate convergence
reached.
- mple non-detections on y may also be included.
- LLING SEQUENCE
LINMIX_ERR, X, Y, POST, XSIG=, YSIG=, XYCOV=, DELTA=, NGAUSS=, /SILENT,
/METRO, MINITER= , MAXITER=
- PUTS
- X - THE OBSERVED INDEPENDENT VARIABLE. THIS SHOULD BE AN
NX-ELEMENT VECTOR.
- Y - THE OBSERVED DEPENDENT VARIABLE. THIS SHOULD BE AN NX-ELEMENT
VECTOR.
- TIONAL INPUTS
- XSIG - THE 1-SIGMA MEASUREMENT ERRORS IN X, AN NX-ELEMENT VECTOR.
- YSIG - THE 1-SIGMA MEASUREMENT ERRORS IN Y, AN NX-ELEMENT VECTOR.
- XYCOV - THE COVARIANCE BETWEEN THE MEASUREMENT ERRORS IN X AND Y,
AND NX-ELEMENT VECTOR.
- DELTA - AN NX-ELEMENT VECTOR INDICATING WHETHER A DATA POINT IS
CENSORED OR NOT. IF DELTA[i] = 1, THEN THE SOURCE IS
DETECTED, ELSE IF DELTA[i] = 0 THE SOURCE IS NOT DETECTED
AND Y[i] SHOULD BE AN UPPER LIMIT ON Y[i]. NOTE THAT IF
THERE ARE CENSORED DATA POINTS, THEN THE
MAXIMUM-LIKELIHOOD ESTIMATE (THETA) IS NOT VALID. THE
DEFAULT IS TO ASSUME ALL DATA POINTS ARE DETECTED, IE,
DELTA = REPLICATE(1, NX).
- METRO - IF METRO = 1, THEN THE MARKOV CHAINS WILL BE CREATED USING
THE METROPOLIS-HASTINGS ALGORITHM INSTEAD OF THE GIBBS
SAMPLER. THIS CAN HELP THE CHAINS CONVERGE WHEN THE SAMPLE
SIZE IS SMALL OR IF THE MEASUREMENT ERRORS DOMINATE THE
SCATTER IN X AND Y.
- SILENT - SUPPRESS TEXT OUTPUT.
- MINITER - MINIMUM NUMBER OF ITERATIONS PERFORMED BY THE GIBBS
SAMPLER OR METROPOLIS-HASTINGS ALGORITHM. IN GENERAL,
MINITER = 5000 SHOULD BE SUFFICIENT FOR CONVERGENCE. THE
DEFAULT IS MINITER = 5000. THE MCMC IS STOPPED AFTER
RHAT < 1.1 FOR ALL PARAMETERS OF INTEREST, AND THE
NUMBER OF ITERATIONS PERFORMED IS GREATER THAN MINITER.
- MAXITER - THE MAXIMUM NUMBER OF ITERATIONS PERFORMED BY THE
MCMC. THE DEFAULT IS 1D5. THE MCMC IS STOPPED
AUTOMATICALLY AFTER MAXITER ITERATIONS.
- NGAUSS - THE NUMBER OF GAUSSIANS TO USE IN THE MIXTURE
MODELLING. THE DEFAULT IS 3. IF NGAUSS = 1, THEN THE
PRIOR ON (MU, TAUSQR) IS ASSUMED TO BE UNIFORM.
- TPUT
POST - A STRUCTURE CONTAINING THE RESULTS FROM THE MCMC. EACH
ELEMENT OF POST IS A DRAW FROM THE POSTERIOR DISTRIBUTION
FOR EACH OF THE PARAMETERS.
ALPHA - THE CONSTANT IN THE REGRESSION.
BETA - THE SLOPE OF THE REGRESSION.
SIGSQR - THE VARIANCE OF THE INTRINSIC SCATTER.
PI - THE GAUSSIAN WEIGHTS FOR THE MIXTURE MODEL.
MU - THE GAUSSIAN MEANS FOR THE MIXTURE MODEL.
TAUSQR - THE GAUSSIAN VARIANCES FOR THE MIXTURE MODEL.
MU0 - THE HYPERPARAMETER GIVING THE MEAN VALUE OF THE
GAUSSIAN PRIOR ON MU. ONLY INCLUDED IF NGAUSS >
1.
USQR - THE HYPERPARAMETER DESCRIBING FOR THE PRIOR
VARIANCE OF THE INDIVIDUAL GAUSSIAN CENTROIDS
ABOUT MU0. ONLY INCLUDED IF NGAUSS > 1.
WSQR - THE HYPERPARAMETER DESCRIBING THE `TYPICAL' SCALE
FOR THE PRIOR ON (TAUSQR,USQR). ONLY INCLUDED IF
NGAUSS > 1.
XIMEAN - THE MEAN OF THE DISTRIBUTION FOR THE
INDEPENDENT VARIABLE, XI.
XISIG - THE STANDARD DEVIATION OF THE DISTRIBUTION FOR
THE INDEPENDENT VARIABLE, XI.
CORR - THE LINEAR CORRELATION COEFFICIENT BETWEEN THE
DEPENDENT AND INDEPENDENT VARIABLES, XI AND ETA.
- LLED ROUTINES
RANDOMCHI, MRANDOMN, RANDOMGAM, RANDOMDIR, MULTINOM
- FERENCES
- Carroll, R.J., Roeder, K., & Wasserman, L., 1999, Flexible
Parametric Measurement Error Models, Biometrics, 55, 44
- Kelly, B.C., 2007, Some Aspects of Measurement Error in
Linear Regression of Astronomical Data, The Astrophysical
Journal, 665, 1489 (arXiv:0705.2774)
- Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B., 2004,
Bayesian Data Analysis, Chapman & Hall/CRC
- VISION HISTORY
AUTHOR : BRANDON C. KELLY, STEWARD OBS., JULY 2006
- - MODIFIED PRIOR ON MU0 TO BE UNIFORM OVER [MIN(X),MAX(X)] AND
PRIOR ON USQR TO BE UNIFORM OVER [0, 1.5 * VARIANCE(X)]. THIS
TENDS TO GIVE BETTER RESULTS WITH FEWER GAUSSIANS. (B.KELLY, MAY
2007)
- - FIXED BUG SO THE ITERATION COUNT RESET AFTER THE BURNIN STAGE
WHEN SILENT = 1 (B. KELLY, JUNE 2009)
- - FIXED BUG WHEN UPDATING MU VIA THE METROPOLIS-HASTING
UPDATE. PREVIOUS VERSIONS DID NO INDEX MUHAT, SO ONLY MUHAT[0]
WAS USED IN THE PROPOSAL DISTRIBUTION. THANKS TO AMY BENDER FOR
POINTING THIS OUT. (B. KELLY, DEC 2011)