idlastro / Math and Statistics: LINMIX_ERR

[Source code]

NAME
LINMIX_ERR
PURPOSE
Bayesian approach to linear regression with errors in both X and Y
EXPLANATION
Perform linear regression of y on x when there are measurement
errors in both variables. the regression assumes :
            ETA = ALPHA + BETA * XI + EPSILON
            X = XI + XERR
            Y = ETA + YERR
re, (ALPHA, BETA) are the regression coefficients, EPSILON is the
trinsic random scatter about the regression, XERR is the
asurement error in X, and YERR is the measurement error in
EPSILON is assumed to be normally-distributed with mean zero and
riance SIGSQR. XERR and YERR are assumed to be
rmally-distributed with means equal to zero, variances XSIG^2 and
IG^2, respectively, and covariance XYCOV. The distribution of XI
modelled as a mixture of normals, with group proportions PI,
an MU, and variance TAUSQR. Bayesian inference is employed, and
structure containing random draws from the posterior is
turned. Convergence of the MCMC to the posterior is monitored
ing the potential scale reduction factor (RHAT, Gelman et
.2004). In general, when RHAT < 1.1 then approximate convergence
reached.
mple non-detections on y may also be included.
LLING SEQUENCE
LINMIX_ERR, X, Y, POST, XSIG=, YSIG=, XYCOV=, DELTA=, NGAUSS=, /SILENT, 
           /METRO, MINITER= , MAXITER= 
PUTS
X - THE OBSERVED INDEPENDENT VARIABLE. THIS SHOULD BE AN
NX-ELEMENT VECTOR.
Y - THE OBSERVED DEPENDENT VARIABLE. THIS SHOULD BE AN NX-ELEMENT
VECTOR.
TIONAL INPUTS
XSIG - THE 1-SIGMA MEASUREMENT ERRORS IN X, AN NX-ELEMENT VECTOR.
YSIG - THE 1-SIGMA MEASUREMENT ERRORS IN Y, AN NX-ELEMENT VECTOR.
XYCOV - THE COVARIANCE BETWEEN THE MEASUREMENT ERRORS IN X AND Y,
AND NX-ELEMENT VECTOR.
DELTA - AN NX-ELEMENT VECTOR INDICATING WHETHER A DATA POINT IS
CENSORED OR NOT. IF DELTA[i] = 1, THEN THE SOURCE IS
DETECTED, ELSE IF DELTA[i] = 0 THE SOURCE IS NOT DETECTED
AND Y[i] SHOULD BE AN UPPER LIMIT ON Y[i]. NOTE THAT IF
THERE ARE CENSORED DATA POINTS, THEN THE
MAXIMUM-LIKELIHOOD ESTIMATE (THETA) IS NOT VALID. THE
DEFAULT IS TO ASSUME ALL DATA POINTS ARE DETECTED, IE,
DELTA = REPLICATE(1, NX).
METRO - IF METRO = 1, THEN THE MARKOV CHAINS WILL BE CREATED USING
THE METROPOLIS-HASTINGS ALGORITHM INSTEAD OF THE GIBBS
SAMPLER. THIS CAN HELP THE CHAINS CONVERGE WHEN THE SAMPLE
SIZE IS SMALL OR IF THE MEASUREMENT ERRORS DOMINATE THE
SCATTER IN X AND Y.
SILENT - SUPPRESS TEXT OUTPUT.
MINITER - MINIMUM NUMBER OF ITERATIONS PERFORMED BY THE GIBBS
SAMPLER OR METROPOLIS-HASTINGS ALGORITHM. IN GENERAL,
MINITER = 5000 SHOULD BE SUFFICIENT FOR CONVERGENCE. THE
DEFAULT IS MINITER = 5000. THE MCMC IS STOPPED AFTER 
RHAT < 1.1 FOR ALL PARAMETERS OF INTEREST, AND THE
NUMBER OF ITERATIONS PERFORMED IS GREATER THAN MINITER.
MAXITER - THE MAXIMUM NUMBER OF ITERATIONS PERFORMED BY THE
MCMC. THE DEFAULT IS 1D5. THE MCMC IS STOPPED
AUTOMATICALLY AFTER MAXITER ITERATIONS.
NGAUSS - THE NUMBER OF GAUSSIANS TO USE IN THE MIXTURE
MODELLING. THE DEFAULT IS 3. IF NGAUSS = 1, THEN THE
PRIOR ON (MU, TAUSQR) IS ASSUMED TO BE UNIFORM.
TPUT
POST - A STRUCTURE CONTAINING THE RESULTS FROM THE MCMC. EACH
       ELEMENT OF POST IS A DRAW FROM THE POSTERIOR DISTRIBUTION
       FOR EACH OF THE PARAMETERS.
         ALPHA - THE CONSTANT IN THE REGRESSION.
         BETA - THE SLOPE OF THE REGRESSION.
         SIGSQR - THE VARIANCE OF THE INTRINSIC SCATTER.
         PI - THE GAUSSIAN WEIGHTS FOR THE MIXTURE MODEL.
         MU - THE GAUSSIAN MEANS FOR THE MIXTURE MODEL.
         TAUSQR - THE GAUSSIAN VARIANCES FOR THE MIXTURE MODEL.
         MU0 - THE HYPERPARAMETER GIVING THE MEAN VALUE OF THE
               GAUSSIAN PRIOR ON MU. ONLY INCLUDED IF NGAUSS >
               1.
         USQR - THE HYPERPARAMETER DESCRIBING FOR THE PRIOR
                VARIANCE OF THE INDIVIDUAL GAUSSIAN CENTROIDS
                ABOUT MU0. ONLY INCLUDED IF NGAUSS > 1.
         WSQR - THE HYPERPARAMETER DESCRIBING THE `TYPICAL' SCALE
                FOR THE PRIOR ON (TAUSQR,USQR). ONLY INCLUDED IF
                NGAUSS > 1.
         XIMEAN - THE MEAN OF THE DISTRIBUTION FOR THE
                  INDEPENDENT VARIABLE, XI.
         XISIG - THE STANDARD DEVIATION OF THE DISTRIBUTION FOR
                 THE INDEPENDENT VARIABLE, XI.
         CORR - THE LINEAR CORRELATION COEFFICIENT BETWEEN THE
                DEPENDENT AND INDEPENDENT VARIABLES, XI AND ETA.
LLED ROUTINES
RANDOMCHI, MRANDOMN, RANDOMGAM, RANDOMDIR, MULTINOM
FERENCES
Carroll, R.J., Roeder, K., & Wasserman, L., 1999, Flexible
Parametric Measurement Error Models, Biometrics, 55, 44
Kelly, B.C., 2007, Some Aspects of Measurement Error in
Linear Regression of Astronomical Data, The Astrophysical
Journal, 665, 1489 (arXiv:0705.2774)
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B., 2004,
Bayesian Data Analysis, Chapman & Hall/CRC
VISION HISTORY
AUTHOR : BRANDON C. KELLY, STEWARD OBS., JULY 2006
- MODIFIED PRIOR ON MU0 TO BE UNIFORM OVER [MIN(X),MAX(X)] AND
PRIOR ON USQR TO BE UNIFORM OVER [0, 1.5 * VARIANCE(X)]. THIS
TENDS TO GIVE BETTER RESULTS WITH FEWER GAUSSIANS. (B.KELLY, MAY
2007)
- FIXED BUG SO THE ITERATION COUNT RESET AFTER THE BURNIN STAGE
WHEN SILENT = 1 (B. KELLY, JUNE 2009)
- FIXED BUG WHEN UPDATING MU VIA THE METROPOLIS-HASTING
UPDATE. PREVIOUS VERSIONS DID NO INDEX MUHAT, SO ONLY MUHAT[0]
WAS USED IN THE PROPOSAL DISTRIBUTION. THANKS TO AMY BENDER FOR
POINTING THIS OUT. (B. KELLY, DEC 2011)