tag:blogger.com,1999:blog-6288862798546085706.post6371169213466024660..comments2022-05-25T07:09:25.580-04:00Comments on Econometrics By Simulation: Easily generate correlated variables from any distributionFrancishttp://www.blogger.com/profile/16658586705916884436noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-6288862798546085706.post-53358807940900588212016-08-17T01:49:18.901-04:002016-08-17T01:49:18.901-04:00This comment has been removed by the author.Vivek Choudharyhttps://www.blogger.com/profile/15663156083307693165noreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-89755441394552357552014-03-03T06:12:41.708-05:002014-03-03T06:12:41.708-05:00Hi Francis,
Thanks for bringing this to our atten...Hi Francis,<br /><br />Thanks for bringing this to our attention.<br /><br />The NORTA procedure (from normal, through uniform, to anything) works very well for us, because it preserves the Spearman rank correlations:<br /><br />1. For the normal and the uniform variables, the Pearson correlations are related by rho_normal = 2 sin (rho_uniform x pi/6),<br /><br />2. For the uniform variables, the Pearson and Spearman correlations are identical,<br /><br />3. So for the final variables, the Spearman correlation can be specified without any error (unless there is probability mass at particular points).<br /><br />By the way, I do not agree with you that the NORTA is “without copulas”. They are still there, in the intermediate step, but they can be ignored if you want to.<br /><br />WilbertAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-25490218741854564272014-02-28T16:17:49.548-05:002014-02-28T16:17:49.548-05:00Great points Bryce especially with regards to the ...Great points Bryce especially with regards to the copula question. Apparently, I was using some form of Copulas. As for your first point I think you may be mistaken. The distribution of the multivariate normals each are defined as mean 0 and variance 1. I only modified the non-diagnol portion of the variance/covariance matrix. Therefore, using pnorm without specifying mean and variance correctly transforms the normal variables into uniform variables.<br /><br />Observe:<br />> rawvars <- mvrnorm(n=100000, mu=mu, Sigma=Sigma)<br />> var(rawvars)<br /> [,1] [,2] [,3] [,4]<br />[1,] 1.0011314 0.7044411 0.7034198 0.7007685<br />[2,] 0.7044411 1.0092061 0.7068980 0.7068707<br />[3,] 0.7034198 0.7068980 1.0062027 0.7051700<br />[4,] 0.7007685 0.7068707 0.7051700 1.0039373<br /><br />> summary(rawvars)<br /> V1 V2 V3 V4 <br /> Min. :-4.115731 Min. :-4.512754 Min. :-4.219649 Min. :-3.976831 <br /> 1st Qu.:-0.679732 1st Qu.:-0.681546 1st Qu.:-0.679985 1st Qu.:-0.680782 <br /> Median :-0.005315 Median :-0.002374 Median :-0.002781 Median :-0.001728 <br /> Mean :-0.004668 Mean :-0.001369 Mean :-0.003259 Mean :-0.004019 <br /> 3rd Qu.: 0.675674 3rd Qu.: 0.683425 3rd Qu.: 0.676256 3rd Qu.: 0.673854 <br /> Max. : 4.712489 Max. : 4.336386 Max. : 5.606908 Max. : 4.630360 <br /><br />Does this answer your critique?Francishttps://www.blogger.com/profile/16658586705916884436noreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-3971710500962513242014-02-28T16:02:55.585-05:002014-02-28T16:02:55.585-05:00Hi Francis,
I cannot seem to get your blog to acc...Hi Francis,<br /><br />I cannot seem to get your blog to accept my comments.<br /><br />I enjoyed your R demo on Thursday related to generating correlated random variables. However, it seems to me that applying "pnorm" without specifying the means via which to center the multivariate normal variates won't yield exactly uniform random variates, and therefore the output of the q- functions won't come from the specified distributions. It may come from some more generalized non-central version however, but that I won’t pretend to understand.<br /><br />I also do not understand why you write "without copulas”. If I'm right about the bug and it's corrected, at that point the routine implements first 1) simulation of a Gaussian copula via the mvnorm and pnorm steps (due to the probability integral transform of normal marginals of a multivariate normal) and 2) simulation of the desired marginals using their q-functions transformation to the desired marginals. Though I’m no copula specialist, I thought this was exactly how you use a Gaussian copula to generate correlated random variables with specified marginals. For example, this is discussed under the heading “Monte Carlo integration for copula models” on the Wikipedia page for copulas. Perhaps the term “copula” intimidates some readers, but I think it’d be better to "take ownership of it” (to use an idiom of political activists) rather than avoid it. The idea is to link the technical mathematical edifice to a more nuts and bolts understanding of how someone might use a copula. That way we get any benefits of the theoretical edifice and bring that to a wider audience.<br /><br />In case it helps, some recent statistics literature refers to joining estimates of Pearson correlations and arbitrary marginals as estimation of a non-paranormal distribution.<br />http://repository.cmu.edu/cgi/viewcontent.cgi?article=2024&context=compsci<br /><br />Best wishes,<br />BryceAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-26689986236769518932014-02-28T04:06:46.308-05:002014-02-28T04:06:46.308-05:00Thanks for these citations. That is very helpful ...Thanks for these citations. That is very helpful here is the MLA citations:<br /><br />Cario, Marne C., and Barry L. Nelson. Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical Report, Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois, 1997.<br /><br />Yahav, Inbal, and Galit Shmueli. "On generating multivariate poisson data in management science applications." Robert H. Smith School Research Paper No. RHS (2009): 06-085.Francishttps://www.blogger.com/profile/16658586705916884436noreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-21313101784709361912014-02-27T11:19:20.270-05:002014-02-27T11:19:20.270-05:00Cool post. I think the method has been around for ...Cool post. I think the method has been around for a while under multiple names (adds to confusion). Here are some references. <br />http://onlinelibrary.wiley.com/doi/10.1002/asmb.901/pdf<br />http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.281&rep=rep1&type=pdf<br /><br />I think the method is very much related to the probability integral transform. <br />http://en.wikipedia.org/wiki/Probability_integral_transform<br /><br />Would be cool to know more details regarding the pros/cons of this method versus copulas. And if any other general purpose methods exist for generating correlated data?<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6288862798546085706.post-38960878256609576122014-02-27T09:49:53.888-05:002014-02-27T09:49:53.888-05:00Thank you for this! I have been trying to build th...Thank you for this! I have been trying to build this functionality in R through the "datasynthR" package. https://github.com/jknowles/datasynthR I think I have implemented this method in many cases, but not all. I will be referring to this post as I continue to expand the package. Anonymoushttps://www.blogger.com/profile/00462769186660770584noreply@blogger.com