Monday, August 19, 2013

Export R Results Tables to Excel - Please don't kick me out of your club

This post is written as a result of finding the following exchange on one of the R mailing lists:

Is-there-a-way-to-export-regression-output-to-an-excel-spreadsheet

[Make sure to check out the many great comments on the bottom of the post.  That is where some better answers to this problem can be found.]

Question: Is there a way to export regression output to an excel spreadsheet?
Translation: I would like to be able to do a very simple thing that almost any statistical programming language can easily do, please suggest a basic command to do that.

Response1: ?lm ?coef ?write.csv ...
Translation: Read the manual and try this bit of incomplete code.

Questioner: I am very very new with R... Is there some simple code I could just paste?
Translation: Really? Isn't there anything you could suggest?

Response2: This is the help-you-learn-R mailing list, not the do-my-work-for-me mailing list...
Translation: Go F%$# yourself, freeloader. We only answer interesting questions.

Me:
I just wanted to say that it is just this type of response that gives R-users a bad reputation. I am an active R user and very happy to contribute to R in whatever way is possible but when I see posts like this, it makes me want to switch to a language in which the users are NICE people.  Okay, I know, I know.  A few bad apples should not spoil the basket, but sometimes things just taste rotten.

First thing, the reason I even stumbled across this post was because I had the same very similar question. Looking at the hits on the bottom of the page I can see that there are over 400 people who have viewed this discussion I am guessing most of them because they were looking for a specific solution rather than being interested in seeing how quickly experienced R users could could offend new users (for which there are numerous other examples).

In all likelihood, a lot of other new R users have come across this same post and been equally confounded but this rude and ridiculous response.

The original user who asked this question asked a very simple question for which any statistical language should have a very simple canned response.  Something along the lines:

lmOut(mylm, file="results", filetype="csv")

Yet the response that was instead produced was one which was overly complex, patronizing, and ultimately needlessly insulting.

I have written a little program to help format regression summary statistics into spreadsheet formats easily read by excel. Sorry if this is redundant. I am sure hundreds of people have programmed similar solutions. But I think it might be useful to many users who are not very familiar with how R constructs results.


Find the code on this gist:
https://gist.github.com/EconometricsBySimulation/6274532

Output looks something like this:


Looking over the original exchange it does not look like this code would even work for a logit for which it was originally needed.  However, I will post it on github and perhaps others will find the concept useful enough to make revisions (unlikely).  I might take another go at making it more general in the future though in all likelihood some user will send me an angry message saying "this has already been done by ...".

20 comments :

  1. Why not to use XLConnect package to write your results to worksheet? This code is the simpliest way to do such thing. Give it a try:)

    library(XLConnect)
    data(iris)
    wyn<-summary(lm(Sepal.Length~Sepal.Width,data=iris))$coeff
    writeWorksheetToFile(file='test.xlsx',data=wyn,sheet='test')

    ReplyDelete
  2. Good post and thanks for the function. I agree that people can often lose their humility when they attain greater knowledge and skill which obviously happens sometimes with users on R mailing lists. But then again, its also about finding a good balance. People usually gain more by solving things for themselves but sometimes they need that little bit of help to get them started. In any case, its always better to be respectful!

    ReplyDelete
  3. lmOut(myres, file="my-results.csv")
    Error in ncol + 1 : non-numeric argument to binary operator
    Enter a frame number, or 0 to exit

    1: source("~/.active-rstudio-document", echo = TRUE)
    2: withVisible(eval(ei, envir))
    3: eval(ei, envir)
    4: eval(expr, envir, enclos)
    5: .active-rstudio-document#43: lmOut(myres, file = "my-results.csv")
    6: .active-rstudio-document#12: matrix("", nrow = nvar + nstats, ncol = ncol + 1)

    ReplyDelete
  4. If you want to write to an excel format rather than a csv format, see packages xlsx and XLConnect. Here's an overview, although not specific to lm: https://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/.

    ReplyDelete
  5. Fourth answer, plus translation, is the reason why I don't care to read R-help any longer.

    Also, because thanks to cheap HDD space, my own stupidity would be visible for later generations, much to my embarrassement, I guess.

    I know prefer to bugger my colleagues, which seem to be all better in learning R and stats than me. I'm lucky to have such an environment. My supervisors and co-authors, on the other hand, are utterly useless in such a case.

    ReplyDelete
  6. Yea, there's a much simpler way to do this using the R2HTML package. The package has an HTML2clip function which will format the summary output as an HTML table on the clipboard, which can be pasted into Excel.

    So, code would be:
    library(R2HTML)
    HTML2clip(summary(lm.model))

    and then you just past into Excel.

    http://rss.acs.unt.edu/Rdoc/library/R2HTML/html/HTML2clip.html

    -Aaron

    ReplyDelete
  7. I had to face that problem when teaching R to social science students who do not use TeX and thus could not use the stargazer package to get their OLS results out of the console. It'd be great if the package could gain some form of CSV export (it already exports to ASCII, so even that should be parsable).

    ReplyDelete
  8. I've had good luck using package xtable to format tables in html. Save that to a file and import it into Excel or print it to the console, select it and paste it into Excel.

    ReplyDelete
  9. It's a shame the original questioner felt obliged to delete their question. A more useful answer would be, someone on stackoverflow might *explain* it to you, but R-help will try to make it as obscure and difficult as possible to deter 'stupid' questions.

    My favourite example is:
    http://r.789695.n4.nabble.com/help-page-becomes-unavailable-after-a-package-is-reinstalled-td3671176.html

    Anyone ever heard of the questioner, Yihui Xie, since then? He seems very gracious:

    http://blog.revolutionanalytics.com/2011/08/brian-ripley-on-the-r-development-process.html

    Some questions are answered clearly though, even novice ones I've found useful:
    http://r.789695.n4.nabble.com/Correlations-by-subgroups-tt3599548.html

    Maybe *Novice Question:...* would help some ignore it, rather than get exasperated and try to demonstrate that's what it is. ?help

    ReplyDelete
  10. Just noticed this posted to R-bloggers. I've recently finished writing up two informal papers (for our local online magazine: Benchmarks) which deal with similar issues -- getting stuff out of R and into Excel as well as getting stuff from Excel into R...See Adobe.pdf files below.

    ~Jon

    http://www.unt.edu/rss/class/Jon/Benchmarks/ExportExcel_L_JDS_Sep2013.pdf

    http://www.unt.edu/rss/class/Jon/Benchmarks/ImportMultipleExcel_L_JDS_Aug2013.pdf

    ReplyDelete
  11. A couple things. First, like it or not, there is an "intro" to the r-help mailing lists which people are supposed to read and respect. The mailing lists are quite specifically NOT for people who have no understanding of software, which is certainly the case for the vignette you posted. Novices should learn how to program and how to use R by reading the excellent R-intro document. Frankly, the responses weren't insulting. They were blunt and to the point. If the mailing list encouraged questioners of all levels it would collapse under the weight of the posts. People really, no REALLY, need to be taught to do their own research, in this case meaning scouring the help archives for similar questions.

    Second, I'm a bit concerned that your example of writing from R to Excel includes massive truncation of the data. Unless you have some a priori reason to believe everyone's data are precise to only 3 places or so, why are you doing that?

    ReplyDelete
    Replies
    1. As for your first point. I guess I am just baffled why such a fundamental question would not have an easy answer. That are the basic features of any statistical software? 1. Import data 2. Do analysis 3. Export analysis. How much scouring really should be necessary?

      As for your second point. It is an option in the function to choose the number of digits to truncate. I was hoping anybody who was looking at it could figure that out pretty easy but perhaps I should have put more comments.

      Delete
    2. Point One: People who ask that question need to step back and learn about the concept of software, not to mention the concept of "find the existing introductive tutorials." The original question is right up there with "Hey, how do I change the font size in Microsoft Word." Do you really believe questions like that should be asked? Point two: yes I saw your options. That's not how precision analysis works. If you are going to export data, the best thing to do by far is export as binary, i.e. precision to the limit of the source. If you must go by way of ASCII text, then always send as many digits as possible; leaving actual analysis for later.

      Delete
    3. I have no interest in pissing contests. I do not present myself as any kind of expert at anything let alone an expert at "precision analysis" ;)

      As for the primary point of me critiquing the hostile tone of the original post. It seems you are continuing expressing yourself in the same brutal tone.

      In addition, I don't see how it serves anybody having an R censorship board running around telling people what questions are worth asking.

      I personally have asked a lot of stupid questions:
      http://stackoverflow.com/questions/18205080/shiny-html-output-object-that-takes-html-code-for-easy-copy-and-paste/

      http://stackoverflow.com/questions/18193073/is-it-possible-to-create-a-sub-folder-directly-through-the-github-web-applicatio

      http://stackoverflow.com/questions/17934948/having-difficulty-installing-r-package-in-specific-directory/

      http://stackoverflow.com/questions/17826061/save-p-values-from-r-ivreg-aer-package-or-tsls-sem-package/

      http://stackoverflow.com/questions/16523574/variable-reference-elements-of-lists-in-r/

      Fortunately, people were much kinder to me than I am guessing you would have been.

      Delete
  12. I wrote a small package that attempts to write an 'arbitrary' R data structure (vector, df, matrix, list) to a CSV file in a reasonable format (list elements are written sequentially and indented according to their level in the list), using a single command, 'toCSV'. It was one of my first attempts at an R package and is probably a piece of rubbish - there is at least one bug in there that still needs fixing. Ergo, it has not been released.

    But, seeing this post, perhaps I should pluck up my courage, tidy up the package and put it on GitHub...

    A

    ReplyDelete
  13. Someone else's solution: http://strengejacke.wordpress.com/2013/08/20/print-glm-output-to-html-table-rstats/

    ReplyDelete
  14. Another solution I found usefull in the same usecase:

    http://strengejacke.wordpress.com/2013/08/20/print-glm-output-to-html-table-rstats/

    ReplyDelete
  15. @cellocgw Boo to that opinion (not to you personally, but definitely to that opinion). Let's not pretend that the internet is a place where people read introductory documents before they participate.

    The *correct* response to any overly-simple question on R-help is "I think you'll find your answer (and more!) over at stackoverflow.com. It's easy to ask questions and many answers are already there! Good luck!"

    That said, even SO still has some barriers to the really new. I would love to see a sandbox area of SO where dumb/repeat/unresearched questions are not only allowed, but encouraged.

    ReplyDelete
  16. Appreciated! Small typo, though: ncol twice where you refer to the created ncoll object.

    ReplyDelete