Wednesday, October 8, 2014

Julia style string literal interpolation in R

I feel like a sculptor who has been using the same metal tools for the last four years and happened to have looked at my comrades and found them sporting new, sleek electric tools. Suddenly all of the hard work put into maintaining and adapting my metal tools ends up looking like duck tape and bubble gum patches.

I hate to say it but I feel that I have become somewhat infatuated with Julia. And infatuation is the right word. I have not yet committed the time to fully immerse myself in the language, yet everything I know about it makes me want to learn more. The language is well known for its mind-blowingly speed accomplished through just-in-time compiling. It also has many features which enhance the efficiency and readability of its code (see previous post, note the documentation has greatly improved since posting).

However, though I very much want to, I cannot entirely switch my coding needs from R into Julia. This is primarily due to my ongoing usage of packages such as RStudio's "Shiny" and the University of Cambridge's server side software for building adaptive tests, "Concerto". And so with regret I will resign my Julia coding to probably a minor portion of my programming needs.

That does not mean however that I can't make some small changes to make R work more like Julia. To this end I have programmed a small function p which will replace string literals identified as "Hello #(name), how are you?" with their values being evaluated. If there are nested parenthesizes then it is necessary to close the literal with ")#", for example "c=#(b^(1+a))#".

# Julia like text concaction function.
p <- function(..., sep="", esc="#") { 
  # Change escape characters by specifying esc.
  # Break the input values into different strings cut at '#('
x <- paste(..., sep=sep)
x <- unlist(strsplit(x, paste0(esc,"("), fixed = TRUE))

# The first element is never evaluated.
out <- x[1]
# Check if x has been split.
if (length(x)>1) for (i in 2:length(x)) {
y <- unlist(strsplit(x[i], paste0(")",esc), fixed = TRUE))
if (x[i]==y[1])
y <- unlist(regmatches(x[i], regexpr(")", x[i]),
invert = TRUE))
out <- paste0(out, eval(parse(text=y[1])), y[-1])
}
out
}

name="Bob"
height=72
weight=230

# Let's see it in action
p(sep=" ", "Hello #(name).",
"My record indicates you are #(height) inches tall and weigh #(weight) pounds.",
"Your body mass index is #(round(703*weight/height^2,1))#") 
# [1] "Hello Bob. My record indicates you are 72 inches tall and weigh 230 pounds.
# Your body mass index is 31.2"

# The other nice thing about the p function is that it can be used to concat
# strings as a shortcut for paste0.
p("QRS","TUV")
# [1] "QRSTUV"
Created by Pretty R at inside-R.org

Thank you SO community for your help.

1. Thanks very much for this. I haven't used Julia but I'm used to string interpolation in Scala and have missed it in R. Nice touch to make the escape character a parameter.

2. Note that gsubfn in the gsubfn package can do string interpolation, e.g. gsubfn(x = "Your height is $height, weight is$weight, your BMI is round(703*weight/height^2,1)") produces "Your height is 72, weight is 230, your BMI is 31.2" .

1. Thank you! This is great. I don't think my command above is working very well.

3. OR, instead of reinventing the wheel you can use the sprintf() function in base R.

>message(sprintf("Hello %s. My record indicates you are %s inches tall and weigh %s pounds.\nYour body mass index is %s.", name, height, weight, round(703*weight/height^2,1)))

The message() function is what recognizes the new line character (\n), otherwise if you don't need new lines sprintf() will get you all the way there.
Every time I think R doesn't have something, my coworker, who's a straight up old school Rsmith, takes me to school with base R (this happens a lot with new fangled R packages I try and stay hip with).

4. OR, instead of reinventing the wheel you can use the sprintf() function in base R.

>message(sprintf("Hello %s. My record indicates you are %s inches tall and weigh %s pounds.\nYour body mass index is %s.", name, height, weight, round(703*weight/height^2,1)))

The message() function is what recognizes the new line character (\n), otherwise if you don't need new lines sprintf() will get you all the way there.
Every time I think R doesn't have something, my coworker, who's a straight up old school Rsmith, takes me to school with base R (this happens a lot with new fangled R packages I try and stay hip with).