# Printing usually works like this
print(1:9)
[1] 1 2 3 4 5 6 7 8 9
Programming necessarily presupposes a programming environment. Just as one can’t write text without choosing a font, one can’t write code outside of an environment.
The environment of an R script is like air is to us: you might forget that it exists, but it is always there. Then why write a whole chapter about this? Why write about air?
# Printing usually works like this
print(1:9)
[1] 1 2 3 4 5 6 7 8 9
# What happened here?
print(1:9)
[1] 1 2 3
[4] 4 5 6
[7] 7 8 9
Breathing foul air can make you dizzy, and working in a broken environment might cause your script to function incorrectly; this is why we need to learn how to properly manage it. There is no trick in the code above, by the way. I haven’t redefined the print()
function, I just changed the width
option in R’s environment.
In Chapter 3, we’ll discuss how to organize an R project, but first we need some guidelines on how to organize files in general. Many people underestimate the usefulness organizing and naming their files consistently, but the truth is that having a system saves you valuable time when searching through your past projects.
First of all, I suggest you create a folder in your computer for all your programming needs. It’s not that rare to create a lot of files when coding, and making sure they will all be in a single place really helps. Since scripts are able to write and delete files, this lowers the odds that an R program ruins your valuable personal files by accident.
Now for the files. Danielle Navarro has an amazing presentation about project structure where she outlines three main principles to go by when naming files:
Machines usually get confused by spaces, special characters (like ^.*?+|$"
), and accents. Some operating systems also treat uppercase and lowercase as the same. Our files should, therefore, always separate words with _
or -
(consistently), only use lowercase letters and numbers, and never use accented characters.
# Good
draft01_jalapeno_essay.docx
# Bad Draft "Jalapeño" Essay(1).docx
An important exception here are file extensions: R scripts should be terminated by .R
, always uppercase. This is a longstanding tradition, so it takes precedence over the other rules.
Humans need context, so short and vague file names are to be avoided. A good practice is to begin the file name with its “type” (e.g. analysis, note, report, etc.) and then append a meaningful description.
# Good
analysis01_descriptive-statistics.R
notes02_tentative-write-up.docx
# Bad
01.R notes2.docx
Note how, in the examples above, -
is used to separate words that belong to the same “chunk”, and _
is used to separate different “chunks”. I don’t follow this suggestion personally, but many people like the idea.
If your files names need to include dates, always follow the YYYY-MM-DD
format and place them before the description so that the files are sorted automatically.
# Good
2012-01-01.R
2012-01-02.R
2012-04-01.R
# Bad
1-April-2012.R
1-Jan-2012.R 2-Jan-2012.R
If you need to order files by something other than date, use numbers as prefix, but remember to left pad with 0
so that all numbers have the same length. Also include keywords if you want to divide files by type.
# Good
reading01_shakespeare_romeo-and-juliet.docx
reading02_shakespeare_romeo-and-juliet.docx
...
reading11_shakespeare_romeo-and-juliet.docx
notes01_shakespeare_romeo-and-juliet.docx ...
Most users don’t know about a little file called .Rprofile
(the dot at the beginning makes the file invisible). Your .Rprofile
contains R code that is run every time you open R, so this is a good place to set some options and configurations. It’s also an amazing place to make your analyses impossible to reproduce.
If you’ve never used this feature of R, I suggest you stay this way until you are a seasoned veteran in the language. A good tip for using your .Rprofile
correctly is to never put anything that you would write in a script there.
Here is a good example of an .Rprofile
:
# Set repo
options(repos = c(CRAN = "https://cran.rstudio.org"))
# Change width (like in the beginning of the chapter)
if (interactive()) {
options(width = 10)
}
If you really want to add something to your .Rprofile
, you can install the usethis package and run usethis::edit_r_profile()
.
To me, this is the most important part of the chapter. You can fix all of your file names and remove everything from your .Rprofile
; if you don’t update your system, it’s all for naught.
R is a relatively old language, so it has been updated many times since its launch. Every two to three months, the R Core Team releases a new version with many improvements and bug fixes. These updates can be very minor, but sometimes there are truly exciting new features. in R 4.1.0, for example, the native pipe (|>
) was introduced, pretty much retiring magrittr.
This means that, if you don’t upgrade your R installation, you’ll be missing out on the evolution of the language and leaving your system vulnerable to known bugs.
The same is true for packages, the life and blood of R. Packages change much more and much faster than R itself, since they’re not always worried about backwards compatibility. If you leave your packages to rot, expect your scripts to stop working on other peoples’ machines, to have harder time googling bugs, and to get stuck with old code.
My advice is then the following:
Every few weeks make sure to run update.packages()
to update your packages.
Every few months you should also update the language. If you’re on Windows you can use the installr package, otherwise you can go to their website and grab the newest binary.
It’s evidently a bit harder to keep your environment up to date than to just keep it unchanged forever. Sometimes you’ll have to rewrite something that stopped working from one version to the next, but this is much easier than letting your system grow more and more out of date until, all of the sudden, everything stops working all at once. If you keep up with every release, the adjustments will be small and frequent; if you don’t, the adjustments will be major and you might not be able to make them in a reasonable amount of time.
As a last suggestion, also make sure to update your operating system. R doesn’t support every version of Windows and Mac forever, and I’m not even mentioning the security patches you’re missing when you choose to “update later”.