This page contains the notes for the second part of R Workshop Module 6: “To Infinity and Beyond”, which is part of the R Workshop series prepared by ICJIA Research Analyst Bobae Kang to enable and encourage ICJIA researchers to take advantage of R, a statistical programming language that is one of the most powerful modern research tools.
Click here to go to the workshop home page.
Click here to go to the workshop Modules page.
Click here to view the accompanying slides for Module 6, Part 2.
Navigate to the other workshop materials:
In this final part, we will explore options for getting help and find resource materials to get answers to our questions. R users are blessed with a wealth of information that is freely available online from a variety of sources. Knowing where to look when in need of help, we are truly ready to solve any problems using R.
One of the key reasons we must know how to leverage online resources is simply that we cannot know everything.
In fact, no one knows everything! Many advanced R programmers still rely on the Internet partly because the ecosystem keeps evolving and partly because they, too, are only humans with plastic memories.
For us emerging R programmers, one of the greatest benefit of knowing where to look online comes from the fact that “Someone has already done it.” Questions we ask are often not so unique. Even when we are asking truly unique questions, there are many others who worked on similar questions or questions that address some part of ours. And, as R community is very active online, we are likley to find some information on what others have done on the Internet!
Source: AZ Quotes
Source: Wikimedia Commons
But, wait. Before going online, you should first try to solve the problem on your own. If you cannot solve it after “15 mintues,” then go online looking for advices.
15 min rule: when stuck, you HAVE to try on your own for 15 min; after 15 min, you HAVE to ask for help.- Brain AMA pic.twitter.com/MS7FnjXoGH
— Rachel Thomas (@math_rachel) August 14, 2016
Even though you shouldn’t be reinventing the wheel all the time, “first trying it yourself” will help you to become an indepedent thinker/programmer. In the long run, this will prove to be a critical skill you need to adapt to the changing R ecosystem.
When faced with some error, remember that we all make typos. Always. So we should check for typos!
Unfortunately, RStudio does not check for typos in our code automatically. However, we can refer to an error caused by the typos to figure out what went wrong.
Also, check if a package is loaded before using its functions. A common error message we might see would be the following:
Error in some_function() : could not find function "some_function"
Make sure the intended package is imported or the function is defined before using it.
# these are equivalent
?some_function
help(some_function)
Looking ino the documentation is often the best way to understand what a function is and how to use it. We can bring out the documentation if available using the ?
followed by the function name or help()
.
When an error is thrown, it comes with an error message. Error messages often have rich information about what went wrong and where it went wrong.
If we are working with custom functions we defined, RStudio’s debugging tools can help us to spot the source of an error in the script and debug it. See this article on debugging with RStudio. Also, see this video by RStudio on introduction to debugging.
Source: Google.com
“Googling” is a great technique to find answers to our own questions. Here are some tips to effectively take advantage of Google.
Source: R Project
“Official” resources are those provided by authortative entities, such as CRAN, RStudio, and package authors/maintainers. Though we can get to such “official” resources via Google search. knowing how to find them directly can facilitate our search for answers.
The Comprehensive R Archive Network (CRAN) has many resources for R and R packages, including the following:
CRAN offers the following “manuals”:
Manuals page can be found under “Documentation” on the menu located on the left side of the CRAN website. Each manual can be viewed as an HTML page or downloaded as PDF or EPUB file.
A Task View offers a brief introduction to a particular topic and an annotated list of relevant R packages.
CRAN has tasks views on a selection of topics, including:
CRAN Task Views page can be found under “CRAN” on the menu located on the left side of the CRAN website.
Each contributed package that is listed on CRAN has a page on the CRAN website. Here we can find a reference manual and vignettes for the package.
To directly get to the package page, try on your broswer:
with replacing [package-name] with any existing package name.
Alternatively, we can search for a particular package on the CRAN website user interface. Packages page can be found under “Software” on the menu located on the left side of the CRAN website.
dplyr
)Here is an example page for dplyr
package. It offers detailed information about the current version available on CRAN as well as links to its reference manual and vignettes (in the red box).
R packages have reference manuals that contain documentation for all its contents, i.e. functions and datasets. Basically, it is a collection of help()
documentations in a pdf format.
Reference manual can also be found by googling. Just try “package-name pdf” as your Google search term.
Packages often have vignettes to introduce its contents. Some vignettes can be accessed via vignette("package")
on R console. Other vignettes are found on the pacakge page on CRAN.
Unfortunately, not all packages have vignettes, so don’t be suprised when you cannot find vignettes for certain packages.
RStudio’s website offer many useful resources under “Resources” menu, including the following:
Currently, 13 RStudio cheat sheets are available, including:
There are about 15 user-made cheat sheats as well.
Some RStudio cheat sheats can also be found in RStudio IDE menu at “Help > Cheatsheets”.
The image below shows the Cheet Sheets page on RStudio website under “Resources” menu.
And the following is the example cheat sheet for using dplyr
to manipulate tabular datasets in R.
RStudio’s webinars and videos offer materials covering a variety of subjects. Some materials are organized by topics, including:
Materials from RStudio’s annual conference, rstudio::conf
, are also made available.
Tidyverse has its own website to introduce tidyverse packages, share updates and news on tidyverse, and offer guides to training matarials.
There are also child websites for many of tidyverse packages with standardized URL: “[package-name].tidyverse.org”.
The following table lists tidyverse’s child websites for some of its packages:
Pacakge | Description | URL |
---|---|---|
ggplot2 |
For data visualization | http://ggplot2.tidyverse.org/ |
dplyr |
For data manpulation | http://dplyr.tidyverse.org/ |
tidyr |
For tidying up data | http://tidyr.tidyverse.org/ |
readr |
For data implort/export | http://readr.tidyverse.org/ |
purrr |
For better loops | http://purrr.tidyverse.org/ |
tibble |
For extending data.frame |
http://tibble.tidyverse.org/ |
stringr |
For working with strings | http://stringr.tidyverse.org/ |
forcats |
For working with factors | http://forcats.tidyverse.org/ |
readxl |
For importing Excel files | http://readxl.tidyverse.org/ |
haven |
For SPSS, SAS, and Stata data | http://haven.tidyverse.org/ |
lubridate |
For working with datetimes | http://lubridate.tidyverse.org/ |
magrittr |
For specialized pipe oprators | http://magrittr.tidyverse.org/ |
RStudio has a separate website focused on all things R Markdown.
The R Markdown website has useful resources such as its Articles page that offers a number of tutorials on creating various sorts of R Markdown documents and the Formats page that provides links to reference matarials on various R Markdown formats and templates.
RStudio also has a separate website on everything Shiny. Some of the useful resource materials can be found in the following pages:
First, its Video & wrttien tutorial page has links to tutorial videos and articles on Shiny as well as recorded conference presentations and webinars.
Second, the Articles page offers a list of web articles on building Shiny applications.
Finally, the Reference page contains links to upgrade notes and function references for lastest as well as previous versions of the Shiny package.
htmlwidgets
websitehtmlwidgets
for R website presents brief descriptions and examples for various packages for incorporating interactive widgets into R ecosystem.
Currently, there are about 100 widgets registered as htmlwidgets
. Visit its “Gallery” page to see what widgets are available.
Some popular htmlwidgets
packagees include:
plotly
and highcharter
for interactive visualizationsleaflet
for interactive mapsDT
for interactive data tablesSource: “Community (TV series)”, Wikipedia
One of the greatest strengths of R is its community that is highly active and diverse. Naturally, a lot of quality resource materials on the Internet come from the members of R community.
R-bloggers is a blog that collects and features articles and blog posts on R and programming in R from a variety of sources.
The blog offers an excellent way to stay up-to-date on new packages and developments in the R community. Its posts cover new updates in R and major R packages, tutorials, information on upcoming events and conferences, and much more.
There are many “books” written by R community members that are freely available online. Some excellent online books are as follows:
Also, visit the bookdown
package website to find many more free online books on R!
Source: R for Data Science
I especially recommend R for Data Science by Hadley Wickham and Garrett Grolemund as your first R book. Much of this workshop is inspired by this book. It is written for beginners and covers key concepts and applications of R programming for data analysis.
There are many excellent websites providing tutorials and learning materials on R and data analysis with R. The following are some of my personal favorites:
And, of course, take advantage of this workshop’s website! :)
Source: GitHub
“GitHub is a development platform inspired by the way you work. From open source to business, you can host and review code, manage projects, and build software alongside millions of other developers.” - GitHub.com
Most R packages are available as GitHub repositories, which can be “cloned” and downloaded if wanted. Here we can view the source code that shows what the package functions are doing under the hood to get restults they promise. Not only can we understand better what the functions are doing, we can also use the source code as an inspiration for writing our own functions or even packages.
Also, many R package authors offer brief explanations and even quick tutorials for their packages on the GitHub repositories.
dplyr
)Here is a screenshot of dplyr
GitHub repository:
Source: worldview.stanford.edu
If you are a kind of person who learns best from taking courses on the subject matter, you can take advantage of online courses on R.
Some popular online learning websites with courses on R are as follows:
DataCamp is one of the best websites out there for learning R. It appears that many programmers and package authors at RStudio have courses on DataCamp. This means that you can learn about certain packages directly from the authors!
DataCamp requires registration and log-in to take the courses. There are some free courses available, but most are paid courses with one free chapter. Cost is $25/month with the annual plan or $29/month. Onces you are subscribed, all courses are made available.
DataCamp offers 70+ courses on R. In general, a course is short (~4 hours) and focused on a specific topic. DataCamp’s R courses cover materials that range from basic to intermediate level.
Coursera is a MOOC (massive open online courses) site works with universities and offers learning materials that feel like a college course on a variety of subjects. Coursera has courses, specializations and online degrees. You can find out more about differences between these options here.
Coursera requires registration and log-in. Once you are logged in, you can “audit” any course for free. For a small fee ($29 to $99), you can get course Certificate and online support.
Some notable contents on Coursera include:
edX is another MOOC site that offers university-level courses, which are generally free and self-paced. Like Coursera, edX courses are usually organized in a college-course like format.
Taking courses on edX requires registration and log-in. edX offer verified certificate for individual courses and XSeries certificate for XSeries programs for a small fee.
edX is perhaps better for learning basics on topics like: