R for Actuarial Science
This is an 8-week course designed to show actuarial students the basic concepts and methods needed to write elementary R scripts and understand those written by others. Applications will be chosen for their relevance to the actuarial profession. By the end of the course, students will have learned topics necessary to perform basic data analysis tasks and will be well-prepared for more advanced analytics coursework.
Much of this material can be learned “on the job” or “as you go,” but it can take years – ask me how I know! My goal with this course is to take the frustrations I experienced on my own journey and help you overcome them early. Together in 8 weeks, you can reach a level that took me over 8 months to reach!
Instructor:
Robert C. Roper

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Objectives
By taking this course you will learn the following skills:
- Fundamentals of the base R language
- Installation and use of third-party packages
- Elementary object-oriented programming
- How to load, manipulate, query, and prepare complex data sets for analysis
- Web-scraping techniques for collection of online data
- Cross-sectional and time-series modeling in R
- Specialized R packages for actuarial sciences
- Dashboard preparation for presenting final deliverables to stakeholders

Course Schedule
Week 1: Getting Started in R
This week, we’ll learn all about how to get set up with R and RStudio, and how to access the online cloud environment (for those who would prefer one). We’ll then move on to the basics of R programming, and will even cover some “advanced” R topics. Don’t be nervous – these will make your life so much easier!
- Installing R/RStudio
- Getting to know RStudio
- Using the R Console
- Installing packages
- R’s data types
- Loops and apply functions
- Conditional statements
- Random numbers
- Probability functions
- Writing your own functions
- The dots (…) operator
- Basic exception handling
- The S3 object system
- Functionals
- Function factories
- Function operators
Week 2: Data Wrangling with the tidyverse
In the second week, move on to the actual data and analysis component of the course, starting with an introduction to the tidyverse package.
- Introduction to the tidyverse
- magrittr and pipe operators
- readr for reading source data
- dplyr for data manipulation and summarization
- tidyr for data reshaping
- purrr for better loops
- merging/joining dataframes
Week 3: Data Collection and Exploratory Analysis
Now that we know how to work with basic data frames, it’s time to get our hands on some data and analyze it! We’ll go over the most basic ways to get data out of SQL databases, and will even cover how to take data straight off the web for analysis using a technique called “web scraping.” We’ll then learn how to summarize and plot that data to get an idea of what’s in it.
- SQLite: the bare minimum
- Database connectors in R
- dbplyr: SQL for the tidyverse
- Summary statistics with summary()
- Graphics with ggplot
- Built-in hypothesis tests
- Outlier and distributional fit tests
Week 4: Statistical Modeling in R
You’ve got the data in R, and you can describe it. Now it’s time to start building models and making inferences. This week, we’ll go over the basic built-in modeling tools that come with R; and will also cover how to choose variables for your model, compute model statistics, plot your results, and make inferences about the data.
- OLS Regression with lm()
- Interpretation of lm()
- Likelihood Estimation: stats4::mle()
- Model comparison with aov()
- The AIC() and BIC() specification criteria
- Confidence and prediction intervals.
Week 5: Time Series Analysis and Forecasting
Ordinary Least Squares as described in Week 4 is a powerful tool, but it relies on some assumptions that real data doesn’t always conform to. Time series analysis has a few caveats that we’ll go over this week. We’ll then continue our course by learning how to work with time series and forecast future values, using packages designed specifically for these tasks.
- Time-series-specific modeling concerns
- The xts and tsibble packages
- Time series decomposition:
- trend
- seasonal
- residual
- Time series model types:
- AR(p)
- ARDL(p, q)
- ARIMA(p, d, q)
- Forecasting: forecast and fable
- Forecast evaluation
Week 6: Specialized Actuarial Packages for R
By now, you’ve learned the basics of R, and even a little more. But you’re not just any old data analysts – you’re learning this for the actuarial field. This week, we’ll go over special packages developed by actuaries for actuaries, so that you can avoid re-inventing algorithms and instead use popular, well-tested solutions!
- survival: life tables and survival analysis
- chainladder: claims reserving
- fitdistrplus: fitting of loss distributions
- idmodelr: scenario analysis
- More packages will be introduced based on students’ goals and interests
Week 7: Report Generation and Automation
R won’t do you much good if you can’t clearly show executives and stakeholders what you’ve discovered. The RMarkdown language allows users to generate data-driven, attractively-formatted documents (or static documents like this syllabus) for either the web or print. After covering the basics, we’ll go over how to set up automatic tasks that can generate reports from R as often as you like!
- manual vs. automated workflows
- report generation with RMarkdown and knitr
- mathematical notation with LaTeX
- task scheduling
Week 8: Advanced Visualization and Interactivity
Last week, we discussed how to make reports for the web and print that don’t change with new data and have no interactivity. This week, we’ll end the course with a discussion of how to use R to generate dashboards and reports that executives and stakeholders can comfortably interact with.
- Introduction to the shiny web framework
- The reactable package for interactive, sortable tables
- The plotly package for beautiful interactive graphics
- Just enough HTML/CSS/Javascript to get you in trouble!
Final Project
At the end of the course, there will be an assigned project using the 2024 U.S. Life Tables from the CDC, where students will perform an mortality analysis using R.
Prerequisites:
- Probability and Statistics (Exam P)
- Financial Mathematics (Exam FM)
- Linear Algebra (Just knowing what matrix multiplication and inverses are will do)
- Microsoft Excel
Required Software:
There are two programs and one web service that students must have access to:
- The R interpreter
- The RStudio integrated development environment (IDE)
- A GitHub account – for submission of homework assignments