STA 141A course webpage: Fundamentals of Statistical Data Science
Spring 2025
Instructor: Akira Horiguchi (ahoriguchi@ucdavis.edu)
Lectures: Mondays, Wednesdays and Fridays, 1:10 PM - 2:00 PM, (Young 198)
Labs: Run by TAs, (Wellman 230)
- Section A01: Wednesday, 3:10 PM - 4:00 PM, Zhentao Li (ztlli@ucdavis.edu)
- Section A02: Wednesday, 4:10 PM - 5:00 PM, Zijie Tian (zijtian@ucdavis.edu)
- Section A03: Wednesday, 5:10 PM - 6:00 PM, Lingyou Pang (lyopang@ucdavis.edu)
Office hours:
Day, time | Location | |
---|---|---|
Akira Horiguchi | Monday, 9:30 AM - 10:30 AM | Physical and Data Sciences Building 0003 (in the basement; ignore the scary signs), Google Map |
Zijie Tian | Tuesday, 4:00 PM - 5:00 PM | MSB 1117 |
Zhentao Li | Thursday, 10:00 AM - 11:00 AM | MSB 1117 |
Lingyou Pang | Thursday, 5:00 PM - 6:00 PM | MSB 1117 |
Syllabus: here
Piazza: here
Textbooks: Two "main" textbooks and two "supplemental" textbooks will be used for the course. They are all freely available online.
"Main":
- [R4DS2] R for Data Science, 2nd edition. Hadley Wickham, Mine Çetinkaya-Rundel, Garrett Grolemund. 2023. https://r4ds.hadley.nz/
- [ISLR2] An Introduction to Statistical Learning with Applications in R, 2nd ed. G. James, D. Witten, T. Hastie, and R. Tibshirani. 2021. https://www.statlearning.com/
- In 2023 the authors published a Python version of this book, but our course will use the R version of this book.
"Supplemental":
- [IR] An Introduction to R. W. N. Venables, D. M. Smith, and the R Core Team. 2020. https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
- [IP] Introduction to probability, statistics, and random processes. H. Pishro-Nik. Kappa Research LLC, 2014. https://www.probabilitycourse.com
Class Schedule
The exam, homework, and project dates are set, but the lecture topics are subject to change.
Week | Lecture Day | Topics | Slides | Additional references | Homework | Lab |
---|---|---|---|---|---|---|
1 | Mar 31 (M) | 1 (Class overview, basic R) | part 1 part 2 | IR | HW 0 released pdf rmd | |
Apr 2 (W) | 2 (Vectors, matrices, arrays, lists, data frames) | html | IR | HW 0 due 9pm (due to waitlist logistics, HW 0 will be the only homework that will be accepted late) HW 1 released pdf rmd | 1 | |
Apr 4 (F) | 2 (Functions, loops, apply, conditional execution) | (see above) | IR | |||
2 | Apr 7 (M) | 3 (Explore data: import, subset, inspect) | html | R4DS2, Ch3 | ||
Apr 9 (W) | 3 (Explore data: reshape) | (see above) | R4DS2, Ch5 | HW 1 due 9pm HW 2 released pdf rmd | 2 | |
Apr 11 (F) | 3 (Explore data: transformations) | (see above) | R4DS2, Ch12-13 | |||
3 | Apr 14 (M) | 3 (Explore data: transforming) Discuss project | R4DS2, Ch12-13 | |||
Apr 16 (W) | 3.5 (Explore data: visualizing) | html | R4DS2, Ch9-10 | HW 2 due 9pm | 3 | |
Apr 18 (F) | 3.5 (Explore data: visualizing), 3.6 (Explore data: joins) | html | R4DS2, Ch16 | Practice midterm exam 1 released pdf | ||
4 | Apr 21 (M) | 4 (Probability) | IP, Ch1,3-5 | |||
Apr 23 (W) | 4 (Probability) | IP, Ch1,3-5 | (Teams created) HW 3 released pdf rmd | 4 | ||
Apr 25 (F) | Midterm exam 1 (1:20 PM - 2:00 PM) | |||||
5 | Apr 28 (M) | 4 (Probability) | IP, Ch1,3-5 | |||
Apr 30 (W) | 5 (Overview of statistical learning) | ISLR2, Ch1-2 | HW 3 due 9pm HW 4 released pdf rmd | 5 | ||
May 2 (F) | 5 (Overview) / 6 (Regression) | pdf / pdf | ISLR2, Ch3 | |||
6 | May 5 (M) | 6 (Regression) | ISLR2, Ch3 | Project proposal due, 9pm (due date extended to May 7, 11:59pm) | ||
May 7 (W) | 7 (Classification) | ISLR2, Ch4 | HW 4 due 9pm HW 5 released pdf rmd | 6 | ||
May 9 (F) | 7 (Classification) | ISLR2, Ch4 | ||||
7 | May 12 (M) | 7 (Classification), 8 (Resampling methods) | pdf pdf | ISLR2, Ch5 | ||
May 14 (W) | 8 (Resampling methods), 9 (Unsupervised learning) | pdf pdf | ISLR2, Ch12 | HW 5 due 9pm HW 6 released pdf rmd | 7 | |
May 16 (F) | 9 (Unsupervised learning) | ISLR2, Ch12 | ||||
8 | May 19 (M) | Work on project (no class) | ||||
May 21 (W) | Work on project (no class) | HW 6 due 9pm Practice midterm exam 2 released pdf | 8 | |||
May 23 (F) | Work on project (no class) | |||||
9 | May 26 (M) | Memorial Day, no class | ||||
May 28 (W) | Midterm exam 2 (1:20 PM - 2:00 PM) | HW extra credit released pdf rmd | 9 | |||
May 30 (F) | Tree-based methods | ISLR2, Ch8 | ||||
10 | Jun 2 (M) | Tree-based methods | ||||
Jun 4 (W) | Tree-based methods, parting thoughts | HW extra credit due 9pm | 10 | |||
11 | No final exam | Final project due June 11, 9pm |