Instructor: Jonathan Wells
Email: wellsjon@grinnell.edu
Classroom: Noyce 2401
Office: Noyce 2249
Office Hours: Link to Office Hours Calendar
This course is an introduction to the basic ideas of statistical reasoning, data analysis, and probability theory, beginning with a survey of data visualization, processing, and summary techniques using the R programming language, continuing with an investigation of resampling and randomization methods for statistical inference and estimation, and concluding a comparison of simulation-based techniques to classical probability tools.
MAT 124 or MAT 131; or instructor consent.
There are no statistics prerequisites for this class, nor are there any programming prerequisite. Additionally, although we will not directly use any calculus techniques, having had at least a semester of calculus indicates an appropriate level of quantitative preparation.
Throughout the term, we will make use of content from the following texts.
Statistical Inference via Data Science: a ModernDive into R and the Tidyverse, 1st Edition by Arthur and Kim, available for free online at https://moderndive.com/ (Used primarily during the 1st half of the term)
Statistics: Unlocking the Power of Data, 3rd Edition by Lock et al., available through Day One Access on the course P-Web page https://pioneerweb.grinnell.edu/ (Used primarily during the 2nd half of the term)
The following web-based resources will be used for communicating class information:
PWeb https://pioneerweb.grinnell.edu (non-public documents, e-reserves, Day One Access Textbook).
Course Website https://grinnell-statistics.github.io/sta-209-s23-wells/ (documents, a daily schedule, assignments, lecture notes, course information).
Gradescope https://www.gradescope.com/ (homework and lab submissions, grades).
We will use computers in almost every day of our class. A limited number of desktop computers are available in the classroom, but you are encouraged to bring your own laptop to class each day, both to take notes as well as to perform statistical computations.
Modern statisticians and data scientists makes extensive use of computing software to perform statistical analysis, and one of the primary tools used by these practitioners is the free and open-source R programming language. In this class, we will use R (along with the RStudio editor interface) to perform computations, to create data visualizations, and to write reproducible data reports.
All homework and lab assignments will be completed and submitted using the RStudio interface. R and RStudio are free to use can be accessed online using the Grinnell RStudio Server: https://rstudio.grinnell.edu/
Alternatively, for students with some prior programming experience and who are interested in either offline access or greater control over computing resources can install R and RStudio to their personal computer:
Install R using the instructions here: https://www.r-project.org/
Install RStudio Desktop using the instructions here: https://posit.co/download/rstudio-desktop/
If you would like to contact me, I can most easily be reached via email (wellsjon@grinnell.edu) weekdays between 8am and 6pm. While I try to answer emails as soon as possible, in some cases, I may not be able to respond until the following school day. If you’d prefer to talk live, send me an email and we can schedule a time to chat via WebEx.
You are free and encouraged to attend any scheduled office hours without prior appointment. These are times I have specifically set aside for answering questions, discussing class material, and helping with other college business. If you have a matter you’d prefer to discuss one-on-one, or if none of the scheduled times fit your schedule, please email me and we can arrangement another time to meet. On very rare occasions, I may need to reschedule office hours due to illness or other unavoidable conflict, and in these cases, I will notify the class via email.
By the end of the course, a student should be able to:
A typical week will involve the following:
Assigned Reading. For each day of class, one or more sections of the textbook will be assigned for reading, which you are expected to complete before the start of that day’s class. Statistical intuition takes time to develop, and by reviewing definitions, theory and examples before class, we can revisit and clarify them during our limited face-to-face class time.
Active In-person Lecture. The first 50 minutes of each class day will include an interactive lecture by the instructor, with some time devoted to discussion either class-wide or in small groups. Lecture slides will be posted on the course website in advance of class.
Lab. During the remaining 30 minutes of each class day, students will investigate statistical concepts using the R programming language, with guidance from the instructor and course mentor.
Homework. Each week, a homework set will be assigned containing both computational and theoretical problems. Homework assignments will be due Friday by 11:59pm.
A prepared student will attend class for 80 minutes per day, three days each week and spend about two to three hours per day of class on work outside the classroom (reading, doing homework, working on projects, discussing, studying, etc.). Together, this represents a 10 - 13 hour per week commitment.
Your grade in the class will be determined by your proficiency in each of the Course Outcomes, using the following weights:
Letter grades will be assigned based on the following course percentages (with upper and lower \(2\%\) of each division corresponding to \(+ / -\), respectively).
The ability to immediately interrogate your beliefs and understanding through dialogue sets a live class apart from more passive means of education. For this reason, you are expected to attend class regularly and to actively participate by asking questions, responding to questions, and engaging in class discussion.
If you are unable to attend class, you should notify the instructor before class (or promptly after, if that’s not possible). You are responsible for independently catching up on the material missed, which you can do by:
Typically, you may miss up to three classes without penalty. However, prolonged or recurring illness, as well as other emergencies, may require individual adjustment, in which case you should contact the instructor as soon as possible to make appropriate arrangements.
For most class days, a short lab assignment will be posted on the lab page of our course website. These assignments are intended to be completed during class, although you are welcome to take additional time after class to finish. Lab assignments should be submitted by the start of class (2:30pm) on the following class day.
Solutions to each problem must be typed in an .Rmd file, exported as a .pdf, and then uploaded as a .pdf to Gradescope.
Up to three times throughout the term, you may request a 2 day extension on a lab assignment.
In addition to the lab assignments, each week a homework assignment will be posted on Friday to the homework page of our course website, to be completed and submitted to Gradescope before 11:59pm the following Friday. These assignments will require you to synthesize skills developed during lab with material covered during lecture.
Solutions to each problem must be typed in an .Rmd file, exported as a .pdf, and then uploaded as a .pdf to Gradescope.
Up to twice throughout the term, you may request a weekend extension on your homework assignment, in which case the assignment will be due at 11:59pm the following Monday. You do not need to specify the reasons for the requesting the extension, although except in extraordinary circumstances, requests must be made prior to 5pm on the assignment’s due date.
Two midterm exams will be given during the term. This exams will have both an in-class and a take-home component. Tentatively, the in-class exams are scheduled for:
Take-home components of the exams will be distributed the same day as the in-class component, and will be due the following Friday. No homework will be due on the day the take-home exam is due.
Except in the case of illness or emergency, requests to reschedule the in-class exam must be made a week before the exam.
Throughout the term, you will work in groups of 3-4 on a project that answers a significant research question using real-world data, by implementing the fundamental techniques developed in our class, as well as some more advanced methods from supplementary sources.
The project will culminate in a 3-5 page technical report, due at the end of the semester.
A lightly cumulative final exam will given at the end of the term. The exam will have both an in-class and a take-home component.
The in-class exam is scheduled by the registrar’s office for 9am - noon, Tuesday May 16th. The take-home component will be distributed on Tuesday and due on Friday of Finals Week.
Except in the case of illness or emergency, the in-class exam cannot be rescheduled.
Grinnell College is committed to creating inclusive and accommodating learning environments. Please notify me as soon as possible if there are aspects of the instruction or design of this course that result in barriers to your participation. I also encourage you to have a conversation about and provide documentation of your disability to the Coordinator for Student Disability Resources, Jae Hirschman, located on the 1st floor of Steiner Hall(x3089). If you have already been approved for accommodations, please have Disability Resources provide a letter during the first week of classes, or as soon as possible after approval. I will then contact you to schedule a meeting during which we can discuss the particular implementation of your accommodations.
Grinnell College offers alternative options to complete academic work for studnets who observe religious holy days. Please contact me within the first three weeks of the semester if you would like to discuss how to meet the terms of your religious observance and also the requirements for this course.
Students are allowed and encouraged to collaborate on most in-class and homework assignments. However, any work that you turn in for grading must be your own. If you collaborate on homework, you should clearly indicate the names of your collaborators on the first page of your assignment.
You are welcome to use other paper or internet resources to supplement content we cover in this course; however, with the exception of existing solutions to homework or exam problems. Copying or paraphrasing solutions from the internet or other sources is an example of academic dishonesty. Exams will explicitly mention what resources may be consulted. All written work that references material outside of the textbook or lecture should be accompanied by an appropriate citation.
I expect all members of the class to make participation a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
I expect everyone to act and interact in ways that contribute to an open, welcoming diverse, inclusive, and healthy community of learners. Examples of unacceptable behavior include: using sexualized language or imagery, making insulting or derogatory comments, harassing someone publicly or privately, monopolizing discussion or otherwise preventing others from meaningfully participating. Instead you can contribute to a positive learning environment by demonstrating empathy and kindness, being respectful of differing viewpoints and experiences, giving and gracefully accepting constructive feedback, and making space for everyone to contribute.
You will receive timely feedback on your homework via Gradescope, usually within a week of the assignment’s due date. Each homework problem can earn up to five points, and correspond loosely to letter grades (5 points \(\approx\) A, 4 points \(\approx\) B, etc.)
I recommend you review comments on your solutions and rework missed problems. You are welcome to talk to me about them during office hours or via email.
I strongly encourage you to attend my office hours each week. You are welcome to come either with specific questions, or just with general uncertainties about content we’ve discussed. If you are unable to attend scheduled office hours, please email me to schedule an alternative appointment (either in-person or virtual).
The Data Science and Social Inquiry Lab (DASIL) in HSSC S1310 is staffed by mentors who are experienced in R programming and may be able to troubleshoot coding problems you are having.
This is the schedule as of Day 1. A detailed and updated schedule is available on our course webpage.
Week | Dates | Topic | Important Dates |
---|---|---|---|
1 | 1/23 - 1/28 | The Structure of Data | - |
2 | 1/30 - 2/3 | Grammar of Graphics | Add/Drop Deadline 2/3 |
3 | 2/6 - 2/10 | Data Wrangling, Study Design | - |
4 | 2/13 - 2/17 | Linear Regression | - |
5 | 2/20 - 2/24 | Multiple Linear Regression | - |
6 | 2/27 - 3/3 | The Sampling Distribution | - |
7 | 3/6 - 3/10 | Confidence Intervals | Midterm Exam 1 3/6 |
8 | 3/13 - 3/17 | Hypothesis Testing | - |
- | 3/20 - 3/24 | Spring Break | - |
- | 3/27 - 3/31 | Spring Break | - |
9 | 4/3 - 4/7 | Probability | Withdraw Deadline 4/7 |
10 | 4/10 - 4/14 | Inference Using Mathematical Models | - |
11 | 4/17 - 4/21 | Inference for Proportions | Midterm Exam 2 4/17 |
12 | 4/24 - 4/28 | Inference for Means | No Class 4/26 |
13 | 5/1 - 5/5 | Inference for Regression | - |
14 | 5/8 - 5/12 | Review / Projects | Project Draft Due 5/12 |
15 | 5/15 - 5/19 | Finals Week | Final Exam 9 - noon 5/16 |