Econ 145 - Fall 2024
Data Wrangling for Economics
Syllabus
Richard Startz
Fall 2024
Real economic data is often messy. It can require significant filtering and re-arranging before
it can be analyzed. In this course we learn techniques to accomplish this. We focus on the
what to do questions, such as the importance of checking for missing data. Along the way,
we learn some answers to the ‘how to do it’ questions. We will learn some basic use of the R
computer environment, although once you understand the issues you will be able to apply
concepts in other environments as well. We will learn some very basic techniques of data
description, analysis, and presentation. But this course is not a substitute for a computer
science course nor for courses in econometrics/statistics. It’s all about the nitty-gritty of
dealing with real data in economics.
Data Wrangling for Economics is a learning by doing course. To be successful in this course
you should like to learn concepts and you should like to solve puzzles. Pretty much everything
is open book, so memorization is of very little importance. This course is open to individuals
of all skill sets. No prior R experience is required! In fact, this course is designed for
individuals with no prior R experience.
Student Learning Objectives and Purpose of Class
Students will learn how to:
a) Organize data for computer analysis.
b) Display and summarize data in order to answer substantive questions.
c) Communicate findings clearly and succinctly.
Copyright Richard Startz 2024 1
Econ 145 - Fall 2024 COURSE ORGANIZATION
Course Organization
The course has a number of moving parts.
1. Class lectures are each Tuesday/Thursday.
2. Computer lab hours: The computer lab in North Hall 1122 will be staffed with folks to
help you from 5:00pm-8:00pm Monday and Wednesday and 6:30pm-8:00pm Tuesday
and Thursday the time . During your assigned section time you get priority for access
to a computer. Good place to meet up with classmates too.
3. Office hours: Tuesday 11:00-12:00 with Professor Startz to discuss anything you’d like
to discuss. . . except R and the specifics of the homework (those belong on Nectir).
Concepts in data wrangling, questions about the lecture material, advice on careers,
and the state of the world are all fair game.
4. Guided Exercises are on Canvas. Each one walks you through material relevant to
the course and to assignments. We strongly suggest that you code along with the
examples that we go through, although there will be times when copying and pasting
can be useful (e.g. loading in a data set). Spending time on the guided exercises is
strongly recommended. You do not turn in anything from the guided exercises.
5. Homeworks: Assignments and deadlines are posted on Canvas. For each homework
(unless otherwise specified) you will:
a. Upload R code to GradeScope. Gradescope will grade your code and give you a
little bit of feedback. You can turn code in to GradeScope as often as you like,
only the highest scoring submission counts for a grade.
b. Create a written analysis. Turn in a your assignment on GradeScope.
6. Simulation week. During the week of November 18, you will receive multiple analysis
assignments. As is often true in organizations, these assignments will be short-fuse,
time-critical. Be sure to set aside time for the needed tasks. These assignments count
for more than one homework.
7. Online implicit bias test. You receive participation credit for this assignment. You are
not graded on your answers.
8. Final project. The final project is an extended homework. In addition to turning it in
for a grade, think of your final project as giving you a writing sample and some talking
points you can use as part of a job search.
9. Exams? Nope, no exams.
Copyright Richard Startz 2024 2
Econ 145 - Fall 2024 ACADEMIC INTEGRITY
Necessary equipment
Basically, you need an internet connection and some kind of computer. You need something
with a keyboard and a bigger screen is better. (A smart phone won’t do.) You need software
to create a pdf document, but almost any word processor will do that.
Collaborating and getting help
• You are strongly encouraged to collaborate on assignments with classmates. Having
discussions on Nectir is one way to do this. We recommend coding with RStudio open
in one window and Google open in another. And if you find ChatGPT or other AI
useful, go for it. (Our experience is that AI can be useful, but not useful enough to
get a good grade used uncritically.)
• All questions are fair game for you to post on Nectir. (Do keep it PG, which can
sometimes be hard while coding.) And drop in when you can to help out a classmate!
Course staff members will monitor Nectir as regularly as the budget allows. So we
hope that between classmates and staff that response rates will be relatively quick.
• For administrative questions about the course–not about help with homework or R–
send email to [email protected]. We will try to get you a response within 24
hours. Do not send questions through Canvas.
Academic integrity
• Copying a couple of lines of code is fine. . . copying more is not. When you copy code,
it is considered a professional courtesy to include a comment citing the source.
• If you plagiarize, you flunk the course and get turned in for violating the student
conduct code. So don’t do it.
• All course materials (class lectures and discussions, handouts, examinations, web ma-
terials) and the intellectual content of the course itself are protected by United States
Federal Copyright Law, the California Civil Code. The UC Policy 102.23 expressly
prohibits students (and all other persons) from recording lectures or discussions and
from distributing or selling lectures notes and all other course materials without the
prior written permission of the instructor (See http://policy.ucop.edu/doc/2710530/
PACAOS-100). Students are permitted to make notes solely for their own private ed-
ucational use. Exceptions to accommodate students with disabilities may be granted
with appropriate documentation. To be clear, in this class students are forbidden from
completing study guides and selling them to any person or organization. This text has
been approved by UC General Counsel.
Copyright Richard Startz 2024 3
Econ 145 - Fall 2024 GRADE POINTS
• We know perfectly well that the vast majority of students conduct themselves with
integrity without being lectured about it. Apologies to you. If it is any consolation,
the economics department is surprisingly good at catching cheaters.
Grade points
Grade points will be assigned something like this:
Table 1: Point Breakdown
Assignment Points
Homework - coding 100 (x14)
Homework - write-up 500 (x8)
Implicit Bias Test Participation 100
Simulation Week coding 100 (x2)
Simulation Week write-up 500 (x2)
Final project - coding 200
Final project - write-up 1200
See also insurance policies and extra credit below
At the end of the term, we will sort grades more or less in line with the econ department’s
guidelines for elective, upper division courses. So the raw grades don’t correspond to any
particular letter grade.
Most homework assignments will consist of two parts: a coding portion and a written portion.
The coding portion may be submitted to gradescope an unlimited amount of times. The
submission with the highest grade will be counted. The written portion will be graded based
on addressing the prompt, clarity of writing, and organization.
Regrades
If you think your written assignment is graded incorrectly, send a copy to econ-econ145@
ucsb.edu of your submission together with a written explanation of the mistake. Just saying
you disagree won’t work, you have to identify a mistake. Requests for regrading must be
made within two weeks of the original due date. (And the truth is, we almost never find a
good reason to change a grade.)
Insurance policy and Late Work
Do not wait for the last minute to do assignments. The internet goes down. Computers
crash. People get sick. (Given the way the last few years have been going, locusts may land
and carry off your laptop.)
The grades above come with an automatic no-request-needed insurance policy.
Copyright Richard Startz 2024 4
Econ 145 - Fall 2024 GRADE POINTS
• We drop the lowest homework coding grade and we drop the lowest homework write-up
grade.
– Try not to use up your insurance early, in case you have a real emergency later
in the course. (Obvious, no?)
• Homework coding or write-ups 1-13 may be turned in up to 7 days late and receive
one-half credit. Homeworks 14 and 15 may be turned in up to 3 days late and receive
one-half credit. All other assignments are due at the time specified on Canvas.
iClicker optional insurance
We will use iClickers in class. Each question is worth 1 point. At the end of the term we
drop the lowest 15 points and then give a grade as a percentage of 100. The iClicker score
then replaces the second lowest coding grade. (If the iClicker grade is lower than the second
lowest coding grade we drop the iClicker grade.) If you don’t want to do the iClicker, you
can opt-out of the optional insurance by October 14.
Optional extra credit assignment.
An optional extra credit assignment will be available on Canvas. The extra credit assignment
will be due December 9 (no extensions, no late submissions). The extra credit assignment
is worth 500 points, which are added to your point total (but can’t push your total above a
perfect score.)
Copyright Richard Startz 2024 5
Econ 145 - Fall 2024 TEXTBOOKS
Textbooks
The following textbooks are highly recommended, but not required:
R for Data Science, 2nd edition. 2023. Hadley Wickham and Garrett Grolemund, O’Reilly
Media or https://r4ds.hadley.nz/.
R Graphics Cookbook: Practical Recipes for Visualizing Data, 2nd edition. 2019. Winston
Chang, O’Reilly Media. or https://r-graphics.org/.
Hands-On Programming with R. 2014. Garrett Grolemund, O’Reilly Media or https://
rstudio-education.github.io/hopr/.
The following books are also useful:
An Introduction to R. 2019. W. N. Venables, D. M. Smith and the R Core Team, https:
//cran.r-project.org/manuals.html.
ggplot2: Elegant Graphics for Data Analysis, 2nd edition. 2016. Hadley Wickham, Springer.
R Markdown: The Definitive Guide. 2019, Yihui Xie, J. J. Allaire, and Garrett Grolemund,
Taylor and Francis or https://bookdown.org/yihui/rmarkdown/.
Useful cheat sheets can be found at https://rstudio.com/resources/cheatsheets/
Copyright Richard Startz 2024 6