- I feel really old in the office.
- I take a lot of meetings with young people wanting to be data scientists..
The meetings generally have this feel:
Hi, I have interest in becoming a data scientist and I want to get your perspective on what that might take, let's meet for coffee.The interns that setup these meetings come from a wide range of backgrounds and skill levels. Many times they are interested in which classes they should take to be competitive for data science roles after they graduate. I thought it would be helpful to put the advice I give them into a blog post so more people can read it.
I reviewed the current course offerings at a few schools, and settled on a list of 14 basic classes (and some electives at the end) that I think should be part of every data science curriculum. Here is my list:
- Calculus I, II, III
- Differential Equations
- Linear Algebra
- Intro to Stats
- Calculus Based Stats
- Generalized Research Method Class
Econometrics may seem like an outlier, but there are concepts of predictive modeling such as time-series analysis, dealing with collinearity, endogeneity and auto correlation which are best taught in the context of econometrics.
- Programming I, II
- Data Structures
- Fundamentals of Computer Algorithms
- Introduction to Database Systems
Sometimes I question the value of formal education in coding, some of the best programmers I know have degrees in non-computer fields. That said, computer science is still a core skillset for data scientists, and is required knowledge to be hired by someone like me (if you have the skills from another source that's great, just figure out a way to demonstrate it with an application/in an interview).
As for electives in the data science space, these should be modeled towards what specifically you want to do.
- If you want to go into business, take classes in economics, business operations, and accounting.
- If you want to go into algorithm development, focus more time in advanced computer science classes.
- If you want to go into academic research, focus on whichever academic discipline you are most interested in.
This is intended to be a reasonable list of classes for young people interested in data science. It serves two purposes really:
- Provide a framework for undergrads looking to become a data scientist.
- Prevent me from saying things I later regret when confronted by students who want to be data scientists without taking math.