Philosophy

This page outlines my teaching philosophy and some implementation details. You are unlikely to find it interesting! I include it here for three reasons. First, I believe in transparency. I am happy for students to “see behind the curtain,” to understand the reasoning behind my pedagogical choices. Second, I want to provide teaching staff with links which they may find interesting. Third, I am always interested in discussing pedagogy with colleagues. Let me know if you have comments!

First, the syllabus for my Harvard class included a “Course Philosophy” section which applies just as much to every class I teach. Highlights:

No course does a better job of increasing students’ odds of getting the future they want.

This is the central promise I make to every student. Do what I tell you and you will increase the odds of getting the internship/job/career that you want, as well as your odds of being successful once you are there. Every organization needs more people, especially more junior people, who can do things with data.

The central metaphor for this class is Ulysses and the Sirens. You are Ulysses. Ithaca is the future you want. The Sirens are the many distractions of the modern world. I am the rope. No course at Harvard does more to increase students’ chances of getting the future they want.

I am creating a “pit of success,” a structure in which students can’t help but to spend 1 or 2 hours a day doing data science.

No Lectures: The worst method for transmitting information from my head to yours is for me to lecture you. There are no lectures. We work on problems together during class. You learn soccer with the ball at your feet. You learn about data with your hands on the keyboard.

Professionalism: We use professional tools. Your workflow will be very similar to the workflow involved in paid employment. Your problem sets and final project will be public, the better to impress others with your abilities. High quality work will be shared with your classmates.

Second, “Kill The Math and Let the Introductory Course Be Born” is an article (pdf) which explains why most of my courses include little (meaningful) mathematics. Abstract:

Our introductory classes in statistics and data science use too much mathematics. The key causal effect which our students want our classes to have is to improve their future performance and opportunities. The more professional their computing skills (in the context of data analysis), the greater their likely success. Introductory courses should feature almost no mathematical/statistical formulas beyond simple algebra.

Any introductory course which, for example, does not teach Git/Github is doing its students a disservice.

Third, the heart of my courses are the tutorials which students complete on their own. This essay about how to write tutorials provides some background details. Highlights:

Imagine the shallowest possible learning curve. Almost every student should be able to answer almost every Exercise, albeit perhaps with the help of the Hint. There are no hard questions. In fact, there really aren’t questions at all. Instead, there are instructions: Do this. Do that. Next do this other thing.

Assume that you are giving the student a private lesson. You ask them a question. They give you an answer. What would you say next to them? What do you want to teach them, given that context?

There are 1,000,000 bits of R knowledge which we might provide to students: tips, tricks, cool packages, fun websites, et cetera. We don’t have time to mention all of them. The art of teaching is to, first, decide which 10,000 of the bits are most important to mention and, second, figure out the best time to mention them. Tutorials are a key location for doing that mentioning. Which bits do we mention and where do we mention them?

Generally, students don’t do the assigned reading, at least in a large class. But, they will complete required work. They will do the tutorials. Our promise: If you complete the tutorials, you will become a data scientist. There is simply no way not to.

Fourth, I will (try to) ensure that students are always engaged in class, always doing data science. Part of that is the “no lectures” philosophy. The rhythm of class centers around solving a specific problem. I say a few words and send the students off to work on a problem. Because students work so much in class, I end up saying fewer words in lecture than any other teacher.

In the physical classroom, students work in pairs, either in parallel, discussing the problem as they both type in some code, or as pair programmers, both looking at the same screen but only one student writing code.

On Zoom, students work in breakout rooms, each of which includes about four students. One student shares her screen, a different student each time we go to breakout rooms. Another student is the “guide,” the person who tells the screen-sharer what to type. The other two students should be coding along as well. Course staff moves from room to room, ensuring that the sharing and guiding students are different each time. Course staff will also often, when first entering a room, ask a different student to share his screen.

After a few minutes of working as a small group, I will bring the whole class back together. I then ask a random student to tell me what their group came up with. From my syllabus:

Cold Calling: I call on students during class. This keeps every student involved, makes for a more lively discussion and helps to prepare students for the real world, in which you can’t hide in the back row. Want to be left alone? Don’t take this course.

I share my screen with the class, typing what the randomly-selected student tells me to type, providing commentary on the approach and, perhaps, suggesting a different answer. My goal is to ensure that all the student groups are “caught up,” so that when I send them back to continue work on the problem, they can all start from, more or less, the same place.