(Editor's note: transcripts don't do talks justice.
This transcript is useful for searching and reference, but we recommend watching the video rather than reading the transcript alone!
For a reader of typical speed, reading this will take 15% less time than watching the video, but you'll miss out on body language and the speaker's slides!)
Thanks. Can you hear me? I'm your visitor from academia. I think this will be fun. I hope you agree. This is different for me and maybe a little different from you. And what I thought I would do is kind of give you a take on where I think undergraduate programs in computer science are sitting today and why I think you should find that interesting.
So before I get to the message, just a little bit about the messenger. So it took me about 10 minutes to get here from my campus. I've been a faculty member since 2003. I plan to be there for a long, long time. When I'm not talking about curriculum, I do actually work in programming languages. I write things like this, I write things more complicated than this. I have a bunch of PhD students. I write papers. I do all those things. This also somewhat interestingly is the last line of code in the talk.
This is the diagram I'm mostly going to talk about. We'll get to it in several minutes. It's the best picture I have of the core of our undergraduate curriculum and how the pieces fit together. I lead the charge to create that several years ago. And we have a saying, at least in my world, that no good deed goes unpunished. So after that I ended up on an international steering committee to perform the once-a-decade update of the international guidelines on what should be in a computer science undergraduate curriculum. That was a three year process that led to six meetings and a 300 page report. And it's actually I think pretty darn good, but of course I'm biased.
And as that was winding down, then MOOC happened. So if we've crossed paths before, it might be because you've seen a logo like this. I have what's now a sequence of three somewhat shorter MOOCs on Coursera. And if you've taken them, that's awesome. Come say hi and that's why I look familiar. These days though, if you think about my job in sort of academic middle management, I stare at spreadsheets like this figuring out who's going to teach about 120 courses a year to a couple of thousand students.
So here's the model for this talk. I think it will resonate with you. This is where I work, it's an actual picture of the building. You can call it the ivory tower, though technically speaking it is neither ivory nor a tower. I'm basically assuming that you're over there.
And a big part of what we do is send you people.
And therefore, it might be interesting to you to think about how we think about what we teach them, and what we actually teach them and what we think they've learned You know, you might try to hire these folks. You might have your own pet peeves, things you rant about after a beer or two about what undergraduate CS education gets wrong. In which case a modern view of what's going on will let you update your rant.
It will at least let you see what may or may not have changed from when you participated in such a thing. Or if you never participated in such a thing, an updated view on what you missed. And I also feel like we have a somewhat unique role, at least here in Seattle, that we are kind of this neutral party. We send students out to everywhere, our alums come back from everywhere. And so even though I don't participate in the real world in the same way that you do, I have sort of a unique viewpoint because I am sort of neutral in that interesting way.
So that said, it's very interesting times in computer science education and I boiled it down to three challenges that I at least think about a lot. So here's the first one. Computer science is not the same thing as software engineering. And this is something that both we and you as proxies for the real world have acknowledged, and studied, and wrestled with for decades. And I'm actually very much in favor of this divide. So that's putting my sort of philosophy on the table. And let me give you some reasons.
So the first is we navigate that divide all the time. We send students out to summer internships, they come back, they mentor, they do resume workshops. And so the classroom experience and on campus live experience is nicely complemented by a series of exposures that help our students understand the difference between computer science and software engineering. Second, I do believe in sort of the classic academic argument that we're trying to future-proof our students. That you need to walk out of an undergraduate education with critical thinking skills, self learning skills, and a way to organize material because the technology is going to change after you leave campus.
Third, we do deeply believe that we're not vocational programs. There's nothing wrong with vocational programs. Those exist also and we shouldn't duplicate them, we should do something different. And in particular we should not target any subset of the technology industry. And if you're not going to target a subset, you can't targeted it at all because it's so broad, it's so different. If I asked each of you what's the three most important tools I should expose my students to, together just among the people in this room there would be a list of hundreds of things and it's just impractical.
And lastly, I get to quote my colleague and friend and mentor David Notkin who passed away a few years ago, who many times over his long career was challenged for not producing students that had a set of skills that are really important in the business world and understanding realities and communication skills and all the sort of things that go along with that. And he learned to say, you know it seems like you're looking for someone with the skills that are common among those with three to five years of industry experience. And the best way to find those people is to look for people with three to five years of industry experience.
And it's not that academia couldn't do better, this is not an excuse. It's that we all have a role to play and we all learn as we go through our lives. And you're going to learn so much in your four-ish years at the University of Washington and you're going to continue learning after. And whatever we teach there'll be something left that people learn afterwards.
So that was sort of challenge number one. I believe I teach computer science, which is different than professional software engineering. The second challenge that you may be less aware of is that we are currently going through an epic enrollment boom. So I'll first cite some national statistics. A fantastic report from the Computing Research Association came out recently that puts the data behind what we all knew. So years on the x-axis, number of majors on the y-axis averaged across over 100 research universities in North America. The orange line is the highest number we ever saw during the dot com boom. It must have been '99, 2000, or 2001, right? And we are well north of that after being well south of that. And it's happening very quickly. So we're drowning in students that are passionate about wanting to be one of us and then potentially one of you.
Since you're here in Seattle, I'll tell you the local news. So this is a graph from the University of Washington some fewer years on the x-axis. When freshmen show up on campus they take a zero-stake survey of what do you think your first choice major is. We're now ahead of business and ahead of biology and it's got to stop because we're up to over 900 freshmen and there are only 6,000 freshmen.
That's before they ever sign up for classes. When they actually sign up for classes, blue line is our first programming course, red line is our second programming course. And in the last decade or so, we've seen both of them increase by a factor of about 2.5x. Combined we're now seeing north of 5,000 student enrollments in freshmen programming courses a year. There is an entire additional talk about how you manage a course that looks like that.
At the other end of the pipeline, most of those students never become computer science majors. So this is what growth looks like in terms of bachelor's degrees awarded. I have very good projections because they're already on campus. So the red line, which is the number of students who actually get the diploma, goes out to 2019. And 2014 was the first time my department ever gave more than 200 bachelor's degrees, and five years later we'll be at 345. And we're turning away too many students, students we'd like to teach. We can only grow so fast.
Challenge number three will not surprise you. In fact, I believe it's basically the origin story of our field. You know, it's this picture. I don't have to narrate it for you. You recognize some or all of these pictures. This is the 1970 through 2020 picture. And what I would point out is this doesn't surprise us. This is what we live day in and day out. But to my colleagues in the chemistry department, or the math department, or the sociology department, this is insane.
And I certainly appreciate that it is hard for academics to move quickly. To take a curriculum that's spread across hundreds of students and respond to this kind of change. But I want to give ourselves a little bit of credit. In preparing this presentation, I went back and looked a little bit and I thought, well have we added new things in the last 10 years? And we've added about 9 or 10 different courses for undergraduates that I think are accurately reflective of trends going on in computing. So we're not speedy like a four person startup, but we can do this.
And there are some things we shouldn't change. There are things that were taught in computer science courses the year I was born that are still taught and will be taught the year I die. The difference of what happens when you run a logarithmic algorithm versus a quadratic algorithm, this matters. The idea of caching, this matters. The idea of managing resources matters. These things are not going to change and it's nice to see those common threads last through the decades and I think that's a good thing.
I should also point out that I used grep this morning. You might have to. And believe it or not, we still teach our students how to use grep because it's useful to them and that's OK too. All right, so now into the core of it. I do curriculum design. I'm probably the only one in this room who does curriculum design. I do it with a lot of colleagues. It is an exercise in an online system evolution. It is a legacy system. It turns out you don't get to flush your pipeline of students, redesign a new system, and then re-enroll freshmen. So there is a gradual process involved. I'm not going to show you that, but I will show you kind of where we are.
So here is your roadmap. I'm dramatically over simplifying of course. We have introductory programming. We have the core where I want to mostly focus, because I think that's the most interesting story to tell. You have a bunch of senior level courses. I'm mostly going to skip a really important thing, but it just doesn't fit in the message I'm trying to make today, which are senior capstone design courses. Those are project-based courses where they solve a real problem. It synthesizes ideas from multiple courses. Really important, but there hasn't been the kind of change I can boil down to a PowerPoint slide in those, so we'll kind of skip that part of the story.
So I'm just going to show you what these courses look like. If you did do an undergraduate degree particularly in the US, you probably had 13, 14, 15 week terms and you had two terms a year. We have three 10 week terms. So that kind of throws off your mental mapping so I wanted to make sure to point that out. Hundred-level introductory programming, these things have never programmed before. Or maybe they have, but we assume they haven't. They show up in what we call CSE142, 1,000 of them at a time or so. And it turns out that if you've never programmed before, and you're basically a college freshman, in about 10 weeks we can teach you variables, conditionals, loops, arrays, methods, a little bit of I/O, and a little bit of objects.
OK, you've got to start somewhere. It's a good place to start. 3,000 students a year, 7 out of 8 will not end up with a bachelor's degree in computer science. About half of them will go on to our next course. This is when you get into some juicier stuff, recursion, linked lists, binary search trees, real object oriented programming at least at a second course level, interface versus implementation, abstraction, things like that. And now about four out of five of them will not end up with a four year degree in computer science. So to me this is kind of the starting point for when students get into what I consider the focused study of computer science.
And that brings us to the core. To me the core is whatever connects what we just finished to being able to take an advanced, senior level course in a particular area. There is a gap to navigate, and you want to navigate it as efficiently as possible. Don't freak out, I can explain this entire picture in five minutes and I'm basically going to. Blue courses are required, purple are not but most students take them. The yellow is a hardware track that most students don't, and I won't have any more to say about it but I wanted to make sure you saw that it was there. And that we have 10% of our students or so do that, and they become fantastic computer engineers.
The top row is roughly software. And we're going to go through it bit by bit. But these are high level software courses. The middle row, the word foundations is for political reasons. We didn't want to call it theory. And it's not theory, we try to be very applied. But from your perspective it's theory.
And then that third row is your systems level. Things much more like the previous talk. C assembly, interface to hardware, what's really going on underneath that Java you wrote in your introductory courses. Arrows are pre-reqs, don't worry about the two kinds of arrows. What exactly are we teaching across this set of 8 and 1/2 courses? That first theory course it turns out is Boolean logic, the idea of a set that has elements, proofs, induction, finite state machines, undecidability. If you don't know what that means, if you've heard of the halting problem, that's what I mean. That there are things that computer programs cannot accurately compute, which is very surprising the first time someone tells you that.
Sure this looks like theory to you. I would also point out that it has more arrows coming out of it than any other course in our curriculum. So we think it's important in things we teach later. And when I did that international committee for three years, this was the set of material that changed least in the update from 2001 to 2013. So this is pretty timeless stuff. All right, in case you're like ooh I hate this already. Let's go here.
Bits, binary numbers, assembly c, pointers, aliasing, caching, malloc, free, and in my favorite part of the course, taking the last week to connect it up to Java. How is it that Java runs on top of this stuff to connect you from the low level software down at the assembly and the interrupts in the system calls, up to the introductory programming in Java? There is a real world out there and we want to give an initial, broad understanding of what it is.
So it turns out we wanted to get away with not teaching make, and bash, and grep, and git and all that. And figured there's no intellectual content, it's important but students will pick it up on their own. The students said no, you need to teach it to us. They were right, we were wrong. So we put it in a one credit, pass-fail course that everyone takes. They get a lot out of it and now they know that stuff.
Software design implementation. This is an early, you might argue not early enough, introduction to actually designing your software, testing it, programming it in terms of abstractions, writing readable and usable and effective specifications, learning to debug, a little bit of design patterns so when you go to your summer internship, you know what people are talking about, and so on. You still got to understand the difference between linear and quadratic and exponential and logarithmic. You have to deeply understand how trees work, priority queues, hash tables, seven different ways to sort, graphs they're a big deal, nodes, edges, arrows, they're important.
Our innovation is now with 30% fewer data structures. This is where we now say, you know the old model of one thing happens at a time is not the real world. And this believe it or not, I have a whole separate talk about it, is a great place to introduce parallelism. Oh and by the way, there are things that we know computers can solve, we just strongly believe they can't solve in reasonable time. And so there's this whole P versus NP thing, arguably the most interesting open question in theoretical computer science.
So that other theory course, believe it or not, is probability and statistics in computing. So it turns out that there is a place in the world for learning the chances of drawing a blue marble out of a jar, but people don't tend to draw a lot of blue marbles out of jars. You need all that theory, you need to understand Bayesian reasoning. You need to understand statistical independence. You need to understand the difference between a Gaussian distribution and an exponential distribution. But you can learn that thinking about redundant arrays of inexpensive hardware or noisy communication channels or the randomness inherent to quicksort or the chance of a node failing on your network and so on.
And so we kind of brought classic statistics into the computing domain. And this was not my idea, and was really great foresight. A lot of foundational computing has become very statistical, think data mining. And you want this kind of right in the center of your picture. Systems programming, that previous course that led into it, they had never seen bits before, they had never seen C before. So now let's actually do something with it. Let's do some C, let's do some C++, lets do some asynchronous I/O, let's mess with some threads and locks. And while not required, almost all of our students take it.
So SQL is a thing, transactions are a thing, that produces a thing. It's kind of a different thing than anything in any of the other classes. It used to be to learn this stuff you had to take a senior databases course where you also learned how to implement relational database management systems. That's not what most of the students wanted to learn. This is what they wanted to learn. So we split it into two courses and it's been very successful.
And then this is my baby. This is the course that became the MOOC. This is the course I taught to 160 students a few hours ago. Functional programming, static versus dynamic typing, modularity, macros, type inference, this sort of stuff. Not all software looks like Java or C with curly braces and while loops. It's a big world out there, let's expose you to some of it.
So there it is, right? That takes someone who can kind of program with linked lists in recursion but it's hard, and has literally programmed for 20 weeks, and I argue they're now ready to be fairly effective learners to become effective engineers. And from an academic standpoint, this is my view of it. This is the staff that if you have an undergraduate computer science degree, you have mostly internalized so well that you forgot it took you two years to learn it.
I remind students when I teach these classes that they spent years as a child learning single digit multiplication. Which if you think back on it is shocking that it took more than what? Like a week?
So believe it or not I'm almost done. Because when you get to the 400 level it's a very different roadmap. What you do at your senior level is you have independent courses taught by domain experts, world-leading researchers in these different fields. And lose coupling between the courses is a real strength because it allows innovation without having to redraw the whole architecture. So you know you have a bunch of courses. And they're all great, you have a networks course, you have an AI course, you have a databases course, you have a software engineering course, you have a computer vision course. And students typically take six to eight to ten of these and then they graduate. So nobody takes all of them, there's no time. Nobody takes none of them, we wouldn't let you graduate.
So kind of the only interesting question I have for this is, so which ones are popular? And I actually think that's an important question, especially if you own the teaching schedule. But since none of you do, I think that student interest is a good proxy for where the next generation of professionals think their passion and future interests lie. And I think they're really smart people who read a lot of your blogs and are right in where a lot of the excitement is. But I hasten to add that the data I'm about to show you is influenced by much, much more than that. What you're not seeing is some faculty are more popular than others, some class had to get canceled because a professor left the university, different courses are offered at different sizes and so on. There's a lot of reality I'm sweeping under the rug.
But nonetheless, I find this graph pretty interesting. I took four years of graduates from 2012 to 2015, threw them all into one big bucket, and then said what percentage of them took a particular course. Sorted them on the x-axis, y-axis is what percentage. So the most popular course was networks, it turns out the least popular is not complexity theory, I cut off the really heavy tail. But there only a few more and it ranges from kind of 70-ish% to 12-ish%. And what I would argue is this is actually really broad and really fascinating. There is no course that hit 75%. There are only four courses that hit 50%. It's a broad field, it's getting bigger. I would be more than happy to add a fifth year to the degree, but that's not the way the world works.
So we graduate people knowing a small but decent subset of this sort of stuff. I can also look at trends, I didn't do this quantitatively but I know what's going on. I look at a lot of spreadsheets and run a lot of database queries. The first trend I've already mentioned is choices, choices, choices. People are broadening, people have different passions, I want to make room for that. There is not a one size fits all computer science graduate any more and I don't want there to be. Machine learning and artificial intelligence are way up through the roof. In the previous slide, machine learning clocked in at 45%. If I redid it for this year it would be 65%. This is not surprising.
Security remains bottlenecked. It should be higher on that graph, but I can only offer it two out of every three quarters, and we need that third one. And when we add it, it will go up. Operating systems and networks are a little bit down. Don't worry, they're still very popular just the broadening out is happening. The biggest drops are in software engineering and graphics. I don't think that's because these are bad courses or bad subdisciplines. I think it's because it's a zero sum game and if something goes way up, a bunch of things have to go down.
So what to make of this? I don't know.
But this is our future. This is your future. So I think it's worth seeing where, at least our current students or very recent students, think things are going. So a few parting thoughts. So is it working? I don't know, define working. Run a study for me. I can't give you a proof. But I try to talk to alumni a lot and they sure seem like happy, successful folks that are changing the world. Arguably more than I ever did. So I'm very happy and we do have formal exit surveys and five year later surveys and we do really, really great on those. When I sit around at a happy hour with some alums, I will ask them what should we have done differently. And they mostly heap praise on things.
I will point out before you fill out your card, this is known as survivorship bias, people who come to alumni happy hours tend to be more successful than the average. This is confirmation bias, I am hearing what I want to hear. And this is social acceptability, it turns when someone asks, so do you think I do a good job at my job, people usually say yes. But I'm trying to find out and that's what I got. All right, I think this is an important and really interesting point. Some of you never did a program like this, whether now or 20 years ago. And I think that's great. I would never suggest and I have not said that an undergraduate degree in computer science is necessary to be successful in software, because it's not. We have plenty of proof walking around that it's not.
Nonetheless, I think academia has a very important role to play in our society in bringing people into computer science. For those of you who are self-taught, and truly self-taught, I will point out that first of all, you I believe are the minority depending on how you measure and how you count it. And second of all, I think if you look back, you'll find a really interesting story about how you gained and then spent the social capital to figure out what it is you wanted to learn to be where you are today. And I think there are a lot of kids in our society who don't have that social capital, and maybe have just enough to figure out how to go to the University of Washington and sign up for the first computer science course. And suddenly they find their home and the curriculum can take them from there.