How to learn computer science

Imposter syndrome is a familiar feeling for many self-taught developers. You’re doing good work but you can’t shake off the doubt that you’re missing some basic background knowledge that your colleagues all seem to share. You’re not wrong! What you’re missing is all the computer science fundamentals they picked up at university.

Some will argue that you don’t need to know computer science to be a good developer. That’s true, to an extent. You can do good work and go quite far without knowing what the computer’s really doing. Sooner or later, though, you’ll hit a point where your lack of understanding lets you down. You do need to know computer science if you want to be a great developer who can take on the tough stuff. Do you want interesting work that’s intellectually and financially rewarding? You need to know computer science.

Some of us don’t have the luxury of spending four years studying full-time under the tutelage of global experts. How should we proceed? There’s an extraordinary abundance of freely available textbooks, lectures and courses available online. That abundance makes it hard to know where to begin, how to select the best resources and how to focus on the most relevant topics.

That’s where this post comes in. It’s a self-directed guide to the most relevant computer science for turbo-boosting your career. It’s based on the path I took when I, a self-taught developer, realised that I didn’t know enough computer science. Naturally, it’s not going to match the depth or rigour of a full-time degree. But we don’t need to go that far to get a lot of value. This is a guide to rid you of your imposter syndrome.

<self-promotion>

The Computer Science Book is a complete introduction to computer science in one book. I wrote it specifically for people like you! It’s the only book I know that gives a well-rounded overview of essential computer science in a single volume.

Many of the suggested resources in this post are big, dense textbooks aimed at computer science students. Reading them all cover-to-cover is not necessarily a particularly efficient, enjoyable or practical method of study. The Computer Science Book features ten concise chapters on each of the core topics of a standard computer science degree. Each chapter gives you all the important, practical information you need to know and provides pointers for further, independent study.

I think it’s the perfect starting point. You can download the first chapter for free from the website.

</self-promotion>

Core knowledge (6 months)

These three textbooks are my absolute favourites. In my opinion they’re essential reading for any developer.

We begin with Code by Charles Petzold. If you’ve ever thought to yourself “but how did the computer actually know that?” then you need to read this book. Code explains from first principles how digital computers work in a gentle, didactic manner. The first few chapters might be a little slow for someone with some programming experience but it quickly romps through binary and hexadecimal number systems, digital logic circuits, assembly and on into memory and processors.

Code does the foundational work of building your mental model of computing. I still remember the revelation I had reading it during a long coach journey when I realised that a computer is nothing more than lots and lots of dumb components working together. It doesn’t “know” anything at all! It’s only doing what its structure compels it to do.

Next up is nand2tetris. It’s available as a free online course or as a textbook called The Elements of Computer Systems. In this book we take everything we learned in Code and write our own implementation. That’s right – we start off designing logic gates and gradually work our way up to writing a complete computer system including the processor, assembler, virtual machine, compiler and operating system. Though challenging, it’s hard to overstate the satisfaction you get from really understanding how code is executed right down to the level of individual logic gates.

The book itself is rather slim, which is why I recommend beginning with Code to make sure you have a good grasp of the concepts. The real value is in the projects so be sure to take the time to do them. It’s a popular course so you’ll have no trouble finding help online if you get stuck. I found that the difficulty racheted up quite a bit after project six (implementing the assembler). If you’re not yet super confident in your programming abilities you might want to come back to the virtual machine and compiler later.

Finally we have Structure and Interpretation of Computer Programs (SICP). It’s something of a programmer’s Bible. It uses a minimalist dialect of Lisp known as Scheme. Because Scheme is so minimalist it doesn’t really do that much but it’s very easy to extend. SICP teaches programming by using Scheme to add more and more complex functionality to Scheme itself. So, for example, you learn about state management by adding variables and assignment to Scheme. Ever wondered how variables actually work? Now you’ll know. By the end you’ll have written a Scheme interpreter, a processor emulator and a compiler targeting your emulator, all in Scheme.

Working through the exercises is critical to your understanding. The first couple of chapters start from absolute basics. I found their exercises really useful to solidify my understanding of basic programming concepts at a higher abstraction than just “this is how I do things in JavaScript/Ruby”. The later chapters can be very mind-bending and the exercises very challenging! There are plenty of discussions and solutions for SICP exercises online but don’t feel bad if you don’t make it all the way to the end. I doubt many do on their first reading – I intend to go through the interpreter and compiler chapters again at some point.

With these three books you’ll have deepened your understanding of computer architecture, programming, interpreters/compilers and programming languages. You’ll be able to see the deeper patterns in the code you write and the computer itself is less of a black box.

Broadening and deepening expertise (1 - 2 years)

The next step is to go deeper into the other important areas of computer science. The challenge here is balancing breadth with depth. It’s not possible to have a deep understanding of the whole field. You need to decide for yourself how much time you can dedicate to self-study and which areas are priorities for you but I recommend getting at least some exposure to every topic mentioned below.

The topics below are arranged in the same order as in The Computer Science Book. Each one builds on the concepts introduced in the preceding topics. Obviously the dependencies are more complex than a simple linear progression so please don’t feel constrained by the order presented here – let your curiosity guide you.

Some resources cover familiar ground (e.g. computer architecture). Whereas the books above gave descriptions of “toy” systems and programs designed to be easy to teach, these resources are much more in-depth and more reflective of how things really work.

Theory of computation explores the mathematical and logical underpinnings of computer science. Some dismiss it as pointlessly abstract but the concepts pop up more often than you’d think and it’s fascinating in its own right. I’m a fan of the Great Ideas in Theoretical Computer Science course at Carnegie Mellon (YouTube lecture playlist here) because it covers a wide range of topics and doesn’t linger too much on mathematical proofs. You’ll understand why Alan Turing’s work is such a big deal and understand the capabilities and limitations of your computer. Look out for finite state machines and how they can elegantly model certain systems (e.g. a user sign-up flow). The Annotated Turing by our mate Petzold (of Code fame) is a really enjoyable companion to Turing’s landmark paper. It provides all the necessary intellectual background.

Algorithms and data structures are pretty much core computer science. Their importance is perhaps slightly exaggerated by certain companies' reliance on them for whiteboard interviews. I’ve waited until now to cover them because they’re arguably less relevant to day-to-day web development. As you take on increasingly challenging problems, however, it becomes more and more important to have a solid grasp of the tools available to you and their tradeoffs.

A commonly recommended textbook is The Algorithm Design Manual by Skiena. I have to admit I didn’t really get on with it that well. It’s described as being “reader-friendly” and is definitely better than other, more proof-heavy textbooks, but it’s still hard going. Algorithms by Erickson is more of a streamlined introduction. Open Data Structures is a good overview of basically every data structure you’re likely to come across. It’s available with code examples in a few programming languages too.

That said, in my opinion the best way to get a handle on algorithms and data structures is by implementing them yourself. By all means read the books I’ve suggested but all that reading will be worthless without practice. Harvard’s CS50 MOOC is a great starting point – lots of instruction material and you implement some common sorting algorithms and data structures in C.

For practice you can grind through problems on Leetcode but they’re a bit unimaginative. My main suggestion is therefore a bit leftfield. Once you’ve got a few solutions under your belt try and get access to Google’s Foobar program. If you have a search history full of programming terms searching for “python list comprehension” should show you an invitation. Treat each challenge as a learning opportunity: if you can’t solve it at first, find a working solution online, read up on the related terms you find (e.g. dynamic programming, Bellman-Ford algorithm) and reimplement the solution yourself. You’ll gain a much better intuition for the concepts this way. I’m also a fan of Mazes for Programmers. It teaches sophisticated graph algorithms through building beautiful mazes!

The computer architecture we learned in nand2tetris is very simplistic. To learn how computers are actually implemented I highly recommend Computer Systems: A Programmer’s Perspective. It describes the architecture of computing systems on a much deeper level than nand2tetris and even includes useful chapters on concurrency and networking. As the name suggests, it’s focused on what programmers need to know about computer architecture – perfect for us! After reading this you’ll understand things like instruction pipelining and the memory hierarchy.

Another popular textbook is Computer Organisation and Design. It’s definitely more focused on the hardware implementation details – it includes plenty of circuit designs – but I particularly appreciated its in-depth description of how a modern, superscalar, out-of-order processor works. I read this in the aftermath of the Meltdown vulnerability so that I could understand how Meltdown actually worked.

For operating systems I recommend Operating Systems: Three Easy Pieces (OSTEP), a wonderful, freely available resource. The “easy pieces” are the three core concepts of an operating system: virtualisation, concurrency and persistence. You’ll cover Linux processes, virtual memory and file systems. This is essential reading if your understanding of processes is limited to occasionally wielding kill -9. It also features a chapter on concurrency, so between this and Computer Systems you’ll have that topic covered.

There are two ways of looking at computer networking. Application developers need to understand how requests get from the client to the server and back again. Network administrators need to understand how to put together the hardware and software infrastructure that makes that communication possible. My favourite resource for developers is High Performance Browser Networking. In a short space it covers the most important Internet protocols (IP, TCP, HTTP) and includes lots of practical advice on how to write websites and servers to be performant. This is basically all required knowledge for web developers.

Computer Networks: A top down approach is the textbook you’ll need for a deeper understanding. “Top down” here means that it begins by looking at networking from the point of view of the application developer and works its way down the network stack. Read until you lose interest. If you make it all the way to the end you might be interested in networks from the perspective of a network administrator. Should you wish to go this far, the de-facto standard is Cisco’s CCNA certification. There is a wealth of free study materials online that you can use without actually signing up to get the certificate.

A good working knowledge of databases is essential nowadays but good resources are a little tricky to find. All of the textbooks I’ve read are really long and dense. I don’t think they’re a good starting point. I enjoyed a MOOC offered by Stanford’s Jennifer Widom. The MOOC has now been replaced with a complete programme of introductory database courses. I recommend doing the “data models”, “querying relational databases” and “database design” modules. If you’re still interested you can do the advanced topics. Architecture of a Database System is a short and readable overview of – you guessed it – database system architectures. It’s important that you have at least some idea of how your database engine works under the hood so that you can use it more effectively and avoid common performance pitfalls. Use the Index, Luke teaches developers what they need to know about database indexes.

Moving beyond strictly databases, Designing Data-Intensive Applications focuses more on distributed data systems but includes sections on data querying (including SQL), storage and retrieval. Friends have raved about this book! Another recommended resource for distributed systems is Distributed systems for fun and profit. I’d argue that you don’t need to go deep on this but knowing at least the CAP theorem is useful.

Next steps

From here you can go for further depth or explore new topics. There are some important “optional” topics that I haven’t covered (e.g. security, cryptography, machine learning). Partly this is because I don’t think they’re quite so fundamental, especially for web developers, and partly because I simply don’t know enough about them to give solid recommendations.

After working through everything above you’ll have a much better understanding of computer science. If you have any lingering imposter syndrome, remember that the learning will never be over. There’ll always be another book to read, a new language to learn or a new problem to solve.

But look at what you’ve done! You’ll have written an interpreter, followed a packet through the network stack and across the Internet, seen how databases efficiently store and query data (including across multiple machines), designed a frickin' processor and much, much more.

How to learn computer science

Core knowledge (6 months)

Broadening and deepening expertise (1 - 2 years)

Next steps

Did you find this useful?