Deep learning

Introduction

Deep learning is machine learning using neural networks with many layers forming a “deep” pipeline. The shift from the “classical” machine learning seen in the previous chapter isn’t just that the models are larger. In classical ML, we choose the model family and select the features ourselves. The model learns from training data but it’s still heavily constrained by our model and feature choices. In deep learning, we still choose an architecture that matches the shape of the data, but the model discovers many of the intermediate features on its own.

Learned features are most useful on data that don’t reduce neatly to regular tables of data. Images, audio, and language are high-dimensional, messy, and full of structure that’s hard to encode by hand. Deep models can uncover patterns that would be painful or impossible to engineer manually. Doing so unlocks amazing new functionality but at the cost of more training data, more compute, more careful training, and less interpretability than the models we’ve seen already.

This chapter treats deep learning as the next step in the machine learning story. We start with neurons and feedforward networks, then move to representation learning, backpropagation, training, and the main architecture families. From there we look at generative models, reinforcement learning, and the practical question of when deep learning is worth the cost. The next chapter builds on this foundation to cover large language models (LLMs) and modern AI systems.

Deep learning

Introduction

Get the free 45-page CS roadmap