8. Large Language Models and ChatGPT

This chapter gets into the details of how Large Language Models actually work. It builds on several previous chapters (NLP, Neural Networks, and Reinforcement Learning).

It may surprise you how these models actually process text and learn!

We start at the high level and go through each part of a transformer model (which is the AI architecture LLMs are based on) one component at a time.

Because we aren't getting into the math or coding, you won't be able to go off and build your own after this, but you will understand each of the components and have a sense of how these models work and what they are doing. As we cover in this chapter, understanding how they work is one thing, but understanding why this works, and why this works the best (for now), is not so intuitive...