5 - Part 2 : Representing Words and your first Machine Learning algorithm
In this video we start with the simplest way computers use language: bag of words. Which essentially is just counting the number of times each word appears to build a big list of numbers. But already we can do some clever things with that.
We introduce the idea of similar lists of numbers, and the distance between lists. Using that we can start to use machine learning algorithms for practical tasks. And we start with the classic problem of deciding if an email is spam or not. Already with our bag of words approach, and a simple machine learning algorithm called nearest neighbours we can show an actual AI system that can start to detect spam.