What exactly is Machine Learning
Machine learning is a really interesting concept! It’s all about teaching computers to find patterns in a bunch of information, and then use those patterns to make predictions or answer questions. The more information we give them, the better they become at making accurate predictions.
Let me give you an example to make it easier to understand, By the way you’ll be seeing a lot of examples today, so brace yourself. Imagine you have a friend who has never seen different types of fruits before. They don’t know what an apple, banana, or orange looks like.
To help your friend learn, you decide to show them some pictures of fruits and tell them the names of each fruit. Your friend carefully looks at the pictures and tries to remember what each fruit looks like based on the information you provide.
After looking at many pictures and hearing the names of different fruits, your friend starts to notice certain things that are common to each fruit. They realize that apples are usually round and can be red or green, bananas are long and yellow, and oranges are round and orange in color.
Now, your friend can look at a new picture of a fruit and make an educated guess about what it is by comparing it to the patterns they’ve learned. This is similar to how machine learning models work.
In machine learning, we use a special computer program called a machine learning model. Instead of showing pictures of fruits, we give the model lots of examples, like pictures of fruits, and tell it what each example represents (for example, this is an apple, this is a banana).
The model carefully looks at all these examples and tries to find patterns or similarities between them. It learns to recognize features or characteristics that are common to a particular type of fruit. For instance, it might learn that round shape and red or green color are common features of apples.
Once the model has learned these patterns, we can give it a brand new picture of a fruit that it has never seen before. The model can then use the patterns it learned to make a guess about what the fruit is. It might say, “Hmm, this fruit is round and red, so it’s probably an apple!”
Just like your friend gets better at recognizing fruits with more practice, machine learning models can improve their accuracy with more data and training. They can learn to recognize not just fruits, but also other things like animals, objects, or even predict outcomes based on patterns they’ve learned.
So, machine learning models are like super smart friends that learn from examples to recognize patterns and make predictions or classifications based on what they’ve learned.
Starting with the most basic model, The Decision Tree
Starting with one of the most basic models called the Decision Tree, there are fancier models that can give even more accurate predictions. However, decision trees are great because they are easy to understand and they serve as the foundation for some of the best models in data science.
Let’s imagine you want to play a game with your friend to guess what kind of animal they’re thinking of. To figure it out, you can ask them a series of questions.
You start by asking, “Does it have four legs?” If your friend says yes, you know the animal could be a dog, cat, or horse. But if your friend says no, then it’s probably a bird, fish, or snake.
Next, you ask, “Does it have fur?” If your friend says yes, you can narrow it down to a dog or cat. But if your friend says no, then it’s likely a horse, bird, fish, or snake.
You continue asking questions, such as “Does it bark?” or “Does it have feathers?” Each time you ask a question, you eliminate some possibilities and get closer to the correct answer. Eventually, after asking enough questions, you can determine the exact animal your friend is thinking of.
A decision tree model works in a similar way. Instead of animals, we use data and make decisions based on that data. A decision tree is like a flowchart or a game of 20 questions, where each question is a “node,” and the possible answers are the “branches.” We start with a single node and as we ask more questions, we add more nodes and branches to the tree.
For example, let’s say we have data about people’s height and weight, and we want to predict whether they play basketball or not. The first question in our decision tree could be, “Is the person taller than 6 feet?” If the answer is yes, we go down one branch, and if the answer is no, we go down another branch.
Then, we might ask a second question, like “Is the person’s weight more than 180 pounds?” Depending on the answer, we go down different branches again.
We keep asking more questions, dividing the data into different groups based on the answers, until we reach the bottom of the tree. At the bottom, we have a final answer or prediction for each group.
So, a decision tree model helps us make decisions or predictions by organizing a series of questions and answers in a tree-like structure. It’s kind of like using “if” and “else” statements in programming, where we follow different paths based on the conditions.
How do we know Decision Trees are bad at prediction?
There’s a thing Called Model Validation, now you’ll be thinking what is model validation? Lemme explain
Imagine you are learning how to draw pictures. You have some practice sheets that show examples of different objects like apples, houses, and cars. You want to get better at drawing, so you decide to test your skills.
To check how well you can draw without looking at the practice sheets, you ask a friend to show you new pictures of objects and you try to draw them. Your friend tells you if your drawings are accurate or not.
This process of testing your drawing skills with new pictures and getting feedback from your friend is similar to model validation in machine learning.
In machine learning, we create models that can make predictions or classify things based on patterns they learn from training data. But we need to check if our models can perform well on new, unseen data, just like you wanted to see if your drawing skills extend beyond the practice sheets.
To do this, we set aside some data that the model hasn’t seen before, called the validation set. This validation set contains examples with known outcomes or labels. It’s like the new pictures your friend showed you.
We use the model to make predictions on the validation set and compare those predictions with the actual labels. If the model’s predictions match the actual labels well, it means the model is doing a good job on new data.
The validation process helps us understand how well the model is likely to perform in real-world situations. It gives us a measure of how accurate and reliable the model’s predictions are beyond just the training data.
Validation is important because it tells us if our model is really learning meaningful patterns or if it’s just memorizing the training examples. If the model performs well on both the training data and the validation set, it means it has learned useful patterns and is likely to make good predictions in the future.
So, model validation is like checking your drawing skills with new pictures and getting feedback. It helps us assess how well our machine learning models can perform on new, unseen data and tells us if they are reliable in real-world applications.
Overfitting: Problems with Decision trees
There’s a reason no one uses Decision trees as it has some major challenges.
Imagine you have a friend who loves playing video games. They’re really good at playing a specific game that they’ve practiced a lot. They know all the levels, challenges, and strategies by heart.
Now, your friend wants to impress you by playing a new game that they’ve never seen before. Despite their gaming skills, they struggle to perform well in this new game. They keep losing and making mistakes they wouldn’t normally make.
The reason for their struggle is that they got so used to the rules and patterns of their favorite game that they can’t adapt to the different rules and challenges of the new game. They are overfit to the specific patterns of their favorite game and find it difficult to apply their skills to a different game.
This situation is similar to overfitting in machine learning. When we train a machine learning model using a specific set of data, it learns the patterns and characteristics of that data. However, if the model becomes too focused and specialized on the training data, it may struggle to perform well on new, unseen data.
Just like your friend’s gaming skills didn’t transfer well to the new game, an overfit model becomes too rigid and doesn’t generalize well beyond the specific examples it has seen during training. It becomes too specialized in the training data and may not be able to handle variations or unexpected patterns in new data.
In other words, overfitting occurs when a model fits the training data so closely that it starts to memorize it rather than learning the underlying patterns that would allow it to make accurate predictions on new data. This can result in poor performance when the model encounters new, unseen examples.
To avoid overfitting, it’s important to strike a balance. The model should learn the general patterns from the training data without getting too fixated on specific details. This way, it can make accurate predictions not only on the training data but also on new, unseen data that it hasn’t encountered before.
By ensuring that the model can adapt and generalize beyond the specific examples it has seen, we can improve its ability to handle different scenarios and make reliable predictions in real-world situations.
Now imagine you have a friend who is learning to ride a bicycle. They start by using a tricycle, which is stable and easy to control. As they become more confident, they decide to switch to a two-wheeled bicycle.
However, when they start riding the bicycle, they find it challenging to maintain balance and control. They struggle to ride smoothly and may even fall off occasionally. It seems like the bicycle is too difficult for them to handle at this stage.
This situation is similar to underfitting in machine learning. Underfitting occurs when a machine learning model is too simple or lacks the necessary complexity to capture the patterns and relationships in the data.
Just like your friend with the bicycle, an underfit model fails to adequately capture the nuances and details of the data. It is too basic and doesn't have enough capacity to learn the underlying patterns effectively.
When a model underfits, it typically performs poorly not only on the training data but also on new, unseen data. It struggles to make accurate predictions or provide meaningful insights because it fails to capture the complexity and variability of the problem at hand.
To address underfitting, we often need to make the model more complex or use more sophisticated algorithms. By increasing the model's capacity, it becomes better equipped to capture the patterns and relationships in the data, leading to improved performance.
Thank you for joining me as we explored the basics of machine learning today. I hope you found this information helpful and insightful. Get ready for the next part, where we will dive into more advanced models like the Decision Tree and explore their potential challenges. There’s much more to come, so stay tuned for the next blog!