The touch of Machine Learning

Machine Learning has been a topic of interest since a long time. But even today, many of us consider Artificial Intelligence to be an idea which is too good to be true. But in reality, we have actually approached to a time where Artificial Intelligence is real, and is completely capable of doing tasks in ways humans can't possibly imagine. Computers have come a long way since they were first created.

Most of the tools we have created till date are Passive, that is, they can only do what we tell them to do, and nothing more than that. They can only perform limited tasks in limited ways and have no capability of thinking about any other ways. There are three stages for a computing tool.

chart

Passive : The tools which can not think for themselves.
Generative : Generative tools are quite intelligent. Most of the tools which use Machine Learning are generative in nature, that is, they can think for ways to solve for a certain problem. Consider an example of Generative tool. Airbus recently used Machine Learning to design a new type of partition wall in their planes. The new structure was much stronger than what humans had designed, and was 50% lighter. Currently, a there's a completely autonomous bridge construction underway in Amsterdam. Read more about it here. Basically, they are using generative tools so that the computer can design a bridge by itself and 3D print it using stainless steel.
Intuitive : This is the ultimate goal of Artificial Intelligence. This is where the computer uses its intuition to decide the further steps. Recently, Google DeepMind created a Neural Network which defeated the world's best player in Go, which is considered to be the most strategistic game till date. During the match, at some points, even the engineers who designed AlphaGo couldn't understand why AlphaGo made a certain move!

What is Machine Learning?

In simple words, Machine learning is the idea that there are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem. Instead of writing code, you feed data to the generic algorithm and it builds its own logic based on the data.

For example, one kind of algorithm is a classification algorithm. It can put data into different groups. The same classification algorithm used to recognize handwritten numbers could also be used to classify emails into spam and not-spam without changing a line of code. It’s the same algorithm but it’s fed different training data so it comes up with different classification logic.

Classification

Styles of Machine Learning

There are three styles of Machine Learning. But, supervised and unsupervised learning styles are the most common and popular styles.

Styles

Supervised : We have a properly labelled dataset and we receive feedback on each and every cycle. So even if the prediction is wrong, we receive a feedback. This type of learning is easier to implement because we have a properly labelled dataset in this case.
Unsupervised : Here, have an unordered dataset, which may not be labelled. In this style of learning, there is no feedback at all, that is, we won't get a feedback in any type of prediction (correct or wrong). This is a bit more difficult to implement than supervised.
Reinforcement : Here, we can have a dataset of any type. The main difference in this style of learning is the feedback. We only get a feedback when the prediction is correct. For example, if you setup a bot which can play chess, we would prefer reinforcement learning, that is, the bot would only learn the steps if it wins the game!

To sum it up,

differences

Let's code it out!

So now we know the styles of machine learning, but we don't know how to implement it. Let's say that we want to predict cost of houses. So, how would you write a program which estimates the weight of an animals body just by knowing the weight of its brain?

In the traditional programming approach, we would basically write a program with lots if-elses which compare the weight of the brain with certain parameters and coming out with a result. In this case, there would be infinite amount of if-elses because the number of animal species is way too high, and, there would be enormous data which would have to be manually fed into the system for it to work.

Our traditional approach would look something like this :

def calculate_weight_of_body(brain_weight, animal_name):
    if animal_name == animal_one:
        if weight < certain_value:
            return weight_certain_value
    elif animal_name == animal_two:
        if weight < certain_value:
            return some_value

...and so on.

So we follow the Machine Leaning approach to solve this problem. Let's consider that we have a dataset which looks like this:

Weight of Brain	Weight of Body
3.385	44.50
0.480	15.50
1.04	5.50

.. and so on. Also, keep in mind that since our data is labelled, we are following the supervised learning style.

(Actual dataset included in the code repository. Check the code section for links).

So now let's get into the code. This time, we will use the scikit_learn library for performing Linear Regression. Don't worry if you don't understand what it is right now, because I'll be explaining more about it in the upcoming posts! We have three main dependencies :

Pandas : We'll be using this library for quickly loading our data from the dataset file.
Scikit Learn : For performing Linear Regression
Matplotlib : For visualizing our predictions

So lets get into it

# Import all the dependencies
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plot

# Read data
data = pd.read_fwf('brain_body.txt')
x_values = data[['Brain']]
y_values = data[['Body']]

# Train model using Linear Regression
body_prediction = linear_model.LinearRegression()
body_prediction.fit(x_values, y_values)

# Visualize results
plot.scatter(x_values, y_values)
plot.plot(x_values, body_prediction.predict(x_values))
plot.show()

And there you are! You've just taken your first step into Machine Learning. But right now, you're curious about how it really works?

How this works?

We use Linear Regression to trace a simple line which is the best fit for all our data points. Consider the x-axis to be the weight of the brain and y-axis to be the weight of the body. Now, we plot all the points form our dataset on Cartesian plane.

Linear Regression helps us find the relation between the points by tracing a line with the best fit for our data. The equation of our line is

equation

where b is the y-intercept and m is the slope of the line. And by using this line, we can have an established relation between the brain weight and the body weight!

The graph of our points and our traced line would look like this:

linear_regression

Is this magic?

No. This is not magic! Once you start seeing how easily machine learning techniques can be applied to problems that seem really hard (like handwriting recognition), you start to get the feeling that you could use machine learning to solve any problem and get an answer as long as you have enough data.

Summing it up

To sum it up, we now know three basic things :

Machine Learning : Letting the computer figure out the steps on its own, based on the data we provide it with.
Three different styles : Supervised, Unsupervised and Reinforcement
Linear Regression : It allows us to model relations between independent and dependent values via a line of best fit.

Code

The code for this post has been uploaded on my Github Profile!

Find the source code here : https://github.com/C-Aniruddh/linear_regression