Building a supervised learning model by Hand.

5 min readMar 17, 2020

Hi everyone!!! Today, my blog is about building a supervised learning model. I think you will get a better understand about what supervised learning is 😊.

The process to build supervised learning models.

01. Define Task

02. Acquire Clean Data

03. Understand the Data

04.Prepare Data

05. Build models

06. Evaluate models

07. Finalize & deploy

Look at this dataset, based on the latest World happiness report showing some features for 12 countries. We used these 12 data rows for understanding the above process.

This process is iterative process. Let’s discuss each step clearly and simply.

Define Task

Define Task

- Write down the objective

In ex:

Use life expectancy & long-term unemployment rate to predict the perceived happiness (Low or High) of inhabitants of a country.

Acquire Clean Data

Major challenge — getting clean data

Acquire clean data

- Load Data

Get the data into a form you can easily investigate & manipulate.

Understand the Data

Aspect of a building machine learning models- get a good understanding of data.

If you understand the data, you can then present it to the algorithms in a way that produces more effective results.

Understand the data

- Inspect the data

- Visualize

Inspect the Data

- Define the features

Feature- characteristics of the individuals in the sample.

Each individual has a set of features.

In ex: 4 features:- country, lifeexp, unemployment, happiness represent in columns

Look at datashape

- 12 rows -> sample points

- 4 columns -> features

- Country — unique index

- Happiness -> categorical-High or Low

- Lifeexp and unemployment column → numeric

Apply descriptive statistics

Visualize the data

Plot the histogram

We can get useful information

-countries with high happiness tend to have a high life expectancy and low employment.

Prepare the data for Machine Learning

Prepare Data

-Select Features

-Split into input & target features

Select Features

Country feature-> is not helpful in the hard, statistics-based world of machine learning

You can’t use preconceptions to build a generalized model of the factors influence happening-So remove country.

Split into input and target features

Input feature — x (used to predict)

Target feature — Y (the one we trying to predict)

Build the model

Build the model

-Split into training & test

-Select the algorithm

-Fit model

-Check model

Split into training and test sets

Use same data for training and testing -> bad practice

As an example: the teacher will not give the same questions in the exam paper that gave earlier. First teacher trains students for some questions. Then after teacher will give some other questions in the exam which are have same concept which taught in the class.

If teacher give the same questions in the exam -> it is a bad practice.

Training set

A sample of input & target features used to train an untrained model.

We will pass both the input & target features to the model & ask it to tune itself to predict the target features.

This will produce a trained model.

Test Set

We take data*1/3 for test set.

A separate sample of input & target features used to test a trained machine learning model.

When testing the model, we will pass just the input features to the model & check how accurately it predicts the target features.

Training set= data*2/3= 12*2/3=8

Test set = data*1/3=12*1/3=4