Building a supervised learning model by Hand.

Zina Youhan
5 min readMar 17, 2020

Hi everyone!!! Today, my blog is about building a supervised learning model. I think you will get a better understand about what supervised learning is 😊.

The process to build supervised learning models.

01. Define Task

02. Acquire Clean Data

03. Understand the Data

04.Prepare Data

05. Build models

06. Evaluate models

07. Finalize & deploy

Look at this dataset, based on the latest World happiness report showing some features for 12 countries. We used these 12 data rows for understanding the above process.

This process is iterative process. Let’s discuss each step clearly and simply.

Define Task

Define Task

- Write down the objective

In ex:

Use life expectancy & long-term unemployment rate to predict the perceived happiness (Low or High) of inhabitants of a country.

Acquire Clean Data

Major challenge — getting clean data

Acquire clean data

- Load Data

Get the data into a form you can easily investigate & manipulate.

Understand the Data

Aspect of a building machine learning models- get a good understanding of data.

If you understand the data, you can then present it to the algorithms in a way that produces more effective results.

Understand the data

- Inspect the data

- Visualize

Inspect the Data

- Define the features

Feature- characteristics of the individuals in the sample.

Each individual has a set of features.

In ex: 4 features:- country, lifeexp, unemployment, happiness represent in columns

Look at datashape

- 12 rows -> sample points

- 4 columns -> features

- Country — unique index

- Happiness -> categorical-High or Low

- Lifeexp and unemployment column → numeric

Apply descriptive statistics

Visualize the data

Plot the histogram

We can get useful information

-countries with high happiness tend to have a high life expectancy and low employment.

Prepare the data for Machine Learning

Prepare Data

-Select Features

-Split into input & target features

Select Features

Country feature-> is not helpful in the hard, statistics-based world of machine learning

You can’t use preconceptions to build a generalized model of the factors influence happening-So remove country.

Split into input and target features

Input feature — x (used to predict)

Target feature — Y (the one we trying to predict)

Build the model

Build the model

-Split into training & test

-Select the algorithm

-Fit model

-Check model

Split into training and test sets

Use same data for training and testing -> bad practice

As an example: the teacher will not give the same questions in the exam paper that gave earlier. First teacher trains students for some questions. Then after teacher will give some other questions in the exam which are have same concept which taught in the class.

If teacher give the same questions in the exam -> it is a bad practice.

Training set

A sample of input & target features used to train an untrained model.

We will pass both the input & target features to the model & ask it to tune itself to predict the target features.

This will produce a trained model.

Test Set

We take data*1/3 for test set.

A separate sample of input & target features used to test a trained machine learning model.

When testing the model, we will pass just the input features to the model & check how accurately it predicts the target features.

Training set= data*2/3= 12*2/3=8

Test set = data*1/3=12*1/3=4

Select Algorithm

Use life expectancy & long-term unemployment rate to predict the perceived happiness (Low or High).

At this point, we can choose a machine learning algorithm & use it to build a model.

Algorithm-> method to solve a general problem

A model-> the result of applying an algorithm to solve a specific problem.

Fit the model to the data

Low Happiness
High Happiness

Check the model

According to model, we draw a table as below and check the predictions

Evaluate the model

Evaluate models

Compute Accuracy Score

Evaluate how well the model performs on data that was not used in training.

We will make some predictions using the test sample x and check the results against the test sample Y

Compute the accuracy score.

Using the same rules, you created above, run them against the test data and write your predictions in a table.

Let’s build the model with python in next blog. Thank you for reading my blogs 😊

Author:

Zeena Youhan

Undergraduate

B.Sc. (Hons.) in Software Engineering

University of Kelaniya.

📝 Save this story in Journal.

👩‍💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.

--

--