Building a supervised learning model by Hand.
Hi everyone!!! Today, my blog is about building a supervised learning model. I think you will get a better understand about what supervised learning is 😊.
The process to build supervised learning models.
01. Define Task
02. Acquire Clean Data
03. Understand the Data
04.Prepare Data
05. Build models
06. Evaluate models
07. Finalize & deploy
Look at this dataset, based on the latest World happiness report showing some features for 12 countries. We used these 12 data rows for understanding the above process.
This process is iterative process. Let’s discuss each step clearly and simply.
Define Task
Define Task
- Write down the objective
In ex:
Use life expectancy & long-term unemployment rate to predict the perceived happiness (Low or High) of inhabitants of a country.
Acquire Clean Data
Major challenge — getting clean data
Acquire clean data
- Load Data
Get the data into a form you can easily investigate & manipulate.
Understand the Data
Aspect of a building machine learning models- get a good understanding of data.
If you understand the data, you can then present it to the algorithms in a way that produces more effective results.
Understand the data
- Inspect the data
- Visualize
Inspect the Data
- Define the features
Feature- characteristics of the individuals in the sample.
Each individual has a set of features.
In ex: 4 features:- country, lifeexp, unemployment, happiness represent in columns
Look at datashape
- 12 rows -> sample points
- 4 columns -> features
- Country — unique index
- Happiness -> categorical-High or Low
- Lifeexp and unemployment column → numeric
Apply descriptive statistics
Visualize the data
Plot the histogram
We can get useful information
-countries with high happiness tend to have a high life expectancy and low employment.
Prepare the data for Machine Learning
Prepare Data
-Select Features
-Split into input & target features
Select Features
Country feature-> is not helpful in the hard, statistics-based world of machine learning
You can’t use preconceptions to build a generalized model of the factors influence happening-So remove country.
Split into input and target features
Input feature — x (used to predict)
Target feature — Y (the one we trying to predict)
Build the model
Build the model
-Split into training & test
-Select the algorithm
-Fit model
-Check model
Split into training and test sets
Use same data for training and testing -> bad practice
As an example: the teacher will not give the same questions in the exam paper that gave earlier. First teacher trains students for some questions. Then after teacher will give some other questions in the exam which are have same concept which taught in the class.
If teacher give the same questions in the exam -> it is a bad practice.
Training set
A sample of input & target features used to train an untrained model.
We will pass both the input & target features to the model & ask it to tune itself to predict the target features.
This will produce a trained model.
Test Set
We take data*1/3 for test set.
A separate sample of input & target features used to test a trained machine learning model.
When testing the model, we will pass just the input features to the model & check how accurately it predicts the target features.
Training set= data*2/3= 12*2/3=8
Test set = data*1/3=12*1/3=4
Select Algorithm
Use life expectancy & long-term unemployment rate to predict the perceived happiness (Low or High).
At this point, we can choose a machine learning algorithm & use it to build a model.
Algorithm-> method to solve a general problem
A model-> the result of applying an algorithm to solve a specific problem.
Fit the model to the data
Check the model
According to model, we draw a table as below and check the predictions
Evaluate the model
Evaluate models
Compute Accuracy Score
Evaluate how well the model performs on data that was not used in training.
We will make some predictions using the test sample x and check the results against the test sample Y
Compute the accuracy score.
Using the same rules, you created above, run them against the test data and write your predictions in a table.
Let’s build the model with python in next blog. Thank you for reading my blogs 😊
Author:
Zeena Youhan
Undergraduate
B.Sc. (Hons.) in Software Engineering
University of Kelaniya.
📝 Save this story in Journal.
👩💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.