Machine Learning


Example of a Sales prediction business case

For the last five years, we have learned how much a company spends on marketing a product and, as a consequence, its sales.

What is the prediction of Sales Amount, if we spend 2700€ in Marketing budget?


Vocabulary

  • The train data usually refers to old data
  • The test data usually refers to future data
  • A model is a Machine Learning method
  • We usually give the machine train data, in order
    it to learn
  • A variable is a feature

How a machine predicts future features? Is it magical?

As the prediction word looks wrongly miraculous, it is wise to replace it by estimation using mathematical / statistical / computer calculations

Machine learning is estimating one response variable. For this task, it requires one or many explanatory variables. Considering the previous associations of explanatory/response variables provided some new explanatory variables, the machine will estimate the response variable

Remark: prediction of several response variables is also possible, indeed it’s derived from prediction of one response variable.


Resuming the Sales example

In the data below, the Sales amount is the consequence of the marketing budget. Hence, the Sales amount is the response variable, and marketing is the only explanatory variable.

We’ll determine in this post, one model for predicting the Sales amount, if the Marketing budget is 2700€:

https://vgir.fr/2019/06/04/linear-regression/

Marketing budget € Sales amount €
Year 1 2120 19240
Year 2 2300 28500
Year 3 2410 32010
Year 4 2530 37020
Year 5 2600 40580

What kind of response variable can Machine Learning predict?

There are 2 kind of features:

  • Categorical
    variable which is a choice between limited values

Is the picture designing a cat or a dog or a bird?

Will the customer buy our new product if we send him

an ad mail: yes or no?

  • Continuous
    variable which is a numerical value

What will the weight of the parcel be in kg? 4.7? 5.3? 8.6?

How much time before the tool breaks in hours? 50? 165?

How much will be next year’s sales in €? 35000? 36000?


Categorization of Machine Learning

Machine learning is subdivided into 3 categories:

  • Supervised
  • Unsupervised
  • Reinforcement
    Learning

Supervised

The correct answer (target variable) is provided to the machine in order along with the train data

Example:

The Machine must recognize pictures of cats and dogs. First, it receives a set of hundreds or thousands of pictures of animals, and for each picture, the information “cat” or “dog” is sent.

Test Data: we provide only pictures to the machine and ask it, if it represents cat/dog/bird

Some models for supervised machine learning:

  • Linear
    Regression: see next post for the example “Sales Report”
  • Decision Tree
  • Random Forest
  • Boosted Tree

Unsupervised

No correct answer (target variable) is provided to the Machine with train data.
We don’t know the correct answer, and the task of the Machine is to find it

Example:

We have a color picture. We indicate to the Machine that it must convert this picture into only 4 colors in order to print this picture with reduced cost.

As we don’t know which color to use, and which pixel to replace by which color, we cannot give the information of the optimal 3-color picture

Some models for unsupervised machine learning:

  • k-means
    clustering: it had been used for the 3-color image
  • k nearest
    neighbours – KNN


The Machine is learning through iterations: it generates its own experiments, gets a result after each experiment, and takes this result into account during the next iteration

Example:

A machine is learning to play chess. It plays, is doing mistakes, and loses. Hopefully, at each new game iterations is doing latest mistakes or smallest mistakes and progress through the game

Some models for reinforcement machine learning, Neural Networks:

  • Convolutional
    neural network- CNN
  • Recurrent
    neural network – RNN

The Neural Networks are also called Deep Learning: hence, Deep Learning is a sub-area of Machine Learning


Reinforcement Learning

This is a balance between exploration (Unsupervised) and exploitation (Supervised).

By Megajuice – Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=57895741

The typical framing of a Reinforcement Learning (RL) scenario: an agent takes actions in an environment, which is interpreted into a reward and a state representation, which are fed back to the agent.

https://en.wikipedia.org/wiki/Reinforcement_learning

andFor instance, it is applied to a self-driving car: the vehicle has to learn by itself, plus it is humanly helped, so the progress is optimized.