0% found this document useful (0 votes)
6 views11 pages

Introduction Tom L

The document provides an introduction to Machine Learning, covering its necessity, definitions, processes, and types. It emphasizes the importance of Machine Learning in analyzing vast data for insights, improving decision-making, and solving complex problems through various applications. Additionally, it outlines the Machine Learning process, including data gathering, preparation, model building, evaluation, and types of learning approaches such as supervised, unsupervised, semi-supervised, and reinforcement learning.

Uploaded by

simarjot2809
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views11 pages

Introduction Tom L

The document provides an introduction to Machine Learning, covering its necessity, definitions, processes, and types. It emphasizes the importance of Machine Learning in analyzing vast data for insights, improving decision-making, and solving complex problems through various applications. Additionally, it outlines the Machine Learning process, including data gathering, preparation, model building, evaluation, and types of learning approaches such as supervised, unsupervised, semi-supervised, and reinforcement learning.

Uploaded by

simarjot2809
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

The following topics are covered in this Introduction to Machine Learning topic:

1. Need For Machine Learning

2. What Is Machine Learning?

3. Machine Learning Definitions

4. Machine Learning Process

5. Types Of Machine Learning

6. Type Of Problems Solved Using Machine Learning

Need For Machine Learning

Ever since the technical revolution, we’ve been generating an immeasurable amount of data. As
per research, we generate around 2.5 quintillion bytes of data every single day! It is estimated
that by 2020, 1.7MB of data will be created every second for every person on earth.

With the availability of so much data, it is finally possible to build predictive models that can
study and analyze complex data to find useful insights and deliver more accurate results.

Top Tier companies such as Netflix and Amazon build such Machine Learning models by using
tons of data in order to identify profitable opportunities and avoid unwanted risks.

Here‟s a list of reasons why Machine Learning is so important:

• Increase in Data Generation: Due to excessive production of data, we need a method


that can be used to structure, analyze, and draw useful insights from data. This is where
Machine Learning comes in. It uses data to solve problems and find solutions to the most
complex tasks faced by organizations.

• Improve Decision Making: By making use of various algorithms, Machine Learning can
be used to make better business decisions. For example, Machine Learning is used to
forecast sales, predict downfalls in the stock market, identify risks and anomalies, etc.
Importance Of Machine Learning

• Uncover patterns & trends in data: Finding hidden patterns and extracting key insights
from data is the most essential part of Machine Learning. By building predictive models
and using statistical techniques, Machine Learning allows you to dig beneath the surface
and explore the data at a minute scale. Understanding data and extracting patterns
manually will take days, whereas Machine Learning algorithms can perform such
computations in less than a second.

• Solve complex problems: From detecting the genes linked to the deadly ALS disease to
building self-driving cars, Machine Learning can be used to solve the most complex
problems.

To give you a better understanding of how important Machine Learning is, let‟s list down a
couple of Machine Learning Applications:

• Netflix’s Recommendation Engine: The core of Netflix is its infamous recommendation


engine. Over 75% of what you watch is recommended by Netflix and these
recommendations are made by implementing Machine Learning.

• Facebook’s Auto-tagging feature: The logic behind Facebook‟s DeepMind face


verification system is Machine Learning and Neural Networks. DeepMind studies the
facial features in an image to tag your friends and family.

• Amazon’s Alexa: The infamous Alexa, which is based on Natural Language Processing
and Machine Learning is an advanced level Virtual Assistant that does more than just
play songs on your playlist. It can book you an Uber, connect with the other IoT devices
at home, track your health, etc.
• Google’s Spam Filter: Gmail makes use of Machine Learning to filter out spam
messages. It uses Machine Learning algorithms and Natural Language Processing to
analyze emails in real-time and classify them as either spam or non-spam.

Now that you know why Machine Learning is so important, let‟s look at what exactly Machine
Learning is.

Introduction To Machine Learning

The term Machine Learning was first coined by Arthur Samuel in the year 1959. Looking back,
that year was probably the most significant in terms of technological advancements.

Machine learning is the field of study that gives computers the ability to learn without being
explicitly programmed. — Arthur L. Samuel, AI pioneer, 1959

If you browse through the net about „what is Machine Learning‟, you‟ll get at least 100 different
definitions. However, the very first formal definition was given by Tom M. Mitchell:

“A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P if its performance at tasks in T, as measured by P, improves with
experience E.”

In simple terms, Machine learning is a subset of Artificial Intelligence (AI) which provides
machines the ability to learn automatically & improve from experience without being explicitly
programmed to do so. In the sense, it is the practice of getting Machines to solve problems by
gaining the ability to think.

But wait, can a machine think or make decisions? Well, if you feed a machine a good amount of
data, it will learn how to interpret process and analyze this data by using Machine Learning
Algorithms, to solve real-world problems.

Before moving any further, let’s discuss some of the most commonly used terminologies in
Machine Learning.

Machine Learning Definitions

Algorithm: A Machine Learning algorithm is a set of rules and statistical techniques used to
learn patterns from data and draw significant information from it. It is the logic behind a
Machine Learning model. An example of a Machine Learning algorithm is the Linear Regression
algorithm.

Model: A model is the main component of Machine Learning. A model is trained by using a
Machine Learning Algorithm. An algorithm maps all the decisions that a model is supposed to
take based on the given input, in order to get the correct output.
Predictor Variable: It is a feature(s) of the data that can be used to predict the output.

Response Variable: It is the feature or the output variable that needs to be predicted by using
the predictor variable(s).

Training Data: The Machine Learning model is built using the training data. The training data
helps the model to identify key trends and patterns essential to predict the output.

Testing Data: After the model is trained, it must be tested to evaluate how accurately it can
predict an outcome. This is done by the testing data set.

What Is Machine Learning?

To sum it up, take a look at the above figure. A Machine Learning process begins by feeding the
machine lots of data, by using this data the machine is trained to detect hidden insights and
trends. These insights are then used to build a Machine Learning Model by using an algorithm in
order to solve a problem.

The next topic in this Introduction to Machine Learning is the Machine Learning Process.

Machine Learning Process

The Machine Learning process involves building a Predictive model that can be used to find a
solution for a Problem Statement. To understand the Machine Learning process let‟s assume that
you have been given a problem that needs to be solved by using Machine Learning.
Machine Learning Process

The problem is to predict the occurrence of rain in your local area by using Machine Learning.

The below steps are followed in a Machine Learning process:

Step 1: Define the objective of the Problem Statement

At this step, we must understand what exactly needs to be predicted. In our case, the objective is
to predict the possibility of rain by studying weather conditions. At this stage, it is also essential
to take mental notes on what kind of data can be used to solve this problem or the type of
approach you must follow to get to the solution.

Step 2: Data Gathering

At this stage, you must be asking questions such as,

• What kind of data is needed to solve this problem?

• Is the data available?

• How can I get the data?

Once you know the types of data that is required, you must understand how you can derive this
data. Data collection can be done manually or by web scraping. However, if you‟re a beginner
and you‟re just looking to learn Machine Learning you don‟t have to worry about getting the
data. There are 1000s of data resources on the web, you can just download the data set and get
going.
Coming back to the problem at hand, the data needed for weather forecasting includes measures
such as humidity level, temperature, pressure, locality, whether or not you live in a hill station,
etc. Such data must be collected and stored for analysis.

Step 3: Data Preparation

The data you collected is almost never in the right format. You will encounter a lot of
inconsistencies in the data set such as missing values, redundant variables, duplicate values, etc.
Removing such inconsistencies is very essential because they might lead to wrongful
computations and predictions. Therefore, at this stage, you scan the data set for any
inconsistencies and you fix them then and there.

Step 4: Exploratory Data Analysis

Grab your detective glasses because this stage is all about diving deep into data and finding all
the hidden data mysteries. EDA or Exploratory Data Analysis is the brainstorming stage of
Machine Learning. Data Exploration involves understanding the patterns and trends in the data.
At this stage, all the useful insights are drawn and correlations between the variables are
understood.

For example, in the case of predicting rainfall, we know that there is a strong possibility of rain if
the temperature has fallen low. Such correlations must be understood and mapped at this stage.

Step 5: Building a Machine Learning Model

All the insights and patterns derived during Data Exploration are used to build the Machine
Learning Model. This stage always begins by splitting the data set into two parts, training data,
and testing data. The training data will be used to build and analyze the model. The logic of the
model is based on the Machine Learning Algorithm that is being implemented.

In the case of predicting rainfall, since the output will be in the form of True (if it will rain
tomorrow) or False (no rain tomorrow), we can use a Classification Algorithm such as Logistic
Regression.

Choosing the right algorithm depends on the type of problem you’re trying to solve, the data set
and the level of complexity of the problem. In the upcoming sections, we will discuss the
different types of problems that can be solved by using Machine Learning.

Step 6: Model Evaluation & Optimization

After building a model by using the training data set, it is finally time to put the model to a test.
The testing data set is used to check the efficiency of the model and how accurately it can predict
the outcome. Once the accuracy is calculated, any further improvements in the model can be
implemented at this stage. Methods like parameter tuning and cross-validation can be used to
improve the performance of the model.
Step 7: Predictions

Once the model is evaluated and improved, it is finally used to make predictions. The final
output can be a Categorical variable (eg. True or False) or it can be a Continuous Quantity (eg.
the predicted value of a stock).

In our case, for predicting the occurrence of rainfall, the output will be a categorical variable.

So that was the entire Machine Learning process. Now it is time to learn about the different ways
in which Machines can learn.

Machine Learning Types

A machine can learn to solve a problem by following any one of the following three approaches.
These are the ways in which a machine can learn:

1. Supervised Learning

2. Unsupervised Learning

3. Semi-Supervised

4. Reinforcement Learning

Supervised Learning

Supervised learning is a technique in which we teach or train the machine using data which is
well labeled. Labelled datasets have both input and output parameters. In Supervised Learning
algorithms learn to map points between inputs and correct outputs. It has both training and
validation datasets labelled.

To understand Supervised Learning let us consider an analogy. As kids we all needed guidance
to solve math problems. Our teachers helped us understand what addition is and how it is done.
Similarly, you can think of supervised learning as a type of Machine Learning that involves a
guide. The labeled data set is the teacher that will train you to understand patterns in the data.
The labeled data set is nothing but the training data set.

Supervised Learning
Consider the above figure. Here we’re feeding the machine images of Tom and Jerry and the
goal is for the machine to identify and classify the images into two groups (Tom images and
Jerry images). The training data set that is fed to the model is labeled, as in, we’re telling the
machine, this is how Tom looks and this is Jerry‟. By doing so you’re training the machine by
using labeled data. In Supervised Learning, there is a well-defined training phase done with the
help of labeled data.

Unsupervised Learning

Unsupervised learning involves training by using unlabeled data and allowing the model to act
on that information without guidance. Unlike supervised learning, unsupervised learning doesn’t
involve providing the algorithm with labeled target outputs. The primary goal of Unsupervised
learning is often to discover hidden patterns, similarities, or clusters within the data, which can
then be used for various purposes, such as data exploration, visualization, dimensionality
reduction, and more.

Think of unsupervised learning as a smart kid that learns without any guidance. In this type of
Machine Learning, the model is not fed with labeled data, as in the model has no clue that ,this
image is Tom and this is Jerry, it figures out patterns and the differences between Tom and Jerry
on its own by taking in tons of data.

Unsupervised Learning

For example, it identifies prominent features of Tom such as pointy ears, bigger size, etc, to
understand that this image is of type 1. Similarly, it finds such features in Jerry and knows that
this image is of type 2. Therefore, it classifies the images into two different classes without
knowing who Tom is or Jerry is.
Semi-Supervised Learning:
Semi-Supervised learning is a machine learning algorithm that works between the supervised and
unsupervised learning so it uses both labelled and unlabelled data. It’s particularly useful when
obtaining labelled data is costly, time-consuming, or resource-intensive. This approach is useful
when the dataset is expensive and time-consuming. Semi-supervised learning is chosen when
labelled data requires skills and relevant resources to train or learn from it.

We use these techniques when we are dealing with data that is a little bit labelled and the rest
large portion of it is unlabelled. We can use the unsupervised techniques to predict labels and
then feed these labels to supervised techniques. This technique is mostly applicable in the case of
image data sets where usually all images are not labelled.

Semi-Supervised Learning
Let us understand it with the help of an example.
Example: Consider that we are building a language translation model, having labelled
translations for every sentence pair can be resources intensive. It allows the models to learn from
labelled and unlabelled sentence pairs, making them more accurate. This technique has led to
significant improvements in the quality of machine translation services.

Reinforcement Learning

Reinforcement Learning is a part of Machine learning where an agent is put in an environment


and he learns to behave in this environment by performing certain actions and observing the
rewards which it gets from those actions. Trial, error, and delay are the most relevant
characteristics of reinforcement learning. In this technique, the model keeps on increasing its
performance using Reward Feedback to learn the behavior or pattern. These algorithms are
specific to a particular problem e.g. Google Self Driving car, AlphaGo where a bot competes
with humans and even itself to get better and better performers in Go Game. Each time we feed
in data, they learn and add the data to their knowledge which is training data. So, the more it
learns the better it gets trained and hence experienced.
This type of Machine Learning is comparatively different. Imagine that you were dropped off at
an isolated island! What would you do?

Panic? Yes, of course, initially we all would. But as time passes by, you will learn how to live on
the island. You will explore the environment; understand the climate condition, the type of food
that grows there, the dangers of the island, etc. This is exactly how Reinforcement Learning
works, it involves an Agent (you, stuck on the island) that is put in an unknown environment
(island), where he must learn by observing and performing actions that result in rewards.

Reinforcement Learning is mainly used in advanced Machine Learning areas such as self-driving
cars, AplhaGo, etc.

So that sums up the types of Machine Learning. Now, let’s look at the type of problems that are
solved by using Machine Learning.

Type Of Problems In Machine Learning

Type of Problems Solved Using Machine Learning

Consider the above figure; there are three main types of problems that can be solved in Machine
Learning:

1. Regression: In this type of problem the output is a continuous quantity. So, for example,
if you want to predict the speed of a car given the distance, it is a Regression problem.
Regression problems can be solved by using Supervised Learning algorithms like Linear
Regression.

2. Classification: In this type, the output is a categorical value. Classifying emails into two
classes, spam and non-spam is a classification problem that can be solved by using
Supervised Learning classification algorithms such as Support Vector Machines, Naive
Bayes, Logistic Regression, K Nearest Neighbor, etc.
3. Clustering: This type of problem involves assigning the input into two or more clusters
based on feature similarity. For example, clustering viewers into similar groups based on
their interests, age, geography, etc can be done by using Unsupervised Learning
algorithms like K-Means Clustering.

Here’s a table that sums up the difference between Regression, Classification, and Clustering.

Regression vs Classification vs Clustering

You might also like