0% found this document useful (0 votes)
17 views19 pages

1 Introduction

The document provides an overview of machine learning, defining it as a subset of artificial intelligence that enables systems to learn from data and improve performance without explicit programming. It contrasts machine learning with traditional programming, outlines different types of machine learning problems (supervised, unsupervised, semi-supervised, and reinforcement learning), and discusses challenges such as data quality, overfitting, and model interpretability. Additionally, it highlights the relationship between machine learning, deep learning, and data mining.

Uploaded by

Gayatri Varanasi
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views19 pages

1 Introduction

The document provides an overview of machine learning, defining it as a subset of artificial intelligence that enables systems to learn from data and improve performance without explicit programming. It contrasts machine learning with traditional programming, outlines different types of machine learning problems (supervised, unsupervised, semi-supervised, and reinforcement learning), and discusses challenges such as data quality, overfitting, and model interpretability. Additionally, it highlights the relationship between machine learning, deep learning, and data mining.

Uploaded by

Gayatri Varanasi
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Machine learning

Learning
– any process by which an organism or system improves performance

Learning vs. from experience

Machine
Learning
Machine learning
– a branch of artificial intelligence – technique to develop programs
(AI) in which a computer that can make predictions or
automatically improves decisions without being explicitly
performance from experience programmed to do so
Machine Learning Is Old
“Machine Learning: The field of study that gives
computers the ability to learn without being explicitly
programmed.”

- Arthur Samuel (1956)

Arthur Samuel playing checkers with the IBM 701


Machine Learning vs. Traditional
Programming
Traditional
Programming
Program

Input Output
Computer

Machine
Learner
Learning
Input
Program
Output Computer
Machine Learning vs. Traditional
Programming cont...
Traditional
Programming
Program

1+1 =2
Computer

Machine
Lots of Examples
Learning
Learner
Model for
Addition
Computer
Machine Learning vs. Traditional
Programming cont...
Traditional
Programming
Program

=?
Computer

Machine
Learning
Learner
Model for
=2 “2” recognition
Computer
The traditional approach

Machine Learning approach


Machine Learning (ML) vs. Conventional
Computing

Machine learning algorithms


Conventional computer
perform tasks that are difficult
programs or algorithms
or infeasible to do via
perform tedious tasks faster
conventional computer
and more accurately than
algorithms (grammar
humans (addition,
checking, interpreting speech,
subtraction, spell-checking)
image recognition)
Artificial Intelligence (AI) vs. Machine Learning (ML)

Artificial Machine Learning


Aspect
Intelligence (AI) (ML)

Scope Broad concept Subfield of AI


Artificial Intelligence (AI): Broad field focused on
creating systems that mimic human intelligence
May use rules or
Learning Learns from data
learning
Machine Learning (ML): A subset of AI that enables
Adaptability
Can be static or
Always adaptive systems to learn from data and improve
adaptive
automatically
Decision Rule-based +
Data-driven
Making data-driven Relationship AI ⟶ ML ⟶ Deep Learning
Subsets of Artificial Intelligence

Artificial Intelligence

Machine Learning
Artifical
Machine Learning is a subset of
Artificial Intelligence

Deep Deep Learning Deep Learning uses neural


Learning is a subset of networks to simulate
Machine Learning human like decision making
Machine Learning vs. Data Mining
• Both fields require lots of data, both can be used
to predict
• ML focuses on reproducing or predicting from
known knowledge
• Data mining focuses on discovery of previously
unknown knowledge
Machine Learning vs. Deep Learning
• Deep learning is a subdiscipline of ML
• Deep learning uses artificial neural networks (ANNs), specifically multi-layered
“deep” neural networks (DNNs) as the main learning model – this allows DNNs to
learn more complex patterns, handle tougher problems and make smarter
predictions
• Other common ANN architectures include recurrent neural networks (RNNs),
convolutional neural networks (CNNs) and deep belief networks (DPNs)
Types of Machine Learning Problems
Types of Machine Learning Problems
Machine learning problems can be categorized into:
1. Supervised Learning
• In supervised learning, the training data you feed to the
algorithm includes the desired solutions, called labels
• The model learns from labeled data.
• Examples: Regression (predicting continuous values),
Classification (categorizing inputs into classes).
• Algorithms: Linear Regression, Logistic Regression,
Decision Trees, Support Vector Machines, Neural Networks.
[Link] Learning
1. The model learns from unlabeled data to find hidden
structures.
2. The system tries to learn without a teacher.
3. No predefined input–output pairs are available
4. Discover hidden structures, relationships, or data
distributions
5. Useful for exploratory analysis

Common Techniques
Clustering: K-means, Hierarchical, DBSCAN Clustering
Dimensionality Reduction: PCA, ICA, Autoencoders
Association Rule Mining: Apriori, FP-Growth

Anomaly detection Visualization algorithms


3. Semi-Supervised Learning
Semi-supervised learning is a machine learning approach that trains models using a small amount of labeled
data and a large amount of unlabeled data. It lies between supervised and unsupervised learning.

• Key Idea
Labeled data → expensive & limited
Unlabeled data → abundant
SSL leverages the structure/patterns in unlabeled data to improve learning accuracy

• How It Works
Train an initial model using labeled data
Use the model to predict labels for unlabeled data
Select confident predictions (pseudo-labels)
Retrain the model using both labeled and pseudo-labeled data
• Advantages
Reduces labeling cost
Improves performance with limited labeled data
Utilizes real-world data effectively
• Applications
Image and speech recognition
Medical diagnosis
Text classification
4. Reinforcement Learning

Reinforcement Learning is a type of machine learning where an agent learns by interacting with an
environment and improves its behavior through rewards and penalties.

• Key Components
Agent – the learner/decision maker
Environment – where the agent operates
State (S) – current situation
Action (A) – decision taken by the agent
Reward (R) – feedback from the environment
Policy (π) – strategy to choose actions

• How It Works
• Agent observes the current state
• Takes an action
• Environment returns a reward and next state
• Agent updates its policy to maximize cumulative reward

• Applications: Robotics, Game AI, Self-driving cars.


• Algorithms: Q-Learning, Deep Q-Networks (DQN), Policy Gradient Methods
Batch and Online Learning
Batch Learning (Offline Learning) Online Learning
Online learning updates the model incrementally,
Batch learning trains a model using the entire
dataset at once. Once trained, the model is not learning from data one sample or small batch at a time
updated until it is retrained with new data. as it arrives.
• Key Characteristics Key Characteristics
Uses full historical dataset Continuous model updates
Training is done offline Learns from streaming data
Model remains fixed after training
Adapts to changes quickly
• Advantages Advantages
Stable and accurate models
Suitable for large historical datasets Low memory requirement
• Limitations Real-time adaptation
✘ High computational cost Handles concept drift
✘ Not suitable for real-time updates
Limitations
• Applications ✘ Sensitive to noise
Credit scoring systems
Image classification
✘ Requires careful learning rate control
Applications
Stock market prediction
Spam detection
Real-time load and renewable energy forecasting
Main Challenges of Machine Learning
5. Model Interpretability
1. Data Quality and Quantity
Insufficient labeled data Black-box models (e.g., deep learning) lack
Noisy, incomplete, or inconsistent datasets transparency
Data imbalance Difficult to explain decisions in critical applications

2. Overfitting and Underfitting 6. Bias and Fairness


Overfitting: model learns noise instead of Models may inherit bias from data
patterns
Underfitting: model too simple to capture Ethical and social concerns
relationships

7. Concept Drift
3. Feature Selection and Engineering
Identifying relevant features is difficult
Data distribution changes over time
High-dimensional data increases complexity Model performance degrades in real-world deployment

4. Scalability and Computational Cost 8. Hyperparameter Tuning


Large datasets require high memory and Selecting optimal parameters is non-trivial
processing power
Training deep models is time-consuming
Poor tuning leads to suboptimal performance

You might also like