0% found this document useful (0 votes)
3 views29 pages

Python (4-6)

The document provides an introduction to NumPy and its features, including fast calculations, multi-dimensional arrays, and mathematical functions. It covers array creation, types, properties, basic operations, and applications in data science and machine learning. Additionally, it introduces the Scikit-learn library for machine learning and the Pandas library for data manipulation, emphasizing their importance in data analysis.

Uploaded by

xsvhf0aoj3
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views29 pages

Python (4-6)

The document provides an introduction to NumPy and its features, including fast calculations, multi-dimensional arrays, and mathematical functions. It covers array creation, types, properties, basic operations, and applications in data science and machine learning. Additionally, it introduces the Scikit-learn library for machine learning and the Pandas library for data manipulation, emphasizing their importance in data analysis.

Uploaded by

xsvhf0aoj3
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

⁕ Introduction to NumPy (U-4) ⁕

• 1. NumPy Basics

1. Introduction

NumPy (Numerical Python) is a Python library used for working with numbers and arrays.
It is widely used in scientific computing and data analysis.

2. Features of NumPy

 Fast and efficient calculations


 Uses less memory
 Supports multi-dimensional arrays
 Provides many mathematical functions

3. Installation and Import

Installation:

pip install numpy

Import:

import numpy as np

4. NumPy Array

A NumPy array is a collection of elements of the same type.

Example:

import numpy as np
arr = [Link]([1, 2, 3, 4])

5. Types of Arrays

(a) 1D Array

arr = [Link]([1, 2, 3])

(b) 2D Array

arr = [Link]([[1, 2], [3, 4]])


(c) 3D Array

arr = [Link]([[[1,2,3], [4,5,6]]])

6. Array Properties

 ndim → number of dimensions


 shape → size of array (rows, columns)
 size → total number of elements

Example:

arr = [Link]([[1,2,3],[4,5,6]])
print([Link])
print([Link])
print([Link])

8. Basic Operations

Operations are performed element-wise.

a = [Link]([1,2,3])
b = [Link]([4,5,6])

print(a + b)
print(a - b)
print(a * b)

11. Mathematical Functions


arr = [Link]([1,2,3,4])

print([Link](arr))
print([Link](arr))
print([Link](arr))
print([Link](arr))

12. Advantages of NumPy

 Faster than Python lists


 Easy to use
 Handles large data efficiently

13. Applications of NumPy

 Data Science
 Machine Learning
 Data Analysis
 Scientific calculations
• 1.1 creating Nd arrays

1. Nd Array (N-dimensional array)

An Nd array is the main data structure in NumPy used to store data in multiple dimensions.

👉 “N” means number of dimensions (1D, 2D, 3D, etc.)


👉 All elements are of the same data type

Examples:

2. Examples of Nd Arrays

 1D Array → like a list


Example: [1, 2, 3]
 2D Array → like a table (rows and columns)
Example:

[[1, 2],
[3, 4]]

 3D Array → collection of 2D arrays


Example:

[[[1,2,3],
[4,5,6]]]

2. Creating Nd Arrays in Python

First import NumPy:

import numpy as np

(a) Using array()

arr = [Link]([1, 2, 3]) # 1D


arr2 = [Link]([[1,2],[3,4]]) # 2D

(b) Using zeros() and ones()

[Link]((2,3)) # array of 0s
[Link]((2,2)) # array of 1s

(c) Using arange()

[Link](1,10)

(d) Using random values


[Link](2,2)

3. Important Properties
arr = [Link]([[1,2,3],[4,5,6]])

[Link] # number of dimensions


[Link] # size of array
[Link] # total elements

2.2 Data types for Nd arrays

📊 Data Types for Nd Arrays (NumPy)

1. Introduction

In NumPy, an Nd array stores elements of the same type.


The data type (dtype) tells what kind of data is stored in the array (like integer, float, etc.).

2. What is dtype?

👉 dtype means data type of array elements


👉 It defines how data is stored in memory

Example:

import numpy as np
arr = [Link]([1, 2, 3], dtype='int')

3. Common Data Types in Nd Arrays

(a) Integer (int)

Used for whole numbers.

arr = [Link]([1, 2, 3], dtype='int')

Example values: 1, 10, -5

(b) Float

Used for decimal numbers.


arr = [Link]([1.5, 2.7, 3.0], dtype='float')

Example values: 1.2, 3.5, 10.0

(c) Boolean

Stores True or False values.

arr = [Link]([True, False, True], dtype='bool')

(d) String

Stores text values.

arr = [Link](['a', 'b', 'c'], dtype='str')

4. Checking Data Type

We can check the data type using dtype.

arr = [Link]([1, 2, 3])


print([Link])

5. Changing Data Type

We can convert one type to another.

arr = [Link]([1, 2, 3], dtype='float')

2.2 arithmetic with NumPy arrays


➕ Arithmetic with NumPy Arrays

1. Introduction

In NumPy, arithmetic operations are used to perform mathematical calculations on arrays.

👉 These operations work element by element.


👉 This means each element in one array is operated with the corresponding element in another array.

2. Basic Arithmetic Operations

(a) Addition (+)


Adds elements of two arrays.

import numpy as np

a = [Link]([1, 2, 3])
b = [Link]([4, 5, 6])

print(a + b)

Output:

[5 7 9]

(b) Subtraction (-)

Subtracts elements of two arrays.

print(a - b)

Output:

[-3 -3 -3]

(c) Multiplication (*)

Multiplies elements of two arrays.

print(a * b)

Output:

[4 10 18]

(d) Division (/)

Divides elements of two arrays.

print(a / b)

Output:

[0.25 0.4 0.5 ]

3. Scalar Operations

A single number can also operate on all elements of an array.


Example:

arr = [Link]([1, 2, 3])

print(arr + 2)
print(arr * 2)

Output:

[3 4 5]
[2 4 6]

2.3 basic indexing and slicing NumPy arrays

📊 Basic Indexing and Slicing in NumPy

1. Introduction

Indexing and slicing are used to access elements from a NumPy array.

👉 Indexing = getting a single element


👉 Slicing = getting multiple elements

2. Indexing in NumPy

Indexing means accessing a specific element using its position.

👉 Index starts from 0

Example (1D array):

import numpy as np

arr = [Link]([10, 20, 30, 40])

print(arr[0])
print(arr[2])

Output:

10
30

3. Indexing in 2D Array

In 2D arrays, we use row and column index.


Example:

arr = [Link]([[1, 2, 3],


[4, 5, 6]])

print(arr[0, 1])
print(arr[1, 2])

Output:

2
6

👉 Format: arr[row, column]

4. Slicing in NumPy

Slicing means getting a part of the array.

👉 Syntax: array[start:end]
👉 End index is not included

Example (1D array):

arr = [Link]([10, 20, 30, 40, 50])

print(arr[1:4])

Output:

[20 30 40]

5. Slicing with Step

We can also skip elements using step value.

print(arr[0:5:2])

Output:

[10 30 50]

6. Slicing in 2D Array

We can slice rows and columns.


arr = [Link]([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

print(arr[0:2, 1:3])

Output:

[[2 3]
[5 6]]

2.3 Boolean Indexing in NumPy

📊 Boolean Indexing in NumPy

1. Introduction

Boolean indexing is used to filter data from a NumPy array using conditions.

👉 It returns only those elements that satisfy a condition


👉 It uses True or False values

2. How Boolean Indexing Works

In Boolean indexing, we first apply a condition on an array.


NumPy then returns only the elements where the condition is True.

3. Example of Boolean Indexing


import numpy as np

arr = [Link]([10, 20, 30, 40, 50])

print(arr > 25)

Output:

[False False True True True]

👉 True means condition is satisfied

4. Filtering Elements

We can directly use the condition to get required values.


print(arr[arr > 25])

Output:

[30 40 50]

5. Multiple Conditions

We can also use multiple conditions using & (and) and | (or).

Example:

print(arr[(arr > 20) & (arr < 50)])

Output:

[30 40]

6. Important Points

 Condition returns True/False values


 Only True values are selected
 Used for filtering data easily
 Very useful in data analysis

2.3 Transposing Arrays and Swapping Axes in NumPy

🔄 Transposing Arrays and Swapping Axes in NumPy

1. Introduction

In NumPy, transpose and swap axes are used to change the shape and direction of arrays.

👉 They help in rearranging rows and columns


👉 Useful in matrix operations and data processing

2. Transposing Arrays

What is Transpose?

Transposing means converting rows into columns and columns into rows.

👉 Rows become columns


👉 Columns become rows
Example:

import numpy as np

arr = [Link]([[1, 2, 3],


[4, 5, 6]])

print(arr.T)

Output:

[[1 4]
[2 5]
[3 6]]

3. Using transpose() function

We can also use the function:

print([Link](arr))

4. Swapping Axes

What is Swap Axes?

Swapping axes means changing the position of dimensions in an array.

👉 Used in multi-dimensional arrays


👉 More flexible than transpose

Example:

arr = [Link]([[1, 2, 3],


[4, 5, 6]])

print([Link](arr, 0, 1))

Output:

[[1 4]
[2 5]
[3 6]]

5. Difference Between Transpose and Swapaxes

 Transpose → mainly used for 2D arrays (rows ↔ columns)


 Swapaxes → used for multi-dimensional arrays (any axes)

2.3 Universal Functions (Ufuncs) in NumPy

⚡ Universal Functions (Ufuncs) in NumPy

1. Introduction

Universal functions (Ufuncs) in NumPy are functions that work element by element on arrays.

👉 They perform operations on each element individually


👉 They are very fast compared to normal Python loops

2. Meaning of Ufunc

A Ufunc (Universal Function) is a function that applies the same operation to every element of an array.

Example: addition, subtraction, square, square root, etc.

3. Example of Ufunc
import numpy as np

arr = [Link]([1, 2, 3, 4])

print([Link](arr))

Output:

[ 1 4 9 16]

👉 Each element is squared individually

4. Common Universal Functions

(a) Addition

[Link](10, 20)

(b) Subtraction
[Link](20, 10)

(c) Multiplication

[Link](2, arr)

(d) Division

[Link](10, 2)

(e) Square Root

[Link](arr)

5. Key Feature

👉 Ufuncs work without loops


👉 They process all elements at once
👉 So they are fast and efficient

2.3 Introduction to Scikit-learn Library for Data Science

📊 Introduction to Scikit-learn Library for Data Science

1. Introduction

Scikit-learn is a Python library used for Machine Learning and Data Science.

👉 It provides tools to build models from data


👉 It helps in prediction, classification, and analysis

2. What is Scikit-learn?

Scikit-learn is an open-source library that gives simple tools for:

 Data analysis
 Machine learning models
 Data preprocessing

3. Why Scikit-learn is used?


👉 It makes machine learning easy
👉 It works well with NumPy and Pandas
👉 It is fast and simple to use
👉 It provides ready-made algorithms

4. Main Uses of Scikit-learn

(a) Classification

Used to divide data into categories


Example: Spam or Not Spam emails

(b) Regression

Used to predict continuous values


Example: Predicting house prices

(c) Clustering

Used to group similar data


Example: Customer grouping

(d) Data Preprocessing

Used to clean and prepare data before training

5. Simple Example
from sklearn.linear_model import LinearRegression

model = LinearRegression()

👉 This creates a simple machine learning model

6. Key Feature

👉 Scikit-learn provides ready-to-use algorithms so we do not need to build everything from scratch.
⁕ Data Manipulation with Panda (U-5) ⁕

1. Introduction to Scikit-learn Library for Data Science


🐼 What is Pandas in Python

1. Introduction

Pandas is a Python library used for data handling and data analysis.

👉 It helps to work with large data easily


👉 It is mainly used in Data Science

2. What is Pandas?

Pandas is a tool that helps to store, organize, and analyze data in tabular form (like Excel sheets).

👉 It works with rows and columns


👉 It makes data easy to read and modify

3. Main Data Structures in Pandas

(a) Series

A Series is a one-dimensional data structure.

import pandas as pd

s = [Link]([10, 20, 30])


print(s)

(b) DataFrame

A DataFrame is a two-dimensional table (rows and columns).

data = {
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
}

df = [Link](data)
print(df)

4. Why Pandas is used?

👉 Easy data handling


👉 Works like Excel tables
👉 Fast data analysis
👉 Helps in cleaning data

5. Simple Features
 Reads data from Excel, CSV files
 Filters and selects data easily
 Handles missing data
 Performs calculations

2.1 Series in Pandas


📊 Series in Pandas

1. Introduction

A Series is one of the main data structures in Pandas.

👉 It is a one-dimensional labeled array


👉 It can store any type of data like numbers, text, etc.

2. What is Series?

A Series is like a single column of a table.

👉 It has values and an index


👉 Index helps to access data easily

3. Creating a Series
import pandas as pd

s = [Link]([10, 20, 30, 40])


print(s)

4. Output Example
0 10
1 20
2 30
3 40
dtype: int64

👉 Left side = index


👉 Right side = values

5. Series with Custom Index

We can give our own index names.


s = [Link]([10, 20, 30], index=["a", "b", "c"])
print(s)

6. Accessing Elements in Series


print(s["b"])

👉 Output: 20

7. Key Features of Series

 One-dimensional data
 Has index and values
 Can store different data types
 Easy to access and modify data

2.2 DataFrame in Pandas

📊 DataFrame in Pandas

1. Introduction

A DataFrame is one of the most important data structures in Pandas.

👉 It is used to store data in table format (rows and columns)


👉 It is similar to an Excel sheet or database table

2. What is DataFrame?

A DataFrame is a 2-dimensional labeled data structure.

👉 It has rows and columns


👉 Each column can store different types of data

3. Creating a DataFrame
import pandas as pd

data = {
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
}
df = [Link](data)
print(df)

4. Output Example
Name Marks
0 A 80
1 B 90
2 C 85

👉 Rows = records
👉 Columns = attributes

5. Accessing Data in DataFrame

(a) Column access

print(df["Name"])

(b) Row access

print([Link][0])

6. Key Features of DataFrame

 Stores data in tabular form


 Has rows and columns
 Can handle large data
 Easy to filter and analyze data

3.1 Dropping Entries in Pandas

🗑️Dropping Entries in Pandas

1. Introduction

Dropping entries means removing unwanted data from a DataFrame.

👉 It helps to clean data


👉 We can remove rows or columns
2. What is Dropping?

In Pandas, we use drop() function to delete data.

👉 It removes:

 Rows
 Columns

3. Dropping a Row

A row can be removed using its index number.

import pandas as pd

df = [Link]({
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
})

df = [Link](1)
print(df)

👉 Row with index 1 is removed

4. Dropping a Column

To remove a column, we use axis=1.

df = [Link]("Marks", axis=1)
print(df)

👉 Column “Marks” is removed

5. Important Points

 drop() is used to remove data


 axis=0 → row removal
 axis=1 → column removal
 Original data is not changed unless reassigned

3.2 Indexing in Pandas


📌 Indexing in Pandas

1. Introduction

Indexing in Pandas means accessing data from rows and columns using labels or positions.

👉 It helps to get specific data from a DataFrame or Series easily.

2. What is Indexing?

Each row in Pandas has an index number or label.

👉 We use this index to access data


👉 It works like an address of data

3. Types of Indexing
(a) Column Indexing

Used to access a full column.

import pandas as pd

df = [Link]({
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
})

print(df["Name"])

👉 Gives all values of Name column

(b) Row Indexing using loc

loc is used to access rows by label.

print([Link][0])

👉 Gives first row data

(c) Row Indexing using iloc

iloc is used to access rows by position.


print([Link][0])

👉 Gives first row data based on position

4. Simple Meaning

 Indexing means finding and selecting data


 It helps to access rows and columns easily
 It is used for data analysis

3.3 Selection Function in Pandas

📌 Selection Function in Pandas

1. Introduction

Selection in Pandas means choosing specific data from a DataFrame or Series.

👉 It is used to access required rows or columns


👉 It helps to work with only needed data

2. What is Selection?

Selection is a process of extracting data based on column name or row position.

3. Types of Selection

(a) Column Selection

Used to select a single column from a DataFrame.

import pandas as pd

df = [Link]({
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
})
print(df["Name"])

👉 This selects only the Name column

(b) Row Selection using (loc)

loc is used to select rows using labels (index).

print([Link][0])

👉 This selects first row

(c) Row Selection using (iloc)

iloc is used to select rows using position.

print([Link][0])

👉 This also selects first row based on position

4. Simple Meaning

 Selection means choosing required data


 It can select rows or columns
 It helps in data analysis

3.4 Filtering Function in Pandas

📊 Filtering Function in Pandas

1. Introduction

Filtering in Pandas means selecting only those rows which satisfy a condition.

👉 It helps to remove unwanted data


👉 It shows only useful data

2. What is Filtering?
Filtering is a process of checking a condition and selecting matching data.

👉 If condition is True → data is selected


👉 If False → data is not selected

3. Example of Filtering
import pandas as pd

df = [Link]({
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
})

print(df[df["Marks"] > 80])

4. Output
Name Marks
1 B 90
2 C 85

👉 Only rows with Marks greater than 80 are shown

5. Multiple Conditions

We can use more than one condition.

print(df[(df["Marks"] > 80) & (df["Marks"] < 90)])

👉 Selects marks between 80 and 90

6. Simple Meaning

 Filtering means selecting data based on condition


 Uses True/False logic
 Helps to find required data

3.5 Applications in Pandas

📊 Application (apply) in Pandas

1. Introduction

Application in Pandas means applying a function to data.


👉 It is done using the apply() function
👉 Used to modify or process data easily

2. What is apply()?

apply() is used to apply a function to each value, row, or column in a DataFrame or Series.

3. Example on Series
import pandas as pd

s = [Link]([1, 2, 3])

s = [Link](lambda x: x * 2)
print(s)

👉 Each value is multiplied by 2

4. Example on DataFrame
df = [Link]({
"Marks": [80, 90, 85]
})

df["Marks"] = df["Marks"].apply(lambda x: x + 5)
print(df)

👉 Adds 5 to each value

3.6 Mapping in Pandas

📊 Mapping (map) in Pandas

1. Introduction

Mapping in Pandas means changing or transforming values in a Series.

👉 It is done using the map() function


👉 It works only on Series (one column)

2. What is map()?

map() is used to apply a function or replace values using a dictionary.


👉 It changes each value one by one

3. Example using Function


import pandas as pd

s = [Link]([1, 2, 3])

s = [Link](lambda x: x * 2)
print(s)

👉 Each value is multiplied by 2

4. Example using Dictionary


s = [Link](["A", "B", "C"])

s = [Link]({"A": "Apple", "B": "Ball", "C": "Cat"})


print(s)

👉 Values are replaced using dictionary

5. Important Points

 Works only on Series


 Applies function to each value
 Can replace values using dictionary

3.6 sorting in Pandas

📊 Sorting in Pandas
1. Introduction
Sorting in Pandas means arranging data in a specific order.

👉 Data can be arranged in:

 Ascending order (small to large)


 Descending order (large to small)

2. What is Sorting?
Sorting helps to organize data properly so it becomes easy to read and analyze.

3. Sorting by Values
We use sort_values() to sort data based on column values.

Example:
import pandas as pd

df = [Link]({
"Name": ["A", "B", "C"],
"Marks": [80, 90, 85]
})

df = df.sort_values("Marks")
print(df)

👉 Sorts data by Marks in ascending order

4. Descending Order
df = df.sort_values("Marks", ascending=False)
print(df)

👉 Sorts data from highest to lowest

5. Sorting by Index
We use sort_index() to sort data by index.

df = df.sort_index()

6. Simple Meaning
 Sorting = arranging data in order
 Helps in better understanding of data
 Can sort by values or index

3.6 Ranking in Pandas

📊 Ranking in Pandas
1. Introduction
Ranking in Pandas means assigning a position or rank to each value in a dataset.

👉 It tells which value is highest, second highest, and so on


👉 It is used for comparison of data

2. What is Ranking?
Ranking assigns a number (rank) to each value based on its size.

👉 Highest value gets rank 1 (by default)


👉 Lower values get higher rank numbers

3. Example of Ranking
import pandas as pd

s = [Link]([50, 80, 70, 90])

print([Link]())

4. Output Example
0 1.0
1 3.0
2 2.0
3 4.0
dtype: float64

👉 Each value gets a rank based on its position

5. Types of Ranking
(a) Ascending Rank (default)

Smallest value gets lowest rank.

(b) Descending Rank


print([Link](ascending=False))

👉 Highest value gets rank 1


6. Simple Meaning
 Ranking means giving position to values
 Used for comparison
 Helps to find top or lowest values

You might also like