1.
Creating a NumPy Array
• Basic ndarray
• Array of zeros
• Array of ones
• Random numbers in ndarray
• An array of your choice
• Imatrix in NumPy
• Evenly spaced ndarray
• Basic ndarray
import numpy as np
arr=[Link]([1,2,3,4,5])
print(arr)
#output
[1 2 3 4 5]
• Array of zeros
zeros_arr = [Link]((3, 3))
print(zeros_arr)
• Array of ones
ones_arr = [Link]((2, 4)) # 2x4 matrix of ones
print(ones_arr)
#output
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]
• Random numbers in ndarray
random_arr = [Link](3, 3) # 3x3 matrix with random values
print(random_arr)
#output
[[0.5488135 0.71518937 0.60276338]
[0.54488318 0.4236548 0.64589411]
[0.43758721 0.891773 0.96366276]]
• An array of your choice
custom_arr = [Link]([[10, 20, 30], [40, 50, 60]]) # Custom 2x3 array
print(custom_arr)
#output
[[10 20 30]
[40 50 60]]
• matrix in NumPy
identity_matrix = [Link](4) # 4x4 Identity matrix
print(identity_matrix)
#output
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
• Evenly spaced ndarray
evenly_spaced = [Link](0, 10, 5) # 5 evenly spaced values from 0 to 10
print(evenly_spaced)
#output
[ 0. 2.5 5. 7.5 10. ]
2. The Shape and Reshaping of NumPy Array
• Dimensions of NumPy array
• Shape of NumPy array
• Size of NumPy array
• Reshaping a NumPy array
• Flattening a NumPy array
• Transpose of a NumPy array
• Dimensions of NumPy array
The number of dimensions (axes) in an array is given by the ndim attribute:
import numpy as np
# Creating a 2-D array
array_2d = [Link]([[1, 2, 3], [4, 5, 6]])
print(array_2d.ndim)
• Shape of NumPy array
The shape of an array, representing the size of each dimension, is accessible via the shape attribute:
print(array_2d.shape)
# Output: (2, 3)
3. Size of a NumPy Array
The total number of elements in the array is obtained using the size attribute:
print(array_2d.size)
# Output: 6
• Reshaping a NumPy array
To change the shape of an array without altering its data, use the reshape() method. The new shape
must be compatible with the original size:
array_1d = [Link]([1, 2, 3, 4, 5, 6])
array_2d = array_1d.reshape((2, 3))
print(array_2d)
# Output:
# [[1 2 3]
# [4 5 6]]
Alternatively, you can use the [Link]() function:
array_2d = [Link](array_1d, (2, 3))
print(array_2d)
# Output:
# [[1 2 3]
# [4 5 6]]
• Flattening a NumPy array
To convert a multi-dimensional array into a 1-D array, use the flatten() method or ravel() function:
array_2d = [Link]([[1, 2, 3], [4, 5, 6]])
array_1d = array_2d.flatten()
print(array_1d)
# Output: [1 2 3 4 5 6]
• Transpose of a NumPy array
To transpose an array (swap rows and columns), use the T attribute:
array_2d = [Link]([[1, 2, 3], [4, 5, 6]])
array_transposed = array_2d.T
print(array_transposed)
# Output:
# [[1 4]
# [2 5]
# [3 6]]
3. Expanding and Squeezing a NumPy Array
• Expanding a NumPy array
• Squeezing a NumPy array
• Sorting in NumPy Array
• Expanding a NumPy array
Method : np.expand_dims()
Syntax:
np.expand_dims(a, axis)
a: The input array.
axis: The position in the result shape where the new axis will be added
Example:
import numpy as np
# Original 1D array arr = [Link]([1, 2, 3, 4])
print("Original array:", arr)
# Expanding the array to 2D (axis=0)
expanded_arr = np.expand_dims(arr, axis=0)
print("Expanded array (axis=0):", expanded_arr)
# Expanding the array to 2D (axis=1)
expanded_arr_2 = np.expand_dims(arr, axis=1)
print("Expanded array (axis=1):", expanded_arr_2)
Output:
Original array: [1 2 3 4]
Expanded array (axis=0): [[1 2 3 4]]
Expanded array (axis=1): [[1]
• Squeezing a NumPy array
Syntax:
[Link](a, axis=None)
• a: The input array.
• axis: Optional. If specified, only the dimensions with size 1 at that axis will be removed. If
None (default), all singleton dimensions (size 1) will be removed.
Example:
Basic Squeezing import numpy as np
# Creating a 3D array with shape (1, 4, 1)
arr = [Link]([[[1], [2], [3], [4]]])
print("Original array shape:", [Link])
print("Original array:\n", arr)
# Squeezing the array (removes the singleton dimensions)
squeezed_arr = [Link](arr)
print("Squeezed array shape:", squeezed_arr.shape)
print("Squeezed array:", squeezed_arr)
Output
Original array shape: (1, 4, 1)
Original array:
[[[1]
[2]
[3]
[4]]]
Squeezed array shape: (4,)
Squeezed array: [1 2 3 4]
• Sorting in NumPy Arrays
Methods for Sorting in NumPy:
[Link](): Returns a sorted copy of the array.
[Link](): Sorts the array in-place, modifying the original array.
[Link]()
import numpy as np
arr = [Link]([3, 1, 4, 1, 5, 9, 2])
sorted_arr = [Link](arr)
print("Original array:", arr)
print("Sorted array:", sorted_arr)
Output:
Original array: [3 1 4 1 5 9 2]
Sorted array: [1 1 2 3 4 5 9]
2. [Link]() (In-place sorting)
arr = [Link]([3, 1, 4, 1, 5, 9, 2])
[Link]()
print("Array after in-place sorting:", arr)
Output:
Array after in-place sorting: [1 1 2 3 4 5 9]
4. Indexing and Slicing of NumPy Array
• Slicing 1-D NumPy arrays
• Slicing 2-D NumPy arrays
• Slicing 3-D NumPy arrays
• Negative slicing of NumPy arrays
• Slicing 1-D NumPy arrays
import numpy as np
# Create a 1-D array
arr = [Link]([10, 20, 30, 40, 50, 60, 70])
# Basic Slicing: Elements from index 1 to 3 (excluding 4)
sliced_arr1 = arr[1:4]
print("Basic Slicing:", sliced_arr1)
# Slicing with Step: Elements from index 0 to 6 with a step of 2
sliced_arr2 = arr[0:7:2]
print("Slicing with Step:", sliced_arr2)
# Slicing from the End: Last three elements
sliced_arr3 = arr[-3:]
print("Slicing from the End:", sliced_arr3)
• Slicing 2-D NumPy arrays
import numpy as np
# Create a 2-D array
arr = [Link]([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
# Basic Slicing: Slice elements from rows 0 to 1 and columns 1 to 2
sliced_arr1 = arr[0:2, 1:3]
print("Basic Slicing:", sliced_arr1)
# Slicing with Step: Slice every other element from rows and columns
sliced_arr2 = arr[::2, ::2]
print("Slicing with Step:", sliced_arr2)
# Slicing Rows or Columns: Slice the second row
sliced_arr3 = arr[1, :]
print("Slicing Rows:", sliced_arr3)
• Slicing 3-D NumPy arrays
import numpy as np
# Create a 3-D array
arr = [Link]([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]], [[13, 14, 15], [16, 17, 18]]])
# Basic Slicing: Slice elements from 0 to 2 in the first dimension, 0 to 1 in the second
dimension, and all elements in the third dimension
sliced_arr1 = arr[0:2, 0:1, :]
print("Basic Slicing:", sliced_arr1)
# Slicing with Step: Slice elements with a step of 2 in the first dimension, and all elements in
the other dimensions
sliced_arr2 = arr[::2, :, :]
print("Slicing with Step:", sliced_arr2)
# Slicing Specific Rows or Columns: Slice the second row from each 2-D array
sliced_arr3 = arr[:, 1, :]
print("Slicing Specific Rows:", sliced_arr3)
# Slicing Specific Rows or Columns: Slice the second column from each 2-D array
sliced_arr4 = arr[:, :, 1]
print("Slicing Specific Columns:", sliced_arr4)
• Negative slicing of NumPy arrays
import numpy as np
# Negative Slicing in 1-D Arrays
arr = [Link]([10, 20, 30, 40, 50])
print(arr[-3:]) # Output: [30 40 50]
# Negative Slicing in 2-D Arrays
arr_2d = [Link]([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr_2d[-2:, -2:]) # Output: [[5 6] [8 9]]
5. Stacking and Concatenating Numpy Arrays
• Stacking ndarrays
• Concatenating ndarrays
• Broadcasting in Numpy Arrays
• Stacking ndarrays
import numpy as np
# Using [Link]()
array1 = [Link]([1, 2, 3])
array2 = [Link]([4, 5, 6])
# Stack arrays along rows (axis=0)
result_rows = [Link]((array1, array2), axis=0)
print("Stacked array along rows (axis=0):\n", result_rows)
# Output:
# Stacked array along rows (axis=0):
# [[1 2 3] # [4 5 6]]
• Concatenating ndarrays
# Using [Link]()
import numpy as np
# Create two 1D arrays
array1 = [Link]([1, 2, 3])
array2 = [Link]([4, 5, 6])
# Concatenate arrays
result = [Link]((array1, array2))
print("Concatenated array:\n", result)
# Output:
# Concatenated array: # [1 2 3 4 5 6]
• Broadcasting in Numpy Arrays
1D Array and 2D Array
import numpy as np
# Create a 1D array
array1 = [Link]([1, 2, 3])
# Create a 2D array
array2 = [Link]([[4, 5, 6], [7, 8, 9]])
# Perform addition
result = array1 + array2
print(result)
6. Perform following operations using pandas
• Creating dataframe
• concat()
• Setting conditions
• Adding a new column
• Creating dataframe
import pandas as pd
# 1. Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df = [Link](data)
print("DataFrame:")
print(df)
• concat()
import pandas as pd
# 1. Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df = [Link](data)
print("DataFrame:")
print(df)
# 2. Using concat() to merge two DataFrames
data2 = {'Name': ['David', 'Eve'],
'Age': [28, 22],
'City': ['Houston', 'Miami']}
df2 = [Link](data2)
merged_df = [Link]([df, df2], ignore_index=True)
print("\nConcatenated DataFrame:")
print(merged_df)
• Setting conditions
import pandas as pd
# 1. Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
filtered_df = df[df['Age'] > 25]
print("\nFiltered DataFrame (Age > 25):")
print(filtered_df)
• Adding a new column
import pandas as pd
# 1. Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df['Salary'] = [50000, 60000, 70000]
print("\nDataFrame with New Column (Salary):")
print(df)
7. Perform following operations using pandas
• Filling NaN with string
• Sorting based on column values
• groupby()
• Filling NaN with string
import pandas as pd
data_with_nan = {'Name': ['Alice', 'Bob', None],
'Age': [25, None, 35],
'City': ['New York', 'Los Angeles', None]}
df_nan = [Link](data_with_nan)
df_filled = df_nan.fillna('Unknown')
print("\nDataFrame with NaN filled:")
print(df_filled)
• Sorting based on column values
import pandas as pd
# 1. Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df = [Link](data)
df_sorted = df.sort_values(by='Age', ascending=True)
print("\nSorted DataFrame by Age:")
print(df_sorted)
• groupby()
Import pandas as pd
data_group = {'Category': ['A', 'B', 'A', 'B', 'A'],
'Values': [10, 20, 30, 40, 50]}
df_group = [Link](data_group)
grouped_df = df_group.groupby('Category').sum()
print("\nGrouped DataFrame by Category:")
print(grouped_df)
8. Read the following file formats using pandas
• Text files
1. [Link]:
Name Age City
Alice 25 New York
Bob 30 Los Angeles
2. Program:
import pandas as pd
# Read a text file (e.g., tab-separated)
df_text = pd.read_csv('[Link]', sep='\t') # Use sep=' ' for space-separated files
print("Text File Data:")
print(df_text)
• CSV files
[Link]:
Name,Age,City
Alice,25,New York
Bob,30,Los Angeles
Program:
# Read a CSV file
df_csv = pd.read_csv('[Link]')
print("\nCSV File Data:")
print(df_csv)
• Excel files
[Link]:
Program:
# Read an Excel file
df_excel = pd.read_excel('[Link]', sheet_name='Sheet1') # Replace 'Sheet1' with your sheet name
print("\nExcel File Data:")
print(df_excel)
• JSON files
[Link]:
[
{"Name": "Alice", "Age": 25, "City": "New York"},
{"Name": "Bob", "Age": 30, "City": "Los Angeles"}
]
Program:
# Read a JSON file
df_json = pd.read_json('[Link]')
print("\nJSON File Data:")
print(df_json)
9. Read the following file formats
[Link] (Pickle File)
A serialized DataFrame or Python object.
• Pickle files
import pandas as pd
# Read a Pickle file
df_pickle = pd.read_pickle('[Link]')
print("Pickle File Data:")
print(df_pickle)
[Link] (Image File)
Any image file (e.g., JPEG, PNG).
pip install Pillow
• Image files using PIL
from PIL import Image
# Read an image file
image = [Link]('[Link]')
# Display image properties
print("\nImage File Properties:")
print(f"Format: {[Link]}, Size: {[Link]}, Mode: {[Link]}")
# Show the image (optional)
[Link]()
• Multiple files using Glob
[Link]:
Name,Age,City
Alice,25,New York
Bob,30,Los Angeles
[Link]:
Name,Age,City
Charlie,35,Chicago
David,40,Houston
import pandas as pd
import glob
# Read multiple CSV files matching a pattern
file_paths = [Link]('data_*.csv') # Matches files like data_1.csv, data_2.csv, etc.
df_list = [pd.read_csv(file) for file in file_paths]
df_combined = [Link](df_list, ignore_index=True)
print("\nCombined Data from Multiple Files:")
print(df_combined)
• Importing data from database
import pandas as pd
import sqlite3
# Connect to the SQLite database
conn = [Link]('[Link]')
# Query the database
query = "SELECT * FROM Dept"
df_db = pd.read_sql(query, conn)
# Close the connection
[Link]()
print("\nData from Database:")
print(df_db)
10. Demonstrate web scraping using python
pip install requests beautifulsoup4
Program:
#Reading a Web Page with Beautiful Soup
from bs4 import BeautifulSoup
import requests
# Fetch the webpage
url = '[Link]
response = [Link](url)
# Parse with BeautifulSoup
bs = BeautifulSoup([Link], '[Link]')
# Print the page title
print("Page Title:", [Link])
# Extract and print all links
print("\nAll Links:")
for link in bs.find_all('a'):
print([Link]('href')) # Print only the href attribute
# Extract and print all paragraphs
print("\nAll Paragraphs:")
for paragraph in bs.find_all('p'):
print([Link]) # Extract text from <p> tags
11. Perform following preprocessing techniques on loan prediction dataset
pip install pandas scikit-learn
Sample loan prediction dataset:
import pandas as pd
# Sample loan prediction dataset
data = {
'LoanAmount': [5000, 6000, 7000, 8000, 9000],
'CreditScore': [650, 700, 750, 800, 850],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Married': ['Yes', 'No', 'Yes', 'No', 'Yes'],
'LoanStatus': ['Approved', 'Rejected', 'Approved', 'Rejected', 'Approved']
}
df = [Link](data)
print("Original Dataset:")
print(df)
• Feature Scaling
import pandas as pd
from [Link] import MinMaxScaler
# Sample loan prediction dataset
data = {
'LoanAmount': [5000, 6000, 7000, 8000, 9000],
'CreditScore': [650, 700, 750, 800, 850],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Married': ['Yes', 'No', 'Yes', 'No', 'Yes'],
'LoanStatus': ['Approved', 'Rejected', 'Approved', 'Rejected', 'Approved']
}
# Define the DataFrame before using it
df = [Link](data)
# Feature Scaling
scaler = MinMaxScaler()
df['LoanAmount_Scaled'] = scaler.fit_transform(df[['LoanAmount']])
# Output
print("\nDataset after Feature Scaling:")
print(df[['LoanAmount', 'LoanAmount_Scaled']])
• Feature Standardization
import pandas as pd
from [Link] import StandardScaler
# Sample loan prediction dataset
data = {
'LoanAmount': [5000, 6000, 7000, 8000, 9000],
'CreditScore': [650, 700, 750, 800, 850],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Married': ['Yes', 'No', 'Yes', 'No', 'Yes'],
'LoanStatus': ['Approved', 'Rejected', 'Approved', 'Rejected', 'Approved']
}
# Define the DataFrame before using it
df = [Link](data)
# Feature Standardization
standard_scaler = StandardScaler()
df['CreditScore_Standardized'] = standard_scaler.fit_transform(df[['CreditScore']])
# Output the result
print("\nDataset after Feature Standardization:")
print(df[['CreditScore', 'CreditScore_Standardized']])
• Label Encoding
from [Link] import LabelEncoder
import pandas as pd
# Sample loan prediction dataset
data = {
'LoanAmount': [5000, 6000, 7000, 8000, 9000],
'CreditScore': [650, 700, 750, 800, 850],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Married': ['Yes', 'No', 'Yes', 'No', 'Yes'],
'LoanStatus': ['Approved', 'Rejected', 'Approved', 'Rejected', 'Approved']
}
# Define the DataFrame before using it
df = [Link](data)
# Label Encoding for LoanStatus
label_encoder = LabelEncoder()
df['LoanStatus_Encoded'] = label_encoder.fit_transform(df['LoanStatus'])
print("\nDataset after Label Encoding:")
print(df[['LoanStatus', 'LoanStatus_Encoded']])
• One Hot Encoding
import pandas as pd
# Sample loan prediction dataset
data = {
'LoanAmount': [5000, 6000, 7000, 8000, 9000],
'CreditScore': [650, 700, 750, 800, 850],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Married': ['Yes', 'No', 'Yes', 'No', 'Yes'],
'LoanStatus': ['Approved', 'Rejected', 'Approved', 'Rejected', 'Approved']
}
df = [Link](data)
# One Hot Encoding for Gender and Married
df_one_hot = pd.get_dummies(df, columns=['Gender', 'Married'], drop_first=True)
print("\nDataset after One Hot Encoding:")
print(df_one_hot)
12. Perform following visualizations using matplotlib
pip install matplotlib
Sample data:
import [Link] as plt
import numpy as np
# Sample data for different plots
categories = ['A', 'B', 'C', 'D']
values = [23, 45, 56, 78]
sizes = [15, 30, 45, 10] # For pie chart
x = [Link](0, 10, 100)
y = [Link](x)
# Bar Graph
[Link](categories, values, color='skyblue')
[Link]('Bar Graph')
[Link]('Categories')
[Link]('Values')
[Link]()
# Pie Chart
[Link](sizes, labels=categories, autopct='%1.1f%%', startangle=90, colors=['gold',
'lightcoral', 'lightgreen', 'lightskyblue'])
[Link]('Pie Chart')
[Link]()
# Box Plot
data = [[Link](0, std, 100) for std in range(1, 4)]
[Link](data, vert=True, patch_artist=True, labels=['A', 'B', 'C'])
[Link]('Box Plot')
[Link]()
# Histogram
data = [Link](170, 10, 250)
[Link](data, bins=30, color='lightgreen', edgecolor='black')
[Link]('Histogram')
[Link]('Values')
[Link]('Frequency')
[Link]()
# Line Chart and Subplots
fig, (ax1, ax2) = [Link](1, 2, figsize=(10, 5))
# Subplot 1: Line Chart
[Link](x, y, color='blue', label='sin(x)')
ax1.set_title('Line Chart')
ax1.set_xlabel('X')
ax1.set_ylabel('Y')
[Link]()
# Subplot 2: Another Line Chart
[Link](x, [Link](x), color='red', label='cos(x)')
ax2.set_title('Another Line Chart')
ax2.set_xlabel('X')
ax2.set_ylabel('Y')
[Link]()
plt.tight_layout()
[Link]()
# Scatter Plot
x = [Link](50)
y = [Link](50)
colors = [Link](50)
sizes = 1000 * [Link](50)
[Link](x, y, c=colors, s=sizes, alpha=0.6, cmap='viridis')
[Link]()
[Link]('Scatter Plot')
[Link]('X')
[Link]('Y')
[Link]()