0% found this document useful (0 votes)

8 views67 pages

Data Science Laboratory Manual CS3361

The document is a laboratory manual for the Data Science Laboratory course (CS3361) at the Department of Computer Science and Engineering. It outlines the course objectives, experiments, and expected outcomes, focusing on Python libraries and statistical methods for data analysis. The manual includes detailed instructions for various experiments, equipment requirements, and additional resources for students to enhance their learning experience.

Uploaded by

Bella S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views67 pages

Data Science Laboratory Manual CS3361

Uploaded by

Bella S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

LABORATORY MANUAL

[Link] : CS3361
[Link] : DATA SCIENCE LABORATORY
Regulation : R2021

Prepared By, Approved By,

[Link]/[Link] [Link]
Christiyana AP/CSE Prof& Head /CSE

1
INSTITUTE MISSION
To become a centre of excellence in preparing engineering with excellent technical, scientific
research and entrepreneurial abilities to contribute to the society.

INSTITUTE MISSION

1 Providing comprehensive learning environment

2 Imparting state-of-the-art technology to fulfil the needs of the students and

Industry
3 Establishing Industry-Institute alliance for bilateral benefits

4 Promoting Research and Development activities

Offering student lead activities to inculcate ethics, social responsibilities,

5
entrepreneurial, and leadership skills

DEPARTMENT VISION
To become a centre of excellence in technical education and scientific research in the field of
Computer Science and Engineering for the wellbeing of the society.

DEPARTMENT MISSION
Producing graduates with a strong theoretical and practical in computer
1 technology
to meet the Industry expectation.
Offering holistic learning ambience for faculty and students to investigate, apply
2 and
transfer knowledge.
Inculcating interpersonal traits among the students leading to employability and
3
entrepreneurship.
4 Establishing effective linkage with the Industries for the mutual benefits
Strengthening Research activities to solve the problems related to industry and
5
society.

2
SYLLABUS

COUR COURSE NAME L T P C

SE
COD
E
CS3361 DATA SCIENCE LABORATORY 0 0 4 2

COURSE OBJECTIVES :
● To understand the python libraries for data science
● To understand the basic Statistical and Probability measures for data science.
● To learn descriptive analytics on the benchmark data sets.
● To apply correlation and regression analytics on standard data sets.
● To present and interpret data using visualization packages in Python.

EXPERIMENTS

1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and
Pandas packages.
2. Working with Numpy arrays
3. Working with Pandas data frames
4. Reading data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set.
5. Use the diabetes data set from UCI and Pima Indians Diabetes data set for performing the following:
a. Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,
Skewness and Kurtosis.
b. Bivariate analysis: Linear and logistic regression modeling
c. Multiple Regression analysis
d. Also compare the results of the above analysis for the two data sets.
6. Apply and explore various plotting functions on UCI data sets.
a. Normal curves
b. Density and contour plots
c. Correlation and scatter plots
d. Histograms
e. Three dimensional plotting
7. Visualizing Geographic Data with Basemap
TOTAL: 60 Periods
CONTENT BEYOND SYLLABI: Hadoop, Apache spark

3
COURSE OUTCOMES:
On completion of the course, students will be able to:
CO1: Make use of the python libraries for data science
CO2: Make use of the basic Statistical and Probability measures for data science.
CO3: Perform descriptive analytics on the benchmark data sets.
CO4: Perform correlation and regression analytics on standard data sets
CO5: Present and interpret data using visualization packages in Python.

EQUIPMENT / SOFTWARE AND HARDWARE REQUIREMENT

● INTEL based desktop PC with min. 8GB RAM and 500 GB HDD, 17” or higher TFT Monitor,
Keyboard and mouse
● Windows 10 or higher operating system / Linux Ubuntu 20 or higher
● Python3.9 and above, Python, Numpy, Scipy, Matplotlib, Pandas, seaborn, Pycharm

4
List of Experiments

S
List of Experiments Page no
l
.
N
o
Download, install and explore the features of NumPy, SciPy, 6-9
1. Jupyter, Statsmodels and Pandas packages.

2. Working with Numpy arrays 10-14

3. Working with Pandas data frames 15-18

Reading data from text files, Excel and the web and
4. exploring various commands for doing descriptive analytics 19-23
on the Iris data set.
Use the diabetes data set from UCI and Pima Indians
Diabetes data set for performing the following:
24-26
a. Univariate analysis: Frequency, Mean, Median, Mode,
Variance, Standard Deviation, Skewness and Kurtosis.

5 b. Bivariate analysis: Linear and logistic regression modeling 27-31

.
c. Multiple Regression analysis 32-34

d. Also compare the results of the above analysis for the two 35-37
data sets.
Apply and explore various plotting functions on UCI data
43-44
sets.
a. Normal curves
b. Density and contour plots 45-47
6
. c. Correlation and scatter plots 48-52

d. Histograms 53-54

e. Three dimensional plotting 55-57

7. Visualizing Geographic Data with Basemap 58-60

5
[Link] Download, install and explore the features of NumPy, SciPy, Jupyter,
Statsmodels and Pandas packages.

AIM
To Download and install python and its packages using pip installation

PROCEDURE
Install Python Data Science Packages
Python is a high-level and general-purpose programming language with data science and machine learning
packages. Use the video below to install on Windows, MacOS, or Linux. As a first step, install Python for
Windows, MacOS, or Linux.

Python Packages
The power of Python is in the packages that are available either through the pip or conda package managers.
This page is an overview of some of the best packages for machine learning and data science and how to
install them.
We will explore the Python packages that are commonly used for data science and machine learning. You
may need to install the packages from the terminal, Anaconda prompt, command prompt, or from the Jupyter
Notebook. If you have multiple versions of Python or have specific dependencies then use an environment
manager such as pyenv. For most users, a single installation is typically sufficient. The Python package
manager pip has all of the packages (such as gekko) that we need for this course. If there is an administrative
access error, install to the local profile with the --user flag.
pip install gecko

Gekko
Gekko provides an interface to gradient-based solvers for machine learning and
optimization of mixed-integer, differential algebraic equations, and time series models.
Gekko provides exact first and second derivatives through automatic differentiation and
discretization with simultaneous or sequential methods.
pip install gecko

6
Keras
Keras provides an interface for artificial neural networks. Keras acts as an interface for the
TensorFlow library. Other backend packages were supported until version 2.4.
TensorFlow is now the only backend and is installed separately with pip install
tensorflow.
pip install
keras
Matplotlib
The package matplotlib generates plots in Python.

pip install matplotlib

Numpy
Numpy is a numerical computing package for mathematics, science, and engineering.
Many data science packages use Numpy as a dependency.
pip install numpy

OpenCV
OpenCV (Open Source Computer Vision Library) is a package for real-time computer vision
and developed with support from Intel Research.
pip install opencv-python

Pandas
Pandas visualizes and manipulates data tables. There are many functions that allow
efficient manipulation for the preliminary steps of data analysis problems.
pip install pandas

Plotly
Plotly renders interactive plots with HTML and JavaScript. Plotly Express is included with
Plotly.
pip install plotly

7
PyTorc

PyTorch enables deep learning, computer vision, and natural language

[Link] is led by Facebook's AI Research lab (FAIR).
pip install torch

Scikit-Learn
Scikit-Learn (or sklearn) includes a wide variety of classification, regression and clustering
algorithms including neural network, support vector machine, random forest, gradient
boosting, k-means clustering, and other supervised or unsupervised learning methods.
pip install scikit-learn

SciPy
SciPy is a general-purpose package for mathematics, science, and engineering and extends
the base capabilities of NumPy.
pip install scipy

Seaborn
Seaborn is built on matplotlib, and produces detailed plots in few lines of code.

pip install seaborn

Statsmodels
Statsmodels is a package for exploring data, estimating statistical models, and performing
statistical tests. It include descriptive statistics, statistical tests, plotting functions, and
result statistics.
pip install statsmodels

TensorFlow
TensorFlow is an open source machine learning platform with particular focus on
training and inference of deep neural networks. Development is led by the Google Brain
team. pip install tensorflow

8
Augmented Questions :

1. How would you approach optimizing a Python program for performance? Discuss
techniques for profiling, identifying bottlenecks, and improving efficiency. Provide
examples of how you might apply these techniques to a data processing task.
2. In the context of data analysis, what are some best practices for ensuring data quality and
integrity? Explain how you would handle missing data, outliers, and data inconsistencies
in a dataset before performing any analysis.

Viva Questions:

1. How do you install NumPy, SciPy, Jupyter, Statsmodels, and Pandas in a Python environment?
2. Can you explain the primary functionalities of NumPy and how it is useful in scientific computing?
3. What are some common functions and features provided by SciPy, and how does it extend
NumPy’s capabilities?
4. What is Jupyter Notebook, and how does it facilitate interactive computing and data analysis?
5. Describe how Pandas and Statsmodels are used for data analysis and statistical modeling in Python.

RESULT:
Thus the download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels
and Pandas packages was successfully completed.

9
[Link] Working with Numpy arrays
AIM

The aim is to create a NumPy array

with a specified number of dimensions
using the argument and verify the
resulting number of dimensions.
ALGORITHM:
1. Import the NumPy library.
2. Create an array with a specified set of elements.
3. Use the ndmin argument to set the desired number of dimensions for the
array.
4. Print the created array. di m
5. Print the number of dimensions of the array using n attribute.
the
CREATE A NUMPY NDARRAY OBJECT
NumPy is used to work with arrays. The array object in NumPy is called ndarray.
Example
import numpy as np
arr = [Link]([1, 2, 3, 4, 5])
print(arr)
print(type(arr))
To create an ndarray, we can pass a list, tuple or any array-like object into the array()
method, and it will be converted into anndarray:
Example
Use a tuple to create a NumPy array:
import numpy as np
arr = [Link]((1, 2, 3, 4, 5))
print(arr)

Dimensions in Arrays

A dimension in arrays is one level of array depth (nested arrays).

0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
Example
Create a 0-D array with value 42
import numpy as np

10
arr = [Link](42)

11
print(arr)

1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.
Example
Create a 1-D array containing the values 1,2,3,4,5:

import numpy as np
arr = [Link]([1, 2, 3, 4, 5])
print(arr)
2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array. These are often used to
represent matrix or 2nd order tensors.

Example
Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np
arr = [Link]([[1, 2, 3], [4, 5, 6]])
print(arr)
3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.
Example

Create a 3-D array with two 2-D arrays, both containing two arrays with the values
1,2,3 and 4,5,6:
import numpy as np
arr = [Link]([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)

12
Check Number of Dimensions?

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many
dimensions the array have.
Example

Check how many dimensions the arrays have:

import numpy as np
a = [Link](42)
b = [Link]([1, 2, 3, 4, 5])
c = [Link]([[1, 2, 3], [4, 5, 6]])
d = [Link]([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print([Link]
)
print([Link]
)
print([Link]
)
print([Link]
)

Higher Dimensional Arrays

An array can have any number of dimensions.
When the array is created, you can define the number of dimensions by using the ndmin
argument.
Example
Create an array with 5 dimensions and verify that it has 5 dimensions:
import numpy as np
arr = [Link]([1, 2, 3, 4],
ndmin=5) print(arr)
print('number of dimensions :', [Link])
In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element
that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd
dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

13
OUTPUT:

14
Augmented Questions :

1. Write a Python program using NumPy to create a 3D array of shape (4, 3, 2) with
random integers between 0 and 10. Perform the following tasks:

● Compute the sum along the first axis (axis=0).

● Compute the mean along the second axis (axis=1).
● Flatten the 3D array into a 1D array and find the maximum value

2. Write a Python program using NumPy to perform the following tasks with two 1D arrays:

● Create two 1D arrays of length 10 with random integers between 1 and 20.
● Compute and print their dot product.
● Compute the element-wise product of the two arrays.
● Normalize the element-wise product by dividing it by the maximum value of the product

Viva Questions:

2. What is the difference between a NumPy array and a Python list, and why would you use a
NumPy array for numerical computations?
3. How can you create a NumPy array from a Python list or tuple? Provide an example.
4. Describe the various methods to access and manipulate elements in a NumPy array. How can
you perform slicing and indexing on a 2D array?
5. How do you perform basic arithmetic operations (such as addition, subtraction, multiplication,
and division) on NumPy arrays? What are some benefits of using NumPy's vectorized operations
over traditional loops?
6. Explain how broadcasting works in NumPy and give an example of how it can be used to
perform operations on arrays of different shapes.

RESULT
Thus the working of Numpy arrays was executed successfully.

15
[Link] Working with Pandas data frames

AIM:

The aim is to illustrate the basic operations of creating, indexing, and loading data into a Pandas
DataFrame. This includes creating a simple DataFrame, locating specific rows using index labels,
adding named indexes, and loading data from an external CSV file into a DataFrame.
ALGORITHM:

1. Import the Pandas library.

2. Create a simple DataFrame using a Python dictionary.
3. Print the DataFrame to display its structure.
4. Use the loc attribute to locate and print specific rows based on their
5. Add named indexes to the DataFrame using index argument.
6. the Print the DataFrame with named indexes.
7. Use the loc attribute with named indexes to locate and print specific
8. rows. Import the Pandas library again for the file loading example.
9. Use the read_csv function to load data from a CSV file into a
10. Print the resultingDataFrame.
DataFrame.
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a
table with rows and columns.
Example

Create a simple Pandas DataFrame:

import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df =
[Link](data)
print(df)
Locate Row
As you can see from the result above, the DataFrame is like a table with rows and
columns. Pandas use the loc attribute to return one or more specified row(s)

Example

16
to print row 0 alone

17
#refer to the row index:
print([Link][0])

Example
Return row 0 and 1:
#use a list of indexes:
print([Link][[0, 1]])

Named Indexes
With the index argument, you can name your own indexes.
Example
Add a list of names to give each row a name:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = [Link](data, index = ["day1", "day2",
"day3"]) print(df)
Locate Named Indexes
Use the named index in the loc attribute to return the specified row(s).
Example
Return "day2":
#refer to the named index:
print([Link]["day2"])

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.
Example
Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd
file_path = r'C:\Users\SRM\Downloads\[Link]'
df = pd.read_csv(file_path)
print(df)

18
OUTPUT:

19
AUGMENTED QUESTIONS:

1. Write a Python program using Pandas to analyze a dataset of customer transactions. The
dataset includes columns for 'CustomerID', 'TransactionDate', 'Amount', and 'Category'.
Perform the following tasks:

● Load the dataset from a CSV file.

● Convert the 'TransactionDate' column to a datetime format and set it as the index of the
DataFrame.
● Filter the DataFrame to include only transactions that occurred in the last 30 days and sort
them by 'Amount' in descending order.
● Export the filtered DataFrame to a new CSV file.

2. Create a Python program using Pandas to work with a dataset of employee records. The
dataset contains columns 'EmployeeID', 'Name', 'Department', 'JoiningDate', and 'Salary'.
Perform the following operations:

● Load the dataset and inspect its basic structure.

● Calculate the number of employees and the average salary for each department.
● Add a new column 'YearsWithCompany' to the DataFrame, representing the number of
years each employee has been with the company.
● Identify and display employees who have been with the company for more than 5 years and
have a salary above the median salary of their department.
● Save the results to an Excel file with separate sheets for each department.

VIVA QUESTIONS:

1. How can you create a Pandas DataFrame from a dictionary, and what are the key methods to
inspect the structure and content of the DataFrame? Provide an example.
2. Describe how you would handle missing data in a Pandas DataFrame. What methods are available
for detecting, removing, or imputing missing values?
3. How can you perform data filtering and selection in a Pandas DataFrame? Explain how to select
rows based on certain conditions and how to access specific columns.
4. What are some common operations for data aggregation and grouping in Pandas? How would you
use the groupby method to calculate summary statistics for different groups within a DataFrame?
5. Explain how to merge and join DataFrames in Pandas. What are the different types of joins
available, and how do you handle conflicts and overlapping column names during the merge process?

RESULT
Thus the working with pandas data frames was executed successfully.

20
[Link]: Reading data from text files, Excel and the web and exploring
4 various commands for doing descriptive analytics on the Iris data
set.

AIM

Perform descriptive analytics on the Iris dataset using Pandas and Seaborn, including data
reading, exploration, manipulation, summary statistics, visualization, and handling missing
values.

ALGORITHM

● Read the Iris dataset from a CSV file using Pandas.

● Explore the dataset's structure and content, perform data slicing, and select
specific columns.
● Calculate summary statistics, handle missing values, and manipulate the data.
● Apply styling and visualization techniques using Seaborn for better insights.

PROGRAM:
import pandas as pd
import seaborn as sns
import [Link] as plt
# Load the Iris dataset
file_path = r'C:\Users\SRM\Downloads\[Link]'
df = pd.read_csv(file_path)
# Display the first few rows of the dataset
print("First few rows of the dataset:")
print([Link]())
# Check for missing values print("\
nMissing values in the dataset:")
print([Link]().sum())
# Handling missing values (if any)
df = [Link]() # Drop rows with missing values
# Summary statistics
print("\nSummary statistics:")
print([Link]())
# Data exploration and visualization
[Link](style="whitegrid")
# Pairplot to see the pairwise relationships
print("\nGenerating pairplot...")
[Link](df, hue='species')
[Link]("Pairplot of the Iris Dataset", y=1.02)
[Link]()
# Boxplot to see the distribution of each feature
print("\nGenerating boxplot...")
[Link](figsize=(10, 6))
[Link](data=df, width=0.5, palette="colorblind")

21
[Link]("Boxplot of Iris Features")
[Link]()

# Correlation heatmap
print("\nGenerating correlation heatmap...")
[Link](figsize=(8, 6))
[Link]([Link](), annot=True, cmap='coolwarm')
[Link]("Correlation Heatmap")
[Link]()
# Distribution of each species
print("\nGenerating countplot for species distribution...")
[Link](x='species', data=df, palette="Set2")
[Link]("Species Distribution")
[Link]()
OUTPUT:

22
23
24
AUGMENTED QUESTIONS :

1. Write a Python program to read data from a CSV file named [Link] and perform the
following tasks:

● Print the first 5 rows of the DataFrame.

● Display the column names of the DataFrame

2. Write a Python program to read data from an Excel file named [Link] and perform
the following tasks:

● Print the summary statistics of the DataFrame.

● Filter and display rows where the value in the 'Age' column is greater than 30.

VIVA QUESTIONS:

1. How do you read a CSV file into a Pandas DataFrame?

2. What function would you use to read data from an Excel file in Pandas, and how can you specify
a particular sheet to load?
3. How can you load data directly from a URL into a Pandas DataFrame?
4. What Pandas functions can you use to get summary statistics (like mean, median, and standard
deviation) for the Iris dataset?
5. How can you create a scatter plot of two features from the Iris dataset using Matplotlib or Seaborn?

RESULT
Thus the Reading data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set was executed successfully.

25
[Link]. Standard Deviation, Skewness and Kurtosis of Pima Indians Diabetes
1 Dataset.

AIM:

Perform basic data exploration and descriptive statistics on the diabetes dataset using Pandas and
the Statistics module. This includes examining data structure, calculating mean, mode, median,
variance, standard deviation, value counts, skewness, and kurtosis.

ALGORITHM
1. Read the diabetes dataset from a CSV file using Pandas.
2. Display the first few rows, shape, and data type of the dataset.
3. Calculate descriptive statistics using the Statistics module:
4. Calculate mean, mode, median, variance, and
5. Calculate value counts for the "Outcome" column.
6. Calculate skewness and kurtosis for the entire dataset.

PROGRAM
import pandas as pd
import statistics
# Load the dataset
file_path = r'C:\Users\SRM\Downloads\[Link]' # Adjust the path to your file location
pima = pd.read_csv(file_path)
# Display the first few rows of the dataset
print("First few rows of the dataset:")
print([Link]())
# Print the shape of the dataset
print("\nShape of the
dataset:") print([Link])
# Print the type of the dataset
print("\nType of the dataset:")
print(type(pima))
# Print the index of the dataset
print("\nRow indices:")
pima_row_idx = [Link]
print(pima_row_idx)
# Print the columns of the dataset
print("\nColumn names:")
pima_col_idx = [Link]
print(pima_col_idx)
# Print the data types of the columns
print("\nData types of each column:")

26
print([Link])
# Calculate statistical measures
mean = [Link](pima["Insulin"])
mode = [Link](pima["Insulin"])
median = [Link](pima["Insulin"])
variance = [Link](pima["Outcome"])
standard_deviation = [Link](pima["Outcome"])
fre_count = pima["Outcome"].value_counts()
skew = [Link](axis=0, skipna=True)
kurt = [Link](skipna=True)
# Print the calculated statistical measures
print("\nMean of Insulin:", mean)
print("Mode of Insulin:", mode)
print("Median of Insulin:", median)
print("Variance of Outcome:", variance)
print("Standard Deviation of Outcome:", standard_deviation) print("\
nFrequency count of Outcome:")
print(fre_count)
print("\nSkewness of each column:")
print(skew)
print("\nKurtosis of each column:")
print(kurt)

OUTPUT:

27
RESULT:
Thus the Standard Deviation, Skewness and Kurtosis of Pima Indians Diabetes Dataset was executed
successfully.

28
[Link]. Univariate analysis: Frequency, Mean, Median, Mode, Variance,Standard
2 Deviation, Skewness and Kurtosis of UCI Diabetes Dataset.

AIM:

Conduct basic data exploration and compute descriptive statistics on the

"num_lab_procedures" column of the diabetic dataset using Pandas and the Statistics
module.

ALGORITHM:
1. Read the diabetic dataset from a CSV file using Pandas.
2. Display the first few rows, shape, and data type of the dataset.
3. Retrieve and print row and column indices.
4. Calculate descriptive statistics using the Statistics module.

PROGRAM
import pandas as pd
import statistics
# Load the dataset
file_path = r'C:\Users\SRM\Downloads\[Link]' # Adjust the path to your file location
pima = pd.read_csv(file_path)
# List of columns to analyze
columns = [Link]
# Univariate analysis
for column in columns:
print(f"\nAnalysis for column:
{column}") # Frequency
frequency = pima[column].value_counts()
print("Frequency:\n", frequency)
# Mean
mean = pima[column].mean()
print("Mean:", mean)
# Median
median = pima[column].median()
print("Median:", median)
# Mode
mode = pima[column].mode()[0] if not pima[column].mode().empty else "No mode"
print("Mode:", mode)
# Variance
variance = pima[column].var()
print("Variance:", variance)

29
# Standard Deviation

30
std_dev = pima[column].std()
print("Standard Deviation:", std_dev)

# Skewness
skewness = pima[column].skew()
print("Skewness:", skewness)
# Kurtosis
kurtosis = pima[column].kurt()
print("Kurtosis:", kurtosis)
OUTPUT:

31
32
33
RESULT:
Thus the Univariate analysis: Frequency, Mean, Median, Mode, Variance,Standard
Deviation, Skewness and Kurtosis of UCI Diabetes Dataset was executed successfully.

34
[Link]. Bivariate Analysis-Program for linear regression
3

AIM:
Explore the relationship between Glucose and Blood Pressure in the diabetes dataset using a scatter plot
and create a linear regression model to predict Age based on BMI.
ALGORITHIM:
1. Import necessary libraries: NumPy, Pandas, Seaborn, Statistics, Matplotlib, scikit-
learn, and statsmodels.
2. Read the diabetes dataset from a CSV file.
3. Select relevant columns for analysis (Pregnancies, Glucose,
BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction,
Age).
4. Create a scatter plot to visualize the relationship between Glucose and
Blood Pressure.
5. Extract features (X) and target variable (Y) for linear regression (e.g., Age vs BMI).
6. Use scikit-learn'sLinearRegression to fit a linear model.
7. Display the scatter plot.
8. Fit a linear regression model to predict Age based on BMI using statsmodels.
PROGRAM
import numpy as np
import pandas as pd
import seaborn as sns
import [Link] as plt
from sklearn.linear_model import LinearRegression

# Load the dataset

file_path = r'C:\Users\SRM\Downloads\[Link]' # Adjust the path to your file location
df = pd.read_csv(file_path)

# Display the first few rows of the dataset

head = [Link]()
print("First few rows of the
dataset:") print(head)

# Define feature columns and target column

cols = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI",
"DiabetesPedigreeFunction", "Age"]
X = df[['BMI']] # Features need to be a 2D array
Y = df['Age'] # Target variable

# Scatter plot of Glucose vs BloodPressure

[Link](df['Glucose'], df['BloodPressure'], color='blue')
[Link]('Glucose vs BloodPressure', fontsize=14)
[Link]('Glucose', fontsize=14)
[Link]('BloodPressure', fontsize=14)
[Link](True)
[Link]()
# Fit a linear regression model
model = LinearRegression()

35
[Link](X, Y)

# Print model coefficients

print(f"Intercept: {model.intercept_}")
print(f"Coefficient for BMI: {model.coef_[0]}")

OUTPUT:

36
RESULT:
Thus the Bivariate Analysis-Program for linear regression was executed successfully.

37
[Link]. Bivariate Analysis Logistic regression
4

AIM:
Perform logistic regression analysis on the diabetes dataset to predict the likelihood of
diabetes based on various independent variables. Evaluate the model's performance using
classification report and confusion matrix.
ALGORITHM
1. Import necessary libraries: NumPy, Pandas, Seaborn,Matplotlib, statsmodels, and scikit-learn.

2. Read the diabetes dataset from a CSV file.

3. Definea logistic regression model using statsmodels withdifferent sets of independent

variables (cols2, cols3cols4).
4. Import LogisticRegression from scikit-learn for additionalevaluation
metrics. [Link] and print the confusion matrix to evaluate themodel's
accuracy. PROGRAM
import pandas as pd
import numpy as np
import [Link] as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from [Link] import classification_report, confusion_matrix
import seaborn as sns

# Load the dataset

file_path = r'C:\Users\SRM\Downloads\[Link]' # Adjust the path to your file location
df = pd.read_csv(file_path)

# Display the first few rows of the dataset

print("First few rows of the dataset:")
print([Link]())

# Define the feature and target variables

X = df[['Glucose']] # Feature: Glucose
Y = df['Outcome'] # Target: Diabetes outcome

# Split the data into training and testing sets

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=42)

# Initialize and fit the logistic regression model

model = LogisticRegression()
[Link](X_train, Y_train)

# Predict on the test set

Y_pred = [Link](X_test)

# Print classification report and confusion matrix

38
print("\nClassification Report:")
print(classification_report(Y_test, Y_pred))

print("\nConfusion Matrix:")
conf_matrix = confusion_matrix(Y_test, Y_pred)
print(conf_matrix)

# Visualization: Scatter plot with decision boundary

[Link](figsize=(10, 6))

# Plot the data points

[Link](X_test, Y_test, color='blue', label='Test Data')

# Plot decision boundary

x_values = [Link](X_test.min(), X_test.max(), 100)
y_values = model.predict_proba(x_values.reshape(-1, 1))[:, 1]
[Link](x_values, y_values, color='red', label='Decision Boundary')

# Labels and title

[Link]('Glucose')
[Link]('Probability of Diabetes')
[Link]('Logistic Regression: Glucose vs Diabetes')
[Link]()
[Link](True)
[Link]()

OUTPUT:

39
RESULT:
Thus the Bivariate Analysis Logistic regression was executed successfully.

40
[Link]. MULTIPLE REGRESSION ANALYSIS
5

AIM:

The aim of the provided code is to analyze and visualize a dataset related to diabetes using Python and
various libraries such as pandas, seaborn, matplotlib, and statsmodels. The code explores the dataset,
calculates and visualizes the correlation matrix, generates a quantile-quantile (QQ) plot for the 'Age'
variable, and produces scatter matrices for different subsets of the data.

ALGORITHM:

1. Import necessary libraries: pandas, seaborn, matplotlib, statsmodels, and pylab.

2. Read the diabetes dataset from the '[Link]' file into a pandas DataFrame (df).
3. Display the first few rows of the dataset using `[Link]()`.
4. Calculate the correlation matrix (`corr`) for the variables in the dataset.
5. Create a heatmap (`hm`) using seaborn to visualize the correlation matrix.
6. Generate a quantile-quantile (QQ) plot for the 'Age' variable using statsmodels.
7. Display the scatter matrix for all variables in the dataset using matplotlib and seaborn.
8. Extract a subset of the data (`data`) containing the columns 'Pregnancies', 'Glucose',
and 'BloodPressure'.
9. Display the scatter matrix for the subset of data.

PROGRAM:
import pandas as pd
import seaborn as sns
import [Link] as plt
import [Link] as sm
import pylab

# Load the dataset

file_path = r'C:\Users\SRM\Downloads\[Link]' # Adjust the path to your file location
df = pd.read_csv(file_path)

# Display the first few rows of the dataset

print("First few rows of the dataset:")
print([Link]())

# Calculate and print the correlation matrix

corr = [Link]()
print("\nCorrelation matrix:")
print(corr)

# Heatmap of the correlation matrix

[Link](figsize=(10, 8))
hm = [Link](corr, xticklabels=[Link], yticklabels=[Link], cmap='RdBu', annot=True)

41
[Link]('Correlation Heatmap')
[Link]()

# QQ plot for the 'Age' column

data = df['Age']
[Link](figsize=(8, 6))
[Link](data, line='s')
[Link]('QQ Plot for Age')
[Link]()

# Scatter matrix for the entire DataFrame

[Link](figsize=(15, 10))
[Link].scatter_matrix(df, alpha=0.8, figsize=(15, 15), diagonal='kde')
[Link]('Scatter Matrix for Diabetes Dataset')
[Link]()

# Scatter matrix for selected features

data = df[["Pregnancies", "Glucose", "BloodPressure"]]
[Link](figsize=(10, 7))
[Link].scatter_matrix(data, alpha=0.8, figsize=(10, 7), diagonal='kde')
[Link]('Scatter Matrix for Selected Features')
[Link]()

OUTPUT:

42
43
44
AUGMENTED QUESTIONS:

1. Write a Python program to perform the following tasks using the UCI Diabetes dataset:

● Calculate and print the mean, median, and standard deviation of the 'Glucose' column.
● Fit a linear regression model to predict 'Outcome' based on 'Glucose' and print the model's R-
squared value.

2. Write a Python program to perform the following tasks using the Pima Indians Diabetes dataset:

● Calculate and print the frequency of unique values in the 'Outcome' column.
● Fit a logistic regression model to predict 'Outcome' based on 'Glucose' and print the model's
accuracy.

VIVA QUESTIONS :

1. How do you calculate the mean, median, and standard deviation for a column in the Diabetes
dataset?
2. What is the purpose of performing linear regression, and how would you apply it to the Diabetes
dataset?
3. How can you fit a logistic regression model to the Diabetes dataset and evaluate its performance?
4. What is multiple regression analysis, and how would you use it to analyze the Diabetes dataset?
5. How would you compare the results of your analysis between the UCI Diabetes dataset and the
Pima Indians Diabetes dataset?

RESULT:
The code produces visualizations and analyses, including a correlation matrix heatmap, a QQ plot for
'Age,' and scatter matrices, offering insights into variable relationships and distributions within the
diabetes dataset.

45
[Link] Apply and explore various plotting functions on UCI data sets.

a. Normal
curves AIM:
The aim of the provided Python code is to generate and visualize the probability
density function (PDF) of a normal distribution. The range for the x-axis values is set from -
20 to 20 with a step size of 0.01.
ALGORITHM:
1. Import necessary libraries: numpy for numerical operations, [Link]
for plotting, [Link] for the normal distribution, and statistics for
calculating mean and standard deviation (although, this is later corrected to use
numpy functions).
2. Create an array x_axis with values ranging from -20 to 20 with a step size of 0.01.
3. Incorrect calculation of mean and standard deviation:
● The initial code attempts to use [Link] and [Link] on the
x_axis array, which is not the correct approach. The correct approach is to
use [Link] and [Link].
4. Plot the normal distribution PDF:
● The code uses [Link] to plot the normal distribution PDF using
[Link] from the [Link] module. The mean and standard deviation
are used as parameters for the normal distribution.
5. Display the plot:
●The code uses [Link]() to display the generated plot.
PROGRAM

import numpy as np
import [Link] as plt
from [Link] import norm
import statistics

# Generate x values from -20 to 20 with a step of 0.01

x_axis = [Link](-20, 20, 0.01)

# Calculate mean and standard deviation of the x_axis values

mean = [Link](x_axis)
sd = [Link](x_axis)

# Plot the normal distribution

[Link](x_axis, [Link](x_axis, mean,

46
sd))

47
[Link]('Normal Distribution')
[Link]('x')
[Link]('Probability Density')
[Link](True)
[Link]()

RESULT:
The corrected code will generate a plot displaying the probability density function of
a normal distribution with mean and standard deviation calculated from the range of values
specified on the x-axis (-20 to 20 with steps of 0.01). The resulting plot visually represents
the distribution of the random variable within that range.

48
b. Density and contour plots

AIM:
The aim of this program is to use Python code is to create a 2D contour plot with filled contours and an
overlaid image of a mathematical function. The function `f(x, y)` is defined, and the contours of this
function are plotted on a grid using Matplotlib. Additionally, an image of the function is displayed using
`[Link]`, and a color bar is added for reference.

ALGORITHM:
1. Import necessary libraries:
- `[Link]` for plotting.
- `numpy` for numerical operations.

2. Set the plotting style to 'seaborn-white' using `[Link]('seaborn-white')`.

3. Define the function `f(x, y)` which represents a mathematical expression involving
sine, cosine, and exponentiation.

4. Generate evenly spaced values for `x` and `y` using `[Link]`.

5. Create a mesh grid (`X`, `Y`) using `[Link]` based on the generated `x` and
`y` values.

6. Evaluate the function `f(X, Y)` for each point on the grid and store the result in the variable
`Z`.

7. Plot black contours of the function using `[Link]` with a single line.

8. Plot filled contours with a colormap ('RdGy') using `[Link]` to emphasize

different levels of the function.

9. Add labeled contour lines using `[Link]` to provide information about the
contour levels.

10. Display an image of the function using `[Link]` with transparency (alpha=0.5) and
a specified colormap ('RdGy').

11. Add a color bar to the plot for reference using `[Link]()`.

49
PROGRAM:

import numpy as np
import [Link] as plt

# Set the style for the plot

[Link]('seaborn-
white')

# Define the function

def f(x, y):
return [Link](x) ** 10 + [Link](10 + y * x) * [Link](x)

# Create grid values

x = [Link](0, 5, 50)
y = [Link](0, 5, 40)
X, Y = [Link](x, y)
Z = f(X, Y)

# Create contour plots

[Link](X, Y, Z, colors='black') # Basic contour plot
[Link](X, Y, Z, 20, cmap='RdGy') # Contour plot with color map

# Add contour labels

contours = [Link](X, Y, Z, 3, colors='black')
[Link](contours, inline=True, fontsize=8)

# Add an image plot of the data with color map

[Link](Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy', alpha=0.5)

# Add color bar

[Link]()

# Display the plot

[Link]('Contour and Image Plot')
[Link]('X axis')
[Link]('Y axis')
[Link]()

50
Result:
The result of the code execution is a 2D contour plot with filled contours and an overlaid
image of the function defined by `f(x, y)`. The contours provide a visual representation of the
function's behavior, and the color bar helps to interpret the values associated with the
colormap. The overall plot combines different elements to present a comprehensive view of
the mathematical function in the specified range.

51
c. Correlation and scatter plots

AIM:

The aim of the provided Python code is to analyze and visualize the relationship
between the 'BloodPressure' and 'BMI' columns in a diabetes dataset using the Pandas,
Seaborn, and SciPy libraries. It includes loading the dataset, displaying its headers, creating
scatter plots, fitting a regression line, and calculating the correlation coefficient and
correlation matrix.

ALGORITHM:

1. Import necessary libraries:

● pandas for data manipulation and analysis.
● seaborn for statistical data visualization.
● [Link] for statistical functions.
2. Load the diabetes dataset from a CSV file using pd.read_csv.
3. Display the first few rows of the dataset using print([Link]()).
4. Create a scatter plot using [Link] to visualize the relationship
between 'BloodPressure' and 'BMI'.
5. Create another scatter plot with a regression line using [Link] to show the
linear relationship between 'BloodPressure' and 'BMI'.
6. Create a scatter plot with hue based on 'BloodPressure' using [Link] to
visualize the distribution of points across different 'BloodPressure' values.
7. Calculate and print the Pearson correlation coefficient between 'BloodPressure' and
'BMI' using [Link].
8. Calculate and print the correlation matrix for all columns in the dataset using
[Link]().
9. Visualize the correlation matrix using a heatmap with [Link].

PROGRAM:
import pandas as pd
import numpy as np
import seaborn as sns
import [Link] as plt
import [Link] as sm

# Load the dataset

file_path = r'C:\Users\SRM\Downloads\[Link]'
df = pd.read_csv(file_path)

# 1. Display the first few rows of the dataset

print("Diabetes DataFile headers Details:")
print([Link]())

# 2. Calculate and visualize the correlation matrix

52
cormat = [Link]()

53
print("\nCorrelation MATRIX:")
print(round(cormat, 2))

# Heatmap of the correlation matrix

[Link](figsize=(10, 8))
[Link](cormat, annot=True, cmap='RdBu', vmin=-1, vmax=1)
[Link]("Correlation Matrix Heatmap")
[Link]()

# 3. Generate a QQ plot for the 'Age' variable

data = df['Age']
[Link](figsize=(8, 6))
[Link](data, line='s')
[Link]('QQ Plot for Age')
[Link]()

# 4. Produce scatter matrices

# Scatter matrix for the entire DataFrame
[Link](figsize=(15, 10))
[Link].scatter_matrix(df, alpha=0.8, figsize=(15, 15), diagonal='kde')
[Link]('Scatter Matrix for Diabetes Dataset')
[Link]()

# Scatter matrix for selected features

selected_features = df[["Pregnancies", "Glucose", "BloodPressure"]]
[Link](figsize=(10, 7))
[Link].scatter_matrix(selected_features, alpha=0.8, figsize=(10, 7), diagonal='kde')
[Link]('Scatter Matrix for Selected Features')
[Link]()

Output:

54
55
56
RESULT:
The code generates visualizations and statistical measures for understanding the association
between 'BloodPressure' and 'BMI' in the diabetes dataset. This includes scatter plots,
regression lines, correlation coefficient, and a correlation matrix heatmap.

57
His
tograms
AIM:

The aim of the provided Python code is to create a histogram of a given dataset (`x`)
and display its distribution.
ALGORITHM:

1. Import the necessary library: `[Link]` for plotting.

2. Define a list `x` containing numerical data.

3. Use `[Link](x, bins=10)` to create a histogram with 10 bins (intervals).

4. Display the histogram using `[Link]()`.

PROGRAM:

import [Link] as plt

# Data
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91]

# Create the histogram

[Link](x, bins=10, edgecolor='black')

# Add title and labels

[Link]('Histogram of Data')
[Link]('Value')
[Link]('Frequency')

58
# Display the plot
[Link]()

59
RESULT:
The result of the code execution is a histogram that visualizes the distribution of the
data in the list `x`. The histogram is divided into 10 bins, providing insights into the
frequency of values within each interval. The visualization allows for a quick understanding
of the data's central tendency and spread.

60
E. THREE DIMENSIONAL PLOTTING

AIM:
The aim of this provided Python code is to create a three-dimensional (3D) plot using Matplotlib.
The code includes plotting a three-dimensional line and scattered points in a 3Dspace.

ALGORITHM:
1. Import necessary libraries: `mpl_toolkits.mplot3d`, `numpy`, and
2. `[Link]`.
3. Create a figure and 3D axes using `[Link]()` and `[Link](projection='3d')`.
a. Generate data for a three-dimensional line: `zline`, `xline`, and `yline`.
b. Plot the three-dimensional line using `ax.plot3D`.
c. Generate data for three-dimensional scattered points: `zdata`, `xdata`, and
`ydata`.
d. Plot the scattered points using `ax.scatter3D`. The color of the points (`c`) is
determined by the `zdata` values, and a colormap ('Greens') is applied for visual
representation.

PROGRAM:
import numpy as np
import [Link] as plt
from mpl_toolkits.mplot3d import Axes3D

# Create a new figure

fig = [Link]()
ax = fig.add_subplot(111, projection='3d')

# Data for a three-dimensional

line zline = [Link](0, 15,
1000) xline = [Link](zline)
yline = [Link](zline)
[Link](xline, yline, zline, color='gray')

# Data for three-dimensional scatter points

zdata = 15 * [Link](100)
xdata = [Link](zdata) + 0.1 * [Link](100)
ydata = [Link](zdata) + 0.1 * [Link](100)
sc = [Link](xdata, ydata, zdata, c=zdata, cmap='Greens')

# Add color bar and labels

[Link](sc, ax=ax, label='Z data')
ax.set_xlabel('X Axis')
ax.set_ylabel('Y Axis')
ax.set_zlabel('Z Axis')
ax.set_title('3D Line and Scatter
Plot')

# Show plot

61
[Link]()

62
63
AUGMENTED QUESTIONS :

1. Write a Python program to perform the following tasks using the UCI dataset:

● Normal Curves: Plot a normal distribution curve over a histogram for a numerical
column, such as 'Glucose'. Calculate and display the mean and standard deviation used for
plotting the normal curve.
● Density and Contour Plots: Create a density plot for the 'Glucose' and 'BMI'
columns, and overlay a contour plot to visualize the density regions.

2. Write a Python program to create the following visualizations using the UCI dataset:

● Correlation Plot: Generate a heatmap showing the correlation matrix of all

numerical columns in the dataset.
● Scatter Plot Matrix: Create a scatter plot matrix for a subset of columns (e.g.,
'Glucose', 'BMI', 'Age') to explore pairwise relationships.
● Three-Dimensional Plot: Plot a 3D scatter plot using 'Glucose', 'BMI', and 'Age'
as the three dimensions.

VIVA QUESTIONS:

1. How would you plot a normal distribution curve for a numerical column in the UCI dataset?
Which Python libraries and functions can you use for this?
2. Explain how you can create a density plot and a contour plot for two numerical columns in
the UCI dataset. What insights can these plots provide?
3. Describe the process for generating a correlation plot and scatter plot between two features in
the UCI dataset. How do these plots help in understanding the relationship between features?
4. What steps would you take to create a histogram of a numerical feature in the UCI dataset?
What information can be derived from a histogram?
5. How can you create a three-dimensional plot using three numerical features from the UCI
dataset? Which Python functions are used for 3D plotting and what do these plots represent?

RESULT:
The code generates a 3D plot with a three-dimensional line and scattered points. The
line is defined by the functions `[Link]` and `[Link]`, and the scattered points have random
coordinates influenced by sine and cosine functions. The color of the scattered points varies
based on the `zdata` values, creating a visually appealing representation of the data in 3D
space

64
[Link] VISUALIZING GEOGRAPHIC DATA WITH BASEMAP

AIM:
The goal of the provided code is to visualize geographic data using the Basemap toolkitin Matplotlib.
Specifically, it creates maps with different projections, includes topographic features, and marks the location of
Seattle.

ALGORITHM:
1. Set up a Matplotlib figure and create a Basemap with Lambert conformal conic
projection, specifying parameters like width, height, central latitude, and longitude.
2. Overlay topographic features using the `etopo` method and add a point on the map
corresponding to Seattle.
3. Define a function `draw_map` to draw shaded-relief images and latitude/longitude
lines with specified styles.
4. Generate three different maps with varying projections: cylindrical projection
covering the entire world, repeated cylindrical projection, and Lambert conformal conic
projection focused on a specific region.
5. Display the maps using Matplotlib.

PROGRAM:
import [Link] as plt
import [Link] as ccrs
import [Link] as cfeature

# Create a new figure

fig = [Link](figsize=(12, 8))

# Set up the Cartopy projection

ax = fig.add_subplot(111, projection=[Link]())

# Add geographic features

ax.add_feature([Link])
ax.add_feature([Link], linestyle=':')
ax.add_feature([Link], edgecolor='black')
ax.add_feature([Link])

# Add gridlines
[Link](draw_labels=True)

# Set title
[Link]('World Map with Cartopy')

# Show the plot

[Link]()

65
OUTPUT:

66
AUGMENTED QUESTIONS :

1. Write a Python program using Basemap to create an interactive map that displays the locations of major
cities around the world. Include functionality to zoom in and out, and add labels to each city. How would
you integrate Basemap with other libraries to enhance interactivity?

2. Develop a Python script to visualize global climate data using Basemap. Create a map that displays
temperature anomalies with a color gradient. Integrate Basemap with data from a CSV file containing
latitude, longitude, and temperature anomaly values. How would you handle large datasets and ensure
efficient rendering of the map?

VIVA QUESTIONS:

1. What is the Basemap toolkit, and how is it used for visualizing geographic data in Python?
2. How would you plot a simple map of a specific region or country using Basemap? What are the
basic steps involved in creating such a map?
3. Explain how to overlay markers or data points on a Basemap. What functions or methods are used
to add these elements to the map?
4. How can you display geographic data such as country borders, rivers, or cities on a map using
Basemap? What are some common map features you can add?
5. Describe how to customize the appearance of a map created with Basemap, such as changing the
map's projection, adding gridlines, or adjusting the map's color scheme.

RESULT:
The code produces geographic visualizations, including a Lambert conformal conic
projection with topographic features, a cylindrical projection covering the entire world, and a
repeated cylindrical projection. The resulting maps showcase the versatility of the Basemap
toolkit for visualizing geographic data in Matplotlib.

Data Science Lab
No ratings yet
Data Science Lab
60 pages
CODE (1) Merged
No ratings yet
CODE (1) Merged
36 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
82 pages
Ocs353 Data Science Fundamentals Laboratory
No ratings yet
Ocs353 Data Science Fundamentals Laboratory
53 pages
CS3361 Data Science Lab Manual 2021
No ratings yet
CS3361 Data Science Lab Manual 2021
61 pages
CS3361 Data Science Laboratory - Lab Manual (Updated)
No ratings yet
CS3361 Data Science Laboratory - Lab Manual (Updated)
66 pages
OCS353 Data Science Fundamentals Lab Manual
No ratings yet
OCS353 Data Science Fundamentals Lab Manual
52 pages
Data Science Lab Manual CS3361
No ratings yet
Data Science Lab Manual CS3361
85 pages
CS3362 Data Science Lab Manual
No ratings yet
CS3362 Data Science Lab Manual
31 pages
Data Science Lab Manual Overview
No ratings yet
Data Science Lab Manual Overview
12 pages
Data Science Lab Manual Overview
No ratings yet
Data Science Lab Manual Overview
74 pages
CS3361 Data Science Lab Manual 2023
No ratings yet
CS3361 Data Science Lab Manual 2023
58 pages
Data Science Laboratory Course Overview
No ratings yet
Data Science Laboratory Course Overview
64 pages
Internship 2 Report
No ratings yet
Internship 2 Report
5 pages
Essential Python Libraries for Data Science
100% (1)
Essential Python Libraries for Data Science
5 pages
Data Analytics Lab Course Overview
No ratings yet
Data Analytics Lab Course Overview
125 pages
Data Science Lab Manual for CSE Students
No ratings yet
Data Science Lab Manual for CSE Students
60 pages
Data Science Laboratory CS3361 Guide
No ratings yet
Data Science Laboratory CS3361 Guide
67 pages
Cs3361 Data Science Lab Manual
No ratings yet
Cs3361 Data Science Lab Manual
77 pages
Data Analytics and Python Basics Guide
No ratings yet
Data Analytics and Python Basics Guide
21 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
18 pages
COS 305 WK 1-Introduction
No ratings yet
COS 305 WK 1-Introduction
6 pages
Data Structures & Algorithms Lab Guide
No ratings yet
Data Structures & Algorithms Lab Guide
59 pages
Data Science Laboratory Syllabus
No ratings yet
Data Science Laboratory Syllabus
51 pages
Data Science Lab: Python Libraries Guide
No ratings yet
Data Science Lab: Python Libraries Guide
5 pages
Data Science Assignment Overview
No ratings yet
Data Science Assignment Overview
56 pages
Data Science Lab Manual for Python
No ratings yet
Data Science Lab Manual for Python
55 pages
Data Science Environment and Tools Guide
No ratings yet
Data Science Environment and Tools Guide
56 pages
CS3361 FDS Arrear Lab
No ratings yet
CS3361 FDS Arrear Lab
57 pages
Data Science Laboratory Manual CS3361
No ratings yet
Data Science Laboratory Manual CS3361
100 pages
Data Science Lab: Python Packages Guide
No ratings yet
Data Science Lab: Python Packages Guide
78 pages
Data Science Course Overview in Python
No ratings yet
Data Science Course Overview in Python
44 pages
Python for Data Science Course Overview
No ratings yet
Python for Data Science Course Overview
5 pages
Idsv Python Basics Exam Master Notes
No ratings yet
Idsv Python Basics Exam Master Notes
19 pages
Python Packages for Data Analytics Guide
No ratings yet
Python Packages for Data Analytics Guide
4 pages
Data Science Lab
No ratings yet
Data Science Lab
4 pages
Record DataScience
No ratings yet
Record DataScience
133 pages
Data Analysis Experiments with Python
No ratings yet
Data Analysis Experiments with Python
54 pages
Machine Learning Lab Manual for BCA
No ratings yet
Machine Learning Lab Manual for BCA
31 pages
Fds Manual With Output
No ratings yet
Fds Manual With Output
44 pages
Analyzing Low Birth Weight Factors
100% (1)
Analyzing Low Birth Weight Factors
219 pages
Machine Learning Lab Setup Guide
No ratings yet
Machine Learning Lab Setup Guide
35 pages
Scientific Computing with Python
No ratings yet
Scientific Computing with Python
4 pages
Foundations of Data Science Syllabus
No ratings yet
Foundations of Data Science Syllabus
3 pages
Mastering Data Science with Python
No ratings yet
Mastering Data Science with Python
148 pages
Statistik Makine Renimi 1684433408
No ratings yet
Statistik Makine Renimi 1684433408
220 pages
Data Science Lab Manual 2021 Regulation
No ratings yet
Data Science Lab Manual 2021 Regulation
111 pages
Data Science Lab: Python & Anaconda Guide
No ratings yet
Data Science Lab: Python & Anaconda Guide
56 pages
Installing Data Science Packages in Python
No ratings yet
Installing Data Science Packages in Python
4 pages
CS3362 Data Science Lab Overview
No ratings yet
CS3362 Data Science Lab Overview
76 pages
Data Science Environment Setup Guide
No ratings yet
Data Science Environment Setup Guide
59 pages
Machine Learning Lab Manual BCA 6th Sem
No ratings yet
Machine Learning Lab Manual BCA 6th Sem
13 pages
Python in Data Science: Key Concepts
No ratings yet
Python in Data Science: Key Concepts
17 pages
Understanding Python Data Structures
No ratings yet
Understanding Python Data Structures
49 pages
Machine Learning Lab Manual for BCA VI Sem
No ratings yet
Machine Learning Lab Manual for BCA VI Sem
42 pages
CS3362 Data Science Lab Manual
No ratings yet
CS3362 Data Science Lab Manual
102 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
50 pages
Properties of Normal Distribution Explained
No ratings yet
Properties of Normal Distribution Explained
28 pages
Understanding Statistical Errors and Concepts
No ratings yet
Understanding Statistical Errors and Concepts
36 pages
Non-parametric ED50 Estimation Guide
No ratings yet
Non-parametric ED50 Estimation Guide
9 pages
Probability Models and Simulations
No ratings yet
Probability Models and Simulations
22 pages
Random Variables and Probability Basics
50% (2)
Random Variables and Probability Basics
12 pages
Actuarial Risk Theory Assignment 3
No ratings yet
Actuarial Risk Theory Assignment 3
2 pages
Discrete Random Variables Overview
No ratings yet
Discrete Random Variables Overview
36 pages
Dirichlet and Related Distributions Theory Methods and Applications 1st Edition Kai Wang NG Ebook Testbank Solutions Downloadable Instantly 2026
100% (6)
Dirichlet and Related Distributions Theory Methods and Applications 1st Edition Kai Wang NG Ebook Testbank Solutions Downloadable Instantly 2026
163 pages
Types of Histogram Shapes Explained
No ratings yet
Types of Histogram Shapes Explained
13 pages
Discrete Probability Distributions Explained
No ratings yet
Discrete Probability Distributions Explained
37 pages
Ischemic Stroke Survival Study in Ethiopia
No ratings yet
Ischemic Stroke Survival Study in Ethiopia
68 pages
Importance of Statistics in Education
No ratings yet
Importance of Statistics in Education
24 pages
M.Sc. Zoology Syllabus Overview
No ratings yet
M.Sc. Zoology Syllabus Overview
52 pages
Finite Math & Applied Calculus Testbank
No ratings yet
Finite Math & Applied Calculus Testbank
17 pages
M.E. Software Engineering Curriculum
No ratings yet
M.E. Software Engineering Curriculum
50 pages
Probability Analysis of Card Draws
No ratings yet
Probability Analysis of Card Draws
2 pages
Class 12 Applied Mathematics Sample Paper
No ratings yet
Class 12 Applied Mathematics Sample Paper
53 pages
MCO-03 Research Methodology Guide
No ratings yet
MCO-03 Research Methodology Guide
14 pages
Probability Distribution Functions Explained
No ratings yet
Probability Distribution Functions Explained
6 pages
Data Interpretation and Probability Concepts
No ratings yet
Data Interpretation and Probability Concepts
15 pages
Understanding Measures of Central Tendency
100% (1)
Understanding Measures of Central Tendency
24 pages
Central Tendency and Dispersion Overview
No ratings yet
Central Tendency and Dispersion Overview
64 pages
Parametric Query Optimization Algorithms
No ratings yet
Parametric Query Optimization Algorithms
12 pages
Sirdar Competency Exam Syllabus
No ratings yet
Sirdar Competency Exam Syllabus
12 pages
AEA Mathematics Exam Instructions 2019
No ratings yet
AEA Mathematics Exam Instructions 2019
28 pages
Probability Concepts and Problems in Mathematics
No ratings yet
Probability Concepts and Problems in Mathematics
17 pages
Basic Inferential Statistics Overview
No ratings yet
Basic Inferential Statistics Overview
14 pages
An Economic Dispatch Model Incorporating Wind Power
No ratings yet
An Economic Dispatch Model Incorporating Wind Power
9 pages
Probability and Distributions Overview
100% (1)
Probability and Distributions Overview
18 pages
AL302: Probability and Statistics Syllabus
No ratings yet
AL302: Probability and Statistics Syllabus
2 pages