0% found this document useful (0 votes)

8 views19 pages

Python Installation for Machine Learning

The document provides a comprehensive guide on installing Python and its ecosystem for machine learning, detailing methods such as individual installation and using Anaconda. It highlights key libraries like NumPy, Pandas, and Scikit-learn, explaining their functionalities and installation processes. Additionally, it introduces Jupyter Notebook as an essential tool for data science applications, outlining its features and types of cells.

Uploaded by

Kalighat Okira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views19 pages

Python Installation for Machine Learning

Uploaded by

Kalighat Okira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Machine Learning with Python

Python EcoSystem

Prof. Shibdas Dutta,

Associate Professor,
DCG DATA CORE SYSTEMS INDIA PVT LTD
Kolkata

Company Confidential: Data-Core Systems, Inc. | [Link]

Installing Python
For working in Python, we must first have to install it. You can
perform the installation of Python in any of the following two
ways:
• Installing Python individually
• Using Pre-packaged Python distribution: Anaconda
Let us discuss these each in detail.
Installing Python Individually
If you want to install Python on your computer, then then you
need to download only the binary code applicable for your
platform. Python distribution is available for Windows, Linux
and Mac platforms.
Company Confidential: Data-Core Systems, Inc. | [Link]
On Windows platform

With the help of following steps, we can install Python on Windows platform:

 First, go to [Link]
Next, click on the link for Windows installer [Link] file.
Here XYZ is the version we wish to install.
Now, we must run the file that is downloaded. It will take us to
the Python install wizard, which is easy to use. Now, accept the
default settings and wait until the install is finished.

Company Confidential: Data-Core Systems, Inc. | [Link]

Using Pre-packaged Python Distribution: Anaconda
Anaconda is a packaged compilation of Python which have all the libraries widely used in Data science. We can
follow the following steps to setup Python environment using Anaconda:

Step1: First, we need to download the required installation package from Anaconda distribution. The link for the
same is [Link] You can choose from Windows, Mac and Linux OS as per your
requirement.

Step2: Next, select the Python version you want to install on your machine. The latest Python version is 3.7.
There you will get the options for 64-bit and 32-bit Graphical installer both.

Step3: After selecting the OS and Python version, it will download the Anaconda installer on your computer.
Now, double click the file and the installer will install Anaconda package.

Step4: For checking whether it is installed or not, open a command prompt and type Python as follows:

Company Confidential: Data-Core Systems, Inc. | [Link]

Why Python for Data
ExtensiveScience?
set of packages
Python has an extensive and powerful set of packages which are ready to be used in
various domains. It also has packages like numpy, scipy, pandas, scikit-learn etc. which
are required for machine learning and data science.

Company Confidential: Data-Core Systems, Inc. | [Link]

Components of Python ML Ecosystem
In this section, let us discuss some core Data Science libraries that form the components of
Python Machine learning ecosystem. These useful components make Python an important
language for Data Science. Though there are many such components, let us discuss some of
the importance components of Python ecosystem here:

Jupyter Notebook
Jupyter notebooks basically provides an interactive computational environment for developing
Python based Data Science applications. They are formerly known as ipython notebooks. The
following are some of the features of Jupyter notebooks that makes it one of the best
components of Python ML ecosystem:
Jupyter notebooks can illustrate the analysis process step by step by arranging the stuff like
code, images, text, output etc. in a step by step manner.

It helps a data scientist to document the thought process while developing the analysis
process.

One can also capture the result as the part of the notebook.

With the help of jupyter notebooks, we can share our work with a peer also.

Company Confidential: Data-Core Systems, Inc. | [Link]

Installation and Execution
If you are using Anaconda distribution, then you need not install jupyter notebook separately
as it is already installed with it. You just need to go to Anaconda Prompt and type the
following command:
C:\>jupyter notebook

After pressing enter, it will start a notebook server at localhost:8888 of your computer. It is
shown in the following screen shot:

Company Confidential: Data-Core Systems, Inc. | [Link]

Now, after clicking the New tab, you will get a list of options. Select Python 3 and it will take
you to the new notebook for start working in it. You will get a glimpse of it in the following
screenshots:

Company Confidential: Data-Core Systems, Inc. | [Link]

On the other hand, if you are using standard Python distribution then jupyter
notebook can be installed using popular python package installer, pip.

pip install jupyter

Company Confidential: Data-Core Systems, Inc. | [Link]

Types of Cells in Jupyter Notebook
The following are the three types of cells in a jupyter notebook:

Code cells: As the name suggests, we can use these cells to write code. After writing the
code/content, it will send it to the kernel that is associated with the notebook.

Markdown cells: We can use these cells for notating the computation process. They can
contain the stuff like text, images, Latex equations, HTML tags etc.

Raw cells: The text written in them is displayed as it is. These cells are basically used to add
the text that we do not wish to be converted by the automatic conversion mechanism of
jupyter notebook.

Company Confidential: Data-Core Systems, Inc. | [Link]

NumPy
It is another useful component that makes Python as one of the favorite languages for Data
Science. It basically stands for Numerical Python and consists of multidimensional array
objects. By using NumPy, we can perform the following important operations:
Mathematical and logical operations on arrays.
Fourier transformation
 Operations associated with linear algebra.

We can also see NumPy as the replacement of

MatLab because NumPy is mostly used along with Scipy (Scientific Python) and Mat-plotlib (plotting library).

Installation and Execution

If you are using Anaconda distribution, then no need to install NumPy separately as it is already installed with it. You
just need to import the package into your Python script with the help of following:

On the other hand, if you are using standard Python distribution then NumPy can be
installed using popular python package installer, pip.

After installing NumPy, you can import it into your Python script as you did above.

Company Confidential: Data-Core Systems, Inc. | [Link]

Pandas
It is another useful Python library that makes Python one of the favorite languages for Data
Science. Pandas is basically used for data manipulation, wrangling and analysis. It was
developed by Wes McKinney in 2008. With the help of Pandas, in data processing we can
accomplish the following five steps:
Load
Prepare
Manipulate
Model
Analyze
Data representation in Pandas
The entire representation of data in Pandas is done with the help of following three data
structures:
Series: It is basically a one-dimensional ndarray with an axis label which means it is like a
simple array with homogeneous data. For example, the following series is a collection of
integers 1,5,10,15,24,25…

1 5 10 15 24 25 28 36 40 89

Company Confidential: Data-Core Systems, Inc. | [Link]

Data frame: It is the most useful data structure and used for
almost all kind of data representation and manipulation in
pandas.
It is basically a two-dimensional data structure which can
contain heterogeneous data.
Generally, tabular data is represented by using data frames.

For example, the following table shows the data of students

having their names and roll numbers, age and gender:
Name Rollnumber Age Gender

Aarav 1 15 Male

Harshit 2 14 Male

Kanika 3 16 Female

Mayank 4 15 Male

Company Confidential: Data-Core Systems, Inc. | [Link]

Panel: It is a 3-dimensional data structure containing heterogeneous data. It is very
difficult to represent the panel in graphical representation, but it can be illustrated as a
container of DataFrame.
The following table gives us the dimension and description about above mentioned data
structures used in Pandas:
DataStructure Dimension Description

Series 1-D Size immutable, 1-D homogeneous data

DataFrames 2-D Size Mutable, Heterogeneous data in

tabular form

Panel 3-D Size-mutable array, container

ofDataFrame.

We can understand these data structures as the higher dimensional data structure
is the container of lower dimensional data structure.

Company Confidential: Data-Core Systems, Inc. | [Link]

Installation and Execution
If you are using Anaconda distribution, then no need to install Pandas separately as it is
already installed with it. You just need to import the package into your Python script
with the help of following:

import pandas as pd

On the other hand, if you are using standard Python distribution then Pandas can be
installed using popular python package installer, pip.
pip install Pandas

After installing Pandas, you can import it into your Python script as did above.

Example

The following is an example of creating a series from ndarray by using Pandas:

Company Confidential: Data-Core Systems, Inc. | [Link]

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: data = [Link](['g','a','u','r','a','v'])

In [4]: s= [Link](data)

In [5]: print(s)

0 g

1 a

2 u

3 r

4 a

5 v

dtype: object

Company Confidential: Data-Core Systems, Inc. | [Link]

Scikit-learn
Another useful and most important python library for Data Science and machine learning in
Python is Scikit-learn. The following are some features of Scikit-learn that makes it so useful:

It is built on NumPy, SciPy, and Matplotlib.

It is an open source and can be reused under Berkeley Software Distribution (BSD) license.

It is accessible to everybody and can be reused in various contexts.

Wide range of machine learning algorithms covering major areas of ML like classification,
clustering, regression, dimensionality reduction, model selection etc. can be implemented with
the help of it.

Installation and Execution

If you are using Anaconda distribution, then no need to install Scikit-learn separately as it is
already installed with it. You just need to use the package into your Python script. For
example, with following line of script we are importing dataset of breast cancer patients from
Scikit-learn:

Company Confidential: Data-Core Systems, Inc. | [Link]

from [Link] import load_breast_cancer

On the other hand, if you are using standard Python distribution and having
NumPy and
SciPy then Scikit-learn can be installed using popular python package installer, pip.

After installing Scikit-learn, you can use it into your Python script as you have
done above.

Company Confidential: Data-Core Systems, Inc. | [Link]

Thank You

Company Confidential: Data-Core Systems, Inc. | [Link]

Installing Python for Data Science
No ratings yet
Installing Python for Data Science
19 pages
Python for Machine Learning Basics
No ratings yet
Python for Machine Learning Basics
24 pages
Introduction to NumPy for Data Science
No ratings yet
Introduction to NumPy for Data Science
9 pages
Introduction to Python for Data Science
No ratings yet
Introduction to Python for Data Science
11 pages
Data Science Lab: Python & Anaconda Guide
No ratings yet
Data Science Lab: Python & Anaconda Guide
56 pages
Data Science Lab Manual for CS3361
No ratings yet
Data Science Lab Manual for CS3361
65 pages
CS254 Pod
No ratings yet
CS254 Pod
41 pages
Machine Learning Algorithms Overview
No ratings yet
Machine Learning Algorithms Overview
85 pages
Python Data Science Packages Guide
No ratings yet
Python Data Science Packages Guide
59 pages
Big Data Labmanual
No ratings yet
Big Data Labmanual
25 pages
Data Science Workshop Overview
No ratings yet
Data Science Workshop Overview
80 pages
Python Programming Basics and Libraries
No ratings yet
Python Programming Basics and Libraries
47 pages
Anaconda and Jupyter Notebook Guide
No ratings yet
Anaconda and Jupyter Notebook Guide
120 pages
Anaconda Installation Guide for Beginners
No ratings yet
Anaconda Installation Guide for Beginners
8 pages
Python Installation Data Science
No ratings yet
Python Installation Data Science
11 pages
Python for Data Science Basics
No ratings yet
Python for Data Science Basics
30 pages
ClassX Python Notes
No ratings yet
ClassX Python Notes
26 pages
ML Exp-1
No ratings yet
ML Exp-1
15 pages
CS3361 Data Science Lab Manual 2024
No ratings yet
CS3361 Data Science Lab Manual 2024
58 pages
Calculating Median and Mean in Pandas
No ratings yet
Calculating Median and Mean in Pandas
38 pages
Logistic Regression on Iris Dataset
No ratings yet
Logistic Regression on Iris Dataset
60 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
62 pages
Machine Learning With Python: The Complete Course
No ratings yet
Machine Learning With Python: The Complete Course
16 pages
Python Libraries for Data Science
No ratings yet
Python Libraries for Data Science
80 pages
Python Libraries for Machine Learning
No ratings yet
Python Libraries for Machine Learning
12 pages
Python for Data Analysis Guide
No ratings yet
Python for Data Analysis Guide
42 pages
DS Lab File
No ratings yet
DS Lab File
20 pages
Python Data Science Quickstart Guide
No ratings yet
Python Data Science Quickstart Guide
13 pages
Mastering NumPy & Pandas for Data Science
No ratings yet
Mastering NumPy & Pandas for Data Science
57 pages
Data Analysis and Python Libraries Guide
No ratings yet
Data Analysis and Python Libraries Guide
19 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
24 pages
Week 5 Data Manipulation With Pandas Compressed
No ratings yet
Week 5 Data Manipulation With Pandas Compressed
21 pages
Installing Anaconda and Jupyter Notebook
No ratings yet
Installing Anaconda and Jupyter Notebook
46 pages
Python for Data Science and Jupyter Notebooks
No ratings yet
Python for Data Science and Jupyter Notebooks
5 pages
Introduction to NumPy for Data Science
No ratings yet
Introduction to NumPy for Data Science
17 pages
Creating Python Packages and Libraries
No ratings yet
Creating Python Packages and Libraries
8 pages
Anaconda Installation & Python Basics
No ratings yet
Anaconda Installation & Python Basics
30 pages
Python Installation and Data Handling Guide
No ratings yet
Python Installation and Data Handling Guide
54 pages
Analyzing Iris Dataset with Scikit-learn
No ratings yet
Analyzing Iris Dataset with Scikit-learn
100 pages
Python Data Analysis Basics Guide
No ratings yet
Python Data Analysis Basics Guide
6 pages
Python for Data Science Basics
No ratings yet
Python for Data Science Basics
37 pages
Python Unit - 5 Notes
No ratings yet
Python Unit - 5 Notes
11 pages
NumPy Array Creation in Python
No ratings yet
NumPy Array Creation in Python
20 pages
Bat404 CH4
No ratings yet
Bat404 CH4
9 pages
Introduction to NumPy for ML
No ratings yet
Introduction to NumPy for ML
27 pages
CITS2402: Data Science Foundations
No ratings yet
CITS2402: Data Science Foundations
32 pages
Python Crash Course Overview
100% (1)
Python Crash Course Overview
9 pages
Exploring SciPy Gaussian Features
No ratings yet
Exploring SciPy Gaussian Features
43 pages
Python for Finance: A Comprehensive Guide
No ratings yet
Python for Finance: A Comprehensive Guide
262 pages
AI Lab-Exp-1-10
No ratings yet
AI Lab-Exp-1-10
35 pages
Getting Started with Pandas in Python
No ratings yet
Getting Started with Pandas in Python
82 pages
Python Software Tools Overview
No ratings yet
Python Software Tools Overview
10 pages
Installing Python Libraries for Data Science
No ratings yet
Installing Python Libraries for Data Science
52 pages
CS3361 FDS Arrear Lab
No ratings yet
CS3361 FDS Arrear Lab
57 pages
Data Structure Micro Project with Python
No ratings yet
Data Structure Micro Project with Python
11 pages
C Operators
100% (2)
C Operators
11 pages
Machine Learning - SD Notes
No ratings yet
Machine Learning - SD Notes
15 pages
Introduction to Operating Systems Overview
No ratings yet
Introduction to Operating Systems Overview
59 pages
Decision Tree Algorithm in Python
No ratings yet
Decision Tree Algorithm in Python
17 pages
Principles of Object Oriented Programming
No ratings yet
Principles of Object Oriented Programming
46 pages
Computer Systems Overview and Basics
No ratings yet
Computer Systems Overview and Basics
58 pages
Spark SQL Basics for Data Scientists
No ratings yet
Spark SQL Basics for Data Scientists
15 pages
Comprehensive Guide to Hadoop Security
No ratings yet
Comprehensive Guide to Hadoop Security
31 pages
Understanding Hadoop and MapReduce
No ratings yet
Understanding Hadoop and MapReduce
32 pages
Big Data Analytics Methods in R
No ratings yet
Big Data Analytics Methods in R
57 pages
Big Data Analytics: Key Concepts Explained
No ratings yet
Big Data Analytics: Key Concepts Explained
57 pages
Decision Tree Algorithm in Python
No ratings yet
Decision Tree Algorithm in Python
17 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
25 pages
Stress Management for Bank Employees
0% (1)
Stress Management for Bank Employees
66 pages
Coimbatore Sahodaya Pre-Board Exam 2022-23
100% (3)
Coimbatore Sahodaya Pre-Board Exam 2022-23
4 pages
Gender Challenges for Iranian Women in Tourism
No ratings yet
Gender Challenges for Iranian Women in Tourism
12 pages
Bezaktiv Blue FX GG
No ratings yet
Bezaktiv Blue FX GG
2 pages
500kV Shunt Reactor Bushing Failure Analysis
No ratings yet
500kV Shunt Reactor Bushing Failure Analysis
7 pages
Common Fixed Points in Metric Spaces
No ratings yet
Common Fixed Points in Metric Spaces
7 pages
Understanding Global Citizenship Concepts
No ratings yet
Understanding Global Citizenship Concepts
18 pages
Year 2 Semester One Report 2023
No ratings yet
Year 2 Semester One Report 2023
7 pages
Agronomy Practices and Cost Analysis
No ratings yet
Agronomy Practices and Cost Analysis
20 pages
Evaluating Big Data Quality Attributes
No ratings yet
Evaluating Big Data Quality Attributes
4 pages
Media and Information Literacy Guide
No ratings yet
Media and Information Literacy Guide
20 pages
AP Physics B: Motion in One Dimension Quiz
No ratings yet
AP Physics B: Motion in One Dimension Quiz
3 pages
Bailey - Grit As A Predictor of Retention For First-Year Latino Students A
No ratings yet
Bailey - Grit As A Predictor of Retention For First-Year Latino Students A
130 pages
MARKESUNITY CIA Advantage XR Spec Sheet v12
No ratings yet
MARKESUNITY CIA Advantage XR Spec Sheet v12
5 pages
Siemens S5-90U/95U Datasheet
No ratings yet
Siemens S5-90U/95U Datasheet
5 pages
Liquid-Liquid Equilibrium of Rice Bran Oil
No ratings yet
Liquid-Liquid Equilibrium of Rice Bran Oil
7 pages
Embedded Design with MicroBlaze & PowerPC
No ratings yet
Embedded Design with MicroBlaze & PowerPC
5 pages
Halliburton SRP Back Pressure Valve Guide
No ratings yet
Halliburton SRP Back Pressure Valve Guide
24 pages
Chevrolet Ignition Switch Wiring Guide
100% (1)
Chevrolet Ignition Switch Wiring Guide
31 pages
Solution Manual: Open Camera and Scan QR Code To Instant Access Product Instant Access
91% (11)
Solution Manual: Open Camera and Scan QR Code To Instant Access Product Instant Access
190 pages
Class XI & XII German Curriculum Guide
No ratings yet
Class XI & XII German Curriculum Guide
9 pages
Welding Procedure Specification Review
No ratings yet
Welding Procedure Specification Review
4 pages
Chief of Operations: Fundraising & Strategy
No ratings yet
Chief of Operations: Fundraising & Strategy
4 pages
Key Modules and Functions: - 2008 ESRI. All Rights Reserved. Writing Advanced Geoprocessing Scripts Using Python
No ratings yet
Key Modules and Functions: - 2008 ESRI. All Rights Reserved. Writing Advanced Geoprocessing Scripts Using Python
11 pages
Daily Routine of a Telecom Technician
No ratings yet
Daily Routine of a Telecom Technician
1 page
Continuous Casting Process Overview
No ratings yet
Continuous Casting Process Overview
7 pages
Hse+3704+curriculum+development+workbook++section+a +introduction Docx 2
No ratings yet
Hse+3704+curriculum+development+workbook++section+a +introduction Docx 2
14 pages
Importing Apparel from South Korea
No ratings yet
Importing Apparel from South Korea
12 pages
Easypaisa Account Statement: Apr-May 2025
No ratings yet
Easypaisa Account Statement: Apr-May 2025
6 pages
General Studies Paper for APSC Exam
No ratings yet
General Studies Paper for APSC Exam
3 pages

Python Installation for Machine Learning

Uploaded by

Python Installation for Machine Learning

Uploaded by

Machine Learning with Python

Prof. Shibdas Dutta,

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

pip install jupyter

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

We can also see NumPy as the replacement of

Installation and Execution

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

For example, the following table shows the data of students

Company Confidential: Data-Core Systems, Inc. | [Link]

Series 1-D Size immutable, 1-D homogeneous data

DataFrames 2-D Size Mutable, Heterogeneous data in

Panel 3-D Size-mutable array, container

Company Confidential: Data-Core Systems, Inc. | [Link]

The following is an example of creating a series from ndarray by using Pandas:

Company Confidential: Data-Core Systems, Inc. | [Link]

In [2]: import numpy as np

In [3]: data = [Link](['g','a','u','r','a','v'])

Company Confidential: Data-Core Systems, Inc. | [Link]

It is built on NumPy, SciPy, and Matplotlib.

It is accessible to everybody and can be reused in various contexts.

Installation and Execution

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

Company Confidential: Data-Core Systems, Inc. | [Link]

You might also like