0% found this document useful (0 votes)

7 views13 pages

Python Pandas for Data Analysis Guide

panda ki chut

Uploaded by

dhruvarora050209

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views13 pages

Python Pandas for Data Analysis Guide

panda ki chut

Uploaded by

dhruvarora050209

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Python Pandas

Pandas or Python Pandas is Python’s library for data analysis. Pandas has
derived its name from “Panel Data System”, which is a term for
multidimensional, structured data set. Pandas has become a popular choice
for data analysis.
Data Analysis: It refers to process of evaluating big data sets using analytical
and statistical tools so as to discover useful information and conclusion to
support business decision making.
Panda makes available various tools for data analysis and makes it a simple
and easy process as compared to other available tools. The main author of
Pandas is Wes McKinney.
Using Pandas:
Panda is an open source, BSD library (Berkeley Software Distribution) built for
Python Programming language. Panda offers high-performance, easy to use
data structures and data analysis tools.
In order to work with pandas in Python, you need to import panda’s library in
your python environment.
Import pandas as pd
Why Pandas?
Pandas is very popular and it is capable of performing following tasks:
 It can read and write in many different data formats (integer, float,
double etc.)
 It can calculate in all the possible ways data is organized (i.e. across rows
and down columns).
 It can easily select subset of data from bulky data sets and combine
multiple datasets together.
 It can find and fill missing data.
 It allows you to apply operations to independent groups within data.
 It supports reshaping of data into different forms.
 It supports advance time-series functionality. (Time Series Forecasting
is the use of a model to predict future values based on
previously observed Values)
 It supports visualization by integrating matplotlib and seaborn etc.
Libraries)
Pandas Data Structure
Data Structure: A Data Structure is a collection of data values and
operations that can be applied to the data. It also refers to specialized way of
storing data so as to apply a specific type of functionality on them.
Basic Data Structure: Python Pandas has two basic Data Structures Series
and DataFrame.
Index Data A B C
0 A 0 Amit 34 India
1 B 1 Sumit 49 India
2 Mark 33 USA
2 C 3 Sam 60 NZ
3 D 4 Paul 23 China
Series Dataframe Object
Series is a one-dimensional data structure which can have values of any data
type (int, float, list, string). Series is called homogenous because it is
considered as object type. The values in the Series can be changed/ modified
so it is called mutable but the size of a series object cannot be changed.
Dataframe is a two-dimensional data structure that can have heterogeneous
(Different Types) data elements. The values of Dataframe are mutable. The
size of dataframe is also mutable. We can add or drop elements from a
dataframe object.

Installing Pandas
To install Pandas from command line we need to type in:

pip install pandas

Important: Pandas can be install when Python is already installed on

that system.

Series Data Structure

Series is an important data structure of pandas. It represents a One-
dimensional array of indexed data. A series type object has two main
components:
1) An array of actual data
2) An associated array of indexes or data label. The index is used to access
individual data values.
Index Data Index Data Index Data
0 A Amit 33 Monday Meeting
1 B Arpi 32 Tuesday Shopping
2 C t Wednesd Sports
3 D Mar 44 ay Activity
k Thursday Study
Woo 55
Examples: of Series type object
Creating Series Objects:
For creating a series type object, we have to import pandas and numpy
module (numerical python) with the help of import statement.
Import pandas as pd Here, pd is an alias name for pandas
Import numpy as np Here, np is an alias name for numpy
i) Creating empty series:
To create an empty series, we can use the pandas library Series().
Syntax:
<series object>= [Link]()
Eg.
import pandas as pd
myseries = [Link]()
print (myseries)
Output
Series([], dtype: float64) (the default dtype of empty series is float64 but
in future versions it will be object type)
ii) Creating non-empty series object:
To create non-empty Series Objects, you need to specify arguments for data
and indexes as per following syntax:
<series object>= [Link](data, index=idx)
 Where, * data is a data part of the series object * idx can be any
numpy datatype
o i.e. Python sequence (List, Tuple, String) & range()
o An ndarray (Numpy Array)
o A python Dictionary
o A scalar value
o Creating Non-Empty Series using range()
Syntax: <Series Name>= [Link] (range())
It will return an object of series type.
Eg.1
import pandas
myseries =
[Link](range(5))
print (myseries)
Output:
0 0
1 1
2 2
3 3
4 4
dtype: int64

o Creating Non-Empty Series using Python Sequence (List)

Syntax: <Series Name>= [Link] (List)
Eg.
a) Series Object:
0 10
import pandas as pd 1 20
s=[Link]([10,20,30,40,50]) 2 30
print("Series Object:") 3 40
print(s) 4 50
dtype: int64
b) Series Object:
0 a
import pandas as pd 1 b
s=[Link](['a','b','c','d','e']) 2 c
print("Series Object:") 3 d
print(s) 4 e
dtype: object
c) Series Object:
0 rat
import pandas as pd 1 bat
s=[Link](['rat','bat','cat']) 2 cat
print("Series Object:") dtype: object
print(s)

DataFrame Data Structure

A DataFrame a is two-dimensional labelled data structure like a table in excel or

spreadsheet. It contains rows and columns just like a two-dimensional array. Both
rows and columns have index numbers. Data frame has following characteristics:
1) It has two indexes – a row index (axis=0) and a column index (axis=1).
2) Each value of a dataframe can be accessed with the combination of row
index and column index. The row index is known as index and column-index
is known as column-name.
3) The indexes can be number or letters or strings.
4) There is no condition of having all data of same type across columns, its
columns can have data of different types.
5) We can easily change it values. It is value-mutables.
6) You can add or delete row/columns in a DataFrame. In other words, it is size-
mutable.

Creating and Displaying a DataFrame

A Dataframe object can be created by passing data in two-dimensional format.

Like earlier, before you do anything with pandas module, make sure to import
Pandas and Numpy modules.

Import pandas as pd
import numpy as np

To create a dataframe we can use the following syntax:

Dataframe=[Link](< a 2D datastructure>, [columns=<column

sequence>], [index=<index sequence>])

1) Creating empty data frame

Syntax: dataframe=[Link]()
Example: Empty
import pandas as pd
DataFrame
import numpy as np
Columns: []
df=[Link]()
Index: []
print(df)

2) Creating a dataframe object using a 2D dictionary

A two dimensional dictionary is a dictionary having items as (key:value) where

value part is a data structure of any type: another dictionary, an Ndarray, a series
object, a list etc. But here the value parts of all the keys should have similar
structure and equal lengths.

a) Creating a dataframe from a 2D dictionary having values as

lists/ndarrays:

We can create a dataframe from dictionary where each value of dictionary

consists of either list or ndarray.

Example:
import pandas as pd
import numpy as np Name Marks Sport
dict1={'Name':['Amit', 'Sumit', 'Arpit'], 0 Amit 79 Cricket
1 Sumit 65
'Marks':[79, 65,89],
Badminton
'sport':['Cricket', 'Badminton', 'Tennis'] 2 Arpit 89 Tennis
}
df=[Link](dict1)
print(df)

Attributes of DataFrame

Data Frame: df

RollNo Name Marks

A 110 Sandeep 97.5
B 111 Mukul 98.5
C 112 Rajkumar 99.5
D 113 Vipul 96.5

Index : it tells about index (Row labels) of the data frame.

print([Link])
Index(['A', 'B', 'C', 'D'], dtype='object')

Columns : it tells about column labels of the data frame.

print([Link])
Index(['RollNo', 'Name', 'Marks'], dtype='object')

dtypes : returns the data types of data in the data frame.

print([Link])
RollNo int64
Name object
Marks float64
dtype: object

shape : it returns a tuple representing the dimension of the data frame.

print([Link])
(4, 3)

size : it returns the number of elements present in the data frame

object.

print([Link])
12
values : It returns Numpy representation of the data frame.

print([Link])
[[110 'Sandeep' 97.5]
[111 'Mukul' 98.5]
[112 'Rajkumar' 99.5]
[113 'Vipul' 96.5]]

Selecting or Accessing Data

We can extract desired columns or rows from a dataframe.

Data Frame: df

RollNo Name
Marks
A 110 Sandeep
97.5
B 111 Mukul
98.5
1. Selecting / Accessing a Column
we can select / access a column as follows:

Syntax:
DataFrame[<Column Name>]
Or
DataFrame.<Column Name>

Example:
print(df['RollNo'])
or
print([Link])

2. Selecting Multiple columns

To select multiple columns we can specify the list of columns in square

bracket.
print(df[['RollNo','Name']])

RollNo Name
A 110 Sandeep
B 111 Mukul
C 112 Rajkumar
D 113 Vipul

3. Selecting / Accessing a subset from a Dataframe using Row/ Column

Names:
Syntax:

[Link][<StartRow>:<EndRow>,<StartColumn>: <EndColumn>)

Example:
import pandas as pd
st1={'RollNo':110,'Name':'Sandeep','Marks':97.5}
st2={'RollNo':111,'Name':'Mukul','Marks':98.5}
st3={'RollNo':112,'Name':'Rajkumar','Marks':99.5}
st4={'RollNo':113,'Name':'Vipul','Marks':96.5}
students=[st1,st2,st3,st4]
df=[Link](students, index=["A","B","C","D"])
print([Link]['A':'C','RollNo':'Name'])

print([Link]['A':'C':2,'RollNo':'Name'])

To extract specific rows we can use iloc (integer location) function. In this
function we use numeric index /position of rows and columns as follows:

print([Link][0:4:2, 0:2])

To Access Individual Values

We can extract individual value of a dataframe as follows:

1) <DataFrame>.<ColumnName>[Row Index / Row numeric Index]

print([Link]['A']) Output: Sandeep

2) Using ‘at’ attributes with DF

<DataFrame>.at[<row label>, <col label>]

Example:

print([Link]['B','Name']) Output: Mukul

3) Using ‘iat’ attributes with DF

<DataFrame>.iat [<row index No. >, <col Index No.>]

Example: print([Link][0,2]) Output: 97.5

Loading Data from CSV to DataFrames

Pythons pandas library offers two functions read_csv() and to_csv(), to

read the data from CSV files and to write the data in CSV file.

Reading from a CSV File

To read the data from CSV file we can use read_csv() function as per the
following syntax:

<df>=pandas.read_csv(<FilePath>)

CSV File= [Link]

Example:
import pandas as pd
df=pd.read_csv("d:\[Link]")
print([Link][0:1,'RollNo.':'Name'])

Here, we can see the first row of the csv file will be considered as column
name of dataframe df.

To Skip Rows while reading CSV File

<df>=pandas.read_csv(<Path>,names=[<Column Names>],skiprows=[<n>])

Examples

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"], skiprows=[0])
print(df)

Specifying Own Columns Names:

Some times CSV file doesn’t have column header, in that case the first row of
CSV file will be considered as column name in the dataframe.

So, to avoid such situation we can specify our own column names while reading
the data from CSV file.

Example:

CSV File= [Link]

(Without Column Header)

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print(df)

If you do not want to use column heading, then you can use the following
statement.

import pandas as pd
df=pd.read_csv("d:\[Link]",header=None)
print(df)

Storing DataFrame Data to CSV Files

We can use “to_csv()” function to create CSV file from data frame. The syntax is
as follows:

Syntax:
<dataframe>.to_csv(<file path>)
Or
<dataframe>.to_csv(<file path>,sep=<Separator_Character>)

Example:

import pandas as pd
st1={'RollNo':110,'Name':'Sandeep','Marks':97.5}
st2={'RollNo':111,'Name':'Mukul','Marks':98.5}
st3={'RollNo':112,'Name':'Rajkumar','Marks':99.5}
st4={'RollNo':113,'Name':'Vipul','Marks':96.5}
students=[st1,st2,st3,st4]
df=[Link](students, index=["A","B","C","D"])
print(df)
df.to_csv("d:\[Link]",sep=',')

Adding / Modifying Rows’/ Columns values in DataFrame

We can assign or modify the data in dataframe by specifying the row name or
column name along with the dataframe’s name.

Adding/Modifying a column

 We can modify a column if it is already existing

 We can add a column if it is not existing.

Syntax:

Modifying Value in a Column

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
[Link]="100"
print("After Adding:\n ",df)
Modifying specific value in a Column
import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
[Link][0,'Name']="Ashu"
print("After Adding:\n ",df)

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
[Link][0:1,'Name']="Ashu"
print("After Adding:\n ",df)

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
[Link][0,'Name']="Ashu"
print("After Adding:\n ",df)

Adding a Column
import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
df['Phone']="100"
print("After Adding:\n ",df)

Adding a Column with different values

import pandas as pd
df=pd.read_csv("d:\[Link]",names=["RNo","Name","Marks"])
print("Before Adding:\n ",df)
df['Phone']=[100,200]
print("After Adding:\n ",df)
Adding / Modifying a Row

We can change or add rows to a dataframe using at or loc attributes as follows:

<df>.at[<row name>, : ]=<New value>

Or
<df>.loc[<rowname> , : ]=<new value>

Important: if there is no row with such row label, then python adds a new
row with this row label and assign the given values to all its columns:

Example:
import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'],
'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])
print(df)
Adding Rows: if row index is not available then it will add a new row.

(i)
import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'],
'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])
print("Before:\n",df)
[Link]['IV']= "ABC"
print("After:\n",df)
(ii)
import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'],
'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])
print("Before:\n",df)
[Link]['IV']= ["Kumar",99,"Ludo"]
print("After:\n",df)

If the sequence containing values ["Kumar",99,"Ludo"], is different than it will

raise a ValueError.

Modifying Existing Row:

(i)
import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'], 'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])

print("Before:\n",df)
[Link]['III':]="Ludo"
print("After:\n",df)

Modifying Single Cell:

We can use the following syntax to modify a particular cell in dataframe.
<DataFrame>.ColumnName[<rowname/Label>]= <new value>
import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'],
'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])
print("Before:\n",df)
[Link]['III']="Sandeep"
print("After:\n",df)
Deleting / Renaming Columns/ Rows

Python Pandas provides two ways to delete rows and columns – del statement
and drop ( ) function. Pandas also provides rename( ) function to rename rows
and columns.

Deleting Column

To delete a column you use del statement as follows:

Syntax:
del <df Object> [<ColumnName>]
Example:

import pandas as pd
dict1={'Name':['Amit', 'Sumit', 'Arpit'],
'Marks':[79, 65,89],
'sport':['Cricket', 'Badminton', 'Tennis']}
df=[Link](dict1, index=['I','II', 'III'])
print("Before:\n",df)
del df['Marks']
print("After:\n",df)

Deleting Multiple columns

Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
11 pages
Advantages of Pandas for Data Analysis
No ratings yet
Advantages of Pandas for Data Analysis
82 pages
Fods Unit 4
No ratings yet
Fods Unit 4
58 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
33 pages
Python Pandas DataFrame Guide
No ratings yet
Python Pandas DataFrame Guide
53 pages
Python Pandas: Data Structures & Setup
100% (1)
Python Pandas: Data Structures & Setup
163 pages
Mastering Data Manipulation with Pandas
No ratings yet
Mastering Data Manipulation with Pandas
71 pages
Understanding Pandas Data Structures
No ratings yet
Understanding Pandas Data Structures
56 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
19 pages
Introduction to NumPy and Pandas
No ratings yet
Introduction to NumPy and Pandas
57 pages
Introduction to Pandas in Python
No ratings yet
Introduction to Pandas in Python
21 pages
Python Pandas Data Analysis Guide
No ratings yet
Python Pandas Data Analysis Guide
22 pages
Python Pandas Module Overview
No ratings yet
Python Pandas Module Overview
84 pages
Unit-4 Data Analysis & Visualization
No ratings yet
Unit-4 Data Analysis & Visualization
58 pages
Data Handling with Pandas Overview
No ratings yet
Data Handling with Pandas Overview
9 pages
Understanding Pandas Library Basics
No ratings yet
Understanding Pandas Library Basics
37 pages
Getting Started with Pandas in Python
No ratings yet
Getting Started with Pandas in Python
82 pages
Understanding Pandas Series Basics
No ratings yet
Understanding Pandas Series Basics
18 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
180 pages
Pandas Workshop Overview and Guide
No ratings yet
Pandas Workshop Overview and Guide
11 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
52 pages
Introduction to Pandas DataFrame Basics
No ratings yet
Introduction to Pandas DataFrame Basics
41 pages
Data Handling with Pandas DataFrame
No ratings yet
Data Handling with Pandas DataFrame
22 pages
Introduction to Pandas Library Basics
No ratings yet
Introduction to Pandas Library Basics
6 pages
Pandas Data Analysis with Python
No ratings yet
Pandas Data Analysis with Python
44 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
34 pages
Data Wrangling with Pandas Guide
No ratings yet
Data Wrangling with Pandas Guide
16 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
26 pages
Understanding Pandas Series Basics
100% (1)
Understanding Pandas Series Basics
35 pages
Unit 4 Pandas
No ratings yet
Unit 4 Pandas
27 pages
Pandas Data Structures and Visualization
No ratings yet
Pandas Data Structures and Visualization
11 pages
Data Handling with Pandas Basics
No ratings yet
Data Handling with Pandas Basics
10 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
29 pages
Pandas Data Structures in Python
No ratings yet
Pandas Data Structures in Python
17 pages
Pandas-DataFrames - PPTX - Google Slides
No ratings yet
Pandas-DataFrames - PPTX - Google Slides
37 pages
Understanding Pandas for Data Analysis
No ratings yet
Understanding Pandas for Data Analysis
39 pages
Pandas: Data Manipulation Essentials
No ratings yet
Pandas: Data Manipulation Essentials
13 pages
Introduction to Pandas: Series & DataFrame
No ratings yet
Introduction to Pandas: Series & DataFrame
21 pages
Getting Started with Pandas DataFrames
No ratings yet
Getting Started with Pandas DataFrames
38 pages
Pandas Assignment: Data Structures Guide
No ratings yet
Pandas Assignment: Data Structures Guide
9 pages
Introduction to Pandas for AI
No ratings yet
Introduction to Pandas for AI
57 pages
Introduction to Pandas in Python
No ratings yet
Introduction to Pandas in Python
19 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
20 pages
Pandas Library: Development and Features
No ratings yet
Pandas Library: Development and Features
3 pages
Dvp-Unit - Ii
No ratings yet
Dvp-Unit - Ii
20 pages
Primary Data Structures in Pandas
No ratings yet
Primary Data Structures in Pandas
20 pages
Mastering Pandas for Data Analysis
No ratings yet
Mastering Pandas for Data Analysis
20 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
25 pages
Introduction to Python Pandas Basics
No ratings yet
Introduction to Python Pandas Basics
84 pages
Understanding Pandas DataFrames Basics
No ratings yet
Understanding Pandas DataFrames Basics
32 pages
Python Programming and Data Analysis Guide
No ratings yet
Python Programming and Data Analysis Guide
10 pages
Python Panda1
No ratings yet
Python Panda1
19 pages
Pandas Data Manipulation Guide
No ratings yet
Pandas Data Manipulation Guide
42 pages
Pandas Basics for Data Analysis
No ratings yet
Pandas Basics for Data Analysis
22 pages
Data Analysis with Pandas Guide
No ratings yet
Data Analysis with Pandas Guide
10 pages
Understanding iloc Limitations in Pandas
No ratings yet
Understanding iloc Limitations in Pandas
107 pages
Python Pandas Data Analysis Guide
No ratings yet
Python Pandas Data Analysis Guide
176 pages
June 2016 Grade 12 Maths Paper 1 Memo
No ratings yet
June 2016 Grade 12 Maths Paper 1 Memo
11 pages
First Semester Maths Model Paper 2022
No ratings yet
First Semester Maths Model Paper 2022
70 pages
Direct Kinematics in Robotics Principles
No ratings yet
Direct Kinematics in Robotics Principles
1 page
Converting Units: Micrometers to Meters
100% (1)
Converting Units: Micrometers to Meters
36 pages
Math 124 Course Syllabus: Geometry
No ratings yet
Math 124 Course Syllabus: Geometry
14 pages
AC Circuits and Transformer Analysis
No ratings yet
AC Circuits and Transformer Analysis
12 pages
Competing Function Model Validation
No ratings yet
Competing Function Model Validation
5 pages
Apple iCar Financial Analysis Insights
100% (1)
Apple iCar Financial Analysis Insights
20 pages
CEPR Response to LAPOP Critique on CARSI
No ratings yet
CEPR Response to LAPOP Critique on CARSI
24 pages
C Programming and Assembly Language Assignment
No ratings yet
C Programming and Assembly Language Assignment
4 pages
Spatial Data Analysis Techniques
No ratings yet
Spatial Data Analysis Techniques
34 pages
Engineering Mathematics II Exam Q&A
No ratings yet
Engineering Mathematics II Exam Q&A
2 pages
Portfolio Theory and Diversification
100% (1)
Portfolio Theory and Diversification
12 pages
Understanding Relativity and Its Impact
No ratings yet
Understanding Relativity and Its Impact
11 pages
Scoring GLM
No ratings yet
Scoring GLM
9 pages
Micro HAWT Tower Load Analysis Study
No ratings yet
Micro HAWT Tower Load Analysis Study
8 pages
Chemistry Measurement Techniques Guide
No ratings yet
Chemistry Measurement Techniques Guide
9 pages
Y9 Maths Curriculum Overview 2024-2025
No ratings yet
Y9 Maths Curriculum Overview 2024-2025
3 pages
Micro Square-Shaped Spiral Inductor Model
100% (1)
Micro Square-Shaped Spiral Inductor Model
8 pages
Dokument Koji Vec Postoji
100% (1)
Dokument Koji Vec Postoji
17 pages
Water Impact Simulation for 2D and 3D Bodies
No ratings yet
Water Impact Simulation for 2D and 3D Bodies
8 pages
Essential Problem Solving Techniques
No ratings yet
Essential Problem Solving Techniques
121 pages
Data Science Overview and Applications
No ratings yet
Data Science Overview and Applications
31 pages
Transformer Protection Using Fault Component
No ratings yet
Transformer Protection Using Fault Component
14 pages
New Scheme of Work For SSS1-3
94% (16)
New Scheme of Work For SSS1-3
60 pages
Understanding Data Structures and Algorithms
No ratings yet
Understanding Data Structures and Algorithms
28 pages
Digital System Fundamentals Overview
No ratings yet
Digital System Fundamentals Overview
77 pages
Perinetti.2010 - Dental Malocclusion and Body Posture in Young Subjects - A Multiple Regression Study
No ratings yet
Perinetti.2010 - Dental Malocclusion and Body Posture in Young Subjects - A Multiple Regression Study
8 pages
Understanding Wave Properties and Calculations
No ratings yet
Understanding Wave Properties and Calculations
4 pages
Probability Concepts and Techniques
100% (1)
Probability Concepts and Techniques
21 pages

Python Pandas for Data Analysis Guide

Uploaded by

Python Pandas for Data Analysis Guide

Uploaded by

Python Pandas

pip install pandas

Important: Pandas can be install when Python is already installed on

Series Data Structure

o Creating Non-Empty Series using Python Sequence (List)

DataFrame Data Structure

A DataFrame a is two-dimensional labelled data structure like a table in excel or

Creating and Displaying a DataFrame

A Dataframe object can be created by passing data in two-dimensional format.

To create a dataframe we can use the following syntax:

Dataframe=[Link](< a 2D datastructure>, [columns=<column

1) Creating empty data frame

2) Creating a dataframe object using a 2D dictionary

A two dimensional dictionary is a dictionary having items as (key:value) where

a) Creating a dataframe from a 2D dictionary having values as

We can create a dataframe from dictionary where each value of dictionary

RollNo Name Marks

Index : it tells about index (Row labels) of the data frame.

Columns : it tells about column labels of the data frame.

dtypes : returns the data types of data in the data frame.

shape : it returns a tuple representing the dimension of the data frame.

size : it returns the number of elements present in the data frame

Selecting or Accessing Data

We can extract desired columns or rows from a dataframe.

2. Selecting Multiple columns

To select multiple columns we can specify the list of columns in square

3. Selecting / Accessing a subset from a Dataframe using Row/ Column

To Access Individual Values

We can extract individual value of a dataframe as follows:

1) <DataFrame>.<ColumnName>[Row Index / Row numeric Index]

print([Link]['A']) Output: Sandeep

2) Using ‘at’ attributes with DF

<DataFrame>.at[<row label>, <col label>]

print([Link]['B','Name']) Output: Mukul

3) Using ‘iat’ attributes with DF

<DataFrame>.iat [<row index No. >, <col Index No.>]

Example: print([Link][0,2]) Output: 97.5

Loading Data from CSV to DataFrames

Pythons pandas library offers two functions read_csv() and to_csv(), to

Reading from a CSV File

CSV File= [Link]

To Skip Rows while reading CSV File

Specifying Own Columns Names:

CSV File= [Link]

Storing DataFrame Data to CSV Files

Adding / Modifying Rows’/ Columns values in DataFrame

 We can modify a column if it is already existing

Modifying Value in a Column

Adding a Column with different values

We can change or add rows to a dataframe using at or loc attributes as follows:

<df>.at[<row name>, : ]=<New value>

If the sequence containing values ["Kumar",99,"Ludo"], is different than it will

Modifying Existing Row:

Modifying Single Cell:

To delete a column you use del statement as follows:

Deleting Multiple columns

You might also like