0% found this document useful (0 votes)

10 views199 pages

Python

The document provides an overview of Python programming, covering key concepts such as data types, variables, comments, and operators. It explains how to create and manipulate variables, use Python libraries, and implement basic syntax for coding. Additionally, it highlights the importance of comments and identifiers in programming for better code documentation and readability.

Uploaded by

Raghuvaran Ram Tulasi Rayithi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views199 pages

Python

Uploaded by

Raghuvaran Ram Tulasi Rayithi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

[Link]

com/jobs/india-improves-skills-proficiency-in-technology-but-lags-in-data-science-
coursera-report/articleshow/[Link]
Data Science: is a branch of computer science where we study how to
store, use and analyze data for deriving information from it.
❖ A Python library is a collection of related modules. It contains bundles of code that can be used repeatedly in

different programs.

❖ It makes Python Programming simpler and convenient for the programmer.

❖ As we don't need to write the same code again and again for different programs.

❖ Python libraries are used to create applications and models in a variety of fields, for instance, machine
learning, data science, data visualization, image and data manipulation, and many more.
PYTHON BASIC SYNTAX
INTRODUCTION
COMMENTS…
Python Comments
• Python comments are strings that begin with the # (hash/pound sign).

• They are used to document code and to help other programmers understand the same.

• You can use Python comments inline, on independent lines, or on multiple lines to
include larger documentation.

• These comments are statements that are not part of your program.

• For this reason, comment statements are skipped while executing your program.

• Usually we use comments for making brief notes about a chunk of code.
Example
>>> #this is a Python comment.
>>> #print("I will not be executed")

>>> print("I will be executed")

Output:
Different Types of Python Comments

• In Python there are two types of comments :

• Single line comments
• Multiple lines comments.
Single line comments
• Single line commenting is commonly used for a brief
and quick comment (or to debug a program).

• Example :
Multiple Lines Comments
• Multiple lines comments to note down something
much more in details or to block out an entire chunk
of code.
• Multiple lines comments are slightly different.
• Simply use 3 single quotes before and after the part
you want to be commented.
• Example:
VARIABLES…
PYTHON VARIABLES
• Variables are nothing but reserved memory locations
to store values.
• This means that when you create a variable you
reserve some space in memory.
• Based on the data type of a variable, the interpreter
allocates memory and decides what can be stored in
the reserved memory.
• Therefore, by assigning different data types to
variables, you can store integers, decimals or
characters in these variables.
Valid and Invalid Identifiers
• Variables are the example of identifiers.
• An Identifier is used to identify the literals used in the program.

• The rules to name an identifier are given below:

• The first character of the variable must be an alphabet or underscore ( _ ).
• All the characters except the first character may be an alphabet of lower-
case(a-z), upper-case (A-Z), underscore, or digit (0-9).
• Identifier name must not contain any white-space, or special character (!,
@, #, %, ^, &, *).
• Identifier name must not be similar to any keyword defined in the
language.
• Identifier names are case sensitive; for example, myname, and MyName is
not the same.

• Examples of valid identifiers: a123, _n, n_9, etc.

• Examples of invalid identifiers: 1a, n%4, n 9, etc.
Assigning Values to Variables
• Python variables do not need explicit declaration to
reserve memory space.
• The declaration happens automatically when you
assign a value to a variable.
• The equal sign (=) is used to assign values to
variables.
• The operand to the left of the = operator is the name
of the variable and the operand to the right of the =
operator is the value stored in the variable.
Example
>>> count = 780 # An integer assignment
>>> miles = 5300.0 # A floating point
>>> name = “Simplilearn" # A string
>>> print count
>>> print miles
>>> print name
Output :
780
5300.0
NIELIT
Examples
# No need of declaring a variable.
a = b = c = 15 #Multiple assignment

# Two integer one string

a,b,c = 1,2,“nielit”

#To delete a variable

Syntax : del var
Example : del var_a, var_b
Variables
Variables are containers for storing data values.

Creating Variables
Python has no command for declaring a variable.
A variable is created the moment you first assign a value to it.
Variable Names
•A variable can have a short name (like x and y) or a more descriptive name (age, car
name, total_volume). Rules for Python variables:A variable name must start with a letter or
the underscore character
•A variable name cannot start with a number
•A variable name can only contain alpha-numeric characters & underscores (A-z, 0-9,& _ )
•Variable names are case-sensitive (age, Age and AGE are three different variables)

Example
Legal variable names: Example
myvar = "John" Illegal variable names:
my_var = "John" 2myvar = "John"
_my_var = "John" my-var = "John"
myVar = "John" my var = "John"
MYVAR = "John"
myvar2 = "John"
Many Values to Multiple Variables

Example
x, y, z = "Orange", "Banana", "Cherry"
print(x)
print(y)
Python Variables - Assign Multiple Values print(z)

One Value to Multiple Variables

fruits=["apple","banana","cherry"]
x, y, z = fruits
print(x)
print(y)
print(z)

Unpack a Collection

x = y = z = "Orange"
print(x)
print(y)
print(z)
x = "Python is awesome"
print(x)
Python - Output Variables

x = "Python"
The Python print() function is often used to output variables
y = "is"
z = "awesome"
print(x, y, z)

For numbers, the + character works as a mathematical

operator:

x = 5
y = 10
print(x + y)

In the print() function, when you try to combine a string and a

number with the + operator, Python will give you an error:

x = 5
y = "John"
print(x + y)
Global variables

Variables that are created outside of a function are

Create a variable outside of a function, and use it inside
known as global variables. the function

Global variables can be used by everyone, both

inside of functions and outside. x = "awesome"

def myfunc():
print("Python is " + x)

myfunc()
Python Data Types
In programming, data type is an important concept.
Variables can store data of different types, and different types can do different things.

Python has the following data types built-in by

default, in these categories:

Print the data type of the variable x:

x = 5
print(type(x))
Python Numbers
There are three numeric types in Python:
•int x = 1 # int print(type(x))
•float y = 2.8 # float print(type(y))
z = 1j # complex print(type(z))
•complex
Int
Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.

x = 1 print(type(x))
y = 35656222554887711 print(type(y))
z = -3255522 print(type(z))
Float
Float, or "floating point number" is a number, positive or negative, containing one or more decimals.

x = 1.10 print(type(x))
y = 1.0 print(type(y))
Import the random module, and display a
z = -35.59 print(type(z)) random number between 1 and 9:
Complex import random
Complex numbers are written with a "j" as the imaginary part:

print([Link](1, 10))
x = 3+5j print(type(x))
y = 5j print(type(y))
z = -5j print(type(z))
Python Strings
Strings in python are surrounded by either single quotation marks, or double quotation marks.
'hello' is the same as "hello". print("Hello")
You can display a string literal with the print() function: print('Hello')

Assign String to a Variable

Assigning a string to a variable is done with the variable name followed by an equal sign and the string:
a = "Hello"
Multiline Strings print(a)
You can assign a multiline string to a variable by using three quotes: a = ""“360 TMG."""
print(a)

a = ’‘’data science.'''
print(a)
String Length
To get the length of a string, use the len() function. a = "Hello, World!"
print(len(a))
Check String
To check if a certain phrase or character is present in a string, we can use the keyword in.

txt = "The best things in life are free!"

print("free" in txt)
Check if NOT
To check if a certain phrase or character is NOT present in a string, we can use the keyword not in.

txt = "The best things in life are free!"

print("expensive" not in txt)

Use it in an if statement:
txt = "The best things in life are free!"
if "expensive" not in txt:
print("No, 'expensive' is NOT present.")
Python - Slicing Strings
You can return a range of characters by using the slice syntax.
Specify the start index and the end index, separated by a colon, to return a part of the string.

Get the characters from position 2 to position 5

(not included):
b = "Hello, World!"
print(b[2:5])
Get the characters from the start to position 5 (not
included):
b = "Hello, World!"
print(b[:5])
Get the characters from position 2, and all the way
to the end:
b = "Hello, World!"
print(b[2:])
Negative Indexing
Use negative indexes to start the slice from the end of the string:

From: "o" in "World!" (position -5)

To, but not included: "d" in "World!" (position -2):
b = "Hello, World!"
print(b[-5:-2])
Python - Modify Strings
Upper Case Lower Case
a = "Hello, World!" a = "Hello, World!"
print([Link]()) print([Link]())

Remove Whitespace Replace String

a = " Hello, World! " a = "Hello, World!"
print([Link]()) print([Link]("H", "J"))

Split String

a = "Hello, World!"
print([Link](","))
Python - Format - Strings
As we learned in the Python Variables chapter, we cannot
combine strings and numbers like this:

age = 36
txt = "My name is John, I am " + age
print(txt)

Use the format() method to insert numbers into strings:

age = 36
txt = "My name is John, and I am {}"
print([Link](age))

quantity = 3
item = 567
price = 49.95
myorder = "I want {} pieces of item {} for {}
dollars."
print([Link](quantity, item, price))
Python Booleans
Boolean Values
Booleans represent one of two values: True or False.

In programming you often need to know if an expression is True or False.

You can evaluate any expression in Python, and get one of two answers, True or False.

When you compare two values, the expression is evaluated and Python returns the Boolean answer:
4.a = 200
print(10 > 9)
b = 33
print(10 == 9)
print(10 < 9)
if b > a:print("b is greater than a")
else: print("b is not greater than a")

Evaluate Values and Variables Some Values are False

The bool() function allows you to evaluate any In fact, there are not many values that evaluate to False, except
value, and give you True or False in return, empty values, such as (), [], {}, "", the number 0, and the
value None. And of course the value False evaluates to False.
1. print(bool("Hello"))
print(bool(15))
bool(False)
bool(None)
2. x = "Hello" bool(0)
y = 15 bool("")
bool(())
print(bool(x)) bool([])
print(bool(y)) bool({})

[Link]("abc")
bool(123)
bool(["apple", "cherry", "banana"])
Python Operators
Operators are used to perform operations on variables and values.

Python divides the operators in the following groups:

•Arithmetic operators

•Assignment operators

•Comparison operators

•Logical operators

•Identity operators

•Membership operators

•Bitwise operators
Python Arithmetic Operators
Arithmetic operators are used with numeric values to perform common mathematical operations:
Python Comparison Operators
Comparison operators are used to compare two values:
Python Lists
Lists are used to store multiple items in a single variable.
Lists are one of 4 built-in data types in Python used to store collections of data, the other 3
are Tuple, Set, and Dictionary, all with different qualities and usage.
Lists are created using square brackets:

1.t = ["apple", "banana", "cherry"] thislist = ["apple", "banana", "cherry"]

print(t) print(thislist[1])

Negative Indexing
Negative indexing means start from the end
-1 refers to the last item, -2 refers to the second last item etc.

thislist = ["apple", "banana", "cherry"]

print(thislist[-1])

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

print(thislist[:4])

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

print(thislist[2:])
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[-4:-1])

thislist = ["apple", "banana", "cherry"]

if "apple" in thislist:
print("Yes, 'apple' is in the fruits list")

Range of Indexes
You can specify a range of indexes by specifying where to start and where to end the range.
When specifying a range, the return value will be a new list with the specified items.

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

print(thislist[2:5])
Python - Change List Items
To change the value of a specific item, refer to the index number:

thislist = ["apple", "banana", "cherry"]

thislist[1] = "blackcurrant"
print(thislist)

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "mango"]

thislist[1:3] = ["blackcurrant", "watermelon"]
print(thislist)
Python - Add List Items
To add an item to the end of the list, use the append() method:

thislist = ["apple", "banana", "cherry"]

[Link]("orange")
print(thislist)

To insert a list item at a specified index, use the insert() method. The insert() method inserts an item at the
specified index:

Insert an item as the second position:

thislist = ["apple", "banana", "cherry"]

[Link](1, "orange")
print(thislist)
Extend List
To append elements from another list to the current list, use the extend() method.

thislist = ["apple", "banana", "cherry"]

new = ["mango", "pineapple", "papaya"]
[Link](new)
print(thislist)
Python - Remove List Items
Remove Specified Item
The remove() method removes the specified item.

thislist = ["apple", "banana", "cherry"]

[Link]("banana")
print(thislist)

Remove Specified Index

The pop() method removes the specified index.

thislist = ["apple", "banana", "cherry"]

[Link](1)
print(thislist)

The del keyword also removes the specified index:

thislist = ["apple", "banana", "cherry"] Clear the List

The clear() method empties the list.
del thislist[0]
print(thislist) thislist = ["apple", "banana", "cherry"]
[Link]()
print(thislist)
Python - Loop Lists
you can loop through the list items by using a for loop
Matplotlib
What is Matplotlib?

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C, Objective-C and

Javascript for Platform compatibility.

Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under

the plt alias:

import [Link] as plt

Draw a line in a diagram from position (0,0) to position (6,250):

import [Link] as plt

import numpy as np

xpoints = [Link]([0, 6])

ypoints = [Link]([0, 250])

[Link](xpoints, ypoints)
[Link]()
Markers
You can use the keyword argument marker to emphasize each point with a specified marker:

Mark each point with a circle:

import [Link] as plt

import numpy as np

ypoints = [Link]([3, 8, 1, 10])

[Link](ypoints, marker = 'o')

[Link]()
Linestyle
You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted line:

import [Link] as plt

import numpy as np

ypoints = [Link]([3, 8, 1, 10])

[Link](ypoints, linestyle = 'dotted')

[Link]()

[Link](ypoints, linestyle = 'dashed')

Shorter Syntax
The line style can be written in a shorter syntax:
linestyle can be written as ls.
dotted can be written as :.
dashed can be written as --.
Create Labels for a Plot
With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis.

import numpy as np
import [Link] as plt

x = [Link]([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = [Link]([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

[Link](x, y)

[Link]("Average Pulse")
[Link]("Calorie Burnage")

[Link]()
Create a Title for a Plot
With Pyplot, you can use the title() function to set a title for the plot.

import numpy as np
import [Link] as plt

x = [Link]([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = [Link]([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

[Link](x, y)

[Link]("Sports Watch Data")

[Link]("Average Pulse")
[Link]("Calorie Burnage")

[Link]()
Specify Which Grid Lines to Display
You can use the axis parameter in the grid() function to specify which grid lines to display.
Legal values are: 'x', 'y', and 'both'. Default value is 'both'.
import numpy as np
import [Link] as plt

x = [Link]([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = [Link]([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

[Link]("Sports Watch Data")

[Link]("Average Pulse")
[Link]("Calorie Burnage")

[Link](x, y)

[Link](axis = 'x')

[Link]()
Display only grid lines for the y-axis:

import numpy as np
import [Link] as plt

x = [Link]([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = [Link]([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

[Link]("Sports Watch Data")

[Link]("Average Pulse")
[Link]("Calorie Burnage")

[Link](x, y)

[Link](axis = 'y')

[Link]()
Matplotlib Scatter
Creating Scatter Plots
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same length, one for the values of the x-axis,
and one for values on the y-axis:

annot – an array of the same shape as data which is used to annotate the heatmap.
A simple scatter plot:

import [Link] as plt

import numpy as np

x = [Link]([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = [Link]([99,86,87,88,111,86,103,87,94,78,77,85,86])

[Link](x, y)
[Link]()
Compare Plots
In the example above, there seems to be a relationship between speed and age, but what if we plot
the observations from another day as well? Will the scatter plot tell us something else?

import [Link] as plt

import numpy as np

#day one, the age and speed of 13 cars:

x = [Link]([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = [Link]([99,86,87,88,111,86,103,87,94,78,77,85,86])
[Link](x, y)

#day two, the age and speed of 15 cars:

x = [Link]([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = [Link]([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
[Link](x, y)
[Link]()
Colors
You can set your own color for each scatter plot with the color or the c argument:

import [Link] as plt

import numpy as np

x = [Link]([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = [Link]([99,86,87,88,111,86,103,87,94,78,77,85,86])
[Link](x, y, color = 'hotpink')

x = [Link]([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = [Link]([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
[Link](x, y, color = '#88c999')

[Link]()
Color Each Dot
You can even set a specific color for each dot by using an array of colors as value for the c argument:

import [Link] as plt

import numpy as np

x = [Link]([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = [Link]([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors =
[Link](["red","green","blue","yellow","pink","black","orange","purple","beige","brown","g
ray","cyan","magenta"])

[Link](x, y, c=colors)

This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple
color, and up to 100, which is a yellow color.
Matplotlib Histograms
Histogram
A histogram is a graph showing frequency distributions.
It is a graph showing the number of observations within each given interval.
Example: Say you ask for the height of 250 people, you might end up with a histogram like this:

Create Histogram
-In Matplotlib, we use the hist() function to create histograms.
-The hist() function will use an array of numbers to create a histogram, the array is sent into the
function as an argument.
-For simplicity we use NumPy to randomly generate an array with 250 values, where the values will
concentrate around 170, and the standard deviation is 10.

[Link] [Link] as plt 2.x = ["APPLES", "BANANAS"]

import numpy as np y = [400, 350]
[Link](x, y)
x = [Link](["A", "B", "C", "D"])
y = [Link]([3, 8, 1, 10])

[Link](x,y)

Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use the barh() function:

import [Link] as plt

import numpy as np

x = [Link](["A", "B", "C", "D"])

import [Link] as plt

import numpy as np

y = [Link]([35, 25, 25, 15])

[Link](y)
[Link]()

Labels
Add labels to the pie chart with the label parameter. The label parameter must be an array with one label for each wedge:

import [Link] as plt

import numpy as np

y = [Link]([35, 25, 25, 15])

mylabels =
["Apples", "Bananas", "Cherries", "Dates"]

[Link](y, labels = mylabels)

[Link]()
Explode
Maybe you want one of the wedges to stand out? The explode parameter allows you to do that.
The explode parameter, if specified, and not None, must be an array with one value for each wedge.
Each value represents how far from the center each wedge is displayed:

import [Link] as plt

import numpy as np

y = [Link]([35, 25, 25, 15])

mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]

[Link](y, labels = mylabels, explode = myexplode)

[Link]()
Pandas
What is Pandas?
-Pandas is a Python library used for working with data sets.
-It has functions for analyzing, cleaning, exploring, and manipulating data.
-The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was
created by Wes McKinney in 2008.
Why Use Pandas?
-Pandas allows us to analyze big data and make conclusions based on statistical theories.
-Pandas can clean messy data sets, and make them readable and relevant.
-Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:
•Is there a correlation between two or more columns?
•What is average value?
•Max value? Min value?
Pandas are also able to delete rows that are not relevant, or contains wrong values, like empty or
NULL values. This is called cleaning the data.
Pandas Series
What is a Series?
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.

Create a simple Pandas Series from a list:

import pandas as pd

a = [1, 7, 2]

myvar = [Link](a)

print(myvar)

The type int64 tells us that Python is storing each value within this column as a 64 bit integer.
Labels
❖ If nothing else is specified, the values are labeled with their index number. First value has index

0, second value has index 1 etc.

data = {"calories": [420, 380, 390],"duration": [50, 40, 45]}

df = [Link](data, index = ["day1", "day2", "day3"])

print(df)
Locate Named Indexes
Use the named index in the loc attribute to return the specified row(s).

print([Link]["day2"])
Pandas Read CSV
-A simple way to store big data sets is to use CSV files (comma separated files).
-CSV files contains plain text and is a well know format that can be read by everyone including
Pandas.
-Using a CSV file called '[Link]’.

Load the CSV into a DataFrame: Print the DataFrame without the to_string() method:

import pandas as pd import pandas as pd

df = pd.read_csv('[Link]') df = pd.read_csv('[Link]')

print(df.to_string()) print(df)

use to_string() to print the entire DataFrame.

If you have a large DataFrame with many rows, Pandas will only return the first 5 rows, and the last 5 rows:
max_rows
The number of rows returned is defined in Pandas option settings.
You can check your system's maximum rows with the [Link].max_rows statement.

Check the number of maximum returned rows:

import pandas as pd
print([Link].max_rows)

In my system the number is 60, which means that if the DataFrame contains more than 60 rows, the print(df) statement will return only the headers
and the first and last 5 rows.
You can change the maximum rows number with the same statement.

Increase the maximum number of rows to display the entire DataFrame:

import pandas as pd
[Link].max_rows = 9999
df = pd.read_csv('[Link]')
print(df)
Pandas Read JSON
-Big data sets are often stored, or extracted as JSON.
-JSON is plain text, but has the format of an object, and is well known in the world of programming,
including Pandas.
-In our examples we will be using a JSON file called '[Link]'.

Data: [Link]

If your JSON code is not in a file, but in a Python Dictionary, you can load it into a DataFrame directly:
Pandas - Analyzing DataFrames
Viewing the Data
One of the most used method for getting a quick overview of the DataFrame, is the head() method.
The head() method returns the headers and a specified number of rows, starting from the top.

There is also a tail() method for viewing the last rows of the DataFrame.
The tail() method returns the headers and a specified number of rows, starting from the bottom.

Info About the Data

The DataFrames object has a method called info(), that gives you more information about the data set.
Pandas - Cleaning Data
Data cleaning means fixing bad data in your data set.

Bad data could be:

•[Link] cells

•[Link] in wrong format

•[Link] data

•[Link]
The data set contains some empty cells ("Date" in row 22, and "Calories" in row 18 and 28).

The data set contains wrong format ("Date" in row 26).

df["Calories"].fillna(130, inplace = True)

Replace Using Mean, Median, or Mode
A common way to replace empty cells, is to calculate the mean, median or mode value of the column.
Pandas uses the mean() median() and mode() methods to calculate the respective values for a specified column:

Calculate the MEAN, and replace any empty values Calculate the MEDIAN, and replace any empty
with it: values with it:
import pandas as pd import pandas as pd

df = pd.read_csv('[Link]') df = pd.read_csv('[Link]')

x = df["Calories"].mean() x = df["Calories"].median()

df["Calories"].fillna(x, inplace = True) df["Calories"].fillna(x, inplace = True)

Calculate the MODE, and replace any empty values
with it:
import pandas as pd

df = pd.read_csv('[Link]')

x = df["Calories"].mode()[0]

df["Calories"].fillna(x, inplace = True)

2. Pandas - Cleaning Data of Wrong Format

Data of Wrong Format

Cells with data of wrong format can make it difficult, or even impossible, to analyze data.
To fix it, you have two options: remove the rows, or convert all cells in the columns into the same format.

Convert Into a Correct Format

In our Data Frame, we have two cells with the wrong format. Check out row 22 and 26, the 'Date' column should be a string that
represents a date:
To convert all cells in the 'Date' column into dates.
Pandas has a to_datetime() method for this:

Convert to date:
import pandas as pd

df = pd.read_csv('[Link]')

df['Date'] = pd.to_datetime(df['Date'])

print(df.to_string())

the date in row 26 was fixed, but the empty date in row 22 got a NaT (Not a Time) value, in other words an empty value. One way
to deal with empty values is simply removing the entire row.
Removing Rows
The result from the converting in the example above gave us a NaT value, which can be handled as a NULL
value, and we can remove the row by using the dropna() method.

Remove rows with a NULL value in the "Date" column:

[Link](subset=['Date'], inplace = True)

[Link] - Fixing Wrong Data
Wrong Data
-"Wrong data" does not have to be "empty cells" or "wrong format", it can just be wrong,

like if someone registered "199" instead of "1.99".

-Sometimes you can spot wrong data by looking at the data set, because you have an

expectation of what it should be.

-If you take a look at our data set, you can see that in row 7, the duration is 450, but for

all the other rows the duration is between 30 and 60.

-It doesn't have to be wrong, but taking in consideration that this is the data set of

someone's workout sessions, we conclude with the fact that this person did not work out in

450 minutes.
How can we fix wrong values, like the one for "Duration" in row 7?

Replacing Values
-One way to fix wrong values is to replace them with something else.
-In our example, it is most likely a typo, and the value should be "45" instead of "450", and we could
just insert "45" in row 7:

Set "Duration" = 45 in row 7:

[Link][7, 'Duration'] = 45

Loop through all values in the "Duration" column.

If the value is higher than 120, set it to 120:
for x in [Link]:
if [Link][x, "Duration"] > 120:
[Link][x, "Duration"] = 120
Removing Rows
-Another way of handling wrong data is to remove the rows that contains wrong data.
-This way you do not have to find out what to replace them with, and there is a good chance you do
not need them to do your analyses.

Delete rows where "Duration" is higher than 120:

for x in [Link]:
if [Link][x, "Duration"] > 120:
[Link](x, inplace = True)
4. Pandas - Removing Duplicates

Discovering Duplicates
Duplicate rows are rows that have been registered more than one time.

By taking a look at our test data set, we can assume that row 11 and 12 are duplicates.

To discover duplicates, we can use the duplicated() method.

The duplicated() method returns a Boolean values for each row:

Returns True for every row that is a duplicate, otherwise False:

print([Link]())
Removing Duplicates
To remove duplicates, use the drop_duplicates() method

Remove all duplicates:

df.drop_duplicates(inplace = True)
Important Methods Pandas Packages
NumPy
What is NumPy?
NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

Mainly used for multidimensional (2D,3D) arrays & This is alternative for matlab

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.

Why Use NumPy?

In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very important.
Why is NumPy Faster Than Lists?
❖ NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access

and manipulate them very efficiently.

❖ This behavior is called locality of reference in computer science.

❖ This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU

architectures.
NumPy Creating Arrays
Create a NumPy ndarray Object
NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

Example

import numpy as np

arr = [Link]((1, 2, 3, 4, 5))

print(arr)
A Python Array is a collection of common type of data structures having elements with same data type.

It is used to store collections of data. In Python programming, an arrays are handled by the “array” module.

If you create arrays using the array module, elements of the array
must be of the same numeric type.

You can insert different types of data in it. Like integer, floating,
list, tuple, string, etc.
Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays).

0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

Example
Create a 0-D array with value 42
import numpy as np

arr = [Link](42)

print(arr)
1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.
Example

import numpy as np

arr = [Link]([1, 2, 3, 4, 5])

print(arr)
2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.
These are often used to represent matrix or 2nd order tensors.

Example
import numpy as np
arr = [Link]([[1, 2, 3], [4, 5, 6]])
print(arr)
3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.

import numpy as np

arr = [Link]([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)
Scikit-learn
What is scikit-learn used for?
Scikit-learn is probably the most useful library for machine learning in Python. The sklearn library contains a lot of
efficient tools for machine learning and statistical modeling including classification, regression, clustering and
dimensionality reduction.

Important features of scikit-learn:

•Simple and efficient tools for data mining and data analysis. It features various classification, regression and clustering

algorithms including support vector machines, random forests, gradient boosting, k-means, etc.

•Accessible to everybody and reusable in various contexts.

•Built on the top of NumPy, SciPy, and matplotlib.

DATA TYPES…
Python Data Types
• Data types are the classification or categorization of
data items.
• It represents the kind of value that tells what
operations can be performed on a particular data.
• Since everything is an object in Python programming,
data types are actually classes and variables are
instance (object) of these classes.
• A variable can hold different types of values.
• For example, a person's name must be stored as a
string whereas its id must be stored as an integer.
• Python provides various standard data types that
define the storage method on each of them.
Standard Data Types
1. Python Numeric Data Type
• Python numeric data type is used to hold numeric values like;
• Int – holds signed integers of non-limited length.
• Long - holds long integers(exists in Python 2.x, deprecated in Python 3.x).
• Float - holds floating precision numbers and it’s accurate up to 15 decimal places.
• Complex - holds complex numbers.
• In Python, we need not declare a datatype while declaring a variable like C
or C++.
• We can simply just assign values in a variable.
• But if we want to see what type of numerical value is it holding right now,
we can use type().
Example

• Output :
Examples
int long float complex
10 51924361L 0.0 3.14j
100 -0x19323L 15.20 45.j
-786 0122L -21.9 9.322e-36j
080 0xDEFABCECBDAECBF 32.3+e18 .876j
BAEl
-0490 535633629843L -90. -.6545+0J

-0x260 -052318172735L -32.54e100 3e+26J

0x69 -4721885298529L 70.2-E12 4.53e-7j

[Link]
• Boolean type provides two built-in values, True and
False.
• These values are used to determine the given statement
true or false.
• It denotes by the class bool.
• True can be represented by any non-zero value or 'T'
whereas false can be represented by the 0 or 'F'.

Example : Output:
[Link]
• Set is an unordered collection of
unique items.
• Set is defined by values
separated by comma inside
braces { }.
• Items in a set are not ordered.
• It is iterable, mutable(can
modify after creation), and has
unique elements.
• In set, the order of the elements
is undefined; it may return the
changed sequence of the
element.
• It can contain various types of
values.
Set – unique values
• We can perform set operations like union, intersection
on two sets.
• Sets have unique values.
• They eliminate duplicates.
Set - indexing

• Since, set are unordered collection, indexing has no meaning.

• Hence, the slicing operator [] does not work.
5. Sequence types
i. STRINGS

• The string can be defined as the sequence of characters

represented in the quotation marks.
• In Python, we can use single, double, or triple quotes to
define a string.
• String handling in Python is a straightforward task since
Python provides built-in functions and operators to
perform operations in the string.
• In the case of string handling, the operator + is used to
concatenate two strings as the operation "hello"+"
python" returns "hello python".
• The operator * is known as a repetition operator as the
operation "Python" *2 returns 'Python Python'.
Some Methods
strip()
✓The strip() method removes any whitespace from the
beginning or the end:
✓a = ” Hello”
✓print([Link]())
Some Methods
len()
✓The len() method returns the length of a string:
>>> a = "Hello, World!"
>>> print(len(a))

lower()
✓The lower() method returns the string in lower case:
>>>a = “Hello, World!”
>>>print([Link]())
>>>print([Link]())

replace()
• The replace() method replaces a string with another string:
>>> print([Link]("H", "J"))
Some Methods
split()
✓The split() method splits the string into substrings if it finds
instances of the separator:
>>> a = "Hello,World!"
>>> print([Link](","))
# returns ['Hello', ' World!']
Array
Python Comments
Machine Learning
Data everywhere!
1. Google: processes 24 peta bytes of data per day.

2. Facebook: 10 million photos uploaded every hour.

3. Youtube: 1 hour of video uploaded every second.

4. Twitter: 400 million tweets per day.

5. Astronomy: Satellite data is in hundreds of PB.

6. : : :

7. “By 2020 the digital universe will reach 44 zettabytes..."

That's 44 trillion gigabytes!

Data types
• Data comes in different sizes and also flavors (types):
• Texts
• Numbers
• Clickstreams
• Graphs
• Tables
• Images
• Transactions
• Videos
• Some or all of the above!
Smile, we are 'DATAFIED'!

• Wherever we go, we are datafied".

• Smartphones are tracking our locations.

• We leave a data trail in our web browsing.

• Interaction in social networks.

• Privacy is an important issue in Data

Science.
The Data Science process
Why Machine Learning
If we stored the data generated in a day on Blu-ray disks and stacked them up, it would be equal to the height

of four Eiffel towers. Machine learning helps analyze this data easily and quickly.

Machine learning
Purpose of Machine Learning
Machine learning is a great tool to analyze data, find hidden data patterns and relationships, and extract information

to enable information-driven decisions and provide insights.

Identify patterns and relationships

Data
Gain insights into unknown data

Take information-driven decisions

Applications of ML
• Spam filtering

• Credit card fraud detection

• Digit recognition on checks, zip codes

• Detecting faces in images

• MRI image analysis

• Recommendation system

• Search engines

• Handwriting recognition

• Scene classication

• etc...
Raw Mango vs. Ripen Mango
SUPERVISED LEARNING
Supervised Learning
UNSUPERVISED LEARNING
Unsupervised Learning
TYPES OF SUPERVISED LEARNING
BINARY CLASSIFICATION
MULTICLASS CLASSIFICATION
REGRESSION

Python Basics Lab Manual
No ratings yet
Python Basics Lab Manual
122 pages
Python Keywords, Indentation, and Data Types
No ratings yet
Python Keywords, Indentation, and Data Types
76 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
71 pages
Python Print and Indentation Basics
No ratings yet
Python Print and Indentation Basics
24 pages
Injecting Variables in Python Strings
No ratings yet
Injecting Variables in Python Strings
84 pages
Week03-1 Variables and Data Types
No ratings yet
Week03-1 Variables and Data Types
25 pages
Combining Dictionaries and Types in Python
No ratings yet
Combining Dictionaries and Types in Python
62 pages
Python Programming Fundamentals Guide
No ratings yet
Python Programming Fundamentals Guide
186 pages
Basics of Python
No ratings yet
Basics of Python
8 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
97 pages
Python Syntax and Indentation Basics
No ratings yet
Python Syntax and Indentation Basics
21 pages
Python Variables and Identifiers Guide
No ratings yet
Python Variables and Identifiers Guide
9 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
6 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
29 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
22 pages
Python Programming
No ratings yet
Python Programming
545 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
60 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
12 pages
Python Basics: Variables and Data Types
No ratings yet
Python Basics: Variables and Data Types
49 pages
Python Tutorial 021020
No ratings yet
Python Tutorial 021020
31 pages
Python for Data Scalability
No ratings yet
Python for Data Scalability
18 pages
Simplified Introduction to Python
No ratings yet
Simplified Introduction to Python
75 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
70 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
28 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
45 pages
Python Basics: Syntax, Variables, Strings
No ratings yet
Python Basics: Syntax, Variables, Strings
8 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
8 pages
What Python Can Do: Key Features
No ratings yet
What Python Can Do: Key Features
34 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
212 pages
Python Variables and Data Types Guide
No ratings yet
Python Variables and Data Types Guide
20 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
10 pages
Python Programming Basics Explained
No ratings yet
Python Programming Basics Explained
23 pages
Python Programming Basics Explained
No ratings yet
Python Programming Basics Explained
5 pages
Python Programming Basics: Comments, Variables, Data Types
No ratings yet
Python Programming Basics: Comments, Variables, Data Types
35 pages
Pythonfinal
No ratings yet
Pythonfinal
40 pages
Python Programming Basics: Syntax & Variables
No ratings yet
Python Programming Basics: Syntax & Variables
70 pages
Python Syntax and Variable Basics
No ratings yet
Python Syntax and Variable Basics
13 pages
Print Variable Data Types in Python
No ratings yet
Print Variable Data Types in Python
186 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
32 pages
Introduction to Python Programming
No ratings yet
Introduction to Python Programming
61 pages
Role of Indentation in Python
No ratings yet
Role of Indentation in Python
17 pages
Python Variables and Data Types Guide
No ratings yet
Python Variables and Data Types Guide
16 pages
Python Programming Course Overview
No ratings yet
Python Programming Course Overview
52 pages
Python Programming Basics Explained
No ratings yet
Python Programming Basics Explained
93 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
63 pages
Greeting Function in Python
No ratings yet
Greeting Function in Python
197 pages
Python Variables and Data Types Guide
No ratings yet
Python Variables and Data Types Guide
23 pages
Introduction To Python1
No ratings yet
Introduction To Python1
38 pages
Theory 1
No ratings yet
Theory 1
32 pages
Python Programming Basics Explained
No ratings yet
Python Programming Basics Explained
56 pages
Python Syntax and Variable Basics
No ratings yet
Python Syntax and Variable Basics
20 pages
Python Basics for Data Engineers
No ratings yet
Python Basics for Data Engineers
10 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
42 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
18 pages
Getting Started with Python Basics
No ratings yet
Getting Started with Python Basics
9 pages
Python Variables and Naming Conventions
No ratings yet
Python Variables and Naming Conventions
6 pages
Python Basics and Open Source Overview
100% (1)
Python Basics and Open Source Overview
56 pages
Guidelines for Distributed Systems Question Bank
No ratings yet
Guidelines for Distributed Systems Question Bank
6 pages
تعلم بايثون من الصفر إلى الاحتراف
No ratings yet
تعلم بايثون من الصفر إلى الاحتراف
69 pages
I.T. Devices Quiz Questions
No ratings yet
I.T. Devices Quiz Questions
2 pages
Computer Ethics: Living in IT Era
No ratings yet
Computer Ethics: Living in IT Era
20 pages
5 Key Insights for Developer-PM Collaboration
No ratings yet
5 Key Insights for Developer-PM Collaboration
9 pages
IC-7100 Quick-Start Guide
No ratings yet
IC-7100 Quick-Start Guide
12 pages
NimOS GX5.6 LTS Release Notes
No ratings yet
NimOS GX5.6 LTS Release Notes
66 pages
MKS TS35 Setup for KlipperScreen
No ratings yet
MKS TS35 Setup for KlipperScreen
3 pages
ICT Terms in Education Explained
No ratings yet
ICT Terms in Education Explained
2 pages
UFM Manager User Manual Overview
No ratings yet
UFM Manager User Manual Overview
6 pages
C/C++ Type Conversion Explained
No ratings yet
C/C++ Type Conversion Explained
10 pages
Ocularis Viewer User Manual
No ratings yet
Ocularis Viewer User Manual
19 pages
.NET Developer Resume - 2+ Years Experience
No ratings yet
.NET Developer Resume - 2+ Years Experience
3 pages
Essential UNIX Commands Guide
No ratings yet
Essential UNIX Commands Guide
7 pages
Camera Service Log Analysis
No ratings yet
Camera Service Log Analysis
34 pages
Understanding Cyber Warfare Dynamics
No ratings yet
Understanding Cyber Warfare Dynamics
15 pages
Customize macOS Terminal Bash Settings
No ratings yet
Customize macOS Terminal Bash Settings
8 pages
Track Your Android Phone by IMEI
No ratings yet
Track Your Android Phone by IMEI
9 pages
Coway Sales Monitoring File Access Guide
No ratings yet
Coway Sales Monitoring File Access Guide
6 pages
Recent Advances and Future Trends in Honeypot Research
No ratings yet
Recent Advances and Future Trends in Honeypot Research
1 page
Cybersecurity Trends and AI Risks 2026
No ratings yet
Cybersecurity Trends and AI Risks 2026
15 pages
Audio Emotion Detection Project Overview
No ratings yet
Audio Emotion Detection Project Overview
15 pages
IoT Smart Cradle for Baby Monitoring
No ratings yet
IoT Smart Cradle for Baby Monitoring
4 pages
Fundamental Programming II Exam Questions
75% (4)
Fundamental Programming II Exam Questions
9 pages
SA-78 FFT Analyzer Overview
No ratings yet
SA-78 FFT Analyzer Overview
4 pages
Use The To Exit Access
No ratings yet
Use The To Exit Access
11 pages
Callcenter
No ratings yet
Callcenter
10 pages
AI's Impact on Chartered Accountancy
No ratings yet
AI's Impact on Chartered Accountancy
6 pages
Mesh-TensorFlow: Advanced DNN Training
No ratings yet
Mesh-TensorFlow: Advanced DNN Training
10 pages
Optimize Bloomex Logistics with Miami Hub
No ratings yet
Optimize Bloomex Logistics with Miami Hub
10 pages

Python

Uploaded by

Python

Uploaded by

[Link]

❖ It makes Python Programming simpler and convenient for the programmer.

>>> print("I will be executed")

• In Python there are two types of comments :

• The rules to name an identifier are given below:

• Examples of valid identifiers: a123, _n, n_9, etc.

# Two integer one string

#To delete a variable

One Value to Multiple Variables

For numbers, the + character works as a mathematical

In the print() function, when you try to combine a string and a

Variables that are created outside of a function are

Global variables can be used by everyone, both

Python has the following data types built-in by

Print the data type of the variable x:

Assign String to a Variable

txt = "The best things in life are free!"

txt = "The best things in life are free!"

Get the characters from position 2 to position 5

From: "o" in "World!" (position -5)

Remove Whitespace Replace String

Use the format() method to insert numbers into strings:

In programming you often need to know if an expression is True or False.

Evaluate Values and Variables Some Values are False

Python divides the operators in the following groups:

1.t = ["apple", "banana", "cherry"] thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]

thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "mango"]

thislist = ["apple", "banana", "cherry"]

Insert an item as the second position:

thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry"]

thislist = ["apple", "banana", "cherry"]

Remove Specified Index

thislist = ["apple", "banana", "cherry"]

The del keyword also removes the specified index:

thislist = ["apple", "banana", "cherry"] Clear the List

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Javascript for Platform compatibility.

the plt alias:

import [Link] as plt

import [Link] as plt

xpoints = [Link]([0, 6])

Mark each point with a circle:

import [Link] as plt

ypoints = [Link]([3, 8, 1, 10])

[Link](ypoints, marker = 'o')

import [Link] as plt

ypoints = [Link]([3, 8, 1, 10])

[Link](ypoints, linestyle = 'dotted')

[Link](ypoints, linestyle = 'dashed')

[Link]("Sports Watch Data")

[Link]("Sports Watch Data")

[Link]("Sports Watch Data")

import [Link] as plt

import [Link] as plt

#day one, the age and speed of 13 cars:

#day two, the age and speed of 15 cars:

import [Link] as plt

import [Link] as plt

How to Use the Color Map

[Link](x, y, c=colors, cmap='viridis')

import [Link] as plt

import [Link] as plt

[Link](x, y, s=sizes, alpha=0.5)

import [Link] as plt

[Link](x, y, c=colors, s=sizes, alpha=0.5, cmap='nipy_spectral')

[Link] numpy as np 2. import [Link] as plt

[Link] [Link] as plt 2.x = ["APPLES", "BANANAS"]

import [Link] as plt