0% found this document useful (0 votes)

10 views29 pages

Introduction to NumPy in Python

The document discusses NumPy, a Python library used for working with multidimensional arrays and matrices for scientific computing. It notes that NumPy provides powerful N-dimensional array objects and tools for working with these arrays. It then covers key aspects of NumPy including how to create arrays, perform operations on arrays, and use NumPy for tasks like linear algebra.

Uploaded by

Arasy Dafa Sulistya Kurniawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views29 pages

Introduction to NumPy in Python

Uploaded by

Arasy Dafa Sulistya Kurniawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

2/23/2021 Pendahuluan Python.

ipynb - Colaboratory

1. NumPy in Python

What is NumPy?

NumPy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scienti c computing with Python. It contains various features
including these important ones:

A powerful N-dimensional array object

Sophisticated (broadcasting) functions
Tools for integrating C/C++ and Fortran code
Useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scienti c uses, NumPy can also be used as an e cient multi-dimensional
container of generic data. Arbitrary data-types can be de ned using Numpy which allows NumPy to
seamlessly and speedily integrate with a wide variety of databases.

1. Arrays in NumPy: NumPy’s main object is the homogeneous multidimensional array.

It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive
integers. In NumPy dimensions are called axes. The number of axes is rank. NumPy’s array class is
called ndarray. It is also known by the alias array.

# Python program to demonstrate

# basic array characteristics
import numpy as np

# Creating array object

arr = [Link]( [[ 1, 2, 3],
[ 4, 2, 5]] )

# Printing type of arr object

print("Array is of type: ", type(arr))

# Printing array dimensions (axes)

print("No. of dimensions: ", [Link])

# Printing shape of array

print("Shape of array: ", [Link])

# Printing size (total number of elements) of array

print("Size of array: ", [Link])

[Link] 1/29
2/23/2021 Pendahuluan [Link] - Colaboratory

# Printing type of elements in array

print("Array stores elements of type: ", [Link])

Array is of type: <class '[Link]'>

No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int64

2. Array creation: There are various ways to create arrays in NumPy.

1. For example, you can create an array from a regular Python list or tuple using the array
function. The type of the resulting array is deduced from the type of the elements in the
sequences.
2. Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy
offers several functions to create arrays with initial placeholder content. These minimize the
necessity of growing arrays, an expensive operation. For example: [Link], [Link], [Link],
[Link], etc.
3. To create sequences of numbers, NumPy provides a function analogous to range that returns
arrays instead of lists.
4. arange: returns evenly spaced values within a given interval. step size is speci ed.
5. linspace: returns evenly spaced values within a given interval. num no. of elements are
returned.
6. Reshaping array: We can use reshape method to reshape an array. Consider an array with
shape (a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2,
b3, …, bM). The only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e
original size of array remains unchanged.)
7. Flatten array: We can use atten method to get a copy of array collapsed into one dimension.
It accepts order argument. Default value is ‘C’ (for row-major order). Use ‘F’ for column major
order.

# Python program to demonstrate

# array creation techniques
import numpy as np

# Creating array from list with type float

a = [Link]([[1, 2, 4], [5, 8, 7]], dtype = 'float')
print ("Array created using passed list:\n", a)

# Creating array from tuple

b = [Link]((1 , 3, 2))
print ("\nArray created using passed tuple:\n", b)

# Creating a 3X4 array with all zeros

c = np zeros((3 4))
[Link] 2/29
2/23/2021 Pendahuluan [Link] - Colaboratory
c = [Link]((3, 4))
print ("\nAn array initialized with all zeros:\n", c)

# Create a constant value array of complex type

d = [Link]((3, 3), 6, dtype = 'complex')
print ("\nAn array initialized with all 6s."
"Array type is complex:\n", d)

# Create an array with random values

e = [Link]((2, 2))
print ("\nA random array:\n", e)

# Create a sequence of integers

# from 0 to 30 with steps of 5
f = [Link](0, 30, 5)
print ("\nA sequential array with steps of 5:\n", f)

# Create a sequence of 10 values in range 0 to 5

g = [Link](0, 5, 10)
print ("\nA sequential array with 10 values between"
"0 and 5:\n", g)

# Reshaping 3X4 array to 2X2X3 array

arr = [Link]([[1, 2, 3, 4],
[5, 2, 4, 2],
[1, 2, 0, 1]])

newarr = [Link](2, 2, 3)

print ("\nOriginal array:\n", arr)

print ("Reshaped array:\n", newarr)

# Flatten array
arr = [Link]([[1, 2, 3], [4, 5, 6]])
flarr = [Link]()

print ("\nOriginal array:\n", arr)

print ("Fattened array:\n", flarr)

Array created using passed list:

[[1. 2. 4.]
[5. 8. 7.]]

Array created using passed tuple:

[1 3 2]

An array initialized with all zeros:

[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

An array initialized with all [Link] type is complex:

[[6.+0.j 6.+0.j 6.+0.j]

[Link] 3/29
2/23/2021 Pendahuluan [Link] - Colaboratory

[6.+0.j 6.+0.j 6.+0.j]

[6.+0.j 6.+0.j 6.+0.j]]

A random array:
[[0.70864301 0.1445599 ]
[0.62385575 0.05495546]]

A sequential array with steps of 5:

[ 0 5 10 15 20 25]

A sequential array with 10 values between0 and 5:

[0. 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
3.33333333 3.88888889 4.44444444 5. ]

Original array:
[[1 2 3 4]
[5 2 4 2]
[1 2 0 1]]
Reshaped array:
[[[1 2 3]
[4 5 2]]

[[4 2 1]
[2 0 1]]]

Original array:
[[1 2 3]
[4 5 6]]
Fattened array:
[1 2 3 4 5 6]

Operations on single array: We can use overloaded arithmetic operators to do element-wise

operation on array to create a new array. In case of +=, -=, *= operators, the exsisting array is
modi ed.

# Python program to demonstrate

# basic operations on single array
import numpy as np

a = [Link]([1, 2, 5, 3])

# add 1 to every element

print ("Adding 1 to every element:", a+1)

# subtract 3 from each element

print ("Subtracting 3 from each element:", a-3)

# multiply each element by 10

print ("Multiplying each element by 10:", a*10)

# square each element

print ("Squaring each element:", a**2)

[Link] 4/29
2/23/2021 Pendahuluan [Link] - Colaboratory

# modify existing array

a *= 2
print ("Doubled each element of original array:", a)

# transpose of array
a = [Link]([[1, 2, 3], [3, 4, 5], [9, 6, 0]])

print ("\nOriginal array:\n", a)

print ("Transpose of array:\n", a.T)

Adding 1 to every element: [2 3 6 4]

Subtracting 3 from each element: [-2 -1 2 0]
Multiplying each element by 10: [10 20 50 30]
Squaring each element: [ 1 4 25 9]
Doubled each element of original array: [ 2 4 10 6]

Original array:
[[1 2 3]
[3 4 5]
[9 6 0]]
Transpose of array:
[[1 3 9]
[2 4 6]
[3 5 0]]

Binary operators: These operations apply on array elementwise and a new array is created. You can
use all basic arithmetic operators like +, -, /, , etc. In case of +=, -=, = operators, the exsisting array is
modi ed.

# Python program to demonstrate

# binary operators in Numpy
import numpy as np

a = [Link]([[1, 2],
[3, 4]])
b = [Link]([[4, 3],
[2, 1]])

# add arrays
print ("Array sum:\n", a + b)

# multiply arrays (elementwise multiplication)

print ("Array multiplication:\n", a*b)

# matrix multiplication
print ("Matrix multiplication:\n", [Link](b))

Array sum:
[[5 5]
[5 5]]

[Link] 5/29
2/23/2021 Pendahuluan [Link] - Colaboratory

Array multiplication:
[[4 6]
[6 4]]
Matrix multiplication:
[[ 8 5]
[20 13]]

# Python program to demonstrate sorting in numpy

import numpy as np

a = [Link]([[1, 4, 2],
[3, 4, 6],
[0, -1, 5]])

# sorted array
print ("Array elements in sorted order:\n",
[Link](a, axis = None))

# sort array row-wise

print ("Row-wise sorted array:\n",
[Link](a, axis = 1))

# specify sort algorithm

print ("Column wise sort by applying merge-sort:\n",
[Link](a, axis = 0, kind = 'mergesort'))

# Example to show sorting of structured array

# set alias names for dtypes
dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]

# Values to be put in array

values = [('Hrithik', 2009, 8.5), ('Ajay', 2008, 8.7),
('Pankaj', 2008, 7.9), ('Aakash', 2009, 9.0)]

# Creating array
arr = [Link](values, dtype = dtypes)
print ("\nArray sorted by names:\n",
[Link](arr, order = 'name'))

print ("Array sorted by grauation year and then cgpa:\n",

[Link](arr, order = ['grad_year', 'cgpa']))

Array elements in sorted order:

[-1 0 1 2 3 4 4 5 6]
Row-wise sorted array:
[[ 1 2 4]
[ 3 4 6]
[-1 0 5]]
Column wise sort by applying merge-sort:
[[ 0 -1 2]
[ 1 4 5]
[ 3 4 6]]

[Link] 6/29
2/23/2021 Pendahuluan [Link] - Colaboratory

Array sorted by names:

[(b'Aakash', 2009, 9. ) (b'Ajay', 2008, 8.7) (b'Hrithik', 2009, 8.5)
(b'Pankaj', 2008, 7.9)]
Array sorted by grauation year and then cgpa:
[(b'Pankaj', 2008, 7.9) (b'Ajay', 2008, 8.7) (b'Hrithik', 2009, 8.5)
(b'Aakash', 2009, 9. )]

2. Difference between Pandas VS NumPy

Pandas: It is an open-source, BSD-licensed library written in Python Language. Pandas provide high
performance, fast, easy to use data structures and data analysis tools for manipulating numeric
data and time series. Pandas is built on the numpy library and written in languages like Python,
Cython, and C. In pandas, we can import data from various le formats like JSON, SQL, Microsoft
Excel, etc.

# Importing pandas library

import pandas as pd

# Creating and initializing a nested list

age = [['Aman', 95.5, "Male"], ['Sunny', 65.7, "Female"],
['Monty', 85.1, "Male"], ['toni', 75.4, "Male"]]

# Creating a pandas dataframe

df = [Link](age, columns=['Name', 'Marks', 'Gender'])

# Printing dataframe
df

Name Marks Gender

0 Aman 95.5 Male

1 Sunny 65.7 Female

2 Monty 85.1 Male

3 toni 75.4 Male

Numpy: It is the fundamental library of python, used to perform scienti c computing. It provides
high-performance multidimensional arrays and tools to deal with them. A numpy array is a grid of
values (of the same type) that are indexed by a tuple of positive integers, numpy arrays are fast,
easy to understand, and give users the right to perform calculations across arrays.

# Importing Numpy package

import numpy as np
[Link] 7/29
2/23/2021 Pendahuluan [Link] - Colaboratory

# Creating a 3-D numpy array using [Link]()

org_array = [Link]([[23, 46, 85],
[43, 56, 99],
[11, 34, 55]])

# Printing the Numpy array

print(org_array)

[[23 46 85]
[43 56 99]
[11 34 55]]

Numpy [Link]()

With the help of Numpy [Link](), we can resize the size of an array. Array can be of any
shape but to resize it we just need the size i.e (2, 2), (2, 3) and many more. During resizing numpy
append zeros if values at a particular place is missing.

Example #1:
In this example we can see that with the help of .resize() method, we have changed the shape of an
array from 1×6 to 2×3.

# importing the python module numpy

import numpy as np

# Making a random array

gfg = [Link]([1, 2, 3, 4, 5, 6])

# Reshape the array permanently

[Link](2, 3)

print(gfg)

[[1 2 3]
[4 5 6]]

Example #2:
In this example we can see that, we are trying to resize the array of that shape which is type of out
of bound values. But numpy handles this situation to append the zeros when values are not existed

[Link] 8/29
2/23/2021 Pendahuluan [Link] - Colaboratory

in the array.
# importing the python module numpy
import numpy as np

# Making a random array

ga = [Link]([1, 2, 3, 4, 5, 6])

# Required values 12, existing values 6

[Link](3, 4)

print(ga)

[[1 2 3 4]
[5 6 0 0]
[0 0 0 0]]

Reshape() method in Numpy

Both the [Link]() and [Link]() methods are used to change the size of a NumPy
array. The difference between them is that the reshape() does not changes the original array but
only returns the changed array, whereas the resize() method returns nothing and directly changes
the original array.

Example 1: Using reshape()

# importing the module

import numpy as np

# creating an array
A = [Link]([1, 2, 3, 4, 5, 6])
print("Original array:")
display(A)

# using reshape()
print("Changed array")
display([Link](2, 3))

print("Original array:")
display(A)

[Link] 9/29
2/23/2021 Pendahuluan [Link] - Colaboratory

Original array:
array([1, 2, 3, 4, 5, 6])
Changed array
array([[1, 2, 3],
Example 2: Using resize()
[4, 5, 6]])
Original array:
array([1, 2, 3, 4, 5, 6])
# importing the module
import numpy as np

# creating an array
Aa = [Link]([1, 2, 3, 4, 5, 6])
print("Original array:")
display(Aa)

# using resize()
print("Changed array")
# this will print nothing as None is returned
display([Link](2, 3))

print("Original array:")
display(Aa)

Original array:
array([1, 2, 3, 4, 5, 6])
Changed array
None
Original array:
array([[1, 2, 3],
[4, 5, 6]])

# import the important module in python

import numpy as np

# make a matrix with numpy

B = [Link]('[1, 2; 4, 5; 7, 8]')

print(B)

# applying [Link]() method

new = [Link]((2, 3))

print(new)

[[1 2]
[4 5]
[7 8]]
[[1 2 4]
[5 7 8]]

[Link] 10/29
2/23/2021 Pendahuluan [Link] - Colaboratory

[Link]()
[Link](), We can perform the simple function of transpose within one line by using
[Link]() method of Numpy. It can transpose the 2-D arrays on the other hand it has no
effect on 1-D arrays. This method transpose the 2-D numpy array.

# importing python module named numpy

import numpy as np

# making a 3x3 array

Aa = [Link]([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

# before transpose
print(Aa, end ='\n\n')

# after transpose
print([Link]())

[[1 2 3]
[4 5 6]
[7 8 9]]

[[1 4 7]
[2 5 8]
[3 6 9]]

# importing python module named numpy

import numpy as np

# making a 3x3 array

Ab = [Link]([[1, 2],
[4, 5],
[7, 8]])

# before transpose
print(Ab, end ='\n\n')

# after transpose
print([Link](1, 0))

[[1 2]
[4 5]
[7 8]]

[[1 4 7]
[2 5 8]]

[Link] 11/29
2/23/2021 Pendahuluan [Link] - Colaboratory

3. Convert NumPy Array to Pandas DataFrame

Step 1: Create a NumPy Array

import numpy as np

my_array = [Link]([[11,22,33],[44,55,66]])

print(my_array)
print(type(my_array))

[[11 22 33]
[44 55 66]]
<class '[Link]'>

Step 2: Convert the NumPy Array to Pandas DataFrame

import numpy as np
import pandas as pd

my_array = [Link]([[11,22,33],[44,55,66]])

df = [Link](my_array, columns = ['Column_A','Column_B','Column_C'])

print(df)
print(type(df))

Column_A Column_B Column_C

0 11 22 33
1 44 55 66
<class '[Link]'>

Step 3 (optional): Add an Index to the DataFrame

What if you’d like to add an index to the DataFrame?

For instance, let’s add the following index to the DataFrame

index = ['Item_1', 'Item_2']

So here is the complete code to convert the array to a DataFrame with an index:

import numpy as np
import pandas as pd
[Link] 12/29
2/23/2021 Pendahuluan [Link] - Colaboratory

my_array = [Link]([[11,22,33],[44,55,66]])

df = [Link](my_array, columns = ['Column_A','Column_B','Column_C'], index = ['Item_1',

print(df)
print(type(df))

Column_A Column_B Column_C

Item_1 11 22 33
Item_2 44 55 66
<class '[Link]'>

4. Array Contains a Mix of Strings and Numeric Data

import numpy as np

my_array = [Link]([['Jon',25,1995,2016],['Maria',47,1973,2000],['Bill',38,1982,2005]], dtyp

print(my_array)
print(type(my_array))
print(my_array.dtype)

[['Jon' 25 1995 2016]

['Maria' 47 1973 2000]
['Bill' 38 1982 2005]]
<class '[Link]'>
object

Use the following syntax to convert the NumPy array to a DataFrame:

import numpy as np
import pandas as pd

my_array = [Link]([['Jon',25,1995,2016],['Maria',47,1973,2000],['Bill',38,1982,2005]], dtyp

df = [Link](my_array, columns = ['Name','Age','Birth Year','Graduation Year'])

print(df)
print(type(df))

Name Age Birth Year Graduation Year

0 Jon 25 1995 2016
1 Maria 47 1973 2000
2 Bill 38 1982 2005
<class '[Link]'>

[Link] 13/29
2/23/2021 Pendahuluan [Link] - Colaboratory

Let’s check the data types of all the columns in the new DataFrame by adding [Link] to the code:
import numpy as np
import pandas as pd

my_array = [Link]([['Jon',25,1995,2016],['Maria',47,1973,2000],['Bill',38,1982,2005]], dtyp

df = [Link](my_array, columns = ['Name','Age','Birth Year','Graduation Year'])

print(df)
print(type(df))

Name Age Birth Year Graduation Year

0 Jon 25 1995 2016
1 Maria 47 1973 2000
2 Bill 38 1982 2005
<class '[Link]'>

Let’s check the data types of all the columns in the new DataFrame by adding [Link] to the code:

import numpy as np
import pandas as pd

my_array = [Link]([['Jon',25,1995,2016],['Maria',47,1973,2000],['Bill',38,1982,2005]], dtyp

df = [Link](my_array, columns = ['Name','Age','Birth Year','Graduation Year'])

print(df)
print(type(df))
print([Link])

Name Age Birth Year Graduation Year

0 Jon 25 1995 2016
1 Maria 47 1973 2000
2 Bill 38 1982 2005
<class '[Link]'>
Name object
Age object
Birth Year object
Graduation Year object
dtype: object

Currently, all the columns under the DataFrame are objects/strings

For example, suppose that you’d like to convert the last 3 columns in the DataFrame to integers.

To achieve this goal, you can use astype(int) as captured below:

import numpy as np
import pandas as pd
[Link] 14/29
2/23/2021 Pendahuluan [Link] - Colaboratory
import pandas as pd

my_array = [Link]([['Jon',25,1995,2016],['Maria',47,1973,2000],['Bill',38,1982,2005]])

df = [Link](my_array, columns = ['Name','Age','Birth Year','Graduation Year'])

df['Age'] = df['Age'].astype(int)
df['Birth Year'] = df['Birth Year'].astype(int)
df['Graduation Year'] = df['Graduation Year'].astype(int)

print(df)
print(type(df))
print([Link])

Double-click (or enter) to edit

How to Union Pandas DataFrames using Concat

You can union Pandas DataFrames using contact:

[Link]([df1, df2])

Step 1: Create the rst DataFrame

import pandas as pd

clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'],

'clientLastName': ['Smith','Lam','Jones','Chang'],
'country': ['US','Canada','Italy','China']
}

df1 = [Link](clients1, columns= ['clientFirstName', 'clientLastName','country'])

print (df1)

clientFirstName clientLastName country

0 Jon Smith US
1 Maria Lam Canada
2 Bruce Jones Italy
3 Lili Chang China

Step 2: Create the second DataFrame

import pandas as pd
[Link] 15/29
2/23/2021 Pendahuluan [Link] - Colaboratory

clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'],

'clientLastName': ['Jackson','Green','Gross','Sing'],
'country': ['UK','Germany','Brazil','Japan']
}

df2 = [Link](clients2, columns= ['clientFirstName', 'clientLastName','country'])

print (df2)

clientFirstName clientLastName country

0 Bill Jackson UK
1 Jack Green Germany
2 Elizabeth Gross Brazil
3 Jenny Sing Japan

Step 3: Union Pandas DataFrames using Concat

import pandas as pd

clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'],

'clientLastName': ['Smith','Lam','Jones','Chang'],
'country': ['US','Canada','Italy','China']
}

df1 = [Link](clients1, columns= ['clientFirstName', 'clientLastName','country'])

clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'],

'clientLastName': ['Jackson','Green','Gross','Sing'],
'country': ['UK','Germany','Brazil','Japan']
}

df2 = [Link](clients2, columns= ['clientFirstName', 'clientLastName','country'])

union = [Link]([df1, df2])

print (union)

clientFirstName clientLastName country

0 Jon Smith US
1 Maria Lam Canada
2 Bruce Jones Italy
3 Lili Chang China
0 Bill Jackson UK
1 Jack Green Germany
2 Elizabeth Gross Brazil
3 Jenny Sing Japan

You may then choose to assign the index values in an incremental manner once you concatenated
the two DataFrames.

[Link] 16/29
2/23/2021 Pendahuluan [Link] - Colaboratory

To do so, simply set ignore_index=True within the [Link] brackets:

import pandas as pd

clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'],

'clientLastName': ['Smith','Lam','Jones','Chang'],
'country': ['US','Canada','Italy','China']
}

df1 = [Link](clients1, columns= ['clientFirstName', 'clientLastName','country'])

clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'],

'clientLastName': ['Jackson','Green','Gross','Sing'],
'country': ['UK','Germany','Brazil','Japan']
}

df2 = [Link](clients2, columns= ['clientFirstName', 'clientLastName','country'])

union = [Link]([df1, df2], ignore_index=True)

print (union)

clientFirstName clientLastName country

0 Jon Smith US
1 Maria Lam Canada
2 Bruce Jones Italy
3 Lili Chang China
4 Bill Jackson UK
5 Jack Green Germany
6 Elizabeth Gross Brazil
7 Jenny Sing Japan

[Link]

df = [Link]([[1, 2], [4, 5], [7, 8]],

index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
df

max_speed shield

cobra 1 2

viper 4 5

sidewinder 7 8

[Link]['viper']
[Link] 17/29
2/23/2021 Pendahuluan [Link] - Colaboratory
d . oc[ pe ]

max_speed 4
shield 5
Name: viper, dtype: int64

[Link][['viper', 'sidewinder']]

max_speed shield

viper 4 5

sidewinder 7 8

[Link]['cobra', 'shield']

[Link]['cobra':'viper', 'max_speed']

cobra 1
viper 4
Name: max_speed, dtype: int64

IF condition in Pandas DataFrame

(1) IF condition – Set of numbers

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10]}

df = [Link](numbers,columns=['set_of_numbers'])

[Link][df['set_of_numbers'] <= 4, 'equal_or_lower_than_4?'] = 'True'

[Link][df['set_of_numbers'] > 4, 'equal_or_lower_than_4?'] = 'False'

print (df)

set_of_numbers equal_or_lower_than_4?
0 1 True
1 2 True
2 3 True
3 4 True
4 5 False
5 6 False
6 7 False
7 8 False

[Link] 18/29
2/23/2021 Pendahuluan [Link] - Colaboratory

8 9 False
9 10 False

(2) IF condition – set of numbers and lambda

how to get the same results as in case 1 by using lambada, where the conditions are:

If the number is equal or lower than 4, then assign the value of ‘True’ Otherwise, if the number is
greater than 4, then assign the value of ‘False’

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10]}

df = [Link](numbers,columns=['set_of_numbers'])

df['equal_or_lower_than_4?'] = df['set_of_numbers'].apply(lambda x: 'True' if x <= 4 else 'Fa

print (df)

set_of_numbers equal_or_lower_than_4?
0 1 True
1 2 True
2 3 True
3 4 True
4 5 False
5 6 False
6 7 False
7 8 False
8 9 False
9 10 False

(3) IF condition – strings

import pandas as pd

names = {'First_name': ['Jon','Bill','Maria','Emma']}

df = [Link](names,columns=['First_name'])

[Link][df['First_name'] == 'Bill', 'name_match'] = 'Match'

[Link][df['First_name'] != 'Bill', 'name_match'] = 'Mismatch'

print (df)

First_name name_match
0 Jon Mismatch
1 Bill Match
2 Maria Mismatch
3 Emma Mismatch

[Link] 19/29
2/23/2021 Pendahuluan [Link] - Colaboratory

(4) IF condition – strings and lambada

import pandas as pd

names = {'First_name': ['Jon','Bill','Maria','Emma']}

df = [Link](names,columns=['First_name'])

df['name_match'] = df['First_name'].apply(lambda x: 'Match' if x == 'Bill' else 'Mismatch')

print (df)

First_name name_match
0 Jon Mismatch
1 Bill Match
2 Maria Mismatch
3 Emma Mismatch

(5) IF condition with OR

import pandas as pd

names = {'First_name': ['Jon','Bill','Maria','Emma']}

df = [Link](names,columns=['First_name'])

[Link][(df['First_name'] == 'Bill') | (df['First_name'] == 'Emma'), 'name_match'] = 'Match'

[Link][(df['First_name'] != 'Bill') & (df['First_name'] != 'Emma'), 'name_match'] = 'Mismatch

print (df)

First_name name_match
0 Jon Mismatch
1 Bill Match
2 Maria Mismatch
3 Emma Match

Applying an IF condition under an existing DataFrame column

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,0,0]}

df = [Link](numbers,columns=['set_of_numbers'])
print (df)

[Link][df['set_of_numbers'] == 0, 'set_of_numbers'] = 999

[Link][df['set_of_numbers'] == 5, 'set_of_numbers'] = 555

print (df)

[Link] 20/29
2/23/2021 Pendahuluan [Link] - Colaboratory

set_of_numbers
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 0
11 0
set_of_numbers
0 1
1 2
2 3
3 4
4 555
5 6
6 7
7 8
8 9
9 10
10 999
11 999

import pandas as pd
import numpy as np

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,[Link],[Link]]}

df = [Link](numbers,columns=['set_of_numbers'])
print (df)

[Link][df['set_of_numbers'].isnull(), 'set_of_numbers'] = 0
print (df)

set_of_numbers
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 6.0
6 7.0
7 8.0
8 9.0
9 10.0
10 NaN
11 NaN
set_of_numbers
0 1.0
1 2.0
2 3.0

[Link] 21/29
2/23/2021 Pendahuluan [Link] - Colaboratory

3 4.0
4 5.0
5 6.0
6 7.0
7 8.0
8 9.0
9 10.0
10 0.0
11 0.0

Descriptive Statistics for Pandas DataFrame

Steps to Get the Descriptive Statistics for Pandas DataFrame

Step 1: Collect the Data

Step 2: Create the DataFrame

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

print (df)

Brand Price Year

0 Honda Civic 22000 2014
1 Ford Focus 27000 2015
2 Toyota Corolla 25000 2016

[Link] 22/29
2/23/2021 Pendahuluan [Link] - Colaboratory

3 Toyota Corolla 29000 2017

4 Audi A4 35000 2018

Step 3: Get the Descriptive Statistics for Pandas DataFrame

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = df['Price'].describe()
print (stats_numeric)

count 5.000000
mean 27600.000000
std 4878.524367
min 22000.000000
25% 25000.000000
50% 27000.000000
75% 29000.000000
max 35000.000000
Name: Price, dtype: float64

You’ll notice that the output contains 6 decimal places. You may then add the syntax of astype (int)
to the code to get integer values.

This is how the code would look like:

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = df['Price'].describe().astype (int)

print (stats_numeric)

count 5
mean 27600
std 4878
min 22000
25% 25000
50% 27000
75% 29000

[Link] 23/29
2/23/2021 Pendahuluan [Link] - Colaboratory

max 35000
Name: Price, dtype: int64

Descriptive Statistics for Categorical Data

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_categorical = df['Brand'].describe()
print (stats_categorical)

count 5
unique 4
top Toyota Corolla
freq 2
Name: Brand, dtype: object

Descriptive Statistics for the Entire Pandas DataFrame

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats = [Link](include='all')
print (stats)

Brand Price Year

count 5 5.000000 5.000000
unique 4 NaN NaN
top Toyota Corolla NaN NaN
freq 2 NaN NaN
mean NaN 27600.000000 2016.000000
std NaN 4878.524367 1.581139
min NaN 22000.000000 2014.000000
25% NaN 25000.000000 2015.000000
[Link] 24/29
2/23/2021 Pendahuluan [Link] - Colaboratory

50% NaN 27000.000000 2016.000000

75% NaN 29000.000000 2017.000000
max NaN 35000.000000 2018.000000

Breaking Down the Descriptive Statistics that are :

1. Count
2. Mean
3. Standard deviation
4. Minimum
5. 0.25 Quantile
6. 0.50 Quantile (Median)
7. 0.75 Quantile
8. Maximum

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],

'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

count1 = df['Price'].count()
print('count: ' + str(count1))

mean1 = df['Price'].mean()
print('mean: ' + str(mean1))

std1 = df['Price'].std()
print('std: ' + str(std1))

min1 = df['Price'].min()
print('min: ' + str(min1))

quantile1 = df['Price'].quantile(q=0.25)
print('25%: ' + str(quantile1))

quantile2 = df['Price'].quantile(q=0.50)
print('50%: ' + str(quantile2))

quantile3 = df['Price'].quantile(q=0.75)
print('75%: ' + str(quantile3))

max1 = df['Price'].max()
print('max: ' + str(max1))

[Link] 25/29
2/23/2021 Pendahuluan [Link] - Colaboratory

How to Plot a DataFrame using Pandas

How to plot a DataFrame using Pandas follow the complete steps to plot:

1. Scatter diagram
2. Line chart
3. Bar chart
4. Pie chart

1. Plot a Scatter Diagram using Pandas

import pandas as pd
import [Link] as plt

data = {'Unemployment_Rate': [6.1,5.8,5.7,5.7,5.8,5.6,5.5,5.3,5.2,5.2],

'Stock_Index_Price': [1500,1520,1525,1523,1515,1540,1545,1560,1555,1565]
}

df = [Link](data,columns=['Unemployment_Rate','Stock_Index_Price'])
[Link](x ='Unemployment_Rate', y='Stock_Index_Price', kind = 'scatter')
[Link]()

2. Plot a Line Chart using Pandas

import pandas as pd
import [Link] as plt
[Link] 26/29
2/23/2021 Pendahuluan [Link] - Colaboratory

data = {'Year': [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010],

'Unemployment_Rate': [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3]
}

df = [Link](data,columns=['Year','Unemployment_Rate'])
[Link](x ='Year', y='Unemployment_Rate', kind = 'line')
[Link]()

3. Plot a Bar Chart using Pandas

import pandas as pd
import [Link] as plt

data = {'Country': ['USA','Canada','Germany','UK','France'],

'GDP_Per_Capita': [45000,42000,52000,49000,47000]
}

df = [Link](data,columns=['Country','GDP_Per_Capita'])
[Link](x ='Country', y='GDP_Per_Capita', kind = 'bar')
[Link]()

[Link] 27/29
2/23/2021 Pendahuluan [Link] - Colaboratory

4. Plot a Pie Chart using Pandas

import pandas as pd
import [Link] as plt

data = {'Tasks': [300,500,700]}

df = [Link](data,columns=['Tasks'],index = ['Tasks Pending','Tasks Ongoing','Tasks Comp

[Link](y='Tasks',figsize=(5, 5),autopct='%1.1f%%', startangle=90)

[Link]()

[Link] 28/29
2/23/2021 Pendahuluan [Link] - Colaboratory

[Link] 29/29

NumPy Array Creation and Operations
No ratings yet
NumPy Array Creation and Operations
7 pages
NumPy Basics: Arrays and Operations
No ratings yet
NumPy Basics: Arrays and Operations
18 pages
NumPy Basics for Python Users
No ratings yet
NumPy Basics for Python Users
15 pages
NumPy Array Creation and Operations
No ratings yet
NumPy Array Creation and Operations
17 pages
Introduction to NumPy in Python
No ratings yet
Introduction to NumPy in Python
8 pages
Python NumPy: Array Basics and Operations
No ratings yet
Python NumPy: Array Basics and Operations
18 pages
Numpy ndarrays: Creation & Operations
No ratings yet
Numpy ndarrays: Creation & Operations
9 pages
NumPy Basics: Fast Array Operations
No ratings yet
NumPy Basics: Fast Array Operations
79 pages
NumPy Basics: Arrays and Operations
No ratings yet
NumPy Basics: Arrays and Operations
12 pages
NumPy Array Basics in Data Analytics
No ratings yet
NumPy Array Basics in Data Analytics
19 pages
NumPy Array Creation and Manipulation Guide
No ratings yet
NumPy Array Creation and Manipulation Guide
27 pages
NumPy Basics for Python Programming
No ratings yet
NumPy Basics for Python Programming
25 pages
CSK W New Numpy
No ratings yet
CSK W New Numpy
11 pages
AI&DS Lab Manual
No ratings yet
AI&DS Lab Manual
64 pages
NumPy Array Operations and Tasks Guide
No ratings yet
NumPy Array Operations and Tasks Guide
5 pages
Data Science Lab Manual: NumPy Basics
No ratings yet
Data Science Lab Manual: NumPy Basics
42 pages
NumPy Array Operations in Python 3.12
No ratings yet
NumPy Array Operations in Python 3.12
6 pages
NumPy Basics: Array Creation & Operations
No ratings yet
NumPy Basics: Array Creation & Operations
27 pages
Mastering NumPy: A Guide to Arrays
No ratings yet
Mastering NumPy: A Guide to Arrays
17 pages
Data Science Laboratory Record
No ratings yet
Data Science Laboratory Record
56 pages
NumPy Basics: Arrays & Operations
No ratings yet
NumPy Basics: Arrays & Operations
53 pages
Numpy Array Basics and Operations
No ratings yet
Numpy Array Basics and Operations
67 pages
Python NumPy Tutorial - A Comprehensive Guide For Beginners
No ratings yet
Python NumPy Tutorial - A Comprehensive Guide For Beginners
21 pages
Introduction to NumPy for Python Users
No ratings yet
Introduction to NumPy for Python Users
41 pages
NumPy Arrays: Creation and Operations
No ratings yet
NumPy Arrays: Creation and Operations
14 pages
II Cse Cs3361 Data Science Lab
No ratings yet
II Cse Cs3361 Data Science Lab
51 pages
Introduction to NumPy for Data Science
No ratings yet
Introduction to NumPy for Data Science
170 pages
Creating Arrays with NumPy
No ratings yet
Creating Arrays with NumPy
16 pages
Module 5 App
No ratings yet
Module 5 App
91 pages
NumPy Cheat Sheet for Data Science
No ratings yet
NumPy Cheat Sheet for Data Science
1 page
NumPy Cheat Sheet for Data Science
No ratings yet
NumPy Cheat Sheet for Data Science
1 page
Introduction to NumPy Arrays in Python
No ratings yet
Introduction to NumPy Arrays in Python
24 pages
NumPy Arrays vs Python Lists Explained
No ratings yet
NumPy Arrays vs Python Lists Explained
14 pages
Python Arrays: Creation and Operations
No ratings yet
Python Arrays: Creation and Operations
18 pages
Creating and Reshaping NumPy Arrays
No ratings yet
Creating and Reshaping NumPy Arrays
42 pages
Numpy Basics: Arrays and Operations
No ratings yet
Numpy Basics: Arrays and Operations
76 pages
NumPy Basics Cheat Sheet
No ratings yet
NumPy Basics Cheat Sheet
1 page
NumPy Basics for Data Science
100% (2)
NumPy Basics for Data Science
17 pages
NumPy Array Operations Cheat Sheet
100% (17)
NumPy Array Operations Cheat Sheet
7 pages
NP Cheat Sheet
No ratings yet
NP Cheat Sheet
1 page
NumPy Python Cheat Sheet Guide
No ratings yet
NumPy Python Cheat Sheet Guide
1 page
NumPy Cheat Sheet
No ratings yet
NumPy Cheat Sheet
1 page
Cheat Sheet
No ratings yet
Cheat Sheet
6 pages
NumPy Basics for Data Science
No ratings yet
NumPy Basics for Data Science
6 pages
NumPy Arithmetic Operations Guide
No ratings yet
NumPy Arithmetic Operations Guide
6 pages
NumPy Basics Cheat Sheet for Python
100% (1)
NumPy Basics Cheat Sheet for Python
1 page
NumPy Basics Cheat Sheet for Data Science
100% (5)
NumPy Basics Cheat Sheet for Data Science
14 pages
NumPy Basics for Data Science
No ratings yet
NumPy Basics for Data Science
5 pages
NumPy Notes For Data Analysis
No ratings yet
NumPy Notes For Data Analysis
55 pages
NumPy Arrays and Matrices Guide
No ratings yet
NumPy Arrays and Matrices Guide
1 page
Exercise No1 DATASCIENCE LAB
No ratings yet
Exercise No1 DATASCIENCE LAB
8 pages
Numpy Array Basics for Class XI
No ratings yet
Numpy Array Basics for Class XI
23 pages
Understanding NumPy Arrays in Python
100% (1)
Understanding NumPy Arrays in Python
25 pages
Comprehensive NumPy Guide
No ratings yet
Comprehensive NumPy Guide
29 pages
NumPy Array Operations Explained
No ratings yet
NumPy Array Operations Explained
62 pages
June 2016 Grade 12 Maths Paper 1 Memo
No ratings yet
June 2016 Grade 12 Maths Paper 1 Memo
11 pages
Pascal Programming Assignment Guide
No ratings yet
Pascal Programming Assignment Guide
5 pages
H Angles Student
100% (2)
H Angles Student
36 pages
BharatBhasaNet: Indian Code-Mixed LID
No ratings yet
BharatBhasaNet: Indian Code-Mixed LID
12 pages
First Semester Maths Model Paper 2022
No ratings yet
First Semester Maths Model Paper 2022
70 pages
AC Circuits and Transformer Analysis
No ratings yet
AC Circuits and Transformer Analysis
12 pages
Mathematics Club Constitution Overview
No ratings yet
Mathematics Club Constitution Overview
10 pages
Technical Drafting Tools Overview
100% (2)
Technical Drafting Tools Overview
21 pages
SOC 56: Intro to Social Statistics
No ratings yet
SOC 56: Intro to Social Statistics
6 pages
Portfolio Theory and Diversification
100% (1)
Portfolio Theory and Diversification
12 pages
Class 11 Maths Assignment Worksheet
No ratings yet
Class 11 Maths Assignment Worksheet
6 pages
Hybrid Heuristic for Multi-Compartment Routing
No ratings yet
Hybrid Heuristic for Multi-Compartment Routing
12 pages
Statistical Charts Worksheet for Form 2
No ratings yet
Statistical Charts Worksheet for Form 2
28 pages
Tool Tolerances Impact on Gear Quality
No ratings yet
Tool Tolerances Impact on Gear Quality
7 pages
Prolog Exercises for Medical Informatics
No ratings yet
Prolog Exercises for Medical Informatics
13 pages
Oblique and Orthographic Drawing Guide
No ratings yet
Oblique and Orthographic Drawing Guide
8 pages
Regula Falsi Method Explained
No ratings yet
Regula Falsi Method Explained
9 pages
Constructing Cantor Sets in [0,1]
No ratings yet
Constructing Cantor Sets in [0,1]
5 pages
Perinetti.2010 - Dental Malocclusion and Body Posture in Young Subjects - A Multiple Regression Study
No ratings yet
Perinetti.2010 - Dental Malocclusion and Body Posture in Young Subjects - A Multiple Regression Study
8 pages
Gliding Arc Plasma for Chloromethane Treatment
No ratings yet
Gliding Arc Plasma for Chloromethane Treatment
12 pages
ICT Reading Grade 11
No ratings yet
ICT Reading Grade 11
200 pages
Scientific Notation and Exponents Guide
100% (2)
Scientific Notation and Exponents Guide
30 pages
Applications of Geometry in Daily Life
No ratings yet
Applications of Geometry in Daily Life
2 pages
Circuit Optimization in Logic Design
No ratings yet
Circuit Optimization in Logic Design
29 pages
1 s2.0 S0142112321003364 Main
No ratings yet
1 s2.0 S0142112321003364 Main
10 pages
Opposite Differences in Grid Windows
No ratings yet
Opposite Differences in Grid Windows
8 pages
Motion in Two and Three Dimensions
No ratings yet
Motion in Two and Three Dimensions
8 pages
Step Radius R Specifications and Pricing
No ratings yet
Step Radius R Specifications and Pricing
1 page
Perfect Solids: 20-Faced Shapes
100% (1)
Perfect Solids: 20-Faced Shapes
8 pages
Differential Equations Assignment 8
No ratings yet
Differential Equations Assignment 8
4 pages