NumPy Essentials for Data Science
NumPy Essentials for Data Science
By DURGA SIR
2
NumPy Course
To perform these complex mathematical operations, python does not contain any inbuilt
library.
To perform these complex operations we require a library which is nothing but numpy.
Note:
Just because of these exta libraries like numpy,pandas,matplotlib,skleanr etc, Python is
recommended language for Data Science, Machine Learning and Deep Learning etc
scatterplot,heatmap seaborn,sckitlearn,tensorflow
Because of high speed, numpy is best choice for ML algorithms than traditional python's in
built data structures like List.
Q8. What is the advantage of numpy array over Python's in built List?
Performance is very high
7. Operational Reasearch
etc
Numpy is the fundamental and compulsory required library for DataScience, Machine
Learning, Deep Learning etc
2 ways
1st way:
---------
By using Anaconda Distribution
Anaconda is python flavour for Data Science,ML etc.
6
Anaconda distribution has inbuilt numpy library and hence we are not required to
install.
2nd way:
-------
If Python is already installed in our system, then we can install numpy library as follows
Summary:
--------
1. NumPy stands for Numerical Python Library.
2. Numpy library defines several functions to solve complex mathematical problems in
Data Science, Machine Learning, Deep Learning etc
3. Numpy acts as Backbone for most of the libraries used in Data Science like Pandas ,
sklearn etc
7
sample code:
-------------
import numpy as np
from datetime import datetime
a = [Link]([10,20,30])
b = [Link]([1,2,3])
print(type(a)) #<class '[Link]'>
before = [Link]()
8
for i in range(1000000):
dot_product(a,b)
after = [Link]()
print('The Time Taken:', after-before)
D:\durgaclasses>py [Link]
<class '[Link]'>
The Time Taken: 0:00:02.496784
The Time Taken: 0:00:02.054956
D:\durgaclasses>py [Link]
<class '[Link]'>
The Time Taken: 0:00:02.482245
The Time Taken: 0:00:01.972443
D:\durgaclasses>py [Link]
<class '[Link]'>
The Time Taken: 0:00:02.464734
The Time Taken: 0:00:01.969930
9
What is an Array?
An indexed collection of homogeneous data elements is nothing but array.
It is the most commonly used concept in programming languages like C/C++/Java etc
Bydefault arrays concept is not available in python, instead we can use List.
(But make sure list and array both are not same)
But in Python, we can create arrays in the following 2 ways:
D:\durgaclasses>py [Link]
<class '[Link]'>
array('i', [10, 20, 30])
Elements one by one:
10
20
30
Note: array module is not recommended to use because much library support is not
available.
10
D:\durgaclasses>py [Link]
<class '[Link]'>
[10 20 30]
Elements one by one:
10
20
30
D:\durgaclasses>py [Link]
The Size of Numpy Array: 168
The Size of List: 184
1. array()
2. arange()
3. linspace()
4. zeros()
5. ones()
6. full()
7. eye()
8. identity()
9. empty()
10. random library functions
1. randint()
2. rand()
3. uniform()
4. randn()
5. normal()
6. shuffle()
1. array():
----------
To create an array for the given list or tuple.
array(...)
array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
14
like=None)
Create an array.
>>> l = [10,20,30]
>>> a = [Link](l)
>>> a
array([10, 20, 30])
>>> type(a)
<class '[Link]'>
>>> l = [[10,20,30],[40,50,60],[70,80,90]]
>>> a = [Link](l)
>>> a
array([[10, 20, 30],
[40, 50, 60],
[70, 80, 90]])
>>> type(a)
<class '[Link]'>
>>> [Link]
2
>>> a = [Link]([10,20,30])
>>> [Link]
dtype('int32')
>>> a = [Link]([10,20.5,30])
>>> a
array([10. , 20.5, 30. ])
>>> type(a)
<class '[Link]'>
>>> [Link]
dtype('float64')
eg-1:
>>> a = [Link]([10,20,30],dtype=float)
>>> a
array([10., 20., 30.])
>>> [Link]
dtype('float64')
eg-2:
>>> a = [Link]([10,20,30],dtype=complex)
>>> a
array([10.+0.j, 20.+0.j, 30.+0.j])
>>> [Link]
dtype('complex128')
eg-4:
>>> a = [Link]([10,20,30,0],dtype=bool)
>>> a
array([ True, True, True, False])
>>> [Link]
dtype('bool')
17
eg-4a:
>>> a = [Link](['A','B',''],dtype=bool)
>>> a
array([ True, True, False])
eg-5:
>>> a = [Link]([10,20,30,0],dtype=str)
>>> a
array(['10', '20', '30', '0'], dtype='<U2')
>>> [Link]
dtype('<U2')
2. arange():
-----------
It's functionality exactly same as python's inbuilt function: range()
range(10) --->0,1,2,....9
range(1,10) --->1,2,3,....9
range(1,10,2) --->1,3,5,7,9
Syntax:
-------
arange([start,] stop[, step,], dtype=None, *, like=None)
Return evenly spaced values within a given interval.
eg-1:
>>> a = [Link](10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> [Link]
dtype('int32')
19
eg-2:
>>> a = [Link](1,10)
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
eg-3:
>>> a = [Link](1,10,2)
>>> a
array([1, 3, 5, 7, 9])
eg-4:
>>> a = [Link](2,11,2,dtype=float)
>>> a
array([ 2., 4., 6., 8., 10.])
5. linspace():
-------------
Syntax:
linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
Return evenly spaced numbers over a specified interval.
eg-3:
>>> a = [Link](0,360,4)
>>> a
array([ 0., 120., 240., 360.])
eg-4:
>>> [Link](0,10,5,dtype=float)
array([ 0. , 2.5, 5. , 7.5, 10. ])
>>> [Link](0,10,5,dtype=int)
array([ 0, 2, 5, 7, 10])
arange() vs linspace()
-----------------------
arange(): elements will be considered in the given range based on step value.
linspace(): The specified number of elements will be considered in the given range
21
eg:
[Link](0,10,2) # [0,2,4,6,8]
[Link](0,10,2)#[0.0, 10.0]
4. zeros():
----------
Syntax:
-------
zeros(shape, dtype=float, order='C', *, like=None)
Return a new array of given shape and type, filled with zeros.
eg-1:
>>> [Link](4)
array([0., 0., 0., 0.])
eg-2:
>>> [Link]((4,))
array([0., 0., 0., 0.])
eg-3:
>>> [Link]((2,3))
array([[0., 0., 0.],
[0., 0., 0.]])
22
eg-4:
>>> [Link]((2,3),dtype=int)
array([[0, 0, 0],
[0, 0, 0]])
eg-5:
>>> [Link]((2,3,4),dtype=int)
array([[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]])
eg-6:
>>> a = [Link]((5,),dtype=int)
>>> a
array([0, 0, 0, 0, 0])
>>> [Link]
(5,)
>>> [Link]
1
>>> [Link]
5
eg-7:
>>> a = [Link]((2,5),dtype=int)
>>> a
array([[0, 0, 0, 0, 0],
23
[0, 0, 0, 0, 0]])
>>> [Link]
(2, 5)
>>> [Link]
2
>>> [Link]
10
Note:
shape: it is a tuple of integers indicating number of elements in each dimension
eg: (2,3,4)
It is 3-dimesional array where first dimension contains 2 elements and second dimension
contains 3 elements and 3rd dimension contains 4 elements.
Total size of this array: 24
5. ones():
----------
Syntax:
ones(shape, dtype=None, order='C', *, like=None)
Return a new array of given shape and type, filled with ones.
eg-1:
>>> [Link](3)
24
eg-2:
>>> [Link]((3,))
array([1., 1., 1.])
eg-3:
>>> a = [Link]((3,2),dtype=int)
>>> a
array([[1, 1],
[1, 1],
[1, 1]])
>>> [Link]
(3, 2)
>>> [Link]
6
>>> [Link]
2
6. full():
----------
full(shape, fill_value, dtype=None, order='C', *, like=None)
Return a new array of given shape and type, filled with `fill_value`.
It is exactly same as zeros() and ones() functions except that we can fill with our required
value(fill_value)
>>> [Link]((5,),7)
array([7, 7, 7, 7, 7])
>>> [Link]((3,5),6,dtype=float)
25
7. eye()
---------
Syntax:
-------
eye(N, M=None, k=0, dtype=<class 'float'>, order='C', *, like=None)
Return a 2-D array with ones on the diagonal and zeros elsewhere.
Parameters
----------
N : int
Number of rows in the output.
M : int, optional
Number of columns in the output. If None, defaults to `N`.
k : int, optional
Index of the diagonal: 0 (the default) refers to the main diagonal,
a positive value refers to an upper diagonal, and a negative value
to a lower diagonal.
dtype : data-type, optional
Data-type of the returned array.
eg-1:
>>> [Link](3)
array([[1., 0., 0.],
26
eg-3:
>>> [Link](4,5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]])
eg-4:
>>> [Link](4,5,-1)
array([[0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.]])
eg-5:
>>> [Link](4,5,-2)
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.]])
27
eg-6:
>>> [Link](4,5,1)
array([[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
eg-7:
>>> [Link](4,5,-10)
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
Ans: D
D. None of these.
Ans: D
8. identity():
--------------
Syntax:
-------
identity(n, dtype=None, *, like=None)
Return the identity array.
Parameters
----------
n : int
Number of rows (and columns) in `n` x `n` output.
dtype : data-type, optional
Data-type of the output. Defaults to ``float``.
eg-1:
>>> [Link](3)
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
eg-2:
>>> [Link](5,dtype=int)
array([[1, 0, 0, 0, 0],
29
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]])
eg-3:
>>> [Link](5,4,dtype=int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: identity() got multiple values for argument 'dtype'
Note: eye() function is generalized function where as identity() function is most specific
function.
identity() function is a special case of eye() function where
1. It should be square matrix(2D array with equal number of rows and columns)
2. Main diagonal should contains ones
Ans: A
Ans: B,C
Revision:
-----------
eye()--->to generate 2-D identity matrix
--->The number of rows and columns need not be same
--->Instead of main diagonal, we can specify diagonal which has to contains 1s.
[Link](2,k=0)
[Link](3,2,k=-1)
---------------------
[Link]()
---> to generate 2-D identity matrix
--->The number of rows and columns must be same
--->only main diagonal should contain 1s
[Link](3)
9. empty():
-----------
empty(shape, dtype=float, order='C', *, like=None)
31
- Return a new array of given shape and type, without initializing entries.
Returns
-------
out : ndarray
Array of uninitialized (arbitrary) data of the given shape, dtype, and
order. Object arrays will be initialized to None.
eg:
>>> [Link]((2,3))
array([[2.33419537e-312, 8.48798317e-313, 1.27319747e-312],
[1.69759663e-312, 2.12199580e-313, 6.36598737e-313]])
zeros() vs empty():
-------------------
If we required an array only with zeros then we should go for zeros().
If we never worry about data, just we required an empty array for future purpose, then
we should go for empty().
The time required to create emtpy array is very very less when compared with zeros array.
i.e performance wise empty() function is recommended than zeros() if we are not worry
about data.
eg:
import numpy as np
from datetime import datetime
import sys
begin = [Link]()
a = [Link]((25000,300,400))
after = [Link]()
32
a= None
begin = [Link]()
a = [Link]((25000,300,400))
after = [Link]()
print('Time taken by empty:',after-begin)
D:\durgaclasses>py [Link]
Time taken by zeros: 0:00:00.430188
Time taken by empty: 0:00:00.056541
1. randint():
-------------
To generate random int values.
Syntax:
randint(low, high=None, size=None, dtype=int)
- Return random integers from 'low' (inclusive) to 'high' (exclusive).
eg-1:
[Link](10,20)
It will generate a random int number which is >=10 but <20. (i.e from 10 to 19)
eg-2:
[Link](20)
It will generate a random number which is in between 0 to 19.
>>> [Link](10,20)
10
>>> [Link](10,20)
18
>>> [Link](10,20)
12
>>> [Link](10,20)
10
>>> [Link](10,20)
12
>>> [Link](10,20)
10
>>> [Link](10,20)
15
>>> [Link](10,20)
13
>>> [Link](10,20)
12
>>> [Link](10,20)
17
34
>>> [Link](10,20)
16
>>> [Link](10)
2
>>> [Link](10)
4
>>> [Link](10)
5
>>> [Link](10)
1
>>> [Link](10)
2
>>> [Link](10)
9
>>> [Link](10)
4
>>> [Link](10)
8
>>> [Link](10)
4
>>> [Link](10)
6
>>> [Link](10)
0
>>> [Link](10)
eg-1: Create nd array of one dimension of 10 size with random values from 1 to 8.
>>> [Link](1,9,size=10)
35
array([7, 7, 5, 7, 2, 2, 2, 1, 8, 5])
>>> [Link](1,9,10)
array([4, 7, 4, 7, 6, 7, 4, 7, 3, 4])
>>> [Link](100,size=(3,5))
array([[45, 15, 41, 26, 26],
[10, 20, 40, 16, 19],
[44, 11, 67, 8, 35]])
Note: Here dtype can be any int type like int32 or int64 etc, but we cannot take any other
types like float.
>>> a = [Link](1,11,7,dtype=int)
>>> a
array([ 9, 9, 1, 10, 8, 5, 5])
>>> [Link]
dtype('int32')
36
>>> a = [Link](1,11,7,dtype='int8')
>>> [Link]
dtype('int8')
>>> a = [Link](1,11,7,dtype='int16')
>>> [Link]
dtype('int16')
>>> a = [Link](1,11,7,dtype='int32')
>>> a
array([ 2, 1, 7, 10, 5, 6, 2])
>>> [Link]
dtype('int32')
>>> a = [Link](1,11,7,dtype='int64')
>>> [Link]
dtype('int64')
>>> a = [Link](1,11,7,dtype='float')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "[Link]", line 763, in [Link]
TypeError: Unsupported dtype dtype('float64') for randint
>>> [Link]
dtype('int16')
>>> a
array([1, 1, 1, 3, 4, 5, 7], dtype=int16)
>>> b = [Link]('float')
>>> b
array([1., 1., 1., 3., 4., 5., 7.])
>>> [Link]
dtype('float64')
Diagram
2. rand() function
-------------------
rand(d0, d1, ..., dn)
-Random values in a given shape.
-Create an array of the given shape and populate it with random samples from a uniform
distribution over [0, 1). 0 inclusive but 1 exclusive.
Parameters
----------
d0, d1, ..., dn : int, optional
The dimensions of the returned array, must be non-negative.
If no argument is given a single float value will be generated.
eg-1:
38
>>> [Link]()
0.054405190205850884
>>> [Link]()
0.040647897775599295
eg-2:
>>> [Link](3)
array([0.83912901, 0.64730918, 0.65829491])
eg-3:
>>> [Link](2,3)
array([[0.88186185, 0.19833909, 0.17236504],
[0.01180205, 0.08002359, 0.50064927]])
eg-4:
>>> [Link](2,3,5)
array([[[0.70663592, 0.23348426, 0.15253474, 0.10795603, 0.39937198],
[0.17003214, 0.41018137, 0.02026827, 0.323256 , 0.01589707],
[0.92207686, 0.14700846, 0.78516395, 0.02655265, 0.0697182 ]],
3. uniform():
----------
Syntax:
uniform(low=0.0, high=1.0, size=None)
Draw samples from a uniform distribution.
39
eg-1: If we are not passing any argument, it simply acts as rand() function.
>>> [Link]()
0.9439394959018282
>>> [Link]()
0.7970971752164758
>>> [Link]()
0.5068985659487931
>>> [Link]()
0.20453164527515644
>>> [Link]()
0.5175842954748464
eg-2:
>>> [Link](10,20)
12.588199652498417
>>> [Link](10,20)
16.44176578692628
>>> [Link](10,20)
12.156096979661884
>>> [Link](10,20)
19.44994224690347
eg-3:
>>> [Link](10,20,size=10)
40
eg-4:
>>> [Link](10,20,size=(2,3))
array([[10.31451286, 11.07349691, 11.46088957],
[19.04364601, 13.81225102, 12.61413272]])
eg-5:
s = [Link](20,30,size=1000000)
import [Link] as plt
count, bins, ignored = [Link](s, 15, density=True)
[Link](bins, np.ones_like(bins), linewidth=2, color='r')
[Link]()
randn()
-------
Syntax:
randn(d0, d1, ..., dn)
Return a sample (or samples) from the "standard normal" distribution of mean 0 and
variance 1.
Parameters
----------
d0, d1, ..., dn : int, optional
The dimensions of the returned array, must be non-negative.
If no argument is given a single Python float is returned.
Returns
41
-------
Z : ndarray or float
A ``(d0, d1, ..., dn)``-shaped array of floating-point samples from
the standard normal distribution, or a single such float if
no parameters were supplied.
>>> [Link]()
1.0156920146285195
>>> [Link](3)
array([ 0.02522907, 0.38010197, -0.27503773])
>>> [Link](3,2)
array([[ 0.9393751 , 0.58826297],
[ 0.00924243, 1.21003746],
[ 1.0109597 , -1.6088758 ]])
>>> [Link](2,3,4)
array([[[ 0.67624207, -1.46274234, 0.00732033, -0.33019154],
[ 1.80526216, 0.71892504, -0.13447181, 2.0164551 ],
[-1.40540834, -0.22872869, 0.60825913, 0.63015516]],
3. normal():
------------
normal(loc=0.0, scale=1.0, size=None)
Draw random samples from a normal (Gaussian) distribution.
42
Parameters
----------
loc : float or array_like of floats
Mean ("centre") of the distribution.
eg-1: If we are not passing any argument, then it acts as randn() function.
>>> [Link]()
0.3733151175386248
>>> [Link]()
-0.9073500065910385
>>> [Link]()
-0.965320002780851
eg-2:
>>> [Link](10,4)
12.597959123744054
>>> [Link](10,4)
12.4328685244566
>>> [Link](10,4)
12.264023606692891
43
>>> [Link](10,4)
10.736425424120322
eg-3:
s = [Link](10,4,size=1000000)
import [Link] as plt
count, bins, ignored = [Link](s, 15, density=True)
[Link](bins, np.ones_like(bins), linewidth=2, color='r')
[Link]()
Note:
-----
[Link]()--->uniform distributed float values from [0,1).
[Link]()--->uniform distributed float values in the given range.
3. shuffle():
-------------
Syntax:
------
shuffle(x)
Modify a sequence in-place by shuffling its contents.
Parameters
----------
x : ndarray or MutableSequence
44
This function only shuffles the array along the first axis of a
multi-dimensional array. The order of sub-arrays is changed but
their contents remains the same.
Returns
-------
In the existing array only, the modification will be happend. Hence it returns None.
eg-1:
>>> a = [Link](9)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> [Link](a)
>>> a
array([1, 7, 4, 2, 8, 3, 5, 0, 6])
eg-2:
In 2-D array, only rows will be shuffled but not internal elements.
>>> a = [Link](1,101,size=(6,5))
>>> a
array([[ 89, 8, 69, 29, 48],
[ 70, 100, 41, 99, 56],
[ 70, 41, 62, 95, 86],
[ 49, 7, 27, 76, 87],
[ 44, 27, 44, 2, 57],
45
Diagram
If we apply shuffle for 3-D array, then the order of 2-D arrays will be changed but not its
internal content.
>>> a = [Link](48).reshape(4,3,4)
>>> a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
s = [Link](-1,0,1000)
import [Link] as plt
count, bins, ignored = [Link](s, 15, density=True)
[Link](bins, np.ones_like(bins), linewidth=2, color='r')
[Link]()
-------------------------------
s = [Link](100000)
import [Link] as plt
count, bins, ignored = [Link](s, 15, density=True)
[Link](bins, np.ones_like(bins), linewidth=2, color='r')
[Link]()
------------------------------------------
s = [Link](5)
import [Link] as plt
48
Summary:
-------
1. array()
2. arange()
3. linspace()
4. zeros()
5. ones()
6. full()
7. eye()
8. identity()
9. empty()
10. random library functions
1. randint()
2. rand()
49
3. uniform()
4. randn()
5. normal()
6. shuffle()
Array Attributes:
-----------------
The following are various array attributes.
eg-1:
>>> a = [Link]([10,20,30,40])
>>> [Link]
1
>>> [Link]
(4,)
>>> [Link]
4
>>> [Link]
dtype('int32')
>>> [Link]
4
50
eg-2:
>>> a = [Link]([[10,20,30],[40,50,60]],dtype=float)
>>> a
array([[10., 20., 30.],
[40., 50., 60.]])
>>> [Link]
2
>>> [Link]
(2, 3)
>>> [Link]
6
>>> [Link]
dtype('float64')
>>> [Link]
8
But, Numpy has some extra data types in addition to these types. We can represent data
types by using a single character also.
i --->integer (int8,int16,int32,int64)
b --->boolean
u --->unsigned integer(uint8,uint16,uint32,uint64)
f --->float (float16, float32,float64)
c ---->complex (complex64, complex128)
s----->String
U----->Unicode string
51
M----->datetime
etc
int8
----
The value will be represented by using 8 bits.
The range of values: -128 to 127
int16
-----
The value will be represented by using 16 bits.
The range of values: -32768 to 32767
int32
-----
The value will be represented by using 32 bits.
The range of values: -2147483648 to 2147483647
int64
-----
The value will be represented by using 64 bits.
The range of values: -9223372036854775808 to 9223372036854775807
Note:
1. By default int means int32
2. By default float means float64
Note: int8 type array requires less memory than int32 type array.
52
>>> a = [Link]([10,20,30,40])
>>> import sys
>>> [Link](a)
120
>>> a = [Link]([10,20,30,40],dtype='int8')
>>> [Link](a)
108
eg:
>>> a = [Link]([10,20,30,40])
>>> [Link]
dtype('int32')
>>> a = [Link]([10.5,20.6,30])
>>> [Link]
dtype('float64')
>>> a = [Link]([10,20,30,40],dtype='i8')
>>> [Link]
53
dtype('int64')
>>> a = [Link]([10,20,30,40],dtype='int8')
>>> [Link]
dtype('int8')
>>> a = [Link]([10,20,30,40],dtype='i')
>>> [Link]
dtype('int32')
Note: If the array element unable to convert into our specified type, then we will get
error.
>>> a = [Link](['a',10,10.5],dtype=int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'a'
>>> a = [Link]([10,20,30])
>>> [Link]
dtype('int32')
>>> b = [Link]('float64')
>>> b
array([10., 20., 30.])
>>> [Link]
dtype('float64')
54
Note: By using the corresponding data type function also, we can convert the type of
array.
eg-1:
>>> a = [Link]([10,20,30])
>>> [Link]
dtype('int32')
>>> c = np.float64(a)
>>> c
array([10., 20., 30.])
>>> [Link]
dtype('float64')
eg-2:
>>> a = [Link]([10,20,30,0,40])
>>> [Link]
dtype('int32')
>>> x = np.bool_(a)
>>> x
array([ True, True, True, False, True])
>>> [Link]
dtype('bool')
1. Indexing
2. Slicing
55
1. Indexing:
------------
1. By using index, we can access single element of the array.
2. Numpy follows zero based indexing. ie index of first element is always zero.
3. Numpy supports both positive indexing and negative indexing.
Syntax:
a[index] --->Returns element present at specified index.
>>> a = [Link]([10,20,30,40,50])
>>> a
array([10, 20, 30, 40, 50])
Diagram
>>> a[0]
10
>>> a[2]
30
>>> a[-1]
50
>>> a[-2]
40
Note: If we are trying to access array element with out of range index, then we will get
IndexError.
56
eg:
>>> a[10]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index 10 is out of bounds for axis 0 with size 5
>>> a[-7]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index -7 is out of bounds for axis 0 with size 5
eg:
>>> a = [Link]([[10,20,30],[40,50,60]])
>>> a
array([[10, 20, 30],
[40, 50, 60]])
Diagram
1. To access 50:
----------------
a[1][1]
a[-1][-2]
57
a[1][-2]
a[-1][1]
2. To access 30:
----------------
a[0][2]
a[-2][-1]
a[0][-1]
a[-2][2]
A. a[0][1] --->20
B. a[-2][0] ---->10
C. a[-1][-2] --->50
D. a[-3][-1] --->IndexError
E. a[0][4]---->IndexError
Syntax:
a[i][j][k]
i represents which 2-Dimensional array
j represents row number in that 2-D array
k represents column number in that 2-D array
58
eg:
a[0][1][2]
0 index 2-D array
In that 2-D array, 1 index row number
In that 2-D array, 2-index column number
eg-1:
>>> l = [[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]]]
>>> a = [Link](l)
>>> a
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
eg-2:
-----
59
l = [[[1,2,3,4],[5,6,7,8],[9,10,11,12]],[[13,14,15,16],[17,18,19,20],[21,22,23,24]]]
a = [Link](l)
2--->2-D arrays
3 --->3 Rows in every 2-D array
4 ---> 4 Columns in every 2-D Array
shape=(2,3,4)
It is 3D array which contains two 2D arrays. Each 2D array contains 3 rows and 4 columns.
Total 24 elements are there.
a[i][j][k]
i--->which 2D array
j--->In that 2D array, Row index
k --->In that 2D array, column index
eg:
>>> l = [[[1,2,3,4],[5,6,7,8],[9,10,11,12]],[[13,14,15,16],[17,18,19,20],[21,22,23,24]]]
>>> a = [Link](l)
>>> a
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
Diagram
To access 18 :
-------------
a[1][1][1]
a[-1][-2][-3]
a[1][-2][-3]
a[1][1][-3]
a[-1][1][1]
a[-1][-2][1]
To access 7:
------------
a[0][1][2]
a[-2][-2][-2]
a[i][j][k][l]
i-->which 3d array
j-->in that 3d array, which 2d array
k--->in that 2d array which row
l --->in that 2d array which column
---------------------------------------------------
Slice means part of the object.
Diagram
Syntax-1:
l[begin:end] --->It returns elements from begin index to end-1 index.
>>> l[1:6] ---->It returns elements from 1st index to 5th index
[20, 30, 40, 50, 60]
>>> l[-6:-2] ---->It returns elements from -6th index to -3 index
[20, 30, 40, 50]
>>> l[:4] ---->It returns elements from 0 index to 3 index
[10, 20, 30, 40]
>>> l[2:]--->It returns elements from 2nd index to last element
[30, 40, 50, 60, 70]
>>> l[4:1]
[]
>>> l[4:10000]
[50, 60, 70]
62
>>> l[-10:]
[10, 20, 30, 40, 50, 60, 70]
>>>l[:]
[10, 20, 30, 40, 50, 60, 70]
Syntax-2:
l[begin:end:step]
---It returns elements from begin index to end-1 index based on step value
l[1:6:1]--->It returns elements from 1st index to 5th index and every element will be
considered.
l[1:6:2]--->It returns elements from 1st index to 5th index and every 2nd element will be
considered.
l[1:6:3]--->It returns elements from 1st index to 5th index and every 3rd element will be
considered.
>>> l[1:6:1]
[20, 30, 40, 50, 60]
>>> l[1:6:2]
[20, 40, 60]
>>> l[1:6:3]
[20, 50]
63
[Link] begin,end,step attributes, we can take both positive and negative values.
But for step we cannot take zero.
>>> l[1:6:0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: slice step cannot be zero
3. Based on step value, we can decide whether we have to consider elements in either
forward or backward direction.
If step value is positive, We have to consider elements from begin to end-1 in forward
direction.
If step value is negative, We have to consider elements from begin to end+1 in backward
direction.
eg:
l[5:1:-1] --->Return elements from 5th index to 2nd index in backward direction
l[5:1:-2] --->Return elements from 5th index to 2nd index in backward direction,every
second element we have to consider.
l[4:-5:1]--->from 4th index to -6 in forward direction
>>> l[5:1:-1]
[60, 50, 40, 30]
>>>
>>> l[5:1:-2]
64
[60, 40]
>>> l[4:-5:1]
[]
>>> l[-2:-6:-1]
[60, 50, 40, 30]
>>> l[5:1:-1]
[60, 50, 40, 30]
>>> l[-1::-2]
[70, 50, 30, 10]
>>> l[::]
[10, 20, 30, 40, 50, 60, 70]
>>> l[::-1]
[70, 60, 50, 40, 30, 20, 10]
l[-1:-1:-1]--->Empty
Syntax: a[begin:end:step]
65
Rules:
------
1. All 3 parameters of slice operator are optional.
[Link] begin,end,step attributes, we can take both positive and negative values.
But for step we cannot take zero.
>>> l[1:6:0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: slice step cannot be zero
3. Based on step value, we can decide whether we have to consider elements in either
forward or backward direction.
If step value is positive, We have to consider elements from begin to end-1 in forward
direction.
If step value is negative, We have to consider elements from begin to end+1 in backward
direction.
eg:
>>> a = [Link]([10,20,30,40,50,60,70])
>>> a
array([10, 20, 30, 40, 50, 60, 70])
Diagram
>>> a[2:5]
array([30, 40, 50])
>>> a[:5]
66
Diagram
Syntax:
------
arrayname[row,column]
arrayname[begin:end:step,begin:end:step]
The first slice operator talks about rows and second slice operator talks about columns.
eg-1:
>>> a[:,0:1]
array([[10],
[30],
[50]])
67
>>> a[:,:1]
array([[10],
[30],
[50]])
Note: For both row and column we should use slice operator only. If we use index directly
we may get 1-D array as the result, but not 2-D array.
>>> a[:,0]
array([10, 30, 50])
eg-2:
>>> a[0:2,:]
array([[10, 20],
[30, 40]])
>>> a[:2,:]
array([[10, 20],
[30, 40]])
eg-3:
>>> a[::2,:]
array([[10, 20],
[50, 60]])
case study:
----------
68
>>> l = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]
>>> a = [Link](l)
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
Diagram
eg-1:
>>> a[0:2,:]
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
>>> a[:2,:]
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
eg-2:
>>> a[0::3,:]
array([[ 1, 2, 3, 4],
[13, 14, 15, 16]])
>>> a[::3,:]
array([[ 1, 2, 3, 4],
[13, 14, 15, 16]])
eg-3:
>>> a[:3,::2]
array([[ 1, 3],
69
[ 5, 7],
[ 9, 11]])
eg-4:
>>> a[1:3,1:3]
array([[ 6, 7],
[10, 11]])
eg-5:
>>> a[::3,::3]
array([[ 1, 4],
[13, 16]])
>>> [Link]
(2, 3, 4)
Diagram
Syntax:
a[i,j,k]
i--->For which 2D arrays
j--->In those 2D arrays, which rows
k --->In those 2D arrays, which columns
a[begin:end:step,begin:end:step,begin:end:step]
eg-1:
>>> a[0:1,0:2,0:2]
array([[[1, 2],
[5, 6]]])
>>> a[:1,:2,:2]
array([[[1, 2],
[5, 6]]])
eg-2:
>>> a[:,:,0:1]
array([[[ 1],
[ 5],
[ 9]],
[[13],
71
[17],
[21]]])
>>> a[:,:,:1]
array([[[ 1],
[ 5],
[ 9]],
[[13],
[17],
[21]]])
eg-3:
>>> a[:,1:,1:3]
array([[[ 6, 7],
[10, 11]],
[[18, 19],
[22, 23]]])
>>> a[:,1:3,1:3]
array([[[ 6, 7],
[10, 11]],
[[18, 19],
[22, 23]]])
eg-4:
72
>>> a[:,::2,::3]
array([[[ 1, 4],
[ 9, 12]],
[[13, 16],
[21, 24]]])
eg-5:
>>> a[:,::2,::]
array([[[ 1, 2, 3, 4],
[ 9, 10, 11, 12]],
Advanced Indexing:
------------------
By using index we can access only one element at a time.
Syntax: a[i], a[i][j], a[i][j][k]
By using Slice operator, we can access multiple elements but should be in the order.
Syntax:
a[begin:end:step],
a[begin:end:step,begin:end:step],
a[begin:end:step,begin:end:step,begin:end:step ]
eg-1:
>>> a = [Link]([10,20,30,40,50,60,70,80,90])
>>> a
array([10, 20, 30, 40, 50, 60, 70, 80, 90])
Diagram
To access 10,30,80:
-------------------
1st way:
------
create nd array with required indexes and then by passing that array as argument we can
access corresponding elements.
2nd way:
-------
74
create list with required indexes and then by passing that list as argument we can access
corresponding elements.
>>> l = [0,2,7]
>>> a[l]
array([10, 30, 80])
>>> a[[0,2,7]]
array([10, 30, 80])
>>> a[[2,6,-3]]
array([30, 70, 70])
Diagram
Syntax:
a[[row_indexes],[column_indexes]]
75
eg-1: a[[0,1,2,3],[0,1,2,3]]
The elements present at (0,0), (1,1), (2,2),(3,3) will be selected.
>>> a[[0,1,2,3],[0,1,2,3]]
array([ 1, 6, 11, 16])
9
eg-2: a[[1,3],[0,2]]
The elements present at (1,0),(3,2) will be selected.
>>> a[[1,3],[0,2]]
array([ 5, 15])
Note: Observation:
------------------
>>> a[[0,2],[0,1,3]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes
(2,) (3,)
>>> a[[0,2,3],[0,1]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes
(3,) (2,)
>>> a[[0,2,3],[1]]
array([ 2, 10, 14])
76
>>> a[[0],[0,1,3]]
array([1, 2, 4])
Diagram
Syntax:
array[[indexes of 2d array],[row indexes],[column indexes]]
eg-1:
a[[0,1,1],[0,0,2],[0,1,3]]
The elements present at (0,0,0),(1,0,1),(1,2,3) will be selected.
77
>>> a[[0,1,1],[0,0,2],[0,1,3]]
array([ 1, 14, 24])
eg-2:
a[[0,1],[1,0],[1,3]]
The elements present at (0,1,1),(1,0,3) will be selected.
>>> a[[0,1],[1,0],[1,3]]
array([ 6, 16])
eg-3:
>>> a[[-1,-2],[-3,-1],[-1,-3]]
array([16, 10])
Summary:
-------
1. To access elements from 1-D Array
a[x] ---->x can be nd array or list which contains indexes
Syntax: array[boolean_array]
It selects all elements where boolean array contains True.
eg-1:
>>> a = [Link]([10,-5,20,40,-3,-1,75])
>>> a
array([10, -5, 20, 40, -3, -1, 75])
To select all elements which are less than 0, First create boolean array.
>>> boolean_array = a<0
>>> boolean_array
array([False, True, False, False, True, True, False])
Note: We can use this approach for any dimensional array, But output is always 1-D array.
79
eg-3:
l = [[[1,2,3,4],[5,6,7,8],[9,10,11,12]], [[13,14,15,16],[17,18,19,20],[21,22,23,24]] ]
a = [Link](l)
eg:
l1=[10,20,30,40]
l2=l1[::]
print(id(l1)) #2793470048384
print(id(l2)) #2793541280832
l2[0]=9999
print(l2) #[9999, 20, 30, 40]
print(l1) #[10, 20, 30, 40]
But in numpy slicing, a new copy won't be created and just we are getting view of the
existing nd array. If we perform any changes in the original array, those changes will be
reflected to the sliced copy and even vice-versa also.
>>> a1 = [Link]([10,20,30,40,50,60,70])
>>> a2 = a1[0:4]
>>> print(a1)
[10 20 30 40 50 60 70]
>>> print(a2)
[10 20 30 40]
>>> a1[0]=7777
>>> print(a1)
[7777 20 30 40 50 60 70]
>>> print(a2)
[7777 20 30 40]
>>> a2[1]=9999
>>> a2
array([7777, 9999, 30, 40])
>>> a1
array([7777, 9999, 30, 40, 50, 60, 70])
>>> a1 = [Link]([10,20,30,40,50,60,70])
>>> a2=a1[[0,2,5]]
>>> print(a1)
[10 20 30 40 50 60 70]
81
>>> print(a2)
[10 30 60]
>>> a1[0]=8888
>>> print(a1)
[8888 20 30 40 50 60 70]
>>> print(a2)
[10 30 60]
3. In numpy slicing, we wont get a new object just we will get view of the original object. If
we perform any changes to the original copy, those changes will be reflected to the sliced
copy.
3. But in the case of advanced indexing, a new separate copy will be created. If we
perform any changes in one copy, then those changes won't be reflected in other.
4. Memory and Performance point of view slicing is the best choice.
4. Memory and Performance point of view Advanced Indexing is not best choice.
D:\durgaclasses>py [Link]
10
20
30
40
50
60
import numpy as np
a = [Link]([[10,20,30],[40,50,60]])
print('1-D arrays one by one:')
for x in a:
print(x)
D:\durgaclasses>py [Link]
1-D arrays one by one:
[10 20 30]
[40 50 60]
Elements one by one:
10
20
30
40
50
60
print(a)
D:\durgaclasses>py [Link]
[[[10 20]
[30 40]]
[[40 50]
[60 70]]]
Elements one by one:
10
20
30
40
40
50
60
70
Note: To iterate elements of n-D array by using Python, we required n loops, which is not
convenient. To overcome this problem we should go for nditer() function.
85
D:\durgaclasses>py [Link]
10
20
30
40
50
60
print(x)
D:\durgaclasses>py [Link]
10
20
30
40
40
50
60
70
D:\durgaclasses>py [Link]
[[10 20 30]
[40 50 60]
[70 80 90]]
10
20
87
40
50
70
80
import numpy as np
a = [Link]([[[10,20],[30,40]],[[40,50],[60,70]]])
for x in [Link](a,op_dtypes=['float']):
print(x)
TypeError: Iterator operand required copying or buffering, but neither copying nor
buffering was enabled
Numpy won't change the type of elements in existing array. To store changed type
elements, we required temporary storage, which is nothing but buffer. We have to enable
that buffer.
import numpy as np
a = [Link]([[[10,20],[30,40]],[[40,50],[60,70]]])
for x in [Link](a,flags=['buffered'],op_dtypes=['float']):
print(x)
D:\durgaclasses>py [Link]
10.0
20.0
30.0
88
40.0
40.0
50.0
60.0
70.0
eg-2:
import numpy as np
a = [Link]([[[10,20],[30,40]],[[40,50],[60,70]]])
for x in [Link](a):
print([Link]) #int32
for x in [Link](a,flags=['buffered'],op_dtypes=['int64']):
print([Link]) #int64
D:\durgaclasses>py [Link]
10 element present at index:(0, 0)
20 element present at index:(0, 1)
90
eg-2:
import numpy as np
a = [Link]([[10,20,30],[40,50,60],[70,80,90]])
for idx, element in [Link](a):
print(f'{element+9999} element present at index:{idx}')
D:\durgaclasses>py [Link]
10009 element present at index:(0, 0)
10019 element present at index:(0, 1)
10029 element present at index:(0, 2)
10039 element present at index:(1, 0)
10049 element present at index:(1, 1)
10059 element present at index:(1, 2)
10069 element present at index:(2, 0)
10079 element present at index:(2, 1)
10089 element present at index:(2, 2)
Arithmetic Operators :
----------------------
The following are various arithmetic operators:
91
+ --->Addition
- --->Subtraction
* --->Multiplication
/ --->Division
// ---->Floor Division
% --->Modulo operation/Remainder Operation
** --->Exponential operation/power operation
Note:
The result of division operator(/) is always float. But floor division operator(//) can return
either integer and float values. If both arguments are of type int, then floor division
operator returns int value only. If atleast one argument is float type then it returns float
type only.
eg-1:
print(3/2) #1.5
print(3//2) #1
print(3.0//2) #1.0
eg-2:
print(4/2) #2.0
print(4//2) #2
print(4//2.0) #2.0
eg-3:
print(10+2) #12
print(10-2) #8
print(10*2) #20
print(10/2) #5.0
92
print(10//2) #5
print(10%2) #0
print(10**2) #100
All these python's arithmetic operators are applicable on Numpy Arrays also.
eg-1: 1D array:
-------------
>>> a = [Link]([10,20,30,40,50])
>>> a+2
array([12, 22, 32, 42, 52])
>>> a*2
array([ 20, 40, 60, 80, 100])
>>> a-2
array([ 8, 18, 28, 38, 48])
>>> a/2
array([ 5., 10., 15., 20., 25.])
>>> a//2
array([ 5, 10, 15, 20, 25], dtype=int32)
>>> a%2
array([0, 0, 0, 0, 0], dtype=int32)
>>> a**2
array([ 100, 400, 900, 1600, 2500], dtype=int32)
93
***Note:
In Python, divide by zero is not possible. If we are trying to perform, then we will get
ZeroDivisionError.
But in Numpy we won't get ZeroDivisionError. If the result is undefined (0/0) then it is
treated as nan (not a number). If the result is infinity (10/0) then it is treated as inf.
eg:
>>> a = [Link](6)
>>> a
array([0, 1, 2, 3, 4, 5])
>>> a/0
<stdin>:1: RuntimeWarning: divide by zero encountered in true_divide
<stdin>:1: RuntimeWarning: invalid value encountered in true_divide
array([nan, inf, inf, inf, inf, inf])
>>> b = a/0
>>> b
array([nan, inf, inf, inf, inf, inf])
eg-2:
>>> a/[Link]
array([0., 0., 0., 0., 0., 0.])
95
eg-2:
>>> a = [Link]([100,200,300,400])
>>> b = [Link]([10,20,30,40])
>>> a
array([100, 200, 300, 400])
>>> b
array([10, 20, 30, 40])
>>> a+b
96
Note: To perform arithmetic operators between numpy arrays, compulsory both arrays
should have same dimension, same shape and same size, otherwise we will get error.
eg:
>>> a = [Link]([10,20,30])
>>> b = [Link]([40,50,60,70])
>>> a+b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,) (4,)
Note:
For every arithmetic operator numpy library defines equivalent functions and we can use
these functions also.
a+b ----->[Link](a,b)
98
a-b ----->[Link](a,b)
a*b ----->[Link](a,b)
a/b ----->[Link](a,b)
a//b ----->np.floor_divide(a,b)
a%b ----->[Link](a,b)
a**b ----->[Link](a,b)
eg-1:
>>> a = [Link]([10,20,30,40])
>>> b = [Link]([1,2,3,4])
>>> [Link](a,b)
array([11, 22, 33, 44])
>>> [Link](a,b)
array([ 9, 18, 27, 36])
>>> [Link](a,b)
array([ 10, 40, 90, 160])
>>> [Link](a,b)
array([10., 10., 10., 10.])
>>> [Link](a,b)
array([0, 0, 0, 0], dtype=int32)
>>> [Link](a,b)
array([ 10, 400, 27000, 2560000], dtype=int32)
>>> np.floor_divide(a,b)
array([10, 10, 10, 10], dtype=int32)
>>> [Link](a,1)
array([11, 21, 31, 41])
Note: To use these functions also, both arrays should be of same dimension, same shape
and same size.
99
Note: The functions which opearates element by element on whole array, are called
universal functions(ufunc). All the above functions are universal functions.
Broadcasting:
-------------
Even though, arrays are of different dimensions, different shapes and different sizes, still
we can perform arithmetic operations between them. It is possible by broadcating.
Broadcasting won't be possible in all cases. It has some rules. If the rules are satisfied then
only broadcasting will be performed internally while performing arithmetic operations.
Note: If both arrays have same dimension, same shape and same size then broadcasting is
not required. Different dimensions or different shapes or different sizes then only
broadcasting is required.
Rules of Broadcasting:
----------------------
Rule-1. If the two arrays are of different dimensions, numpy will make equal dimensions.
Padded 1's in the shape of fewer dimension array on the left side.
eg:
a: (4,3)
b: (3,)
a is 2-D array where as b is 1-D array. Broadcasting will make these two arrays as 2-D
arrays.
a: (4,3)
100
b: (1,3)
Rule-2:
If the size of 2 arrays does not match in any dimension, the array with size equal to 1 in
that dimension is expanded/increased to match other size of the same dimension.
Before:
a: (4,3)
b: (1,3)
After:
a: (4,3)
b: (4,3)
Now both arrays are of same dimension and same shape, then airthmetic operation will
be performed normally element wise.
Note:
1. In any dimension, the sizes are not matched and neither equal to 1, then we will get
error.
eg:
a: (4,3)
b: (2,3)
Rule 2 fails and hence we will get ValueError and broadcating won't be happend.
Note:
Rule-1: to make dimensions same
Rule-2: To make same sizes in every dimension
eg-1:
a = [Link]([10,20,30,40])
b = [Link]([1,2,3])
a: (4,)
b: (3,)
In any dimension, the sizes are not matched and neither equal to 1, then we will get error.
>>> a = [Link]([10,20,30,40])
>>> b = [Link]([1,2,3])
>>> [Link]
(4,)
>>> [Link]
(3,)
>>> a+b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (4,) (3,)
102
eg-2:
a = [Link]([10,20,30])
b = [Link]([40])
Now
a = [Link]([10,20,30])
b = [Link]([40,40,40])
a+b will be performed
>>> a = [Link]([10,20,30])
>>> b = [Link]([40])
>>> [Link]
(3,)
>>> [Link]
(1,)
>>> a+b
array([50, 60, 70])
eg-3:
a = [Link]([[10,20],[30,40],[50,60]])
b = [Link]([10,20])
Rule-1:
Before: a's shape:(3,2) and b's shape:(2,)
After: a's shape:(3,2) and b's shape:(1,2)
Rule-2:
Before: a's shape:(3,2) and b's shape:(1,2)
After: a's shape:(3,2) and b's shape:(3,2)
a = [Link]([[10,20],[30,40],[50,60]])
b = [Link]([[10,20],[10,20],[10,20]])
>>> a = [Link]([[10,20],[30,40],[50,60]])
>>> b = [Link]([10,20])
>>> a
array([[10, 20],
[30, 40],
[50, 60]])
>>> b
array([10, 20])
>>> a+b
array([[20, 40],
[40, 60],
[60, 80]])
eg-4:
a = [Link]([[10],[20],[30]])
b = [Link]([10,20,30])
104
a+b
Rule-1:
a:(3,1)
b:(1,3)
Rule-2:
a:(3,3)
b:(3,3)
Note:
1. In broadcasting if rows are required to increase, then repeat existing row.
1. In broadcasting if columns are required to increase, then repeat existing column.
Even broadcasting also won't work in this case. Hence we will get error.
Syntax:
reshape(a, newshape, order='C')
Gives a new shape to an array without changing its data.
newshape : int or tuple of ints
105
Here we didn't specify order and hence default value 'C' is considered, which is meant for
row major order.
eg-1:
>>> c = [Link](a,(5,2),order='F')
>>> [Link]
(5, 2)
>>> a
106
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> c
array([[ 1, 6],
[ 2, 7],
[ 3, 8],
[ 4, 9],
[ 5, 10]])
Conclusion-1:
------------
While using reshape() function, make sure the sizes should be matched otherwise we will
get error.
>>> a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> b = [Link](a,(5,3))
ValueError: cannot reshape array of size 10 into shape (5,3)
Conclusion-2:
-------------
reshape() won't provide a separate new array object, it will provide just view of the
existing array because there is no change in data.
Hence if we perform any change in the reshaped array, it will be reflected in the original
array and vice-versa also.
>>> a = [Link](1,16)
>>> a
107
Note: ndarray class also contains reshape() method and hence we can call this method on
any ndarray object.
numpy--->module
ndarray ---->class present in numpy module
reshape()-->method present in ndarray class
reshape()--->function present in numpy module.
eg-1:
>>> a = [Link](1,25)
>>> a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24])
>>> b = [Link]((2,3,4))
>>> b
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
>>> c
array([[[ 1, 7, 13, 19],
[ 3, 9, 15, 21],
[ 5, 11, 17, 23]],
[[ 2, 8, 14, 20],
[ 4, 10, 16, 22],
[ 6, 12, 18, 24]]])
From Documentation:
One shape dimension can be -1. In this case, the value is inferred from the length of the
array and remaining dimensions automatically by numpy itself.
b = [Link]((5,-1)) #valid
b = [Link]((-1,5)) #valid
b = [Link]((-1,-1)) #invalid
eg-1:
>>> a = [Link](1,11)
>>> a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> b = [Link]((5,-1))
110
>>> [Link]
(5, 2)
eg-2:
>>> b = [Link]((-1,5))
>>> [Link]
(2, 5)
eg-3:
>>> b = [Link]((-1,3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: cannot reshape array of size 10 into shape (3)
eg-4:
>>> b = [Link]((-1,-1))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: can only specify one unknown dimension
eg-5:
>>> a = [Link](1,13)
>>> a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
>>> b = [Link]((2,2,-1))
>>> [Link]
(2, 2, 3)
>>> b = [Link]((3,-1,2))
>>> [Link]
111
(3, 2, 2)
Conclusions of reshape():
-------------------------
1. To reshape array without changing data.
2. The sizes must be matched.
3. We can use as numpy library function or ndarray class method.
4. It won't create a new array object, just we will get view.
5. We can use -1 in the case of unknown dimension, but only once.
eg-1:
>>> a = [Link](1,6)
>>> a
array([1, 2, 3, 4, 5])
>>> b = [Link](a,(2,4))
>>> b
array([[1, 2, 3, 4],
[5, 1, 2, 3]])
>>> a
array([1, 2, 3, 4, 5])
112
eg-2:
>>> c = [Link](a,(2,2))
>>> c
array([[1, 2],
[3, 4]])
Conclusions:
1. A separate new array object will be created.
2. There may be a chance of changing data.
3. The sizes are need not be same.
4. There is no chance to use -1 in the case of unknown dimension.
>>> c = [Link](a,(2,-1))
ValueError: all elements of 'new_shape' must be non-negative
***Note:
If we are using ndarray class resize() method, inline modification will be happend. ie
existing array only modifed. If new_shape requires more elements then extra elements
filled with zeros.
eg-1:
>>> m = [Link](1,6)
>>> [Link]((4,2))
>>> m
array([[1, 2],
[3, 4],
[5, 0],
[0, 0]])
-------------------------------------------------------
table: [Link]() | [Link]()
3. If the new_shape requires more elements then repeated copies of original array will be
used.
3. If the new_shape requires more elements then extra elements filled with zeros.
2. It won't create new array object and just we will get view of existing array.
If we perform any changes in the reshaped copy, automatically those changes will be
reflected in original copy.
Syntax:
[Link](order='C')
It will create a new 1-Dimensional array with elements of given n-Dimensional array. ie
Return a copy of the array collapsed into one dimension.
>>> c
array([ 1, 3, 5, 7, 9, 2, 4, 6, 8, 10])
[[ 7, 8],
[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16],
[17, 18]]])
>>> b = [Link]()
>>> b
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18])
>>> c = [Link](order='F')
>>> c
array([ 1, 7, 13, 3, 9, 15, 5, 11, 17, 2, 8, 14, 4, 10, 16, 6, 12,
18])
Note:
1. flatten will always create a new array object. Hence on the flatten copy, if we perform
any changes then those changes won't be reflected to original copy.
eg:
116
>>> a = [Link](1,7).reshape(3,2)
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> b = [Link]()
>>> b
array([1, 2, 3, 4, 5, 6])
>>> b[0]=777
>>> b
array([777, 2, 3, 4, 5, 6])
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
2. flatten() will always returns 1-D array only.
===================================================
flat variable:
--------------
It is a 1-D iterator over the array.
This is a '[Link]' instance.
eg:
>>> a = [Link](1,7).reshape(3,2)
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> [Link]
117
Syntax:
[Link](a, order='C')
eg-1:
>>> a = [Link](1,7).reshape(3,2)
>>> b = [Link](a)
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> b
array([1, 2, 3, 4, 5, 6])
>>> b[0]=7777
118
>>> b
array([7777, 2, 3, 4, 5, 6])
>>> a
array([[7777, 2],
[ 3, 4],
[ 5, 6]])
Note: For numpy library ravel() function, ndarray class contains equivalent ravel() method.
We can use anything, functionality is exactly same.
eg:
>>> a = [Link](1,19).reshape(2,3,3)
>>> a
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
1. It can used to flatten n-D array to 1-D array and a separate 1-D array object will be
created.
119
1. It can used to flatten n-D array to 1-D array but it won't create a new 1-D array object
and we will get only view.
2. If we perform any changes in the flatten() copy, then those changes won't be reflected
in the original copy.
2. If we perform any changes in the ravel() copy, then those changes will be reflected in
the original copy.
3. flatten() method operates slower than ravel() as it is required to create a new array
object.
3. ravel() method operates faster than flatten() as it is not required to create a new array
object and it just returns a view.
4. flatten() is not numpy library level function and it is a method present in ndarray class.
[Link]() --->valid
[Link](a) --->invalid
4. ravel() is both numpy library level function and ndarray class method.
[Link]() --->valid
[Link](a) --->valid
Syntax:
-------
[Link](a,axes=None)
Reverse or permute the axes of an array; returns the modified array(View but not copy)
For an array a with two axes, transpose(a) gives the matrix transpose.
If we are not providing axes argument value, then the dimensions simply reversed.
In transpose() operation, just dimensions will be interchanged, but not content. Hence it is
not required to create new array and it returns view of the existing array.
eg:
(2,3)--->(3,2)
(2,3,4)-->(4,3,2),(2,4,3),(3,2,4),(3,4,2)
(2,3,4):
2--->2 2-Dimensional arrays
3 --->In every 2-D array, 3 rows
4 --->In every 2-D array, 4 columns
24--->Total number of elements
In reshape we can change the size of dimension, but total size should be same.
eg: input: (3,4)
121
output: (4,3),(2,6),(6,2),(1,12),(12,1),(2,2,3),(3,2,2)
But in transpose just we are interchanging the dimensions,but we won't change the size of
any dimension.
In 2-D array, because of transpose, rows will become columns and columns will become
rows.
(3,2) --->(2,3)
In transpose, we will get only view part.
>>> b[0][0]=7777
122
>>> b
array([[7777, 3, 5],
[ 2, 4, 6]])
>>> a
array([[7777, 2],
[ 3, 4],
[ 5, 6]])
2--->2-D arrays
3--->3 rows in every 2-D array
4--->4 columns in every 2-D array
If we perform transpose operation without axes argument, then dimensions will be simply
reversed.
>>> b = [Link](a)
>>> b
array([[[ 1, 13],
[ 5, 17],
[ 9, 21]],
[[ 2, 14],
[ 6, 18],
[10, 22]],
[[ 3, 15],
[ 7, 19],
[11, 23]],
[[ 4, 16],
[ 8, 20],
[12, 24]]])
>>> a = [Link](1,6)
>>> a
array([1, 2, 3, 4, 5])
>>> b=[Link](a)
>>> b
array([1, 2, 3, 4, 5])
124
axes parameter:
---------------
If we are not using axes parameter, then dimensions simply reversed.
axes parameter describes in which order we have to take axes.
It is very helpful for 3-D and 4-D arrays.
>>> a = [Link](1,13).reshape(3,4)
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> b = [Link](a,axes=(0,1))
>>> b
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
input: (3,4)
The size of axis-0 is 3.
125
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> [Link](a,axes=(1,0))
array([[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11],
[ 4, 8, 12]])
eg-2: for 3-D array:
---------------------
input: (2,3,4)
The size of axis-0 is:2
The size of axis-1 is:3
The size of axis-2 is:4
>>> a = [Link](1,25).reshape(2,3,4)
>>> a
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
b = [Link](a,axes=(2,0,1))
The shape of b: (4,2,3)
4 --->2-D arrays
2---->2 rows in every 2-D array
3---->3 columns in every 2-D array
>>> b = [Link](a,axes=(2,0,1))
>>> b
array([[[ 1, 5, 9],
[13, 17, 21]],
[[ 2, 6, 10],
[14, 18, 22]],
[[ 3, 7, 11],
[15, 19, 23]],
[[ 4, 8, 12],
[16, 20, 24]]])
input: (2,3,4)
The size of axis-0 is:2
The size of axis-1 is:3
The size of axis-2 is:4
b = [Link](a,axes=(1,2,0))
The shape of b:(3,4,2)
127
>>> b = [Link](a,axes=(1,2,0))
>>> b
array([[[ 1, 13],
[ 2, 14],
[ 3, 15],
[ 4, 16]],
[[ 5, 17],
[ 6, 18],
[ 7, 19],
[ 8, 20]],
[[ 9, 21],
[10, 22],
[11, 23],
[12, 24]]])
***Note: If we repeat the same axis multiple times then we will get error.
>>> b = [Link](a,axes=(2,2,1))
ValueError: repeated axis in transpose
Syntax:
-------
[Link](*axes)
axes : None, tuple of ints, or 'n' ints
128
eg:
>>> a = [Link](1,25).reshape(2,3,4)
>>> b = [Link]()
>>> [Link]
(4, 3, 2)
>>> b = [Link]((2,0,1))
>>> [Link]
(4, 2, 3)
>>> a = [Link](1,25).reshape(2,3,4)
>>> b=a.T
>>> [Link]
(4, 3, 2)
Syntax:
[Link](a, axis1, axis2)
Interchange two axes of an array.
Parameters
----------
a : array_like
Input array.
axis1 : int
First axis.
axis2 : int
Second axis.
eg-1:
>>> a = [Link](1,7).reshape(3,2)
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> b = [Link](a,0,1)
130
>>> [Link]
(2, 3)
>>> b
array([[1, 3, 5],
[2, 4, 6]])
Q. What is the difference between the two lines for 2-D array, a:
b = [Link](a,0,1)
b = [Link](a,1,0)
There is no difference between these lines just interchange rows and columns.
eg-2:
input: 3-D array of (2,3,4)
b = [Link](a,0,2)
[Link] --->(4,3,2)
b = [Link](a,1,0)
[Link] --->(3,2,4)
>>> a = [Link](1,25).reshape(2,3,4)
>>> b = [Link](a,0,2)
>>> [Link]
(4, 3, 2)
>>> b = [Link](a,0,1)
>>> [Link]
131
(3, 2, 4)
[Link]():
--------------------
ndarray class also contains swapaxes() method which is exactly same as numpy library
swapaxes() function.
[Link](axis1, axis2)
Return a view of the array with 'axis1' and 'axis2' interchanged.
[Link] : equivalent function
eg:
>>> a = [Link](1,25).reshape(2,3,4)
>>> b = [Link](1,2)
>>> [Link]
(2, 4, 3)
We can join/concatenate multiple ndarrays into a single array by using the following
functions.
132
1. concatenate()
2. stack()
3. vstack()
4. hstack()
5. dstack()
Rules:
1. We can join any number of arrays, but all arrays should be of same dimension.
2. The sizes of all axes, except concatenation axis should be same.
3. The shapes of resultant array and out array must be same.
Note: Compulsory the out parameter shape and resultant array shape must be matched,
otherwise we will get error.
>>> d = [Link](10)
134
>>> [Link]((a,b,c),out=d)
ValueError: Output array is the wrong shape
[ 70, 80],
[ 90, 100]])
>>> [Link]((a,b),axis=0)
array([[ 10, 20],
[ 30, 40],
[ 50, 60],
[ 70, 80],
[ 90, 100]])
>>> [Link]((a,b),axis=1)
ValueError: all the input array dimensions for the concatenation axis must match exactly,
but along dimension 0, the array at index 0 has size 3 and the array at index 1 has size 2
Note:
[Link]((a,b))--->valid
[Link]((a,b),axis=0)--->valid
[Link]((a,b),axis=1)--->invalid
[Link]((a.T,b),axis=1)--->valid
a.T --->(2,3)
b----> (2,2)
>>> [Link]((a.T,b),axis=1)
array([[ 10, 30, 50, 70, 80],
[ 20, 40, 60, 90, 100]])
eg-4:
>>> a = [Link]([[1, 2], [3, 4]])
>>> b = [Link]([[5, 6]])
>>> [Link]((a, b), axis=0)
array([[1, 2],
136
[3, 4],
[5, 6]])
>>> [Link]((a, b.T), axis=1)
array([[1, 2, 5],
[3, 4, 6]])
Note:
If axis = None, then arrays will be flatten (converts to 1-D array) and then perform
concatenation.
>>> a
array([[10, 20],
[30, 40],
[50, 60]])
>>> b
array([[ 70, 80],
[ 90, 100]])
>>> [Link]((a,b),axis=None)
array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
>>> a = [Link](12).reshape(2,3,2)
>>> a
137
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]]])
>>> b = [Link](18).reshape(2,3,3)
>>> b
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]])
>>> c = [Link]((a,b),axis=2)
>>> [Link]
(2, 3, 5)
>>> c
array([[[ 0, 1, 0, 1, 2],
[ 2, 3, 3, 4, 5],
[ 4, 5, 6, 7, 8]],
[[ 6, 7, 9, 10, 11],
[ 8, 9, 12, 13, 14],
[10, 11, 15, 16, 17]]])
>>> [Link]((a,b),axis=None)
138
Q. Sir if we have 2 arrays....2,3,3 and 1,3,3 then how will the arrays concatenate on axis
0?(diagramatically)
The 'axis' parameter specifies the index of the new axis in the dimensions of the result.
For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last
dimension.
>>> a = [Link]([10,20,30])
>>> b = [Link]([40,50,60,70])
>>> [Link]((a,b))
ValueError: all input arrays must have the same shape
>>> [Link]((a,b),axis=0)
In 2-D array, stack row-wise all input arrays.
Read row wise from input arrays and arrange column wise in result array.
>>> [Link]((a,b),axis=1)
array([[10, 40],
[20, 50],
[30, 60]])
>>> [Link]((a,b),axis=-1)
array([[10, 40],
[20, 50],
[30, 60]])
[[ 7, 8, 9],
141
[[ 4, 5, 6],
[10, 11, 12]]])
>>> [Link]((a,b),axis=2)
array([[[ 1, 7],
[ 2, 8],
[ 3, 9]],
[[ 4, 10],
[ 5, 11],
[ 6, 12]]])
>>> a = [Link](1,10).reshape(3,3)
>>> b = [Link](10,19).reshape(3,3)
142
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b
array([[10, 11, 12],
[13, 14, 15],
[16, 17, 18]])
[[ 4, 5, 6],
[13, 14, 15]],
[[ 7, 8, 9],
143
[[ 4, 13],
[ 5, 14],
[ 6, 15]],
[[ 7, 16],
[ 8, 17],
[ 9, 18]]])
>>> [Link]((a,b,c))
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> [Link]((a,b,c),axis=1)
array([[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]])
eg-5: stacking of three 2-D arrays:
-----------------------------------
>>> a = [Link](4).reshape(2,2)
>>> b = [Link](4,8).reshape(2,2)
>>> c = [Link](8,12).reshape(2,2)
>>> a
array([[0, 1],
[2, 3]])
>>> b
array([[4, 5],
[6, 7]])
>>> c
array([[ 8, 9],
[10, 11]])
>>> [Link]((a,b,c),axis=0)
array([[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]],
145
[[ 8, 9],
[10, 11]]])
>>> [Link]((a,b,c),axis=1)
array([[[ 0, 1],
[ 4, 5],
[ 8, 9]],
[[ 2, 3],
[ 6, 7],
[10, 11]]])
>>> [Link]((a,b,c),axis=2)
array([[[ 0, 4, 8],
[ 1, 5, 9]],
[[ 2, 6, 10],
[ 3, 7, 11]]])
vstack(tuple of arrays)
Stack arrays in sequence vertically (row wise).
Rules:
1. The input arrays must have the same shape along all except first axis(axis-0)
2. 1-D arrays must have the same size.
146
3. The array formed by stacking the given arrays, will be at least 2-D.
4. vstack() operation is equivalent to concatenation along the first axis after 1-D arrays of
shape (N,) have been reshaped to (1,N).
5. For 2-D or more dimension arrays, vstack() simply acts as concatenation wrt axis-0.
>>> a = [Link]([10,20,30,40])
>>> b = [Link]([50,60,70,80])
>>> [Link]((a,b))
array([[10, 20, 30, 40],
[50, 60, 70, 80]])
>>> a = [Link]([10,20,30,40])
>>> b = [Link]([50,60,70,80,90,100])
>>> [Link]((a,b))
147
ValueError: all the input array dimensions for the concatenation axis must match exactly,
but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 6
Note: To use vstack() function for 1-D arrays, compulsory input arrays must be of same
size/shape.
>>> a
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
Syntax:
hstack(tuple of arrays)
Stack arrays in sequence horizontally (column wise).
Rules:
1. This is equivalent to concatenation along the second axis, except for 1-D
arrays where it concatenates along the first axis.
2. All input arrays must be same dimension.
3. Except axis-1, all remining sizes must be equal.
>>> a = [Link](1,7).reshape(3,2)
>>> b = [Link](7,16).reshape(3,3)
>>> [Link]((a,b))
array([[ 1, 2, 7, 8, 9],
[ 3, 4, 10, 11, 12],
[ 5, 6, 13, 14, 15]])
Syntax:
dstack(tuple of input arrays)
Stack arrays in sequence depth wise (along third axis).
Rules:
1. This is equivalent to concatenation along the third axis after 2-D arrays
of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape
(N,) have been reshaped to (1,N,1).
2. The arrays must have the same shape along all but the third axis.
1-D or 2-D arrays must have the same shape.
3. The array formed by stacking the given arrays, will be at least 3-D.
151
eg-1 :
>>> a = [Link]([1,2,3])
>>> b = [Link]([2,3,4])
[Link]((a,b))
array([[[1, 2],
[2, 3],
[3, 4]]])
eg-2:
>>> a = [Link]([[1],[2],[3]])
>>> b = [Link]([[2],[3],[4]])
>>> [Link]((a,b))
array([[[1, 2]],
[[2, 3]],
[[3, 4]]])
Table: Summary of joining of nd arrays:
---------------------------------
1. concatenate() ---> Join a sequence of arrays along an existing axis.
2. stack()--->Join a sequence of arrays along a new axis.
3. vstack() --->Stack arrays in sequence vertically according to first axis (axis-0).
4. hstack()--->Stack arrays in sequence horizontally according to second axis(axis-1).
5. dstack()---> Stack arrays in sequence depth wise according to third axis(axis-2).
Numpy--->ndarrays
Pandas--->Series, DataFrame
Scipy
Splitting of arrays:
--------------------
We can split an array into multiple subarrays as views.
1. split()
2. vsplit()
3. hsplit()
4. dsplit()
5. array_split()
Note: If dividing array into equal number of specified sections is not possible, then we will
get error.
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> [Link](a,4)
ValueError: array split does not result in an equal division
Dividing 9 elements into 4 equal parts is not possible and hence we got error.
But we can specify axis-1, then splitting will be happend based columns (horizontal
spliting)
>>> a = [Link](1,25).reshape(6,4)
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
154
>>> [Link](a,3,axis=0)
[array([[1, 2, 3, 4],
[5, 6, 7, 8]]), array([[ 9, 10, 11, 12],
[13, 14, 15, 16]]), array([[17, 18, 19, 20],
[21, 22, 23, 24]])]
>>> [Link](a,6)
[array([[1, 2, 3, 4]]), array([[5, 6, 7, 8]]), array([[ 9, 10, 11, 12]]), array([[13, 14, 15, 16]]),
array([[17, 18, 19, 20]]), array([[21, 22, 23, 24]])]
Diagram
156
eg-2:
>>> a = [Link](10,101,10)
>>> a
array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
>>> [Link](a,[2,5,7])
[array([10, 20]), array([30, 40, 50]), array([60, 70]), array([ 80, 90, 100])]
eg-2:
>>> a = [Link](1,13).reshape(6,2)
>>> a
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
>>> [Link](a,[3,4])
[array([[1, 2],
[3, 4],
[5, 6]]), array([[7, 8]]), array([[ 9, 10],
[11, 12]])]
eg-3:
>>> [Link](a,[1,3,4])
[array([[1, 2]]), array([[3, 4],
[5, 6]]), array([[7, 8]]), array([[ 9, 10],
[11, 12]])]
157
Diagram
eg-4:
>>> a = [Link](1,19).reshape(3,6)
>>> a
array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18]])
>>> [Link](a,[1,3,5],axis=1)
[array([[ 1],
[ 7],
[13]]), array([[ 2, 3],
[ 8, 9],
[14, 15]]), array([[ 4, 5],
[10, 11],
[16, 17]]), array([[ 6],
[12],
[18]])]
eg-5:
>>> [Link](a,[2,4,4],axis=1)
[array([[ 1, 2],
[ 7, 8],
[13, 14]]), array([[ 3, 4],
[ 9, 10],
[15, 16]]), array([], shape=(3, 0), dtype=int32), array([[ 5, 6],
[11, 12],
[17, 18]])]
eg-6:
>>> [Link](a,[0,2,6],axis=1)
[array([], shape=(3, 0), dtype=int32), array([[ 1, 2],
158
[ 7, 8],
[13, 14]]), array([[ 3, 4, 5, 6],
[ 9, 10, 11, 12],
[15, 16, 17, 18]]), array([], shape=(3, 0), dtype=int32)]
splitting by vsplit()
---------------------
If we want to split based on first axis(axis-0) then we should go for vsplit()
vsplit--->vertical split (row wise)
Syntax:
vsplit(array, indices_or_sections)
Split an array into multiple sub-arrays vertically (row-wise).
>>> a = [Link](10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> [Link](a,2)
ValueError: vsplit only works on arrays of 2 or more dimensions
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
>>> [Link](a,3)
[array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]]), array([[ 9, 10],
[11, 12]])]
splitting by hsplit():
----------------------
160
Syntax:
------
hsplit(array,indices_or_sections)
Split an array into multiple sub-arrays horizontally (column-wise).
>>> [Link](a,2)
[array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]
[11, 12]])]
eg-3: Based on indices:
----------------------
>>> a = [Link](10,101,10)
>>> a
array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
>>> [Link](a,[2,4,7])
[array([10, 20]), array([30, 40]), array([50, 60, 70]), array([ 80, 90, 100])]
splitting by dsplit():
----------------------
dsplit --->means depth split
splitting based on 3rd axis(axis-2).
Syntax:
dsplit(array, indices_or_sections)
-->Split array into multiple sub-arrays along the 3rd axis (depth).
--->'dsplit' is equivalent to 'split' with 'axis=2', the array is always split along the third
axis provided the array dimension is greater than or equal to 3.
eg:
>>> a = [Link](16).reshape(2,2,4)
>>> a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]]])
>>> [Link](a,2)
[array([[[ 0, 1],
162
[ 4, 5]],
[[ 8, 9],
[12, 13]]]), array([[[ 2, 3],
[ 6, 7]],
[[10, 11],
[14, 15]]])]
Diagram
The only difference between split() and array_split() is that 'array_split' allows
'indices_or_sections' to be an integer that does not equally divide the axis. For an array of
length x that should be split into n sections, it returns x % n sub-arrays of size x//n + 1 and
the rest of size x//n.
eg-1:
10 elements --->3 sections
10%3 sub-arrays of size 10//3+1 and the rest of size 10//3
1 sub-array of size 4 and the rest of size 3
163
>>> a = [Link](10,101,10)
>>> a
array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
>>> [Link](a,3)
ValueError: array split does not result in an equal division
>>> np.array_split(a,3)
[array([10, 20, 30, 40]), array([50, 60, 70]), array([ 80, 90, 100])]
eg-2:
11 elements and 3 sections
For an array of length x that should be split into n sections, it returns x % n sub-arrays of
size x//n + 1 and the rest of size x//n.
x=11
n=3
>>> a = [Link](11)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> np.array_split(a,3)
[array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8, 9, 10])]
Sorting of ndarrays:
--------------------
We can sort elements of nd array.
numpy module contains sort() function.
Syntax:
sort(a)
The default sorting algorithm is Quick sort. We can also specify Merge sort, heapsort etc.
sort(-a): -70,-60,-50,-40,-30,-20,-10
-sort(-a): 70,60,50,40,30,20,10
>>> a
array([70, 20, 60, 10, 50, 40, 30])
>>> c = -[Link](-a)
>>> c
array([70, 60, 50, 40, 30, 20, 10])
2nd way:
-------
>>> a
array([70, 20, 60, 10, 50, 40, 30])
>>> d = [Link](a)[::-1]
>>> d
array([70, 60, 50, 40, 30, 20, 10])
>>> a = [Link]([[40,20,70],[30,20,60],[70,90,80]])
>>> a
array([[40, 20, 70],
[30, 20, 60],
[70, 90, 80]])
>>> b = [Link](a)
>>> b
array([[20, 40, 70],
[20, 30, 60],
[70, 80, 90]])
Syntax:
where(condition, [x, y])
Return elements chosen from 'x' or 'y' depending on 'condition'.
out : ndarray
An array with elements from x where condition is True, and elements
from y elsewhere.
If we are not providing x and y, then it returns ndarray of indexes where condition
satisfied.
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link](a==7)
167
>>> b
(array([2, 4], dtype=int64),)
eg-2: Find indexes where odd numbers present in the given 1-D array?
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link](a%2 != 0)
>>> b
(array([0, 1, 2, 4, 5, 9], dtype=int64),)
eg-3: Find indexes where even numbers present in the given 1-D array?
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link](a%2 == 0)
>>> b
(array([3, 6, 7, 8], dtype=int64),)
Note:
where(condition,[x,y])
if condition satisfied that element will be replaced from x and if the condition fails that
element will be replaced from y.
168
eg-3: Replace every even number with 8888 and every odd number with 7777
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link]( a%2 == 0, 8888, 7777)
>>> b
array([7777, 7777, 7777, 8888, 7777, 7777, 8888, 8888, 8888, 7777])
eg-4: Find indexes where odd numbers present in the given 1-D array and replace with
element 9999.
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link]( a%2 !=0,9999,a)
>>> b
array([9999, 9999, 9999, 6, 9999, 9999, 4, 6, 10, 9999])
searchsorted() function:
------------------------
To perform search operation, internall this function using Binary search algorithm.
Syntax:
searchsorted(a, v, side='left', sorter=None)
Find indices where elements should be inserted to maintain order.
To use this function, array should be sorted already otherwise we will get abnormal
results.
eg-1:
>>> a = [Link](0,31,5)
>>> a
array([ 0, 5, 10, 15, 20, 25, 30])
>>> [Link](a,5)
169
1
>>> [Link](a,13)
3
eg-2:
By default it will always search from the left hand side to identify insertion point. If we
want to search from the right hand side we should use side='right'.
>>> a = [Link]([3,5,7,6,7,9,4,6,10,15])
>>> b = [Link](a)
>>> b
array([ 3, 4, 5, 6, 6, 7, 7, 9, 10, 15])
>>> [Link](b,6)
3
>>> [Link](b,6,side='right')
5
Summary:
1. sort()--->To sort given array
2. where() --->To perform search and replace operation
3. searchsorted() --->To identify insertion point in the given sorted array.
1. insert()
2. append()
1. insert():
170
------------
insert(array, obj, values, axis=None)
Insert values along the given axis before the given indices.
>>> b=[Link](a,2,7777)
>>> b
array([ 0, 1, 7777, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = [Link](a,[2,5,7],[7777,8888])
ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing
result of shape (3,)
>>> [Link](a,25,7777)
IndexError: index 25 is out of bounds for axis 0 with size 10
***Note: Array should contains only homogeneous elements. By using insert() function, if
we are trying to insert any other type element, then that element will be converted to
array type automatically before insertion. If the conversion is not possible then we will get
error.
eg-1:
>>> a = [Link](10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> [Link](a,2,10.5)
array([ 0, 1, 10, 2, 3, 4, 5, 6, 7, 8, 9])
Here array contains int values, but we are trying to insert float value, which will be
converted to int type automatically.
eg-2:
>>> [Link](a,2,True) #True will be converted to 1
array([0, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9])
172
eg-3:
>>> [Link](a,2,10+20j)
TypeError: can't convert complex to int
eg-4:
>>> [Link](a,2,'durga')
ValueError: invalid literal for int() with base 10: 'durga'
eg-3:
>>> [Link](a,1,10,axis=1)
array([[ 1, 10, 2],
[ 3, 10, 4]])
eg-4:
>>> [Link](a,1,[10,20],axis=0)
array([[ 1, 2],
[10, 20],
[ 3, 4]])
eg-5:
>>> [Link](a,1,[10,20],axis=1)
array([[ 1, 10, 2],
[ 3, 20, 4]])
eg-6:
>>> [Link](a,1,[10,20,30],axis=0)
ValueError: could not broadcast input array from shape (1,3) into shape (1,2)
eg-7:
>>> [Link](a,0,[10,20],axis=0)
array([[10, 20],
[ 1, 2],
[ 3, 4]])