0% found this document useful (0 votes)
14 views8 pages

NumPy Basics: Array Operations Guide

This document is a comprehensive guide to NumPy, a powerful library for numerical computing in Python, covering topics such as array creation, attributes, operations, indexing, and file handling. It highlights the performance advantages of NumPy over Python lists and provides numerous examples and code snippets for practical understanding. The guide concludes by emphasizing the importance of NumPy for data science and scientific computing.

Uploaded by

i240608
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

NumPy Basics: Array Operations Guide

This document is a comprehensive guide to NumPy, a powerful library for numerical computing in Python, covering topics such as array creation, attributes, operations, indexing, and file handling. It highlights the performance advantages of NumPy over Python lists and provides numerous examples and code snippets for practical understanding. The guide concludes by emphasizing the importance of NumPy for data science and scientific computing.

Uploaded by

i240608
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to NumPy: A

Comprehensive Guide A detailed tutorial


based on a Jupyter Notebook Generated on June 27, 2025
Introduction to NumPy

1 What is NumPy?
NumPy (Numerical Python) is a powerful Python library for numerical comput-
ing. It provides support for large, multi-dimensional arrays and matrices, along
with a collection of mathematical functions to operate on these arrays efficiently.
· Why NumPy? NumPy is faster than Python lists for numerical operations
due to its optimized C-based implementations and contiguous memory al-
location.

2 Key Concepts and Operations


2.1 Installing and Importing NumPy
NumPy is typically installed using pip (pip install numpy) and must be im-
ported before use. The alias np is commonly used for convenience.
1 import numpy as np

2.2 Creating NumPy Arrays


NumPy arrays can be created from Python lists or using built-in functions like
[Link], [Link], [Link], [Link], [Link], and [Link].

2.2.1 Example: Converting a List to a NumPy Array


1 import numpy as np
2 pyl = [1, 2, 3, 4, 5, 6]
3 pyl_array = [Link]([pyl, pyl])
4 print(”Type of pyl:”, type(pyl))
5 print(”Type of pyl_array:”, type(pyl_array))
6 print(”Array:\n”, pyl_array)

Output:
Type of pyl: <class ’list’>
Type of pyl_array: <class ’[Link]’>
Array:
[[1 2 3 4 5 6]
[1 2 3 4 5 6]]

2.2.2 Example: Creating Arrays with Zeros, Ones, and Full


1 x = [Link](20).reshape(4, 5)
2 print(”Zeros array:\n”, x)
3 q = [Link]((2, 5)) * 1
4 print(”Ones array:\n”, q)
5 w = [Link]((2, 5), 10)
6 print(”Full array with 10:\n”, w)

Output:

1
Introduction to NumPy

Zeros array:
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
Ones array:
[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
Full array with 10:
[[10 10 10 10 10]
[10 10 10 10 10]]

2.2.3 Example: Using arange and linspace


1 print(”[Link](1, 22, 3):\n”, [Link](1, 22, 3))
2 print(”[Link](1, 22, 10):\n”, [Link](1, 22, 10))

Output:
[Link](1, 22, 3):
[ 1 4 7 10 13 16 19]
[Link](1, 22, 10):
[ 1. 3.33333333 5.66666667 8. 10.33333333 12.66666667
15. 17.33333333 19.66666667 22. ]

2.3 Array Attributes


NumPy arrays have attributes like ndim, size, shape, and dtype to provide in-
formation about the array.
1 pyl = [Link]([[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]])
2 print(”Dimensions:”, [Link])
3 print(”Size:”, [Link])
4 print(”Shape:”, [Link])
5 print(”Data type:”, [Link])

Output:
Dimensions: 2
Size: 12
Shape: (2, 6)
Data type: int64
· Note: The shape attribute returns a tuple, which is immutable, so direct
assignment like s[0] = 4 will raise an error.

2.4 Reshaping and Transposing Arrays


Arrays can be reshaped using reshape and transposed using .T or flatten to
convert to 1D.

2
Introduction to NumPy

1 x = [Link](20)
2 x = [Link](2, 10)
3 print(”Reshaped to (2, 10):\n”, x)
4 x = [Link](4, 5)
5 print(”Reshaped to (4, 5):\n”, x)
6 x = x.T
7 print(”Transposed:\n”, x)
8 x = [Link]()
9 print(”Flattened:\n”, x)

Output:
Reshaped to (2, 10):
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
Reshaped to (4, 5):
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
Transposed:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
Flattened:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

2.5 Array Operations


NumPy supports element-wise operations, matrix multiplication, and scalar op-
erations.

2.5.1 Example: Scalar Operations


1 arr = [Link](0, 100, (5, 5))
2 print(”Original array:\n”, arr)
3 arr = arr * 2
4 print(”After scalar multiplication by 2:\n”, arr)
5 arr = arr + 2
6 print(”After scalar addition by 2:\n”, arr)

Output (example values):


Original array:
[[23 11 49 3 36]
[83 20 16 14 5]
[23 15 95 71 84]
[ 5 1 87 57 95]

3
Introduction to NumPy

[72 22 33 23 24]]
After scalar multiplication by 2:
[[ 46 22 98 6 72]
[166 40 32 28 10]
[ 46 30 190 142 168]
[ 10 2 174 114 190]
[144 44 66 46 48]]
After scalar addition by 2:
[[ 48 24 100 8 74]
[168 42 34 30 12]
[ 48 32 192 144 170]
[ 12 4 176 116 192]
[146 46 68 48 50]]

2.5.2 Example: Element-wise Multiplication


1 arr1 = arr
2 print(”Element-wise multiplication:\n”, arr1 * arr)

Output (example values):


Element-wise multiplication:
[[ 2304 576 10000 64 5476]
[28224 1764 1156 900 144]
[ 2304 1024 36864 20736 28900]
[ 144 16 30976 13456 36864]
[21316 2116 4624 2304 2500]]

2.5.3 Example: Matrix Multiplication


1 arr = [Link](10).reshape(2, 5)
2 arr1 = [Link](10, 20).reshape(5, 2)
3 print(”Matrix multiplication (arr.T @ arr1):\n”, arr @ arr1)
4 print(”Matrix multiplication ([Link]):\n”, [Link](arr, arr1))

Output:
Matrix multiplication (arr.T @ arr1):
[[ 130 140]
[ 380 410]
[ 630 680]
[ 880 950]
[1130 1220]]
Matrix multiplication ([Link]):
[[ 130 140]
[ 380 410]
[ 630 680]
[ 880 950]
[1130 1220]]

4
Introduction to NumPy

2.6 Indexing and Slicing


NumPy arrays support indexing, slicing, and boolean indexing for accessing and
modifying elements.

2.6.1 Example: Slicing


1 x = [Link](20).flatten()
2 print(”Sliced array (index 5 to end-1):\n”, x[5:-1])

Output:
Sliced array (index 5 to end-1):
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

2.6.2 Example: Boolean Indexing


1 arr = [Link](10)
2 print(”Array:”, arr)
3 print(”Elements where arr % 2 == 1:\n”, arr[arr % 2 == 1])

Output:
Array: [0 1 2 3 4 5 6 7 8 9]
Elements where arr % 2 == 1:
[1 3 5 7 9]

2.6.3 Example: Filtering with [Link]


1 print(”Filtering with [Link] (0 for odd, 1 for even):\n”,
[Link](arr % 2 == 0, 1, 0))

Output:
Filtering with [Link] (0 for odd, 1 for even):
[1 0 1 0 1 0 1 0 1 0]

2.7 Aggregations
NumPy provides functions like sum, min, max, mean, std, and var for statistical
operations.
1 arr = [Link](0, 100, 10)
2 print(”Array:”, arr)
3 print(”Sum:”, [Link]())
4 print(”Min:”, [Link]())
5 print(”Max:”, [Link]())
6 print(”Mean:”, [Link]())
7 print(”Standard Deviation:”, [Link]())
8 print(”Variance:”, [Link]())
9 print(”Sorted array:”, arr[[Link]()])

Output (example values):

5
Introduction to NumPy

Array: [ 1 24 30 4 85 14 39 23 66 76]
Sum: 362
Min: 1
Max: 85
Mean: 36.2
Standard Deviation: 28.314529936775716
Variance: 801.76
Sorted array: [ 1 4 14 23 24 30 39 66 76 85]

2.8 File Handling


NumPy can save and load arrays to/from files using [Link] and [Link].
1 arr = [Link](10)
2 [Link](’[Link]’, arr)
3 print(”File saved successfully”)
4 arr1 = [Link](’[Link]’)
5 print(”File loaded successfully”)
6 print(”File content:”, arr1)

Output:
File saved successfully
File loaded successfully
File content: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
· Note: File handling operations are not supported in some environments
(e.g., Pyodide in browsers). Ensure your environment supports file I/O.

2.9 Handling Missing Values


NumPy provides functions like [Link] to handle missing values (NaN).
1 arr1 = [Link]([0, 1, 2, 3, [Link], 5, 6, 7, 8, 9])
2 print(”Array with NaN:”, arr1)
3 print(”Mean ignoring NaN:”, [Link](arr1))

Output:
Array with NaN: [ 0. 1. 2. 3. nan 5. 6. 7. 8. 9.]
Mean ignoring NaN: 4.555555555555555

3 Performance Comparison: Python Lists vs. NumPy Arrays


NumPy arrays are significantly faster than Python lists for numerical operations
due to vectorization.
1 import numpy as np
2 list1 = list(range(1000000))
3 list2 = list(range(1000000))
4 %time result = [x * y for x, y in zip(list1, list2)]
5

6
Introduction to NumPy

6 arr1 = [Link](1000000)
7 arr2 = [Link](1000000)
8 %time result = arr1 * arr2

Output (example timings):


Python lists:
CPU times: user 90.2 ms, sys: 23.1 ms, total: 113 ms
Wall time: 119 ms

NumPy arrays:
CPU times: user 0 ns, sys: 4.24 ms, total: 4.24 ms
Wall time: 4.32 ms
· Note: NumPy’s performance advantage comes from its ability to perform
operations on entire arrays at once, avoiding Python’s loop overhead.

4 Conclusion
This guide covered the basics of NumPy, including array creation, attributes, op-
erations, indexing, aggregations, and file handling. NumPy is an essential tool
for numerical computing in Python, offering efficient and versatile array oper-
ations. Practice these examples to build proficiency in using NumPy for data
science and scientific computing tasks.

Common questions

Powered by AI

Reshaping in NumPy changes the dimensions of an array without affecting its data. For example, np.zeros(20).reshape(4, 5) rearranges a 1-dimensional array into a 4x5 matrix, which is useful for restructuring data for matrix operations or visualization. Transposing, on the other hand, flips the dimensions of the array, allowing rows to become columns and vice versa, accessible using .T, e.g., converting a 2x5 array into a 5x2 array. This is critical in mathematical computations where operations rely on specific dimensional alignments, such as in linear algebra where transposing matrices is commonplace .

Scalar operations involve performing a single arithmetic operation on all elements of an array, such as multiplying an entire array by a constant. For instance, multiplying every element of an array by 2 doubles all its values, which is useful for scaling datasets uniformly. Element-wise operations, however, involve arithmetic operations performed pairwise on corresponding elements of two arrays, such as adding or multiplying the elements of one array with another of the same shape, useful in calculations requiring simultaneous operations on related datasets like calculating the percentage increase from year to year for two datasets of annual data .

Boolean indexing in NumPy allows users to filter arrays based on conditions, enabling dynamic querying and manipulation of data without manual indexing. It is advantageous over manual loops as it is both concise and computationally efficient. For example, arr[arr % 2 == 1] would extract all odd numbers from the array arr. This operation is not only faster due to internal optimizations but also more readable and easier to implement, especially useful in data cleaning, subsetting datasets, or applying conditions to large datasets for data analysis .

The function np.arange generates an array of numbers within a specified range with a defined step, similar to Python's range function but returns a NumPy array. For example, np.arange(1, 22, 3) produces an array from 1 to 21 with increments of 3. np.linspace, on the other hand, generates numbers linearly spaced over an interval, with the user defining the number of intervals. For instance, np.linspace(1, 22, 10) creates 10 evenly spaced numbers between 1 and 22. These functions are particularly useful for generating indices, discrete samples, and for initializing regular grids and sampling points for numerical methods .

To create a NumPy array filled with a constant value, you can use the np.full function. For example, np.full((2, 5), 10) creates a 2x5 array where every element is filled with the constant value 10. This is useful in situations where you need to initialize an array with a specific value for calculations, simulations, or as placeholders in algorithms where the dimension is fixed .

NumPy arrays outperform Python lists in numerical tasks due to their optimized implementation, which is largely attributable to vectorization, contiguous memory allocation, and utilization of compiled C-backend functions. Where lists apply operations element-by-element often requiring explicit loops with significant overhead, NumPy operates on entire arrays in parallel, eliminating loop overhead and using processor-level optimizations. This results in dramatic speedups exemplified by tasks that can be reduced from milliseconds to microseconds when shifting from lists to arrays, which is vital when scaling computations or working in time-sensitive applications like real-time data analysis .

NumPy arrays are significantly faster than Python lists for numerical operations due to several factors. First, NumPy's implementation is optimized using C, which allows for more efficient execution. Second, NumPy allocates memory for arrays contiguously, leading to faster data access compared to the more generalized storage of lists. Third, it uses vectorization to perform operations on entire arrays at once, eliminating the need for Python’s slower loop-based execution, which involves a substantial overhead .

NumPy is crucial in data science and scientific computing due to its efficient handling of array operations and numerical data. It allows for the performance of complex mathematical and statistical calculations over large datasets quickly and efficiently, which is foundational in data analysis and modeling. Furthermore, with functions for linear algebra, Fourier transforms, and random number capabilities, NumPy lays the groundwork for more advanced libraries like pandas and SciPy, which depend on its array manipulation capabilities to handle rich data structures and more specialized computations .

NumPy provides functions such as np.nanmean to handle missing or NaN values, which compute the average by ignoring these NaN entries. For example, np.nanmean(arr) calculates the mean of array arr excluding any NaN values found within. This capability is critical for statistical computations over data with incomplete entries, ensuring that results reflect only the present data, which is vital for accurate data analysis in fields dealing with real-world problems where missing values are common .

NumPy's file handling functions np.savetxt and np.loadtxt offer simple yet powerful tools for saving and loading arrays to and from text files. This capability is significant for data persistence, sharing, and reproducibility of results. For example, np.savetxt('data.txt', arr) saves array arr into a text file 'data.txt', while np.loadtxt('data.txt') loads it back. This feature streamlines workflow by allowing data embedding into software version control systems or interchange between different programming environments, vital for collaborative projects and reproducibility in scientific research .

You might also like