0% found this document useful (0 votes)
11 views29 pages

Unit 5 Python Packages

The document provides an overview of Python packages and modules, emphasizing the importance of code reusability and namespace management. It introduces NumPy and Pandas as essential libraries for numerical data manipulation and analysis, highlighting their functionalities, differences from Python lists, and installation instructions. Additionally, it covers basic data visualization techniques using Matplotlib, showcasing how to create and customize plots.

Uploaded by

silaniwal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views29 pages

Unit 5 Python Packages

The document provides an overview of Python packages and modules, emphasizing the importance of code reusability and namespace management. It introduces NumPy and Pandas as essential libraries for numerical data manipulation and analysis, highlighting their functionalities, differences from Python lists, and installation instructions. Additionally, it covers basic data visualization techniques using Matplotlib, showcasing how to create and customize plots.

Uploaded by

silaniwal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 5: Python Packages

1. Introduction to Python Packages & Modules


Definition:
A Package is a collection of related modules (Python files containing
definitions, functions, and statements). A Module is a single file that can be
imported to use its functionality in another program. Packages help in
organizing code into logical, reusable components.
Why Use Packages?
 Code Reusability: Write once, use multiple times.
 Namespace Management: Avoid naming conflicts.
 Leverage Community Code: Use powerful libraries like NumPy, Pandas,
etc., without reinventing the wheel.
How to Import:
1. import statement: Imports the entire module.
import math
print([Link](25)) # Output: 5.0

2. from...import statement: Imports specific functions/attributes.


from math import sqrt, pi
print(sqrt(25)) # Output: 5.0
print(pi) # Output: 3.141592653589793
2. NumPy: Numerical Python
Definition:
NumPy (Numerical Python) is the core library that provides essential tools
for working with numerical data.
Most scientific libraries in Python—such as Pandas, SciPy, Matplotlib, scikit-
learn, TensorFlow—are built on top of NumPy.
This makes NumPy the base (foundation) of the entire Python scientific
ecosystem.

NumPy is the fundamental package for scientific computing in Python. It


provides a powerful N-dimensional array object, tools for integrating C/C++
code, and capabilities for linear algebra, Fourier transforms, and random
number generation.

NumPy uses zero-based indexing.

Why NumPy over Lists?

Feature Python List NumPy Array

Data Type Can hold different types Must be homogeneous

Performance Slower for large data Faster, contiguous memory

Functionality Basic operations Rich math operations (element-wise)

Memory More memory intensive Memory efficient


Comparison: Python List vs NumPy Array
1. Data Type
Python List
 A list can store different types of data in the same list.
 Example:
 my_list = [10, "hello", 3.5]
 This is allowed because lists are flexible.
NumPy Array
 A NumPy array must store elements of the same data type
(homogeneous).
 Example:
 [Link]([1, 2, 3.5])
 NumPy will convert all elements to a single type automatically (here →
float).

2. Performance
Python List
 Slower for large data because:
o Each element is stored separately in memory.
o Python needs extra information to track each element.
NumPy Array
 Much faster, especially for numerical operations.
 NumPy uses contiguous (continuous) memory blocks, which CPUs
handle more efficiently.
 Also written in optimized C code internally.

3. Functionality
Python List
 Supports basic operations:
o append(), pop(), slicing
o Concatenation, iteration
 But it is not designed for mathematical operations.
NumPy Array
 Supports powerful mathematical operations:
o Element-wise addition, subtraction, multiplication
o Matrix multiplication
o Statistical functions
o Linear algebra
Example:
import numpy as np
a = [Link]([1, 2, 3])
print(a * 2) # [2, 4, 6]
This is NOT possible directly with Python lists.

4. Memory
Python List
 Takes more memory because each element:
o Stores additional metadata
o Is stored separately
NumPy Array
 Memory efficient:
o Stores data in a compact, continuous block
o No extra metadata per element
This is why NumPy is ideal for big numerical datasets.

2. Example Demonstrations
A. Data Type Difference
Python List (Mixed types allowed)
my_list = [10, "hello", 3.14]
print(my_list)

NumPy Array (Converts all to same type)


import numpy as np
arr = [Link]([10, 3.5, 2])
print(arr) # Output: [10. 3.5 2. ] (all float)

3. Functionality Difference
Python List — No element-wise math
list1 = [1, 2, 3]
print(list1 * 2) # Output: [1,2,3,1,2,3] (repetition)

NumPy Array — Element-wise math


import numpy as np
arr = [Link]([1, 2, 3])
print(arr * 2) # Output: [2 4 6]

4. Memory Diagram
Python List (Scattered memory)
[10] -> stored at location A
["hi"] -> stored at location F
[3.14] -> stored at location K

List only stores references to actual memory cells

NumPy Array (Contiguous memory)


| 10 | 11 | 12 | 13 | 14 |
All stored in one continuous block

This is why NumPy is faster and efficient.

5. Performance Speed Test


import numpy as np
import time

# Python list
lst = list(range(1_000_000))
start = [Link]()
lst_result = [x * 2 for x in lst]
end = [Link]()
print("List time:", end - start)

# NumPy array
arr = [Link](range(1_000_000))
start = [Link]()
arr_result = arr * 2
end = [Link]()
print("NumPy time:", end - start)

What does int32 or int64 mean?


These are data types used by NumPy to store integers efficiently in memory.

1. int32
 Integer stored using 32 bits (4 bytes)
 Can store values from about –2 billion to +2 billion
 Uses less memory
 Usually appears on Windows systems

2. int64
 Integer stored using 64 bits (8 bytes)
 Can store very large values (huge range)
 Uses more memory
 Usually appears on Linux/Mac systems

Installation:
pip install numpy

1. Creating Arrays:

 From a List:
import numpy as np
arr_1d = [Link]([1, 2, 3, 4]) # 1-D Array
arr_2d = [Link]([[1, 2, 3], [4, 5, 6]]) # 2-D Array

Using a range (like range but returns an array):


arr_range = [Link](0, 10, 2) # start, stop, step
print(arr_range) # Output: [0 2 4 6 8]

2. Array Attributes:

arr = [Link]([[1, 2, 3], [4, 5, 6]])


print("Dimensions:", [Link]) # Output: 2
print("Shape (rows, cols):", [Link]) # Output: (2, 3)
print("Total elements:", [Link]) # Output: 6
print("Data Type:", [Link]) # Output: int32 or int64

3. Array Operations (Element-wise):

a = [Link]([1, 2, 3])
b = [Link]([4, 5, 6])

print("Addition:", a + b) # Output: [5 7 9]
print("Multiplication:", a * b) # Output: [4 10 18]
print("Exponential:", a ** 2) # Output: [1 4 9]

4. Indexing and Slicing:

 Indexing: Access individual elements.


arr_2d = [Link]([[1, 2, 3], [4, 5, 6]])
print(arr_2d[1, 2]) # Output: 6 (2nd row, 3rd column)
 Slicing: Access sub-parts of the array.
print(arr_2d[0:2, 1:3]) # Rows 0-1, Columns 1-2
# Output: [[2 3]
# [5 6]]

5. Reshaping and Splitting:

 Reshape: Change dimensions without changing data.


arr = [Link](12)
reshaped = [Link](3, 4) # 3 rows, 4 columns

 Split: Divide an array into multiple sub-arrays.


arr = [Link](9).reshape(3, 3)
first, second = [Link](arr, [1]) # Split after 1st row

6. Statistical Operations:

data = [Link]([10, 20, 30, 40, 50])


print("Max:", [Link]()) # Output: 50
print("Min:", [Link]()) # Output: 10
print("Mean:", [Link]()) # Output: 30.0
print("Standard Deviation:", [Link]()) # Output: ~14.14
3. Pandas: Data Analysis & Manipulation
Definition:
Pandas is a powerful and easy-to-use Python library designed specifically for
data analysis, data cleaning, and data manipulation.
It is built on top of NumPy, which means it is fast and efficient for handling
numerical data.
Pandas is a high-level data manipulation library built on NumPy. It provides
data structures like Series (1D) and DataFrame (2D) that are fast, flexible, and
expressive, designed to make working with structured data intuitive.

Pandas provides two main data structures:


Series (1-Dimensional)
 A Series is like a single column of data.
 It can store integers, strings, floats, or any other Python object.
 Think of it like an Excel column with labels (index).

DataFrame (2-Dimensional)
 A DataFrame is like a table of data with rows and columns.
 Similar to an Excel sheet or SQL table.
 It allows you to easily filter, clean, reshape, group, and analyze data.

✔ Why Pandas is Important?


 It makes working with structured data (rows and columns) very simple.
 It is very fast because it uses NumPy internally.
 It comes with built-in functions for:
o removing missing values
o filtering rows
o merging datasets
o reading/writing CSV, Excel, SQL, JSON
o statistical calculations
o grouping and aggregating data

In simple words:
Pandas helps you take raw data and turn it into clean, organized, and
meaningful information — quickly and easily.

POINTS:
1. Pandas is a high-level data manipulation library
 Pandas is a powerful Python library used to store, analyze, clean, and
manipulate data.
 It makes working with data easy and more convenient compared to
using basic Python or NumPy alone.
2. Built on NumPy
 Pandas is created using NumPy internally.
 That means it uses NumPy arrays for speed and performance.
 Because of this, Pandas is fast, optimized, and efficient for data
operations.
3. Provides data structures like Series and DataFrame

🔹 Series (1D)
 A Series is like a one–dimensional array with labels (called index).
 Example:
0 10
1 20
2 30
dtype: int64

🔹 DataFrame (2D)
 A DataFrame is like an Excel table:
o rows + columns
o labeled data
o easy to analyze and manipulate
Example:

marks age

0 85 19

1 92 20

Installation:
pip install pandas

1. Series:
A one-dimensional labeled array.
 Creation:
import pandas as pd
s = [Link]([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s)
# Output:
# a 10
# b 20
# c 30
# d 40
# dtype: int64

 Indexing & Slicing:


print(s['b']) # Output: 20 (Label-based)
print(s[1:3]) # Output: b=20, c=30 (Position-based)

2. DataFrame:

A two-dimensional labeled data structure, like a spreadsheet or SQL table.


 Creation from Dictionary:
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NYC', 'London', 'Paris']}
df = [Link](data)
print(df)
# Output:
# Name Age City
# 0 Alice 25 NYC
#1 Bob 30 London
# 2 Charlie 35 Paris

 Adding/Deleting Columns:
df['Salary'] = [70000, 80000, 90000] # Add Column
df = [Link]('Age', axis=1) # Drop Column (axis=1 for columns)

 Indexing with loc and iloc:


o loc is label-based.
print([Link][0, 'Name']) # Output: 'Alice'

o iloc is integer position-based.


print([Link][0, 1]) # Output: 'NYC' (1st row, 2nd column)

1. loc → LABEL-based indexing


This means you access data using row labels and column names.

✔ Example:
[Link][0, 'Name']
 0 → row label
 'Name' → column label

✔ Output:
'Alice'
Use loc when you know the row name/index and column name.

2. iloc → POSITION-based indexing


This means you access data using row number and column number (starting
from 0).

✔ Example:
[Link][0, 1]
 0 → first row
 1 → second column

✔ Output:
'NYC'
3. Reading from and Writing to CSV:
 Reading:
df = pd.read_csv('[Link]', sep=',', header=0)

 Writing:
df.to_csv('[Link]', index=False) # index=False avoids writing row
numbers

4. Matplotlib: Data Visualization


Definition:
Matplotlib is a comprehensive library for creating static, animated, and
interactive visualizations in Python. It is the foundation for many other plotting
libraries.
Matplotlib is a powerful and widely used Python library for data visualization.
It allows you to create a wide range of charts, graphs, and plots—such as line
graphs, bar charts, histograms, scatter plots, and more.
Matplotlib can generate static (images), animated (GIFs), and interactive
(zoomable) visualizations, making it useful for data analysis, scientific research,
and presentations.
It also serves as the base library for other visualization tools like Seaborn,
Pandas plotting, and Plotly, which build on top of Matplotlib to offer more
advanced or simplified plotting features.

Key Points (Simplified):


 A Python library used for creating visual graphs and charts.
 Supports static, animated, and interactive visualizations.
 Highly customizable (colors, styles, labels, sizes, etc.).
 Works well with NumPy, Pandas, and other scientific libraries.
 Forms the foundation for many other plotting libraries.
Installation:
pip install matplotlib

1. Basic Line Plot:


import [Link] as plt

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create plot
[Link](x, y, marker='o', linestyle='-', color='blue', linewidth=2, label='Linear
Growth')

# Customize plot
[Link]("Simple Line Plot")
[Link]("X Axis")
[Link]("Y Axis")
[Link](True)
[Link]()

# Display plot
[Link]()
Explanation for above code:
1. Importing the Library
import [Link] as plt
 [Link] is the module used to create plots.
 Imported as plt for convenience.

2. Creating Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
 x contains values for the X-axis.
 y contains values for the Y-axis.
 This means:
o When x = 1 → y = 2
o When x = 2 → y = 4
o And so on…
 The data represents a linear relationship (y = 2x).

3. Creating the Line Plot


[Link](x, y, marker='o', linestyle='-', color='blue', linewidth=2, label='Linear
Growth')
Breakdown:
 [Link](x, y) → Draws a line graph.
 marker='o' → Puts a circle marker at each data point.
 linestyle='-' → Draws a solid line connecting the points.
 color='blue' → Sets the line color to blue.
 linewidth=2 → Makes the line thicker.
 label='Linear Growth' → Sets a label for the legend.

4. Customizing the Plot


Title
 [Link]("Simple Line Plot")
 Sets the title of the chart.
X-axis Label
 [Link]("X Axis")
 Gives a name to the X-axis.
Y-axis Label
 [Link]("Y Axis")
 Gives a name to the Y-axis.
Gridlines
 [Link](True)
 Shows a grid in the background (helps in reading values).
Legend
 [Link]()
 Displays the label (“Linear Growth”) on the graph.

5. Displaying the Plot


 [Link]()
 Opens a window and displays the final plot.
Final Result
This code creates a simple blue line plot showing a straight-line relationship
between x and y values, with:
 Data point markers
 Axis labels
 Title
 Gridlines
 Legend

2. Customization Parameters:
 marker: Defines the data point style ('o' for circle, 's' for square, '*' for
star).
 linestyle: Defines the line style ('-' solid, '--' dashed, ':' dotted).
 color: Defines the line color ('r' for red, 'g' for green, '#000000' for hex
codes).
 linewidth: Defines the thickness of the line.

3. Example for Engineering Data (Stress vs. Strain):


import numpy as np
import [Link] as plt

strain = [Link]([0, 0.1, 0.2, 0.3, 0.4, 0.5])


stress = [Link]([0, 50, 95, 130, 155, 170]) # Example material data

[Link](figsize=(8, 5))
[Link](strain, stress, marker='s', color='red', linestyle='--', linewidth=1.5,
label='Material X')
[Link]('Stress-Strain Curve')
[Link]('Strain (ε)')
[Link]('Stress (σ) in MPa')
[Link](True, linestyle=':')
[Link]()
[Link]()

5. Tkinter: GUI Programming


Definition:
Tkinter is Python’s default and most commonly used GUI (Graphical User
Interface) library.
With Tkinter, you can create desktop applications that include windows,
buttons, labels, text boxes, menus, frames, and other interactive elements.
Tkinter is:
 Built into Python, so no installation is needed.
 Lightweight, meaning it runs efficiently even on simple systems.
 Beginner-friendly, due to its simple and readable syntax.
 Cross-platform, working on Windows, macOS, and Linux.
Because of these features, Tkinter is ideal for making:
 Small tools
 Educational apps
 Calculators
 Simple data entry forms
 GUI front-ends for Python scripts

Step-by-Step Application Creation:


import tkinter as tk
from tkinter import messagebox

# 1. Create the Main Window


root = [Link]()
[Link]("Engineering Calculator")
[Link]("300x200")

# 2. Create Widgets (Components)


label = [Link](root, text="Enter a Number:")
entry = [Link](root)

def calculate_square():
try:
number = float([Link]())
result = number ** 2
[Link]("Result", f"The square is {result}")
except ValueError:
[Link]("Error", "Please enter a valid number!")

button = [Link](root, text="Calculate Square", command=calculate_square)


# 3. Arrange Widgets using Geometry Manager (pack)
[Link](pady=10)
[Link](pady=5)
[Link](pady=10)

# 4. Start the Event Loop


[Link]()

Common Tkinter Widgets for Engineers:


 Label: Displays text or images.
 Entry: Single-line text input field.
 Button: Clickable button to trigger actions.
 Text: Multi-line text area.
 Canvas: For drawing graphs, shapes, and diagrams.

What is an IDE?
 An IDE enables programmers to combine the different aspects of writing
a computer program.
 IDEs increase programmer productivity by introducing features like
editing source code, building executables, and debugging.

What are IDEs and Code Editors?


IDEs and code editors are tools that software developers use to write and edit
code.

 IDEs, or Integrated Development Environments, are usually more feature-


rich and include tools for debugging, building and deploying code.
 Code editors are generally more straightforward and focused on code
editing. Many developers use IDEs and code editors, depending on the
task.

IDE vs. Code Editor: What's the Difference?


 An Integrated Development Environment (IDE) is a software application
that provides tools and resources to help developers write and debug
code. An IDE typically includes
o A source code editor
o A compiler or interpreter
o An integrated debugger
o A graphical user interface (GUI)
 A code editor is a text editor program designed specifically for editing
source code. It typically includes features that help in code development,
such as syntax highlighting, code completion, and debugging.
 The main difference between an IDE and a code editor is that an IDE has
a graphical user interface (GUI) while a code editor does not. An IDE also
has features such as code completion, syntax highlighting, and
debugging, which are not found in a code editor.
 Code editors are generally simpler than IDEs, as they do not include
many other IDE components. As such, code editors are typically used by
experienced developers who prefer to configure their development
environment manually.

Top Python IDEs


Now that you know about the integrated Development Environment, let's
look at a few popular Python IDEs. Note that we won't be ranking these IDEs
just for the sake of it because we believe that different IDEs are meant for
various purposes.

But, we will indeed discuss which IDE you should use according to your
needs or requirements. This will help remove any doubts that you may have
and help you make a choice that best suits your purpose.
1. IDLE

 IDLE (Integrated Development and Learning Environment) is a default


editor that accompanies Python
 This IDE is suitable for beginner-level developers
 The IDLE tool can be used on Mac OS, Windows, and Linux
 Price: Free

The most notable features of IDLE include:

 Ability to search for multiple files


 Interactive interpreter with syntax highlighting, and error and i/o
messages
 Smart indenting, along with basic text editor features
 A very capable debugger
 Its a great Python IDE for Windows

2. PyCharm

 PyCharm is a widely used Python IDE created by JetBrains


 This IDE is suitable for professional developers and facilitates the
development of large Python projects
 Price: Freemium
The most notable features of PyCharm include:

 Support for JavaScript, CSS, and TypeScript


 Smart code navigation
 Quick and safe code refactoring
 Support features like accessing databases directly from the IDE
 Its a great Python IDE for Windows

3. Visual Studio Code

 VisualStudio Code is an open-source (and free) IDE created by Microsoft. It


finds great use in Python development
 VSCode is lightweight and comes with powerful features that only some of
the paid IDEs offer
 Price: Free
The most notable features of Visual Studio Code include:

 One of the best smart code completion is based on various factors


 Git integration
 Code debugging within the editor
 It
provides an extension to add additional features like code linting, themes,
and other services

4. Sublime Text 3

 Sublime Text is a very popular code editor. It supports many languages,


including Python
 It is highly customizable and also offers fast development speeds and
reliability
 Price: Free
The most notable features of Sublime Text 3 include:

 Syntax highlighting
 Custom user commands for using the IDE
 Efficient project directory management
 It supports additional packages for the web and scientific Python
development
 Its a great Python IDE for Windows

5. Atom

 Atom is an open-source code editor by GitHub and supports Python


development
 Atom is similar to Sublime Text and provides almost the same features
emphasis on speed and usability
 Price: Free

The most notable features of Atom include:

 Support for a large number of plugins

 Smart autocompletion
 Supports custom commands for the user to interact with the editor
 Support for cross-platform development

6. Jupyter

 Jupyter is widely used in the field of data science


 It is easy to use, interactive and allows live code sharing and visualization
 Price: Free
The most notable features of Jupyter include:

 Supports for the numerical calculations and machine learning workflow


 Combine code, text, and images for greater user experience
 Intergeneration of data science libraries like NumPy, Pandas, and Matplotlib

7. Spyder

 Spyder is an open-source IDE most commonly used for scientific development


 Spyder comes with Anaconda distribution, which is popular for data science
and machine learning
 Price: Free
The most notable features of Spyder include:

 Support for automatic code completion and splitting


 Supports plotting different types of charts and data manipulation
 Integration of data science libraries like NumPy, Pandas, and Matplotlib
 Its a great Python IDE for Windows

8. PyDev

 PyDev is a strong python interpreter and is distributed as a third-party plugin


for Eclipse IDE
 Being flexible, it is one of the preferred open-source IDE by the developers
 Price: Free
The most notable features of PyDev include:

 Django integration, auto code completion, and code coverage


 Supports type hinting, refactoring, as well as debugging and code analysis
 Good support for Python web development
9. Thonny

 Thonny is an IDE ideal for teaching and learning Python programming


 Price: Free
The most notable features of Thonny include:

 Simple debugger
 Function evaluation
 Automatic syntax error detection
 Detailed view of variables used in a Python program or project

Features of an IDE

Let’s look at some main features of an IDE:

1. Syntax Highlighting

An IDE that knows your language's syntax can provide visual cues and keywords
that are easier to read by visually clarifying the language syntax.

Code without Syntax

Code with Syntax


2. Autocomplete

IDEs are generally really good at anticipating what you're more likely to type
next, making coding significantly faster and simpler.

3. Building Executables

IDE takes care of interpreting the Python code, running python scripts, building
executables, and debugging the applications.

4. Debugging

In the event that a program does not run correctly, programmers can easily
detect their code eros using the debugging tools that IDEs offer.

You might also like