0% found this document useful (0 votes)
6 views6 pages

Python Code Style Guide & Structure

Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

Python Code Style Guide & Structure

Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Python Code Style Guide & Project Structure

Purpose
Consistency beats cleverness. Write code that any teammate can understand quickly. Favor
simple, explicit constructs and plain, international English in names and comments. For
create a clear codebase and make the future work smoother it is better to stick to these
notes, especially when writing modules as they will be used in future.

Naming
- snake_case for variables, functions, and modules.​
Examples: longName → long_name, getUser → get_user​
- PascalCase for classes. Example: class UserAccount: ...​
- UPPER_SNAKE_CASE for constants. Example: MAX_RETRIES = 3​
- Booleans read as questions. Examples: is_valid, has_items, should_retry​
- Be specific, avoid abbreviations. Examples: customer_count not cnt; amount_in_dollars not
amt_in_dol​
- Use domain words consistently. Pick one term and stick to it (customer vs client).

Imports
- Order: standard library → third-party → local. Group with a blank line between groups.​
- Within a group, alphabetize.​
- No duplicate imports. Avoid importing the same thing twice.​
- Prefer absolute imports inside packages.​
- Avoid wildcard imports (from x import *).​
- After the last import block, leave exactly 2 blank lines.

Example:
python​
import dataclasses​
import json​
from pathlib import Path​

import httpx​
import pandas as pd​

from [Link] import settings​
from [Link] import parse_date


# two blank lines after the last import​
Spacing & Layout
- Line length: target ≤ 80 chars. Break long expressions across lines with parentheses.​
- Top-level definitions: leave 2 blank lines before each function or class.​
- Inside functions: separate logical steps with a single blank line.​
- Around operators and commas: space like normal text (a + b, x = f(a, b)).

Functions & Blocks


- One job per function. If you can’t describe its purpose in one sentence, split it.​
- Docstring for public functions/classes. First line: one-sentence summary.​
- Logical blocks get short comments when intent isn’t obvious (don’t narrate the obvious).​
- Use logging over print, set the logging level at the top of the program to see logging
as prints, this allows a smoother testing and clean up phase
- Loops/conditions: add a blank line before and after when it helps readability.

- When looping over a list of files or pages or more generally loops over a lot of items
that means a lot of time to spend inside the loop add a tqdm progress bar

Example:

python​
def load_customers(path: str) -> list[dict]:​
"""Load customers from a JSONL file."""​
customers: list[dict] = []​

# Read file line by line to avoid high memory usage​
with open(path, "r", encoding="utf-8") as f:​
for line in f:​
# Parse each JSON object; skip malformed lines​
try:​
[Link]([Link](line))​
except [Link]:​
continue​

return customers​
Comments & Docstrings
- Write in simple, standard English. Short sentences. No slang or local idioms.​
- Explain “why”, not “what”. Code already shows “what”.​
- Keep comments close to the code they explain.

Example:

python​
def split_name(full_name: str) -> tuple[str, str]:​
"""Split a full name into first and last parts."""​

Types
- Use type hints for all function signatures and important variables.​
- Prefer list[str] over List[str] (Python 3.9+).​
- Run a type checker (e.g., mypy or pyright) in CI.

Strings & Formatting


- Prefer f-strings for readability: f"User {user_id} not found".​
- Use triple quotes for multi-line strings and docstrings.​
- Always specify encoding when reading/writing files: encoding="utf-8".

Errors & Logging


- Catch only what you can handle. Avoid bare except:.​
- Prefer logging over print.​
- Log actionable context, not secrets or PII.

Tests
- Test names describe behavior: test_raises_on_empty_input.​
- Arrange-Act-Assert with blank lines between the three parts.​
- Keep tests deterministic and isolated.
Performance & Clarity
- Readability first. Optimize only when needed, then document the reason.​
- Avoid premature cleverness. Prefer explicit loops over dense comprehensions if clarity
suffers.

Example: Before → After


Before:

python​
from myapp import utils,config,utils # duplicate import​
import os,sys, json​
def DoThing(a,b):# does things​
x=a+b; y=x*2; ​
for i in range(10): y+=i​
return y​

After:

python​
import json​
import sys​
from pathlib import Path​

from myapp import config​
from myapp import utils​


def do_thing(a: int, b: int) -> int:​
"""Compute doubled sum and add a small series."""​
total = a + b​

# Double once as per requirement​
total *= 2​

# Add arithmetic series 0..9​
for i in range(10):​
total += i​

return total​
Do & Don’t (quick checklist)
Do:​
- snake_case names​
- 2 blank lines after imports​
- 2 blank lines before each top-level def/class​
- Group imports: stdlib, third-party, local​
- ≤ 80-char lines; break with parentheses, not backslashes​
- Docstrings for public APIs; comments for non-obvious intent; simple English​
- Type hints; f-strings; pathlib; logging

Don’t:​
- Use ambiguous abbreviations, regional slang, or unexplained acronyms​
- Catch broad exceptions or log secrets​
- Make one function do three different things

Git & Commit


Please commit only on the datapartners organization on github, this ensure privacy and
security for the code and mistakes. Commit always the code at the end of the day. Commit on
a branch “dev” for developing, when the code is ready for testing merge it into “main”
branch (or into a “test” branch, as you prefer)

Project Structure
Use the following layout for all repos. Keep folder names as shown. Python packages inside
these folders must be in snake_case and include an __init__.py.

📁 Name-Of-The-Project/​
├─ 📁 Data/​
│ └─ <📄 data files>​
├─ 📁 Config/​
│ ├─ 📄 .env​
│ ├─ 📄 [Link]​
│ └─ <📄 [Link] if needed>​
├─ 📁 Modules/​
│ └─ <📄 support files, e.g., db_handlers.py>​
├─ 📁 Notebooks/​
│ └─ <📄 notebooks used in development>​
├─ 📁 Database/​
│ └─ <📄 schemas/migrations/seed data or local .db files>​
├─ 📁 Preprocessing/​
│ └─ <📄 data preprocessing scripts>​
├─ 📁 Test/​
│ └─ <📄 tests or test reports>​
📁
📄
├─ Deploy/​

📄
│ └─ < API/service files for deployment>​

📄
├─ [Link]​
└─ [Link]​

What goes where:​


- Data/: Raw/processed datasets, small sample fixtures. Never commit large or private data;
prefer storage links. Add patterns to .gitignore (e.g., Data/*.csv, Data/*.parquet).​
- Config/: .env (never commit secrets), [Link]; optional [Link] as a central
loader reading [Link] with defaults.​
- Modules/: Reusable app/library code (e.g., db_handlers.py, services/, models/). Make
packages snake_case.​
- Notebooks/: Exploratory work and reports. Load data via relative paths. Save outputs
under Data/ or Test/Reports/.​
- Database/: Schema, migrations, seed SQL, local SQLite DBs.​
- Preprocessing/: Scripts/pipelines to clean and transform data before modeling or serving.​
- Test/: Test code and (optionally) reports. Test files: test_*.py or *_test.py. Reports under
Test/Reports/.​
- Deploy/: Service/API entry points, deployment configs.​
- [Link]: Thin orchestrator. Avoid business logic—import from Modules/.​
- [Link]: Purpose, setup, run, test, data location, env vars.

Import rules within this structure:​


- Use absolute imports from the project root; avoid manipulating [Link].​
Example:

python​
from Modules.db_handlers import get_session​
# or​
from [Link] import get_session​

Config & secrets:​


- Load configuration from Config/.env via [Link] (optionally python-dotenv in local dev
only).​
- Never log secrets. Document required variables in [Link].

- Commit small sample datasets only (if needed for examples/tests).

Common questions

Powered by AI

The code style guide suggests handling errors by catching only exceptions that can be specifically handled, avoiding 'bare except:' clauses, and using logging instead of print statements. Logging should capture actionable context without including secrets or personally identifiable information (PII). These practices enhance error diagnosability and code security while providing insights into application behavior without exposing sensitive information .

The recommended naming convention for variables and functions in Python is 'snake_case'. For example, 'longName' should be converted to 'long_name', and 'getUser' should be converted to 'get_user' .

The style guide suggests organizing imports by grouping them into three sections: standard library imports, third-party library imports, and local application imports, separated by a blank line between groups. Within each group, imports should be alphabetically ordered, and duplicate imports as well as wildcard imports should be avoided. This organization is important as it enhances code readability, maintains consistency, and makes it easier to track dependencies and potential conflicts within the codebase .

Maintaining simplicity and consistency in a Python codebase is essential because it makes the code easier to understand for any teammate, reducing cognitive load. This approach favors simple, explicit constructs and minimizes potential errors when future changes are made. In naming conventions, this means using clear, descriptive names in 'snake_case' for variables and functions and avoiding abbreviations, ensuring clarity and maintainability of the code .

Python developers should prefer absolute imports over relative imports to ensure clarity and avoid ambiguity, particularly in larger projects. Absolute imports explicitly specify the location from which a module is being imported, enhancing readability and reducing potential errors caused by the restructuring of the project's directory. This practice aligns with the style guide's emphasis on maintainability and understandability across a codebase .

The best practices for writing comments and docstrings in Python include using simple, standard English without slang or idioms, explaining 'why' the code does something rather than 'what', and keeping comments close to the code they explain. Docstrings should summarize the purpose of functions or classes in the first line. These practices benefit a codebase by clarifying the developer's intent, aiding future maintenance, and facilitating the onboarding of new team members .

The 'main.py' file in a Python project structure serves as a thin orchestrator. It should not contain business logic; instead, it is responsible for importing functions and classes from modules. This approach ensures that 'main.py' remains focused on initializing the application and coordinating high-level flows, which aids in maintaining a clear separation of concerns and a more modular project architecture .

The benefits of optimizing Python code include improved performance in terms of execution speed and resource utilization. However, the style guide cautions against premature optimization as it can lead to overly complex or convoluted code that sacrifices readability and maintainability. Optimization should be carried out only when necessary and should be well-documented to explain the rationale behind the changes, thus preventing technical debt and ensuring that the codebase remains accessible to future developers .

Type hints in Python enhance code clarity and maintenance by explicitly specifying the expected types of function arguments and return values, which assists both human developers and static type checkers like 'mypy'. This practice reduces errors, improves code readability, and facilitates better understanding and communication among developers, making the codebase easier to navigate and extend .

A Python project repository should be organized into distinct folders such as 'Data/', 'Config/', 'Modules/', 'Notebooks/', 'Database/', 'Preprocessing/', 'Test/', and 'Deploy/'. Each folder has a specific purpose: 'Data/' for datasets, 'Config/' for configuration files, 'Modules/' for reusable code, 'Notebooks/' for exploratory analysis, 'Database/' for schema and seed data, 'Preprocessing/' for data transformation scripts, 'Test/' for testing code, and 'Deploy/' for deployment scripts. This structured layout enhances clarity, improves navigability, and facilitates collaboration among developers .

You might also like