PL 300+HawkEye+Slides
PL 300+HawkEye+Slides
POWER BI
DESKTOP
ETL (EXTRACT, TRANSFORM, LOAD)
Grab / Fetch the data from a source : Apply some transformations to it & Load / Save the data to be used in
CSV, Excel, Database, Data Lake etc. clean it up visualizations!
ETL – PowerBI Terminology
Grab / Fetch the data from a source : Apply some transformations to it & Load / Save the data to be used in
CSV, Excel, Database, Data Lake etc. clean it up visualizations!
ETL – PowerBI Lifecycle
Used to Transform the Data & Push it out to the Data Model
Used to Model & Arrange the tables – define relationships & push it
Ahead for visualization (Model View)
ADLS
Text/CSV
Excel
SQL Database
Normalization
• The process of organizing various tables & establishing relationships between them.
• Don’t have 1 huge table, split it into multiple tables.
• Helps to reduce redundancy and manageability.
1 "Book 1" 1
1 "Author A" "authorA@[Link]"
Relationship
2 "Book 2" 1
Authors Books
Improving this structure
1 "Book 1" 1
1 "Author A" "authorA123@[Link]"
Relationship
2 "Book 2" 1
Authors Books
PRIMARY KEYS (PK)
1 "Book 1" 1
1 "Author A" "authorA123@[Link]"
Relationship
2 "Book 2" 1
Authors Books
Fact and Dimension Tables
• Fact Table – The central table & contains quantitative (numeric) information. E.g. – Sales, Profit etc.
• Can be aggregated & helps to answer business questions.
• They often have a FK relationship to Dimension Tables.
• Dimension Tables – Contain descriptive attributes that provide extra information about the data in the fact table.
• Contain textual/categorical data that provides context to facts.
Fact and Dimension Tables
• Now that we know about fact tables & dimension tables, how we connect them is the
next question.
• This is where Star & Snowflake schema come into the picture!
Star Schema – Looks like a Star! One connection from all dimension tables to the fact table!
103 Desk Chair Furniture Comfort Co. 204 Emily White Australia
103 Desk Chair Furniture Comfort Co. 204 Emily White Australia
103 Desk Chair Furniture Comfort Co. 204 Emily White Australia
Manufacturer
ProductID ProductName Category CustomerID CustomerName Country
ID
1 ABC KKK
201 John Doe USA
101 Laptop Electronics 1
• Indicates how many rows in one table correspond to how many rows in another table.
• Can be 1:1 (One to One), 1:* (One to Many) , *:* (Many to Many)
Cardinality – 1:1 (One to One).
• In a 1:1 relationship, each row in the first table is related to EXACTLY one row in the second table, and vice-
versa.
• In a 1:1 relationship, each row in the first table is related to EXACTLY one row in the second table, and vice-
versa.
1 Alice 101 HR
3 Charlie 103 IT
Cardinality – 1:* (One to Many).
• In a 1:* relationship, each row in the first table can be related to MULTIPLE rows in the second table. However,
each row in the second table is related to only ONE row in the first table.
101 1 Widget A
1 John john@[Link]
102 1 Widget B
2 Jane jane@[Link]
103 2 Widget C
Customers Orders
Cardinality – * : * (Many to Many)
• In a *:* relationship, each row in the first table can be related to MULTIPLE rows in the second table and vice-
versa.
103 Desk Chair Furniture Comfort Co. 204 Emily White Australia
Private Highly sensitive data. Never shared. Excel with salaries, HR files
¡ This leads to indirect data leaks (e.g., values from your private Excel affecting public API calls)
Scenario Combining:
• Private: Excel file with employee salaries and currencies • Public: Web API for exchange rates
WHY PRIVACY LEVELS MATTER?
BEST PRACTICES:
[Link]
Full support for complex transformations, Limited support; some DAX functions and
Modeling Flexibility
DAX functions, and calculated columns. transformations are restricted.
Larger, as data is stored within the Power Smaller, since data isn't stored within Power
File Size
BI file. BI.
Needed for scheduled refreshes from on- Required for accessing on-premises data
Gateway Requirement
premises sources. sources in real-time.
dd/mm/yy hh:mm:ss
Import Import
Composite Dual
DirectQuery DirectQuery
In Power BI, Dual Storage Mode is a feature that allows a table to behave both as Import and DirectQuery, depending on
how it's used in your report.
💡 In simple terms: Dual mode gives you the performance benebts of Import mode and the real-time access of
DirectQuery, intelligently switching between the two as needed.
¡ When a Dual table is used with other Import tables: It behaves as an Import table (fast, in-memory).
Types of Errors
Cell-Level Errors: Occur in specific cells but allow the query to load.
Example: Data type conversion issues, such as converting 'NA' to a number.
Solution: Handle errors using Power Query's built-in functions.
¡ Creates an internal date table behind the scenes for each date field in your model.
¡ Automatically allows time intelligence functions (like YTD, MTD, QTD, etc.) without
creating a custom calendar.
¡ It applies only to columns with Date or Date/Time data types in models using Import
mode. The column also shouldn’t be on the “Many” side of the relationship.
HOW DOES IT WORK?
¡ The table spans full calendar years covering all values in the
associated date column.
¡ For example, dates from March 20, 2016 to October 23, 2019
would result in 1,461 rows spanning all days from Jan 1, 2016
to Dec 31, 2019.
Limitations
Advantage Description
🔧 Quick Setup No need to create and relate a date table - works out of the box
📆 Auto Hierarchy Automatically creates a hierarchy: Year > Quarter > Month > Day
📊 Easy Drill-Down Hierarchies allow intuitive visual interactions without extra setup
🧠 Good for Beginners Helps users unfamiliar with DAX or calendar modeling
CONS
Limitation Description
Creates a hidden date table for every date column, which can
📈 Performance Overhead
slow down large models
Multiple date tables can exist for multiple date columns, bloating
🔁 Duplication
the model
Reason Explanation
Functions
Time Intelligence Functions like YTD, MTD, SAMEPERIODLASTYEAR require a
proper Date table.
Enables unified slicers across multiple date fields (e.g.,
Consistent Filtering
Order, Ship, Delivery).
Custom Business Logic Add fiscal periods, weekends, holidays, sort orders, etc.
R1 North Alice
CY2018
R2 South Bob
CY2019
StoreID StoreName RegionID City
CY2020
1 Elm St R1 Chicago
1 CY2019 85
1 CY2020 100
Total revenue in $220K
2 CY2020 150
CY2020 for the
3 CY2020 120
North region?
2 CY2019 90
WHAT IS DAX?
¡ Definition: DAX (Data Analysis Expressions) is a formula language used in Power BI to create custom
calculations and aggregations for data analysis.
¡ Purpose: Enables users to define measures, calculated columns, and tables to enhance data models and
generate insights.
¡ Key Uses:
o Perform complex calculations (e.g., year-over-year growth, running totals).
¡ Components:
o Measures: Calculations for summarizing data (e.g., total sales).
¡ Example: Total Sales = SUM(Sales[Amount]) (Calculates the sum of the Amount column in the Sales table).
WHY USE DAX?
¡ Power and Flexibility: Allows advanced analytics not possible with standard Power BI features.
¡ Context Awareness:
o Row Context: Calculations applied to individual rows (e.g., in calculated columns).
o Filter Context: Dynamic filtering based on report selections (e.g., slicers, visuals).
¡ Common Functions:
o Aggregation: SUM, AVERAGE, COUNT.
¡ Auto-generated by Power BI when you drag a field (like Sales) into the
Values area of a visual.
¡ Created on the fly and exist only within the visual they're used in. Unlike
calculated columns they don’t take storage space in the model!
¡ It's like writing a quick calculation on a sticky note for one use - you can't save or
reference it elsewhere.
¡ Power BI decides the aggregation type (SUM, COUNT, etc.) automatically based
on the data type.
¡ No way to say “SUM only for the year 2024” or “SUM where Product Category =
‘Electronics’” unless you turn it into an explicit measure using DAX.
EXPLICIT MEASURES
¡ Power BI automatically generates a hidden date table behind the scenes for every Date or Date/Time column in your data model.
¡ These tables are not visible in the Fields pane or in the Data Model section.
¡ Each one includes a full set of time intelligence columns (Year, Month, Quarter, etc.).
🧩 Example
¡ A hidden table like OrderDate (auto) with columns like Year, Month, etc.
¡ A date hierarchy in your Fields pane: OrderDate > Year > Quarter > Month > Day.
[Link]
TIME INTELLIGENCE MEASURES
• DATESWTD • DATESINPERIOD
• DATESMTD • DATEADD
• DATESQTD • SAMEPERIODLASTYEAR
• DATESYTD
• NEXTDAY
• TOTALWTD • NEXTMONTH
• TOTALMTD • NEXTYEAR
• TOTALQTD • PREVIOUSMONTH
• TOTALYTD • PREVIOUSQUARTER
SEMI – ADDITIVE MEASURES
• In DAX, measures are typically additive - meaning they can be summed across all dimensions (e.g., sales
amounts add up across product category, sub-category, dates, and Continents etc).
1. Inventory/stock levels: The total stock across warehouses adds up, but summing stock over multiple days
doesn't make sense—instead, you want the stock at the end of the period (e.g., closing balance).
2. Bank account balances: Additive across accounts, but shows the balance as of the last date in a time range.
3. Headcount/HR metrics: Adds up across departments, but for a quarter, you report the ending headcount, not
the sum of monthly headcounts.
Summing semi-additive measures over time would produce misleading results. Instead, DAX uses functions to
"snapshot" the value at the start, end, or average of a period.
Key Concepts
1. Fully Additive: Sum across all dimensions (e.g., Total Sales = SUM(Sales[Amount])).
2. Semi-Additive: Sum across some dimensions (e.g., geography), but use special logic for time (e.g., last non-
empty value).