0% found this document useful (0 votes)
4 views94 pages

Tevta Syllabus Book

The IT Specialist – Data Analytics Project Workbook by LearnKey provides comprehensive training resources for data analytics, including best practices, skills assessments, and project exercises. It covers various topics such as data concepts, variable types, data manipulation, analysis methods, and data visualization, aimed at preparing students for certification and enhancing employability. The workbook is designed for both self-paced learning and instructor-led training, featuring interactive labs and assessments to reinforce knowledge gained from video lessons.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views94 pages

Tevta Syllabus Book

The IT Specialist – Data Analytics Project Workbook by LearnKey provides comprehensive training resources for data analytics, including best practices, skills assessments, and project exercises. It covers various topics such as data concepts, variable types, data manipulation, analysis methods, and data visualization, aimed at preparing students for certification and enhancing employability. The workbook is designed for both self-paced learning and instructor-led training, featuring interactive labs and assessments to reinforce knowledge gained from video lessons.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

IT Specialist – Data Analytics

Project Workbook

First Edition

LearnKey creates signature multimedia courseware. LearnKey provides expert instruction for popular computer software,
technical certifications, and application development with dynamic video-based courseware and effective learning
management systems. For a complete list of courses, visit [Link]

All rights reserved. Unauthorized reproduction or distribution is prohibited.

© 2025 LearnKey
[Link]
Table of Contents
Introduction 1
Best Practices Using LearnKey’s Online Training 2
Using This Workbook 3
Skills Assessment 4
IT Specialist – Data Analytics Video Times 6
Domain 1 Lesson 1 7
Define the Concept of Data 8
Basic Data Variable Types 9
Domain 1 Lesson 2 10
Tables, Rows, Columns, and Lists 11
Qualitative Data 12
Quantitative Data 13
Structured and Unstructured Data 14
Metadata and Big Data 15
Domain 2 Lesson 1 16
ETL Processes 17
Data Manipulation Tools 18
Data Storage File Formats 19
Domain 2 Lesson 2 20
Handle Null Values 21
Handle Special Characters 22
Trim Spaces 23
Handle Inconsistent Formatting 24
Remove Duplicates 25
Impute Data and Validate Data 26
Domain 2 Lesson 3 27
Sort and Filter Data 28
Slice Data 29
Transpose and Append Data 30
Truncate Data 31
Domain 2 Lesson 4 32
Group, Join, and Merge Data 33
Summarize Data 34
Pivot Data 35
Domain 3 Lesson 1 36
Descriptive Analysis 37
Diagnostic Analysis 38
Hypothesis Testing 39
Predictive and Prescriptive Analytics 40
Domain 3 Lesson 2 41
Search Data 42
Filter Data 43
Find Unique Values 44
Aggregate Functions 45
Domain 3 Lesson 3 46
Find Relationships in Data 47
Data Drilling and Data Mining 48
Domain 3 Lesson 4 49
Calculate Trends and Expected Values 50
Interpret Predictive Models 51
Interpret P-Values and T-Tests 52
Interpret Regression Analyses 53
Domain 3 Lesson 5 54
AI, Machine Learning, and Algorithms 55
AI and Machine Learning in Data Analytics 56
Domain 4 Lesson 1 57
Display Information 58
Disaggregate Data 59
Domain 4 Lesson 2 60
Data Visualization Practices 61
Visualization Types 62
Domain 4 Lesson 3 63
Translate Visual Representations 64
Visualizations vs. Statistics 65
Domain 5 Lesson 1 66
Privacy Laws and Standards 67
Domain 5 Lesson 2 68
Managing PII 69
Data Analysis 70
Domain 5 Lesson 3 71
Biases 72
Sampling Methods 74
Appendix 76
Glossary 77
Objectives 80
IT Specialist – Data Analytics Lesson Plan 82
Domain 1 Lesson Plan 83
Domain 2 Lesson Plan 84
Domain 3 Lesson Plan 86
Domain 4 Lesson Plan 88
Domain 5 Lesson Plan 89
Introduction

1 | Introduction: Best Practices Using LearnKey’s Online Training IT Specialist – Data Analytics Project Workbook, First Edition
Best Practices Using LearnKey’s Online Training
LearnKey offers video-based training solutions that are flexible enough to accommodate private students and educational
facilities and organizations.

Our course content is presented by top experts in their respective fields and provides clear and comprehensive
information. The full line of LearnKey products has been extensively reviewed to meet superior quality standards. Our
course content has also been endorsed by organizations such as Certiport, CompTIA®, Cisco, Adobe, and Microsoft.
However, it is the testimonials given by countless satisfied customers that truly set us apart as leaders in the information
training world.

LearnKey experts are highly qualified professionals who offer years of job and project experience in their subjects. Each
expert has been certified at the highest level available for their field of expertise. This expertise provides the student with
the knowledge necessary to obtain top-level certifications in their chosen field.

Our accomplished instructors have a rich understanding of the content they present. Effective teaching encompasses
presenting the basic principles of a subject and understanding and appreciating organization, real-world application, and
links to other related disciplines. Each instructor represents the collective wisdom of their field and within our industry.

Our Instructional Technology


Each course is independently created based on the manufacturer’s standard objectives for which the course was
developed.

We ensure that the subject matter is up-to-date and relevant. We examine the needs of each student and create training
that is both interesting and effective. LearnKey training provides auditory, visual, and kinesthetic learning materials to fit
diverse learning styles.

Course Training Model


The course training model allows students to undergo basic training, building upon primary knowledge and concepts to
more advanced application and implementation. In this method, students will use the following toolset:

Pre-assessment: The pre-assessment is used to determine the student’s prior knowledge of the subject matter. It will also
identify a student’s strengths and weaknesses, allowing them to focus on the specific subject matter they need to improve
the most. Students should not necessarily expect a passing score on the pre-assessment as it is a test of prior knowledge.

Video training sessions: Each training course is divided into sessions or domains and lessons with topics and subtopics.
LearnKey recommends incorporating all available external resources into your training, such as student workbooks,
glossaries, course support files, and additional customized instructional material. These resources are located in the folder
icon at the top of the page.

Exercise labs: Labs are interactive activities that simulate situations presented in the training videos. Step-by-step
instructions and live demonstrations are provided.

Post-assessment: The post-assessment is used to determine the student’s knowledge gained from interacting with the
training. In taking the post-assessment, students should not consult the training or any other materials. A passing score is
80 percent or higher. If the individual does not pass the post-assessment the first time, LearnKey recommends
incorporating external resources, such as the workbook and additional customized instructional material.

Workbook: The workbook has various activities, including fill-in-the-blank questions, short answer questions, practice
exam questions, and group and individual projects that allow the student to study and apply concepts presented in the
course videos.

2 | Introduction: Best Practices Using LearnKey’s Online Training IT Specialist – Data Analytics Project Workbook, First Edition
Using This Workbook
This project workbook contains practice projects and exercises to reinforce the knowledge you have gained through the
video portion of the IT Specialist – Data Analytics course. The purpose of this workbook is twofold. First, get you further
prepared to pass the IT Specialist – Data Analytics exam, and second, to teach you job-ready skills and increase your
employability in the area of data analysis.

The projects within this workbook follow the order of the video portion of this course. To save your answers in this
workbook, you must first download a copy to your computer. You will not be able to save your answers in the web version.
You can complete the workbook exercises as you go through each section of the course, complete several at the end of
each domain, or complete them after viewing the entire course. The key is to go through these projects to strengthen your
knowledge in this subject.

Each project is based upon a specific video (or videos) in the course and specific test objectives. The materials you will
need for this course include:

• LearnKey’s IT Specialist – Data Analytics courseware.

• The course project files. All applicable project files are located in the support area where you downloaded this
workbook.

• Microsoft Excel, as some projects require this software to complete project steps.

For Teachers
LearnKey is proud to provide extra support to instructors upon request. For your benefit as an instructor, we also provide
an instructor support .zip file containing answer keys, completed versions of the workbook project files, and other teacher
resources. This .zip file is available within your learning platform’s admin portal.

Notes
• Extra teacher notes, when applicable, are in the Project Details box within each exercise.

• Exam objectives are aligned with the course objectives listed in each project, and project file names correspond
with these numbers.

• The Finished folder in each domain has reference versions of each project. These can help you grade projects.

• Short answers may vary but should be similar to those provided in this workbook.

• Teachers may consider asking students to add their initials, student ID, or other personal identifiers at the end of
each saved project.

• Refer to your course representatives for further support.

We value your feedback about our courses. If you have any questions, comments, or concerns, please let us know by
visiting [Link]

3 | Introduction: Using This Workbook IT Specialist – Data Analytics Project Workbook, First Edition
Skills Assessment
Instructions: Rate your skills on the following tasks from 1-5 (1 being needs improvement, 5 being excellent).

Skills 1 2 3 4 5
Define the concept of data.

Describe basic data variable types.

Describe basic structures used in data analytics.

Describe data categories.

Import, store, and export data.

Clean data.

Organize data.

Aggregate data.

Describe and differentiate between different types of data


analysis.
Describe and differentiate between data aggregation and
interpretation metrics.
Describe and differentiate between exploratory data analysis
methods.

Evaluate and explain the results of data analyses.

Define and describe the role of artificial intelligence in data


analysis.

Report data.

Create visualizations from data.

Derive conclusions from a data visualization.

Describe data privacy laws and best practices.

Describe best practices for responsible data handling.

Given a scenario, describe types of bias that affect collection


and interpretation of data.

4 | Introduction: Skills Assessment IT Specialist – Data Analytics Project Workbook, First Edition
Skills 1 2 3 4 5

5 | Introduction: Skills Assessment IT Specialist – Data Analytics Project Workbook, First Edition
IT Specialist – Data Analytics Video Times
Domain 1 Video Time
Define the Concept of Data and Basic Data Variable
[Link]
Types
Structures Used in Data Analytics and Data Categories [Link]
Total Time [Link]

Domain 2 Video Time


Import, Store, and Export Data [Link]
Clean Data [Link]
Organize Data [Link]
Aggregate Data [Link]
Total Time [Link]

Domain 3 Video Time


Different Types of Data Analysis [Link]
Aggregation and Metrics [Link]
Exploratory Data Analysis Methods [Link]
Data Analysis Results [Link]
AI in Data Analysis [Link]
Total Time [Link]

Domain 4 Video Time


Report Data [Link]
Create Visualizations from Data [Link]
Derive Conclusions [Link]
Total Time [Link]

Domain 5 Video Time


Privacy Laws and Best Practices [Link]
Responsible Data Handling [Link]
Bias [Link]
Total Time [Link]

6 | Introduction: IT Specialist – Data Analytics Video Times IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson 1

7 | Domain 1 Lesson 1: IT Specialist – Data Analytics Video Times IT Specialist – Data Analytics Project Workbook, First Edition
Define the Concept of Data Project Details
Project file
The first step in data analysis is defining data – its meaning, use, and why it N/A
matters. Understanding data in any form is a foundational step in data analysis.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will better understand the concept of data.
Domain 1
Topic: Define the Concept of Data
Steps for Completion Subtopic: Data Concepts and Uses

1. What is data? Objectives covered


1 Data Basics
a. 1.1 Define the concept of data
1.1.1 Data concepts and uses

Notes for the teacher


2. What is data analysis? Ensure students know the definitions of
data and data analysis.
a.

3. Label the following statements as true or false.

a. Data takes nearly any form one can imagine in the real world.

b. Data never needs to be translated or cleaned.

8 | Domain 1 Lesson 1: Define the Concept of Data IT Specialist – Data Analytics Project Workbook, First Edition
Basic Data Variable Types Project Details
Project file
Several data types exist, including Boolean, numeric, and string data. Each type N/A
has a specific use, and some have more than one type.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will become more familiar with basic data
Domain 1
variable types. Topic: Basic Data Variable Types
Subtopic: Boolean; Numeric; String
Steps for Completion
Objectives covered
1. Match each data type to its description. 1 Data Basics
1.2 Describe basic data variable types
A. Boolean B. Numeric C. String 1.2.1 Boolean
1.2.2 Numeric
a. Data that is primarily used in quantitative analysis. 1.2.3 String

b. Data that is made up of a sequence of characters, such Notes for the teacher
Ensure students know how to define
as letters, numbers, or spaces, that are typically arranged in a
and identify the basic data variable
specific, meaningful order. types.

c. Data that is frequently used to create conditions and,


in programming, can tell a computer to take a certain action if a
condition is met.

2. Refer to each image below. Then, label the data type pictured as Boolean, numeric, or string.

a. b.

c.

9 | Domain 1 Lesson 1: Basic Data Variable Types IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson 2

10 | Domain 1 Lesson 2: Basic Data Variable Types IT Specialist – Data Analytics Project Workbook, First Edition
Tables, Rows, Columns, and Project Details
Project file
Lists N/A

Estimated completion time


There are several methods of storing and structuring data in data analysis. 5-10 minutes
Tables are one of the most important structuring methods due to their flexibility Video reference
and ease of use. Tables are made up of rows and columns. Lists are different Domain 1
from tables in that they are less structured than tables. Topic: Structures Used in Data
Analytics
Purpose Subtopic: Tables, Rows, Columns;
Lists
Upon completing this project, you will become more familiar with tables, rows,
and columns. Objectives covered
1 Data Basics
Steps for Completion 1.3 Describe basic structures used in
data analytics
1. What is a data table? 1.3.1 Tables
1.3.2 Rows
a. 1.3.3 Columns
1.3.4 Lists

2. What are individual data points in a table known as? Notes for the teacher
Ensure students can define a data table
a. and its different parts. Make sure they
understand the difference between a
3. What do rows in a data table usually represent? table and list.

a.

4. What do columns in a data table usually represent?

a.

5. What is a list?

a.

6. When are lists usually used in programming?

a.

11 | Domain 1 Lesson 2: Tables, Rows, Columns, and Lists IT Specialist – Data Analytics Project Workbook, First Edition
Qualitative Data Project Details
Project file
In addition to data types, there are different data categories, which can be N/A
thought of as the larger buckets in which data types fit and serve various
Estimated completion time
purposes. The first data category in this workbook is qualitative data.
5 minutes

Purpose Video reference


Domain 1
Upon completing this project, you will better understand qualitative data. Topic: Data Categories
Subtopic: Qualitative
Steps for Completion
Objectives covered
1. What is qualitative data? 1 Data Basics
1.4 Describe data categories
a. 1.4.1 Qualitative

Notes for the teacher


2. Why is qualitative data more prone to nuance than quantitative data? Ensure students can define qualitative
data and understand how it differs from
a. quantitative data. If time permits,
discuss other examples of qualitative
data.
3. What are three different examples of sources for qualitative data?

a.

4. Refer to the image below, then answer the following question.

5. Which column represents qualitative data?

6. What methodologies might one use to analyze the qualitative data in this table?

a.

12 | Domain 1 Lesson 2: Qualitative Data IT Specialist – Data Analytics Project Workbook, First Edition
Quantitative Data Project Details
Project file
Qualitative data’s counterpart is quantitative data and is the bucket in which N/A
numeric data fits; it is quantifiable and can be analyzed statistically.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will better understand quantitative data.
Domain 1
Topic: Data Categories
Steps for Completion Subtopic: Quantitative

1. How is quantitative data different from qualitative data? Objectives covered


1 Data Basics
a. 1.4 Describe data categories
1.4.2 Quantitative
2. What are some examples of sources for quantitative data?
Notes for the teacher
a. Ensure students know how quantitative
data differs from qualitative data. Make
sure they know the different sources
3. What are some uses for quantitative data? and uses for quantitative data.
a.

4. Refer to the image below, then answer the following question.

5. Which two columns in the table contain quantitative data?

a.

13 | Domain 1 Lesson 2: Quantitative Data IT Specialist – Data Analytics Project Workbook, First Edition
Structured and Unstructured Project Details
Project file
Data N/A

Estimated completion time


An important distinction in data analysis is whether data is structured or 5 minutes
unstructured. Structured data is organized in a predefined format following a Video reference
specific schema, usually in databases or spreadsheets. Unstructured data has no Domain 1
overarching schema or organization and exists in its raw form. Topic: Data Categories
Subtopic: Structured; Unstructured
Purpose
Objectives covered
Upon completing this project, you will become more familiar with structured 1 Data Basics
and unstructured data. 1.4 Describe data categories
1.4.3 Structured
Steps for Completion 1.4.4 Unstructured

Notes for the teacher


1. Which type of data is the form in which businesses collect and report
Ensure students know the difference
data? between structured and unstructured
data. Make sure they know that data is
a. not always structured.
2. Refer to the images below, then label each as structured or unstructured
data.

a.

b.

14 | Domain 1 Lesson 2: Structured and Unstructured Data IT Specialist – Data Analytics Project Workbook, First Edition
Metadata and Big Data Project Details
Project file
Metadata and big data are two more data categories available in data analysis. N/A
Metadata is data that accompanies and gives context to other types of data. Big
Estimated completion time
data is data in very large quantities. Data analysts often work with extensive
5 minutes
datasets with millions or billions of entries.
Video reference
Purpose Domain 1
Topic: Data Categories
Upon completing this project, you will better understand metadata and big Subtopic: Metadata; Big Data
data.
Objectives covered
Steps for Completion 1 Data Basics
1.4 Describe data categories
1. List two examples of metadata. 1.4.5 Metadata
1.4.6 Big data
a.
Notes for the teacher
Ensure students can define metadata
and big data. If time permits, discuss
more examples of each kind of data.
2. What are the three Vs of big data?

a.

3. What does big data often require due to its size?

a.

4. List two examples of big data.

a.

15 | Domain 1 Lesson 2: Metadata and Big Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 1

16 | Domain 2 Lesson 1: Metadata and Big Data IT Specialist – Data Analytics Project Workbook, First Edition
ETL Processes Project Details
Project file
The first step in data analysis is ETL, which stands for extract, transform, load. N/A
ETL serves as the foundation for data analysis, initiating the process of preparing
Estimated completion time
data for analysis.
5 minutes

Purpose Video reference


Domain 2
Upon completing this project, you will become more familiar with ETL processes. Topic: Import, Store, and Export Data
Subtopic: ETL Processes
Steps for Completion
Objectives covered
1. What does each word in ETL mean? 2 Data Manipulation
2.1 Import, store, and export data
a. Extract 2.1.1 Fundamental understanding
of ETL (extract, transform and load)
b. Transform processes

Notes for the teacher


c. Load Make sure students understand the
different parts of ETL and how they
work.
2. Datacize ran a survey built in Google Forms last month asking guests
about their gym experiences. The data cannot be analyzed directly to
the degree you wish in Google Forms, so you must perform an ETL.
Label each step appropriately as the Extract (E), Transform (T), or Load (L) part of this process.

a. Eliminate unnecessary data columns or attributes and edit headers to remove unnecessary
characters.

b. Download the responses in a CSV file.

c. Open the CSV file in Excel.

17 | Domain 2 Lesson 1: ETL Processes IT Specialist – Data Analytics Project Workbook, First Edition
Data Manipulation Tools Project Details
Project file
Data manipulation tools are software designed to process, modify, and [Link]
transform raw data into a more usable format. Various data manipulation tools,
Estimated completion time
such as Python, SQL, R, Microsoft Excel, and Power Query, are available to help
10 minutes
prepare data for analysis.
Video reference
Purpose Domain 2
Topic: Import, Store, and Export Data
Upon completing this project, you will become more familiar with data Subtopic: Data Manipulation
manipulation tools. Tools; Power Query

Steps for Completion Objectives covered


2 Data Manipulation
1. Match each data manipulation tool to its description. 2.1 Import, store, and export data
2.1.2 Data manipulation tools
(SQL, R, Python, Microsoft Excel
A. SQL D. Microsoft Excel
including aspects of Power Query)
B. Python E. Power Query
Notes for the teacher
C. R Make sure students know the different
data manipulation tools available.
a. A high-level, interpreted programming language Ensure they know how to load data into
typically used for the modules and packages from its extensive Microsoft Excel.
library. It is the most popular and widely used programming
language here.

b. A domain-specific programming language mainly used by organizations for maintaining and


manipulating relational databases.

c. A data connection and transformation tool that is available across several Microsoft products
and provides powerful data connection, import, and transformation capabilities.

d. A high-level programming language that tends to be used specifically for its extensive statistics
libraries.

e. A commonly used data analytics environment that provides many data transformation, analysis,
and visualization capabilities. It allows users to store, manipulate, and display data in a tabular format.

2. Open a new blank spreadsheet in Microsoft Excel.

3. Import data from the [Link] file in your Domain 2 Student folder.

4. Save the file as 212-megaGymDataset-Completed

18 | Domain 2 Lesson 1: Data Manipulation Tools IT Specialist – Data Analytics Project Workbook, First Edition
Data Storage File Formats Project Details
Project file
Data comes in many formats, such as text documents, video files, or audio files. N/A
Knowing the basics about file formats one is most likely to encounter in data
Estimated completion time
analysis is important, as it can help one understand the tools used to load and
5 minutes
manipulate them.
Video reference
Purpose Domain 2
Topic: Import, Store, and Export Data
Upon completing this project, you will better understand data storage file Subtopic: Data Storage File
formats. Formats

Steps for Completion Objectives covered


2 Data Manipulation
1. Match each file type to its description. 2.1 Import, store, and export data
2.1.3 Common data storage file
A. CSV C. XML formats (delimited data files, XML,
JSON)
B. JSON D. .xls and .xlsx
Notes for the teacher
a. A text-based, easy-to-read data format used for Make sure students understand the
different file formats and how they are
structured data in which the data is organized into key-value
used.
pairs and nested objects.

b. A plain text file in which a line represents a row of


data, with individual values separated by commas (commonly called a delimiter).

c. A markup language used to store and exchange structured data that can also store and
represent semi-structured data, such as configuration settings files.

d. Files that are native to Excel and store data in spreadsheets.

2. Which file type is popular for being an easily read and accessed method of storing data in a simple table format?

a.

3. Which two file formats mentioned here are often used with programming languages such as Python?

a.

19 | Domain 2 Lesson 1: Data Storage File Formats IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 2

20 | Domain 2 Lesson 2: Data Storage File Formats IT Specialist – Data Analytics Project Workbook, First Edition
Handle Null Values Project Details
Project file
Data cleaning is a crucial first step in data analytics because raw data is often 221-CAT_Survey [Link]
messy. Cleaning data ensures the information is accurate, reliable, and useful for
Estimated completion time
analysis. There are several issues with datasets that require cleaning, and a
10 minutes
common one is null values.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice handling null values. Subtopic: Handle Null Values

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 221-CAT_Survey [Link] from your Domain 2 Student 2.2 Clean data
folder. 2.2.1 Handling Null values

2. In cell B55, type Null values Notes for the teacher


Make sure students can define null
3. In cell B56, use the COUNTBLANK formula to determine the number of values. Ensure they know how to find
null values in an Excel spreadsheet.
null values in the Age column.

a. Copy the formula to the rest of the cells in row 56.

4. Save the file as 221-CAT_Survey Results-Completed

5. What are null values?

a.

6. What are three ways to handle null values?

a.

21 | Domain 2 Lesson 2: Handle Null Values IT Specialist – Data Analytics Project Workbook, First Edition
Handle Special Characters Project Details
Project file
Another issue one might encounter when cleaning data for analysis is special 222-CSAT_Survey [Link]
characters. Special characters can be intentionally present in a dataset or result
Estimated completion time
from accidental data entry or parsing.
10 minutes

Purpose Video reference


Domain 2
In this project, you will practice handling special characters. Topic: Clean Data
Subtopic: Handle Special
Steps for Completion Characters

1. Open the 222-CSAT_Survey [Link] from your Domain 2 Student Objectives covered
folder. 2 Data Manipulation
2.2 Clean data
2. Use Find and Replace to remove all exclamation points, periods, 2.2.2 Handling special characters
commas, and question marks from column J.
Notes for the teacher
3. In cell K1, type Cleaned feedback Ensure students know how to handle
special characters using find and
4. In cells K2 through K53, use the SUBSTITUTE formula to replace all replace. Make sure they know there are
hyphens with spaces in cells J2 through J53. several ways to handle special
characters.
5. Copy and paste the column K values onto column J.

6. Save the file as 222-CSAT_Survey Results-Completed

7. What are special characters?

a.

8. What are two methods for handling special characters?

a.

22 | Domain 2 Lesson 2: Handle Special Characters IT Specialist – Data Analytics Project Workbook, First Edition
Trim Spaces Project Details
Project file
Trimming spaces is another common data-cleaning process. It is necessary when 223-CSAT_Survey [Link]
data has too many spaces or spaces where there should be none. These
Estimated completion time
additional spaces can create issues with parsing data, especially when passed
10 minutes
through a tool or algorithm that does not expect them to be there.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice trimming spaces. Subtopic: Trim Spaces

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 223-CSAT_Survey [Link] file from your Domain 2 2.2 Clean data
Student folder. 2.2.3 Purpose and common
practices: trimming spaces
2. Create a new column between columns E and F.
Notes for the teacher
3. In cell F1, type Trimmed Email Make sure students know how to use
the TRIM function in Excel to remove
4. In cell F2, use the TRIM function on cell E2. extra spaces.

5. Copy the TRIM function in cell F2 to cells F3 through F53.

6. Copy and paste the values in column F to column E.

7. Save the file as 223-CSAT_Survey Results-Completed

23 | Domain 2 Lesson 2: Trim Spaces IT Specialist – Data Analytics Project Workbook, First Edition
Handle Inconsistent Formatting Project Details
Project file
Another common issue when cleaning data is inconsistent formatting. Like 224-CSAT_Survey [Link]
spaces, inconsistent formatting may be due to input errors, problems with
Estimated completion time
previous data parsing, or a storage issue. Whatever the case, data analysts must
5 minutes
address and clean the issue before performing effective data analysis.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice handling inconsistent formatting. Subtopic: Handle Inconsistent
Formatting
Steps for Completion
Objectives covered
1. Open the 224-CSAT_Survey [Link] file from your Domain 2 2 Data Manipulation
Student folder. 2.2 Clean data
2.2.4 Handling inconsistent
2. Change the dates in column C to the MM/DD/YY format. formatting

3. Manually adjust any dates that remain unformatted. Notes for the teacher
Make sure students understand why it is
4. Save the file as 224-CSAT_Survey Results-Completed important to have consistent formatting
for data analysis. If time permits, discuss
5. Give an example of inconsistent formatting. when one might use different formats
and why.
a.

24 | Domain 2 Lesson 2: Handle Inconsistent Formatting IT Specialist – Data Analytics Project Workbook, First Edition
Remove Duplicates Project Details
Project file
Another step that is often necessary when cleaning data is removing duplicates. 225-CSAT_Survey [Link]
Duplicate entries can bias datasets too far toward one data point or observation,
Estimated completion time
so analysts must remove them in many cases.
5 minutes

Purpose Video reference


Domain 2
In this project, you will practice removing duplicate data entries. Topic: Clean Data
Subtopic: Remove Duplicates
Steps for Completion
Objectives covered
1. Open the 225-CSAT_Survey [Link] file from your Domain 2 2 Data Manipulation
Student folder. 2.2 Clean data
2.2.5 Removing duplicates
2. Use the Remove Duplicates data tool to remove duplicate entries from
the table. Notes for the teacher
Make sure students understand that
3. Save the file as 225-CSAT_Survey Results-Completed unintended duplicates can skew data
and cause bias. Ensure they know that
4. Label the following statements as true or false. sometimes duplicates are intentional
and important to datasets. If time
a. Duplicates should always be removed from permits, discuss when duplicates might
datasets. be important and why.

b. Leaving unintended duplicates in a dataset can


reduce data quality and skew results.

25 | Domain 2 Lesson 2: Remove Duplicates IT Specialist – Data Analytics Project Workbook, First Edition
Impute Data and Validate Data Project Details
Project file
Data analysts must deal with null values in a dataset, as they can affect the 226-CSAT_Survey [Link]
accuracy of data analysis and insights. A potential solution to dealing with null
Estimated completion time
values is imputing data.
5 minutes
Validation verifies that analyzed data is reliable, accurate, and complete. It is a Video reference
more high-level process than other data cleaning steps, meaning there is no Domain 2
one process or formula to follow to validate data effectively. Topic: Clean Data
Subtopic: Impute Data; Validate
Purpose Data

In this project, you will practice imputing and validating data. Objectives covered
2 Data Manipulation
Steps for Completion 2.2 Clean data
2.2.6 Imputing data
1. Open the 226-CSAT_Survey [Link] file from your Domain 2 2.2.7 Validating data
Student folder. Notes for the teacher
Make sure students understand how to
2. In cell B53, use the MEDIAN formula to find the median age of cells
impute data to handle null values.
B2:B51. Ensure students know what validation
is. If time permits, discuss questions one
3. Select cells B2:B51 and use Find and Replace to find blank cells and
might ask when validating data.
replace them with the median value in cell B53.

4. Use the COUNTBLANK formula in cell B52 to ensure no blank cells are
in the Age column.

5. Save the file as 226-CSAT_Survey Results-Completed

6. What are some descriptive statistics one might use on numeric data for validation?

a.

26 | Domain 2 Lesson 2: Impute Data and Validate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 3

27 | Domain 2 Lesson 3: Impute Data and Validate Data IT Specialist – Data Analytics Project Workbook, First Edition
Sort and Filter Data Project Details
Project file
After cleaning is complete, data organization is another important step in data 231-Feb_visits.xlsx
analysis. Analysts can take a deeper look at data and rearrange it to work more
Estimated completion time
effectively in their analyses. Sorting and filtering in Excel is an easy way to
5-10 minutes
organize data.
Video reference
Purpose Domain 2
Topic: Organize Data
In this project, you will practice sorting and filtering data. Subtopic: Sort and Filter Data

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 231-Feb_visits.xlsx from your Domain 2 Student folder. 2.3 Organize data
2.3.1 Sorting and filtering data
2. Select the entire table.
Notes for the teacher
3. Use Custom Sort to sort the data in the Date of Last Visit column. Ensure students know how to sort and
filter data in an Excel table.
a. Sort the cell values from oldest to newest.

4. Filter all the columns in the table.

5. Use the Class Type dropdown menu to show only Spin/Cycle class data.

6. Save the file as 231-Feb_visits-Completed

28 | Domain 2 Lesson 3: Sort and Filter Data IT Specialist – Data Analytics Project Workbook, First Edition
Slice Data Project Details
Project file
Another important data manipulation skill is slicing data or subsetting. This 232-Feb_visits.xlsx
means extracting a subset of data from a larger dataset based on specified
Estimated completion time
conditions or criteria. Slicing data gives analysts more flexibility in manipulating
5 minutes
and looking at different pieces of data.
Video reference
Purpose Domain 2
Topic: Organize Data
In this project, you will practice slicing data. Subtopic: Slice Data

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 232-Feb_visits.xlsx file from your Domain 2 Student folder. 2.3 Organize data
2.3.2 Slicing data
2. Create a PivotTable from the entire table.
Notes for the teacher
3. Select the Purchased Items field and add the values to the table. Make sure students know how to create
a PivotTable from data in Excel. Ensure
4. Select the Amount Spent field. they know how to use the slicing tools.

5. Open the Insert Slicer tool and select the Date of Last Visit.

a. Filter for the dates 2/4/2024 and 2/9/2024

6. Save the file as 232-Feb_visits-Completed

29 | Domain 2 Lesson 3: Slice Data IT Specialist – Data Analytics Project Workbook, First Edition
Transpose and Append Data Project Details
Project file
Transposing data is a method of organization specifically used for tabular 233-Feb_visits.xlsx
structured data. It entails swapping the rows and columns in the schema; rows 234-Feb_visits.xlsx
become columns and vice versa.
Estimated completion time
10 minutes
Appending data means adding new rows or columns of data to an existing
dataset. It is typically easiest to do if the data is in table form, and it is important Video reference
if an analyst wants to combine datasets or add later observations or responses. Domain 2
Topic: Organize Data
Purpose Subtopic: Transpose Data; Append
Data
In this project, you will practice transposing and appending data.
Objectives covered
Steps for Completion 2 Data Manipulation
2.3 Organize data
1. Open the 233-Feb_visits.xlsx file from your Domain 2 Student folder. 2.3.3 Transposing data
2.3.4 Appending data
2. Create a new sheet named Transposed data
Notes for the teacher
3. On the 233-Feb_vists sheet, select and copy all the data. Make sure students know how to
transpose and append data in Microsoft
4. Transpose and paste the data to the Transposed data sheet. Excel.

5. Save the file as 233-Feb_visits-Completed

6. Open the 234-Feb_visits.xlsx file.

7. Navigate to the Round 2 results sheet.

8. Copy all the data in the table.

9. Append the data to the 234-Feb_visits sheet in cell A78.

10. Save the file as 234-Feb_visits-Completed

30 | Domain 2 Lesson 3: Transpose and Append Data IT Specialist – Data Analytics Project Workbook, First Edition
Truncate Data Project Details
Project file
While appending data means adding entries, or rows, to the entire dataset, 235-Feb_visits.xlsx
truncating data can mean more than one thing in practice. In a database
Estimated completion time
management context, truncating data often means removing data from a table
5 minutes
entirely. For this lesson, truncating data means shortening or reducing the data
length on a variable-by-variable basis. Video reference
Domain 2
Purpose Topic: Organize Data
Subtopic: Truncate Data
Upon completing this project, you will better understand truncating data.
Objectives covered
Steps for Completion 2 Data Manipulation
2.3 Organize data
1. Open the 235-Feb_visits.xlsx file from your Domain 2 Student folder. 2.3.5 Truncating data

2. Create a new sheet named Truncated data Notes for the teacher
Ensure students understand what it
3. Copy the table from the 235-Feb_visits sheet to the Truncated data means to truncate data in the context of
database management, and for this
sheet.
lesson. Make sure they have truncated
4. On the Truncated data sheet, truncate the Timestamp and IP Address the unnecessary data in the table on a
new sheet named Truncated data.
columns from the table.

5. Save the file as 235-Feb_visits-Completed

6. Label the following statements as true or false.

a. Data can be removed from a table entirely with a database management tool or
programming language such as SQL.

b. Truncating data can also mean eliminating rows or columns in data that are not deemed
valuable or necessary for analysis purposes.

31 | Domain 2 Lesson 3: Truncate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 4

32 | Domain 2 Lesson 4: Truncate Data IT Specialist – Data Analytics Project Workbook, First Edition
Group, Join, and Merge Data Project Details
Project file
The final set of topics in data manipulation is about data aggregation, or 241-Feb_visits.xlsx
grouping and organizing data together for analysis. Grouping data means 242-Join_merge.xlsx
organizing it into subsets based on common characteristics or criteria. Grouping 242-Survey_results_1-start
242-Survey_results_2-start
is important because it empowers analysts to group relevant data in large
datasets and makes their analysis more focused. Estimated completion time
10 minutes
Another way to aggregate data is by joining and merging it. Joining and
merging are interchangeable terms in most data analytics contexts and refer to Video reference
combining information from multiple sources into a single dataset. Domain 2
Topic: Aggregate Data
Subtopic: Group Data; Join or
Purpose
Merge Data
In this project, you will practice grouping, joining, and merging data. Objectives covered
2 Data Manipulation
Steps for Completion 2.4 Aggregate data
2.4.1 Grouping data
1. Open the 241-Feb_visits.xlsx file from your Domain 2 Student folder.
2.4.2 Joining/merging data
2. Group the Class Type, Purchased Items, Amount Spent, and Satisfaction Notes for the teacher
Score columns together. In file 241-Feb_visits – [Link],
make sure students have grouped
3. Use the Subtotal tool to give the average Satisfaction Score for the columns E through H together and
group data. found the average Satisfaction Score for
the grouped data. In file 242-
4. Save the file as 241-Feb_visits-Completed Join_merge – [Link], ensure
students have merged data the 223-
5. Open the 242-Join_merge.xlsx file from your Domain 2 Student folder. CSAT_Survey_results-start table and the
Truncated data table.
6. Import data from the 242-Survey_results_1-start file.

a. Load 223-CSAT_Survey_results-start.

7. Import data from the 242-Survey_results_2-start file.

a. Load Truncated data.

8. Merge the 223-CSAT_Survey_results-start and Truncated data sheets.

9. Use the Left Outer Join Kind and select the Customer ID columns in each table.

10. Expand the merged data from the second sheet.

11. Save the file as 242-Join_merge-Completed

33 | Domain 2 Lesson 4: Group, Join, and Merge Data IT Specialist – Data Analytics Project Workbook, First Edition
Summarize Data Project Details
Project file
Summarizing data is an initial step of analyzing data and refers to the process of 243-Feb_visits.xlsx
obtaining summary statistics. Summarizing data is a great way for data analysts
Estimated completion time
to understand their data better and get summary statistics before beginning
10 minutes
deeper analysis.
Video reference
Purpose Domain 2
Topic: Aggregate Data
In this project, you will practice summarizing data. Subtopic: Summarize Data

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 243-Feb_visits.xlsx file from your Domain 2 Student folder. 2.4 Aggregate data
2.4.3 Summarizing data
2. Create a new sheet named Drink analysis
Notes for the teacher
3. Copy the table from the Truncated data sheet to the Drink analysis Make sure students have created the
sheet. sum and mean of cells G2:G79 in cells
G86 and G87, respectively.
4. Filter the table, then filter the Purchased Items Column for those cells
containing the word drink

5. In cell F86, type Sum

6. In cell F87 type Mean

7. In cell G86, use the SUBTOTAL function to produce the sum of cells G2:G79

8. In cell G87, use the SUBTOTAL function to produce the average of cells G2:G79

9. Save the file as 243-Feb_visits-Completed

34 | Domain 2 Lesson 4: Summarize Data IT Specialist – Data Analytics Project Workbook, First Edition
Pivot Data Project Details
Project file
Pivoting data is similar to transposing data but with some key differences. While 244-Feb_visits.xlsx
transposing data is typically focused on switching rows with columns and vice
Estimated completion time
versa, pivoting involves extra steps to gain summary statistics and possibly
5 minutes
aggregate the data.
Video reference
Purpose Domain 2
Topic: Aggregate Data
In this project, you will practice pivoting data. Subtopic: Pivot Data

Steps for Completion Objectives covered


2 Data Manipulation
1. Open the 244-Feb_visits.xlsx file from your Domain 2 Student folder. 2.4 Aggregate data
2.4.4 Pivoting data
2. Add the Class Attended field to the PivotTable.
Notes for the teacher
3. Change the order of the rows so that Class Attended is before Make sure students have added the
Purchased Items. Class Attended field to the PivotTable
and changed the order of rows so it
4. Save the file as 244-Feb_visits-Completed comes before the Purchased Items row.
Ensure they know the difference
5. Label the following statements as true or false. between transposing and pivoting data
formats.
a. Transposing and pivoting data always convert
data into the same formats.

b. Pivoting data can aggregate or summarize data values and rearrange them.

c. Wide format data has many rows and fewer columns, while tall format data has many
columns and fewer rows.

35 | Domain 2 Lesson 4: Pivot Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 1

36 | Domain 3 Lesson 1: Pivot Data IT Specialist – Data Analytics Project Workbook, First Edition
Descriptive Analysis Project Details
Project file
Descriptive analysis describes data's characteristics, summarizing and describing 311-Feb_visits.xlsx
its main features. Descriptive analysis often includes summary statistics. It also
Estimated completion time
includes some basic visualization methods to help dig deeper into data features.
5 minutes

Purpose Video reference


Domain 3
Upon completing this project, you will better understand how to perform Topic: Different Types of Data
descriptive analysis. Analysis
Subtopic: Descriptive Analysis
Steps for Completion
Objectives covered
1. List four or more descriptive statistics. 3 Data Analysis
3.1 Describe and differentiate
a. between different types of data
analysis
3.1.1 Descriptive analysis

Notes for the teacher


2. Open the 311-Feb_visits.xlsx file from your Domain 3 Student folder. Students’ completed projects should
show a PivotTable with the Purchased
3. Create a PivotTable that includes all data in the document in a new Items field that includes the field’s
worksheet. values. Ensure students understand that
adding a field to the values section of a
4. Add the Purchased Items field to the Rows and Values sections in the PivotTable creates a frequency
PivotTable. distribution of that data.

5. According to the PivotTable, which two items were purchased the least?

a.

6. Save the file as 311-Feb_visits-Completed

37 | Domain 3 Lesson 1: Descriptive Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Diagnostic Analysis Project Details
Project file
Diagnostic analysis refers to the process of analyzing data to identify or 312-Feb_visits.xlsx
diagnose trends, patterns, and potential outcomes of events. It involves
Estimated completion time
identifying and defining relationships in the data, exploring potential causal
10 minutes
factors, finding potential anomalies, and devising hypotheses based on these
findings. Video reference
Domain 3
Purpose Topic: Different Types of Data
Analysis
Upon completing this project, you will better understand how to perform Subtopic: Diagnostic Analysis
diagnostic analysis.
Objectives covered
3 Data Analysis
Steps for Completion 3.1 Describe and differentiate
between different types of data
1. What is the difference between correlation and causation?
analysis
a. 3.1.2 Diagnostic analysis

Notes for the teacher


Students’ completed projects should
show the average of the satisfaction
2. Why is it important to include statistical analysis as part of diagnostic
scores added to the PivotTable. If time
analysis? permits, you may choose to discuss the
data in the PivotTable and what it
a. reveals about customer satisfaction with
the gym.

3. Open the 312-Feb_visits.xlsx file from your Domain 3 Student folder.

4. Open the PivotTable worksheet.

5. Add the Satisfaction Score field to the Values section on the PivotTable.

6. Change the sum of the satisfaction scores in the PivotTable to the average of the satisfaction scores.

7. Save the file as 312-Feb-visits-Completed

38 | Domain 3 Lesson 1: Diagnostic Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Hypothesis Testing Project Details
Project file
Hypothesis testing involves making inferences or drawing conclusions about a N/A
parameter based on sample data. To perform hypothesis testing, an analyst
Estimated completion time
formulates two competing hypotheses, known as null and alternative
5-10 minutes
hypotheses, collecting data and using statistical methods to evaluate the
evidence and argument against the null hypothesis. Video reference
Domain 3
Purpose Topic: Different Types of Data
Analysis
Upon completing this project, you will better understand how to test Subtopic: Hypothesis Testing
hypotheses.
Objectives covered
3 Data Analysis
Steps for Completion 3.1 Describe and differentiate
between different types of data
1. Determine whether each hypothesis is a null hypothesis or an
analysis
alternative hypothesis. 3.1.3 Hypothesis testing

a. The number of times customers visit Notes for the teacher


the gym does not affect satisfaction scores. If time permits, you may choose to
guide students through an instance of
b. Customers’ satisfaction scores increase hypothesis testing with students by
the more they visit the gym. creating a null and an alternative
hypothesis, determining a test statistic,
2. What is a significance level? and setting a significance level together.

a.

3. What is a p-value?

a.

4. Match the test statistic to its description.

A. T-statistic B. Chi-square C. R-squared


a. Used to determine if there is a significant association between categorical variables.

b. Used to determine the proportion of variance in one variable explained by another variable.

c. Used to determine if the means of two groups are significantly different from each other.

39 | Domain 3 Lesson 1: Hypothesis Testing IT Specialist – Data Analytics Project Workbook, First Edition
Predictive and Prescriptive Project Details
Project file
Analytics N/A

Estimated completion time


Predictive analytics is commonly used in business and financial planning to 10 minutes
reduce risk and aid decision-making and funding allocations. Prescriptive Video reference
analytics is related to predictive analytics in that it aids decision-making Domain 3
processes but is not concerned with predicting future trends based on historical Topic: Different Types of Data
data. Prescriptive analytics often follow other types of analytics, especially Analysis
descriptive analysis, to provide a deeper analysis of what an organization can do Subtopic: Predictive Analysis;
Prescriptive Analysis
with the data.
Objectives covered
Purpose 3 Data Analysis
3.1 Describe and differentiate
Upon completing this project, you will better understand how to perform between different types of data
predictive and prescriptive analytics. analysis
3.1.4 Predictive analysis
Steps for Completion 3.1.5 Prescriptive analysis

1. List at least two predictive analysis models. Notes for the teacher
Ensure students understand the
a. difference between predictive and
prescriptive analysis.

2. What is the purpose of predictive analysis?

a.

3. Describe the process of preparing predictive analysis models with testing and training.

a.

4. List at least two prescriptive analytics methodologies.

a.

5. What is the purpose of prescriptive analysis?

a.

40 | Domain 3 Lesson 1: Predictive and Prescriptive Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 2

41 | Domain 3 Lesson 2: Predictive and Prescriptive Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Search Data Project Details
Project file
Searching a dataset might seem like a basic process, but it serves as a vital tool 321-Feb_visits.xlsx
in data analytics. Understanding how searching a dataset works in one’s
Estimated completion time
analytics tool of choice is essential to effective analysis.
5-10 minutes

Purpose Video reference


Domain 3
Upon completing this project, you will better understand how to search data in Topic: Aggregation and Metrics
Excel. Subtopic: Search

Steps for Completion Objectives covered


3 Data Analysis
1. Open the 321-Feb_visits.xlsx file from your Domain 3 Student folder. 3.2 Describe and differentiate
between data aggregation and
2. Select all data in the file. interpretation metrics
3.2.1 Searching
3. Open the Find and Replace dialog box.
Notes for the teacher
4. Find all instances of the word Aerobics in the file. Students’ completed projects should
show no instances of the @ symbol in
5. How many cells contain the word Aerobics? the Class Type column.

6. Select all data in the Class Type column.

7. Search for all instances of the symbol @ in the column.

8. Delete the instance of the symbol @ once you find it.

9. Save the file as 321-Feb_visits-Completed

42 | Domain 3 Lesson 2: Search Data IT Specialist – Data Analytics Project Workbook, First Edition
Filter Data Project Details
Project file
Filtering is a method of obtaining a subset of a dataset based on one or more 322-Feb_visits.xlsx
criteria. It can be used to look only at observations or respondents with specific
Estimated completion time
demographic characteristics or who provide the same answer to a question. In
5 minutes
data analytics, filtering data can help an analyst drill down into the most
relevant data quickly and provide quicker, more impactful analysis. Video reference
Domain 3
Purpose Topic: Aggregation and Metrics
Subtopic: Filter
Upon completing this project, you will better understand how to filter data in
Objectives covered
Excel.
3 Data Analysis
3.2 Describe and differentiate
Steps for Completion between data aggregation and
interpretation metrics
1. Open the 322-Feb_visits.xlsx file from your Domain 3 Student folder.
3.2.2 Filtering
2. Add filters to each column in the file. Notes for the teacher
Students’ completed projects should
3. Filter the Class attended column to show only entries with Yes as a
show entries where a class was
value. attended. The entries should be sorted
alphabetically from A to Z using the
4. Sort data in the Class Type field from A to Z.
Class Type column.
5. Save the file as 322-Feb_visits-Completed

43 | Domain 3 Lesson 2: Filter Data IT Specialist – Data Analytics Project Workbook, First Edition
Find Unique Values Project Details
Project file
Unique values appear only once in a dataset. Unique values are important to 323-Feb_visits.xlsx
find and consider in data analytics because they can provide greater insight and
Estimated completion time
context into the dataset they are in or to muddle and distract from analyses and
5 minutes
findings. Each unique value should be carefully considered to understand where
it fits in the context of the dataset and how it relates to the goals of an analysis. Video reference
Domain 3
Purpose Topic: Aggregation and Metrics
Subtopic: Unique Values
Upon completing this project, you will better understand how to find unique
Objectives covered
values in Excel.
3 Data Analysis
3.2 Describe and differentiate
Steps for Completion between data aggregation and
interpretation metrics
1. Open the 323-Feb_visits.xlsx file from your Domain 3 Student folder.
3.2.3 Unique values
2. Create a PivotTable from all data in the file. Notes for the teacher
Students’ completed projects should
3. Add fields to the Rows and Values sections of the PivotTable until you
show a PivotTable that displays the
find one or more unique values. Purchased Items? field and its values.

4. What field contains unique values?

a.

5. What entries contain unique values in that field?

a.

6. Save the file as 323-Feb_visits-Completed

44 | Domain 3 Lesson 2: Find Unique Values IT Specialist – Data Analytics Project Workbook, First Edition
Aggregate Functions Project Details
Project file
Aggregate functions are an effective and simple way to obtain summary N/A
statistics, which provide high-level insight about the dataset and its features. In
Estimated completion time
Excel, these functions include SUM, MAX, MIN, COUNT, AVERAGE, MODE,
10 minutes
MEDIAN, and STDEV.
Video reference
Purpose Domain 3
Topic: Aggregation and Metrics
Upon completing this project, you will better understand how to use aggregate Subtopic: Aggregate Functions
functions in Excel.
Objectives covered
Steps for Completion 3 Data Analysis
3.2 Describe and differentiate
1. Which Excel function finds the number of instances of a numeric value between data aggregation and
interpretation metrics
in a selected data range?
3.2.4 Aggregate functions (Sum,
a. Max, Min, Count, Avg/Mean, Mode,
Median, Std. Dev)
2. Which Excel function finds the mean of a selected data range?
Notes for the teacher
a. If time permits, you may choose to
show students each of these functions
3. Which Excel function finds the amount of variation in a selected range used on a dataset in Excel.
or set of values?

a.

4. Which Excel function finds the smallest values in a selected data range?

a.

5. Which Excel function finds the midpoint of a selected data range?

a.

6. Which Excel function adds numeric data in a selected data range together?

a.

7. Which Excel function finds the largest values in a selected data range?

a.

8. Which Excel function calculates what data value occurs the most often across a selected data range?

a.

45 | Domain 3 Lesson 2: Aggregate Functions IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 3

46 | Domain 3 Lesson 3: Aggregate Functions IT Specialist – Data Analytics Project Workbook, First Edition
Find Relationships in Data Project Details
Project file
After cleaning the data and analyzing summary statistics, exploratory data 331-Datacize Membership 2019-
analysis is often the next stage in data analytics. It includes finding initial [Link]
relationships in data and determining where, in the future, more in-depth data
Estimated completion time
analysis might take place to develop actionable insights. One method of finding 15 minutes
initial relationships in data involves visualization. For more subtle relationships,
finding correlations may be necessary. Video reference
Domain 3
Purpose Topic: Exploratory Data Analysis
Methods
Upon completing this project, you will better understand how to find Subtopic: Identify Data
Relationships; Correlation
relationships in data.
Coefficient
Steps for Completion Objectives covered
3 Data Analysis
1. Open the 331-Datacize Membership [Link] file from your 3.3 Describe and differentiate
Domain 3 Student folder. between exploratory data analysis
methods
2. Create a bar chart containing all the data in the file. 3.3.1 Identifying data relationships

3. Remove the Total annual revenue data from the chart. Notes for the teacher
If time permits, you may choose to have
4. What does the chart reveal about the gross membership revenue over students use the CORREL function to
time? calculate correlation coefficients
between other variables in the file.
a.

5. In an empty cell near the existing data, add the text Price/membership correlation

6. In the cell below the added text, use the CORREL function to calculate a correlation coefficient to analyze the
relationship between the number of current memberships and the cost of those memberships.

7. Rounded to the hundredth place, what is the correlation coefficient of the number of current memberships and
the cost of those memberships?

8. What does the correlation coefficient reveal about the relationship between the number of current memberships
and the cost of those memberships?

a.

9. Save the file as 331-Datacize Membership 2019-2023-Completed

47 | Domain 3 Lesson 3: Find Relationships in Data IT Specialist – Data Analytics Project Workbook, First Edition
Data Drilling and Data Mining Project Details
Project file
Data drilling, sometimes called data drilling down or data drill-down, is an N/A
analytical process of exploring data with increasing levels of granularity to
Estimated completion time
uncover insights or answer business questions. Data drilling is less of a specific
10 minutes
technique and more of a concept. Data mining discovers trends, patterns,
correlations, and insights from data, especially those found in large datasets. It Video reference
can include techniques such as classification, regression, clustering, association Domain 3
Topic: Exploratory Data Analysis
rule mining, anomaly detection, time series analysis, and text mining.
Methods
Subtopic: Data Drilling Concepts;
Purpose Data Mining Concepts
Upon completing this project, you will better understand how to perform data Objectives covered
drilling and data mining. 3 Data Analysis
3.3 Describe and differentiate
Steps for Completion between exploratory data analysis
methods
1. What data type do data analysts use when starting the data drilling 3.3.2 Describe data drilling
process? concepts (e.g., granularity)
3.3.3 Describe data mining
a. concepts (anomalies, correlation
analysis, patterns, outliers, etc.)

2. What is regression? Notes for the teacher


Ensure students understand the
a. difference between data drilling and
data mining.

3. What is anomaly detection?

a.

4. What is clustering?

a.

5. What is text mining?

a.

6. What is time series analysis?

a.

7. What is classification?

a.

8. What is association rule mining?

a.

48 | Domain 3 Lesson 3: Data Drilling and Data Mining IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 4

49 | Domain 3 Lesson 4: Data Drilling and Data Mining IT Specialist – Data Analytics Project Workbook, First Edition
Calculate Trends and Expected Project Details
Project file
Values N/A

Estimated completion time


Visual trend identification is an excellent form of exploratory data analysis. Still, 5 minutes
analysts will need to incorporate statistical methods to know how strong a trend Video reference
is truly or to confirm that there is a trend at all. Many of these methods are Domain 3
foundational to data analytics and are thus crucial to know how to implement Topic: Data Analysis Results
and understand. There are several methods to calculate and analyze trends in Subtopic: Calculate Trends;
data, including moving averages and regression. Another important data Determine Expected Values
analytics component is the ability to calculate expected values, as expected Objectives covered
values predict future outcomes. 3 Data Analysis
3.4 Evaluate and explain the results
Purpose of data analyses
3.4.1 Calculate trends
Upon completing this project, you will better understand how to calculate 3.4.2 Determine expected values
trends and expected values.
Notes for the teacher
Steps for Completion If time permits, you may choose to
calculate a moving average with a
1. What is the purpose of a moving average? sample dataset with students.

a.

2. How is a moving average calculated?

a.

3. What is the easiest way to calculate expected values when there is a set of data values?

a.

4. What two statistical methods can be used to estimate expected values?

a.

50 | Domain 3 Lesson 4: Calculate Trends and Expected Values IT Specialist – Data Analytics Project Workbook, First Edition
Interpret Predictive Models Project Details
Project file
Interpreting the results of predictive models is crucial to a data analyst, as the N/A
interpretation of the data communicates the insights the model provides.
Estimated completion time
Metrics like coefficients, R-squared, and p-values mean very little as numeric
5 minutes
data unless analysts can explain their values to the business and how they
provide insights and answer business questions. When using a predictive model, Video reference
analysts should take time to understand how that model works, what the Domain 3
Topic: Data Analysis Results
metrics for its performance mean for the data, and the business questions the
Subtopic: Interpret Predictive
predictive model is trying to answer. Model Results

Purpose Objectives covered


3 Data Analysis
Upon completing this project, you will better understand how to interpret the 3.4 Evaluate and explain the results
results of predictive models. of data analyses
3.4.3 Interpret results of predictive
Steps for Completion models

1. What does Mean Square Error (MSE) measure? Notes for the teacher
If time permits, you may choose to
a. show students sample datasets with
different MSE and R-squared values and
explain what the values mean within the
context of the dataset.
2. What does R-squared measure?

a.

3. Label the following statements as true or false.

a. The meaning of an R-squared value is relative to the values of the variables in the
dataset.

b. An R-squared value of 0.28 indicates a weak, positive relationship between the variances
of a dependent variable and an independent variable.

51 | Domain 3 Lesson 4: Interpret Predictive Models IT Specialist – Data Analytics Project Workbook, First Edition
Interpret P-Values and T-Tests Project Details
Project file
P-values and t-tests are used in hypothesis testing to statistically determine the N/A
likelihood of rejecting the null hypothesis, which is the status quo. P-values are
Estimated completion time
often associated with linear regression and are used to determine the
5-10 minutes
significance of an observed result, and t-tests are used to determine if there is a
significant difference between the means of two groups. Video reference
Domain 3
Purpose Topic: Data Analysis Results
Subtopic: Interpret P-Values and
Upon completing this project, you will better understand how to interpret p- T-Tests
values and t-tests.
Objectives covered
3 Data Analysis
Steps for Completion 3.4 Evaluate and explain the results
of data analyses
1. Determine what should happen based on the statistical analysis results,
3.4.4 Interpret results of p-values
assuming the significance level is 0.05. and t-tests

A. Reject the null hypothesis B. Fail to reject the null hypothesis Notes for the teacher
Ensure students understand that both
a. The p-value is 0.01. p-values and t-tests are both compared
to a confidence level to determine their
b. The p-value is 0.12. significance.

c. The p-value is 0.76.

d. The p-value is 0.04.

2. What are independent sample t-tests?

a.

3. What are paired sample t-tests?

a.

52 | Domain 3 Lesson 4: Interpret P-Values and T-Tests IT Specialist – Data Analytics Project Workbook, First Edition
Interpret Regression Analyses Project Details
Project file
Linear regression is a statistical predictive method that models the relationship 345-Datacize-Membership-2019-
between one or more independent variables and a dependent variable. This [Link]
equation can represent a linear regression model: y = B0 + Bx1 + e. There are
Estimated completion time
also non-linear regression models that can be used for variables with a non- 10 minutes
linear relationship.
Video reference
Purpose Domain 3
Topic: Data Analysis Results
Upon completing this project, you will better understand how to interpret the Subtopic: Interpret Regression
results of linear regression analyses. Analyses; Tables and Values

Objectives covered
Steps for Completion 3 Data Analysis
3.4 Evaluate and explain the results
1. Open the [Link] file from your
of data analyses
Domain 3 Student folder. 3.4.5 Interpret results of regression
analyses
2. Use the Data Analysis tool to create a regression analysis from the
current membership data and membership cost data. Notes for the teacher
Students’ completed projects should
a. Use current memberships as the Y variable. show a regression analysis, with current
memberships as the dependent variable
b. Use the cost of memberships as the X variable. and membership costs as the
independent variable. If time permits,
3. Interpret the meaning of the R-square value. you may choose to go into further
analysis of the different statistics
a.
included in the regression analysis.

4. Interpret the meaning of the standard error value.

a.

5. What are degrees of freedom?

a.

6. What is an intercept coefficient?

a.

7. Save the file as 345-Datacize-Membership-2019-2023-Completed

53 | Domain 3 Lesson 4: Interpret Regression Analyses IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 5

54 | Domain 3 Lesson 5: Interpret Regression Analyses IT Specialist – Data Analytics Project Workbook, First Edition
AI, Machine Learning, and Project Details
Project file
Algorithms N/A

Estimated completion time


Artificial Intelligence (AI) is typically characterized by an ability to learn and 10-15 minutes
improve in its tasks like a human generally would and become capable of some Video reference
level of predictive problem-solving and decision-making without human input. Domain 3
While machine learning is foundational to the study and implementation of AI, it Topic: AI in Data Analysis
is also at the heart of other data analytics models and algorithms. Algorithms Subtopic: Define Artificial
are at the heart of machine learning and advanced data analytics, so having a Intelligence; Define Machine
Learning; Define Algorithm
deeper understanding of them is crucial.
Objectives covered
Purpose 3 Data Analysis
3.5 Define and describe the role of
Upon completing this project, you will better understand how artificial artificial intelligence in data analysis
intelligence, machine learning, and algorithms function. 3.5.1 Define artificial intelligence
3.5.2 Define machine learning
Steps for Completion 3.5.3 Define algorithm

1. What is artificial intelligence (AI)? Notes for the teacher


Ensure students understand algorithms
a. are fundamental to machine learning
and that machine learning facilitates AI.

2. What is machine learning?

a.

3. What are algorithms?

a.

4. Label the following statements as true or false.

a. Computer vision is often used in the field of robotics.

b. Deep learning and neural networks are types of expert systems.

c. Natural language processing (NLP) is often used for chatbots and sentiment analysis.

d. Generally, more data is better when machine learning is used.

e. Unsupervised learning occurs when models learn from labeled data where the input-
output pairs are provided and labeled during training.

f. Training is the process of evaluating models on unseen data to assess their performance
and ability.

g. Algorithms are finite and end after specified steps have occurred.

h. If given the same input in multiple instances, algorithms will produce different outputs.

55 | Domain 3 Lesson 5: AI, Machine Learning, and Algorithms IT Specialist – Data Analytics Project Workbook, First Edition
AI and Machine Learning in Project Details
Project file
Data Analytics N/A

Estimated completion time


AI is used increasingly in data analytics as it offers both streamlining of data 5 minutes
processing, cleaning, and exploratory data analysis and more advanced data Video reference
analysis methods. AI in data analysis is helpful in processing, exploring, and Domain 3
analyzing any dataset, especially those difficult for humans to parse and Topic: AI in Data Analysis
understand. While AI is growing in effectiveness and popularity, machine Subtopic: Using AI in Data
learning still underlies more commonly used data analysis methods. Machine Analysis; Machine Learning
Algorithms
learning enables computers to learn from data and improve from this learning,
making it extremely valuable to data analysts looking to develop strong and Objectives covered
accurate analytical models. 3 Data Analysis
3.5 Define and describe the role of
Purpose artificial intelligence in data analysis
3.5.4 Describe how AI is used in
Upon completing this project, you will better understand how AI and machine data analysis
learning are used in data analytics. 3.5.5 Describe how machine
learning algorithms are used in
Steps for Completion data analysis (Note: Specific
algorithms are out of scope)
1. List two or more advantages of using AI for data analytics.
Notes for the teacher
a. If time permits, you may choose to
show students an example of both AI
and machine learning being used for
data analytics.

2. List two or more advantages of using machine learning algorithms for data analytics.

a.

56 | Domain 3 Lesson 5: AI and Machine Learning in Data Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 1

57 | Domain 4 Lesson 1: AI and Machine Learning in Data Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Display Information Project Details
Project file
Once data has been collected, the next step is to display the information in a 411-Datacize Membership 2019-
table or a chart. Choosing the right method of displaying the information is [Link]
important for making sure it is understood. There are a few best practices to
Estimated completion time
follow to ensure that the information in a table or chart is effectively displayed. 5 minutes

Purpose Video reference


Domain 4
Upon completing this project, you will better understand how to display Topic: Report Data
information in tables and charts effectively. Subtopic: Display Information

Steps for Completion Objectives covered


4 Data Visualization and
1. Open the 411-Datacize Membership [Link] file in your Communication
4.1 Report data
Domain 4 Student folder.
4.1.1 Effectively display
2. On the data worksheet, apply bold formatting and add borders and information in tables and charts

filters to the table headers. Notes for the teacher


Ensure students understand both how
3. On the Averages worksheet, change the chart title to Six-month to make their charts more effective and
average of revenue and add data labels to the chart. why they need to ensure readability and
clarity.
4. According to the table, what was the total annual revenue from January
2019 to January 2020?

a.

5. Save the file as 411-Datacize Membership 2019-2023-Completed

58 | Domain 4 Lesson 1: Display Information IT Specialist – Data Analytics Project Workbook, First Edition
Disaggregate Data Project Details
Project file
Disaggregating data is the process of breaking aggregated data down into 412-Feb_visits.xlsx
individual categories for more granular data analysis. It can be useful for
Estimated completion time
performing more detailed analysis, enhancing transparency, and fostering
5 minutes
equity. Keep in mind when aggregating data that you may need to disaggregate
it later, so save the data appropriately. Video reference
Domain 4
Purpose Topic: Report Data
Subtopic: Disaggregate Data
Upon completing this project, you will better understand when and why to
Objectives covered
disaggregate data.
4 Data Visualization and
Communication
Steps for Completion 4.1 Report Data
4.1.2 Explain when and why to
1. Open the 412-Feb_visits.xlsx file in your Domain 4 Student Folder
disaggregate data
2. Which column contains disaggregated data, and what is it showing? Notes for the teacher
Ensure students understand the
a.
meaning of aggregation and
disaggregation, and when to use each
skill. Demonstrate how to disaggregate
3. List three reasons one might need to disaggregate data.
data if time permits.
a.

59 | Domain 4 Lesson 1: Disaggregate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 2

60 | Domain 4 Lesson 2: Disaggregate Data IT Specialist – Data Analytics Project Workbook, First Edition
Data Visualization Practices Project Details
Project file
Following a few best practices can help minimize misinterpretation when 421-Datacize Membership 2019-
creating data visualizations. Sometimes, charts and graphs can be taken out of [Link]
context and misunderstood, but following these best practices can help mitigate
Estimated completion time
this problem and minimize potential misinterpretation. 5 minutes

Purpose Video reference


Domain 4
Upon completing this project, you will better understand the data visualization Topic: Create Visualizations from
practices that can minimize potential misinterpretations. Data
Subtopic: Data Visualization
Steps for Completion Practices

1. Open the 421-Datacize Membership [Link] file in your Objectives covered


4 Data Visualization and
Domain 4 student folder.
Communication
2. On the Averages worksheet, adjust the Six-month average for revenue 4.2 Create visualizations from data
4.2.1 Identify data visualization
chart to have the same maximum point as the Total annual membership
practices that minimize the
revenue chart. potential for misinterpretation

3. Save the file as 421-Datacize Membership 2019-2023-Completed Notes for the teacher
Ensure students understand why
4. How can one prevent misinterpretation if a file might be viewed without misinterpretation can be harmful, and
the analyst providing context? why it is worth it to take the time to
follow these practices
a.

61 | Domain 4 Lesson 2: Data Visualization Practices IT Specialist – Data Analytics Project Workbook, First Edition
Visualization Types Project Details
Project file
When creating visualizations, it is important to understand the types of data N/A
visualizations, how they represent the research question they seek to answer,
Estimated completion time
and the data’s underlying structure. Choosing the appropriate visualization is
10 minutes
crucial to presenting data in a way that can be easily understood. Sometimes,
visualizations of the same type can be used in conjunction with one another to Video reference
provide a further look at the data. Domain 4
Topic: Create Visualizations from
Purpose Data
Subtopic: Identify Visualization
Upon completing this project, you will better understand data visualization Types; Identify Additional
Visualization Types
types.
Objectives covered
Steps for Completion 4 Data Visualization and
Communication
1. Label each visualization category with its corresponding data 4.2 Create visualizations from data
visualizations. 4.2.2 Identify visualization types
that represent the underlying data
a. Comparison: structure and analysis questions
(including comparison, time/trend,
b. Time/Trend: part-to-whole, relationship,
distribution, correlation graphs)
c. Part-to-whole:
4.2.3 Identify additional
d. Relationship: visualization types that represent
the underlying data structure and
e. Distribution: analysis questions (including box
and whisker diagram, scatter chart,
f. Correlation: scatter plot, bar chart, Sankey
diagram, histogram, pie chart,
2. What is a Sankey diagram? column chart, etc.)

a. Notes for the teacher


Ensure students understand the
different categories of data
visualizations, when to use them, and
3. What type of chart is best for comparing data in datasets with several how they benefit data analysis
values in a similar category?

a.

4. What chart type is best for looking at either percentages or parts of a sum?

a.

62 | Domain 4 Lesson 2: Visualization Types IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 3

63 | Domain 4 Lesson 3: Visualization Types IT Specialist – Data Analytics Project Workbook, First Edition
Translate Visual Project Details
Project file
Representations N/A

Estimated completion time


An essential component of data analysis is discussing the importance of 5 minutes
visualizations and conveying the insights they display into words. Without this Video reference
skill, visualizations are dangerously open to interpretation and confusion and Domain 4
can create issues with properly understanding and acting on their insights. A Topic: Derive Conclusions
translation of this nature should be concise and complete, addressing all visual Subtopic: Translate into Words
components of the data while making points clearly.
Objectives covered
4 Data Visualization and
Purpose Communication
4.3 Derive conclusions from a data
Upon completing this project, you will better understand how to interpret and
visualization
translate visual representations of data. 4.3.1 Translate a visual
representation of data into words
Steps for Completion
Notes for the teacher
1. Use the pie chart to interpret insights about the data represented in the If time permits, you may choose to
visualization. present students with another data
visualization and ask them to share
insights from the visualization about the
data.

a. Which two days of the week have the most class registrations?

i.

b. Which two days of the week have the least class registrations?

i.

c. How do the weekly registrations for Wednesday compare to the rest of the week?

i.

64 | Domain 4 Lesson 3: Translate Visual Representations IT Specialist – Data Analytics Project Workbook, First Edition
Visualizations vs. Statistics Project Details
Project file
Visualizations and statistics are both powerful tools that work together in data N/A
analytics. Still, they have different strengths and drawbacks when used to
Estimated completion time
substantiate claims analysts make from the data. Knowing these strengths and
5 minutes
drawbacks is essential to understanding which approach to employ and where
to employ it when substantiating analytical claims. Video reference
Domain 4
Purpose Topic: Derive Conclusions
Subtopic: Claims vs.
Upon completing this project, you will better understand how to choose Representation
between visualizations and statistics for data analysis.
Objectives covered
4 Data Visualization and
Steps for Completion Communication
4.3 Derive conclusions from a data
1. List at least one advantage of using statistics to analyze data rather than
visualization
visualizations. 4.3.2 Identify differences between
claims based on an analysis and its
a.
graphical representation

Notes for the teacher


If time permits, you may choose to
present students with different analysis
2. List at least one disadvantage of using statistics to analyze data rather scenarios and ask them to decide if they
than visualizations. would use visualization or statistics to
analyze the data and explain their
a. reasoning.

3. List at least one advantage of using visualizations to analyze data rather than statistics.

a.

4. List at least one disadvantage of using visualizations to analyze data rather than statistics.

a.

65 | Domain 4 Lesson 3: Visualizations vs. Statistics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 1

66 | Domain 5 Lesson 1: Visualizations vs. Statistics IT Specialist – Data Analytics Project Workbook, First Edition
Privacy Laws and Standards Project Details
Project file
Responsible analytics practices involve being responsible with data and ensuring N/A
one is aware of and following the laws and regulations concerning data storage
Estimated completion time
and handling. There are some laws and regulations one must know, including
5-10 minutes
the General Data Protection Regulation (GDPR), the Family Educational Rights
and Privacy Act (FERPA), the Health Insurance Portability and Accountability Act Video reference
(HIPAA), and the Payment Card Industry Data Security Standard (PCI DSS), Domain 5
Topic: Privacy Laws and Best
commonly referred to as PCI. Organizations utilized in data analytics are
Practices
Institutional Review Boards (IRBs). Subtopic: GDPR; FERPA; HIPAA;
IRB; PCI
Purpose
Objectives covered
Upon completing this project, you will better understand privacy laws and 5 Responsible Analytics Practices
standards. 5.1 Describe data privacy laws and
best practices
Steps for Completion 5.1.1 GDPR
5.1.2 FERPA
1. Match the privacy law terms to their functions. 5.1.3 HIPAA
5.1.4 IRB
A. GDPR C. HIPAA E. PCI 5.1.5 PCI
B. FERPA D. IRB
Notes for the teacher
Ensure students understand the
a. A set of security standards established to protect differences between the privacy laws
outlined in this course.
payment card data.

b. A committee responsible for reviewing and overseeing


human subjects research and ensuring those participants are protected.

c. A United States federal law protecting the privacy of student education data, applying to all
educational institutions that receive funding from the US federal government.

d. A data protection law that governs the processing and handling of EU residents’ personal data and
the transfer of any personal data to countries or entities outside the EU.

e. A law that governs the protection of sensitive personal health information (PHI), regulating its use,
storage, disclosure, and security.

2. Label the following statements as true or false.

a. GDPR is designed to protect the privacy rights of individuals and has strict requirements for
organizations working with data on its collection, processing, storage, and security.

b. HIPAA gives the parents of students certain rights regarding the education records of their
children.

c. FERPA applies to anyone who works with healthcare data, including hospitals, health insurance
companies, healthcare clearinghouses, and business associates of these organizations who handle PHI.

d. Those pursuing a career in data analytics will likely encounter IRBs at some point.

e. PCI DSS standards apply to any organization handling cardholder information from major
credit card brands, including Mastercard, Visa, American Express, and Discover.

67 | Domain 5 Lesson 1: Privacy Laws and Standards IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 2

68 | Domain 5 Lesson 2: Privacy Laws and Standards IT Specialist – Data Analytics Project Workbook, First Edition
Managing PII Project Details
Project file
Data analysts must know how to handle personally identifiable information (PII) N/A
best. Improper handling of PII can result in breaches and expose those whose
Estimated completion time
data is at risk to negative consequences. A few best practices to follow when
5-10 minutes
working with PII include minimization, anonymization, encryption, secure
storage, access controls, and regular audits. Video reference
Domain 5
Purpose Topic: Responsible Data Handling
Subtopic: Handle PII; Anonymize
Upon completing this project, you will better understand how to manage PII and Data
keep data secure.
Objectives covered
5 Responsible Analytics Practices
Steps for Completion 5.2 Describe best practices for
responsible data handling
1. List at least three types of data that are considered PII.
5.2.1 Methods of handling PII,
a. securing data, and protecting
anonymity within small datasets
5.2.2 Importance of anonymizing
2. Describe minimization. data

a. Notes for the teacher


Ensure students understand the
importance of keeping data secure.

3. Describe anonymization.

a.

4. Describe encryption.

a.

5. Label the following statements as true or false.

a. Anonymizing data is another level of protection against breaches, identity theft, and fraud.

b. Anonymizing data does not protect datasets when they are shared with internal and external
collaborators and third parties.

c. Regular audits and training on the security of PII should be conducted to ensure continued
compliance with security guidelines.

d. Access controls should be strict, only letting those users who absolutely need to use the data
access it.

69 | Domain 5 Lesson 2: Managing PII IT Specialist – Data Analytics Project Workbook, First Edition
Data Analysis Project Details
Project file
Data analysts must know how to prepare their analyses with their audience in N/A
mind, often decision-makers. Analyses should be clearly interpretable and
Estimated completion time
accurate, but depending on how one communicates those analyses, they will
5 minutes
likely skew one way or the other. There are pros and cons to both simple and
complex visual models displaying data. There is also a weakness in making Video reference
population-level generalizations with limited sample data. Domain 5
Topic: Responsible Data Handling
Purpose Subtopic: Interpretability and
Accuracy; Shortcomings
Upon completing this project, you will better understand the differences
Objectives covered
between simple and complex visual models and the downsides of limited 5 Responsible Analytics Practices
sample data. 5.2 Describe best practices for
responsible data handling
Steps for Completion 5.2.3 Trade-offs when balancing
interpretability and accuracy
1. Match the analysis model type to its pros and cons. 5.2.4 Shortcomings of making
population-level generalizations
Simple Complex with limited sample data

a. Easier for the analyst and audience to interpret. Notes for the teacher
Remind students that there is no one
b. Often more accurate and can highlight difficult-to- correct answer for the balance between
locate insights. interpretability and accuracy. Each
analysis problem requires a unique
c. May sacrifice accuracy and miss subtler data approach.
findings.

d. May render findings harder to understand.

e. Focuses only on the most critical features or variables.

f. Includes all variables within a project.

2. Sampling , or sampling too much or too little from different parts of a population, can lead to a study
or survey not accurately representing the population about which one is trying to gain insights.

3. Limited sample data increases the risk of encountering Type I errors (false ) and Type II errors
(false ).

70 | Domain 5 Lesson 2: Data Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 3

71 | Domain 5 Lesson 3: Data Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Biases Project Details
Project file
Data analysts should be aware of two main bias categories when researching: N/A
cognitive biases and motivational biases. Cognitive biases often find their basis
Estimated completion time
in subconscious mental thought processes and shortcuts. In contrast,
10 minutes
motivational biases tend to be driven by conscious and unconscious motivations
that can affect the way people interpret information, classify findings, and make Video reference
decisions. Because objectivity is essential to best practices in data analytics, it is Domain 5
Topic: Bias
crucial for analysts to understand biases and how to avoid them best. Biases can
Subtopic: Confirmation Bias;
be overcome by remaining aware of them during research and analysis, staying Human Cognitive Bias;
as objective as possible during these processes, and inviting outside Motivational Bias
perspectives on the data.
Objectives covered
Purpose 5 Responsible Analytics Practices
5.3 Given a scenario, describe types
Upon completing this project, you will better understand different types of of bias that affect collection and
biases when presenting analysis findings. interpretation of data
5.3.1 Confirmation bias
5.3.2 Human cognitive bias
Steps for Completion
5.3.3 Motivational bias
1. Match the cognitive bias type to its definition. Notes for the teacher
Remind students that analysts should
A. Confirmation bias E. Loss aversion bias work to avoid biases in their research
B. Availability heuristic F. Hindsight bias and analyses. Knowing about and
recognizing bias types is the first step
C. Anchoring bias G. Sunk cost fallacy
to being able to avoid or minimize any
D. Overconfidence bias influence they may have in one’s
analysis process.

a. The tendency to see past events as more predictable than


they were due to being able to see their factors in hindsight,
leading to a person overestimating their ability to predict an outcome.

b. The tendency to prefer avoiding losses over acquiring equivalent gains, leading to risk-averse
behavior.

c. The tendency for a person to overestimate the likelihood of events based on how well or recently
they remember them.

d. Occurs when a person overestimates their own abilities or knowledge, leading to unwarranted
confidence in their research, analysis, or predictions.

e. The tendency for people to seek out, interpret, or remember information in a manner that confirms
their existing beliefs or hypotheses and for them to ignore evidence to the contrary.

f. The tendency to rely too heavily on the first piece of information encountered in research when
making decisions and subsequent information is viewed as less important.

g. The tendency to continue to invest resources in a project or course of action that is no longer
deemed beneficial due to the resources already invested.

72 | Domain 5 Lesson 3: Biases IT Specialist – Data Analytics Project Workbook, First Edition
2. Match the motivational bias type to its definition.

A. Self-serving bias B. Desirability bias C. Wishful thinking D. Motivated reasoning

a. This occurs when a person’s existing beliefs, desires, or preferences guide their reasoning and
decision-making processes, making them more likely to arrive at conclusions that align with these rather than
objective evidence.

b. This occurs when a person believes in outcomes or evidence they prefer to be true, despite what
evidence may say to the contrary.

c. The tendency for a person to attribute positive events and outcomes to their own skill and efforts
while attributing negative events and outcomes to external factors beyond their control.

d. The tendency to interpret information or findings in a way that aligns with or serves one’s desires or
goals.

73 | Domain 5 Lesson 3: Biases IT Specialist – Data Analytics Project Workbook, First Edition
Sampling Methods Project Details
Project file
Sampling is the process of identifying and using a subset of a population to N/A
represent it. Employing proper sampling methods is crucial to effective data
Estimated completion time
analysis. Otherwise, analysts risk obtaining samples that are not representative
10 minutes
of the population, thus causing bias and skewing the data analysis. While the
only perfect way to obtain findings and insights from a population is to survey Video reference
that whole population directly, this is rarely practical, so data analysts must Domain 5
Topic: Bias
understand how to use different sampling methods. There are two main types of
Subtopic: Probability Sampling;
sampling: probability and non-probability sampling. Non-Probability Sampling

Purpose Objectives covered


5 Responsible Analysis Practices
Upon completing this project, you will better understand different sampling 5.3 Given a scenario, describe types
methods. of bias that affect collection and
interpretation of data
Steps for Completion 5.3.4 Sampling

1. How are participants chosen from a population with probability Notes for the teacher
If time permits, you may choose to
sampling?
discuss with students some of the
a. advantages and disadvantages of
different probability and non-
probability sampling methods.

2. How are participants chosen from a population with non-probability


sampling?

a.

3. Match the probability sampling method to its description.

A. Simple random sampling B. Systematic sampling C. Stratified sampling D. Cluster sampling

a. The first individual or item for the sample is chosen randomly, and after this, every item or
individual is selected at a set interval.

b. Groups are selected at random rather than individuals or items.

c. Random selection is conducted within predefined groups.

d. Every individual or item in the population has an equal probability of being selected for the
sample.

4. Match the non-probability sampling method to its description.

A. Convenience sampling B. Quota sampling C. Purposive sampling C. Snowball sampling

a. Individuals or items are selected from predefined groups.

b. Individuals or items are selected based on their accessibility or availability.

c. Individuals are chosen by asking existing participants to ask people they know to participate.

74 | Domain 5 Lesson 3: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
d. Individuals or items are chosen based on the analyst’s understanding of the research question
and their existing knowledge.

75 | Domain 5 Lesson 3: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
Appendix

76 | Appendix: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
Glossary
Domain 1

Term Definition
Boolean A type of data with only one of two values possible.
Continuous data Data that can take on any value in a specified range.
Data Information, facts, or figures that can be collected, stored, and analyzed.
Dataset A group of related data, often organized in a specific structure.
Discrete data Data that can only take on specific values from a finite set of possible values.
Feature A measurable characteristic in a dataset, also known as a variable.
Interval data Data measured on a scale with meaningful intervals.
List An organized, ordered collection of items or elements.
Metadata Data containing details and context about other data.
Qualitative Data or information expressed non-numerically.
Quantitative Numerically expressed data or information.
Ratio data Interval data that has a true zero point.
Schema A structural representation of the organization, arrangement, and relationships of a dataset.
Table A method of storing data in a system containing columns and rows.

Domain 2

Term Definition
Aggregation Combining smaller data categories into larger grouped categories.
Correlation A measure of connection between two points of data.
Data Type A characteristic of data, such as numeric, string, or date.
Database An organized collection of structured data or information stored in a manageable, retrievable,
manipulatable format.
ETL Extract, transform, load; the process of obtaining data from the source, altering or reorganizing it
for analysis, and loading it into an analysis tool.
Imputation Replacing missing data values with estimated values.
Validation Checking the accuracy and completeness of data.

Domain 3

Term Definition
AI Artificial intelligence, a subset of machine learning models that can learn, grow, and improve their
capabilities without the need for human interaction.
Algorithm A set of rules used to generate calculations or perform problem-solving operations.
Alternative A hypothesis that there is a significant relationship between variables in a dataset or statistical
hypothesis analysis
Anomaly A data point that deviates from what is expected within the dataset or findings.
Data drilling Exploring and analyzing data in increasing levels of detail and granularity.
Data mining Discovery patterns, trends, and relationships in a dataset.
Hypothesis A proposed explanation or educated guess that can be tested.
77 | Appendix: Glossary IT Specialist – Data Analytics Project Workbook, First Edition
Term Definition
Linear Regression A form of regression in which an outcome is a continuous variable, often a number.
Machine learning The development of algorithms and statistical models that can learn and improve without being
explicitly programmed.
Max The maximum value in a set of numeric data.
Mean The average of a set of numeric data.
Median The value closest to the midpoint of a set of numeric data.
Min The minimum value in a set of numeric data.
Mode The value appearing most often in a set of numeric data.
Model A mathematical representation of the relationship between variables in a dataset.
Natural Language A form of AI that reads data from text and images and can include speech recognition and object
Processing detection.
Null hypothesis A hypothesis that there is no significant relationship between variables in a dataset or statistical
analysis.
Outlier A data point significantly outside the range of most of the rest of a dataset.
Parameter An input value used to help train an algorithm.
Pattern A reoccurring relationship or behavioral tendency in a dataset.
P-value A statistic to help determine the significance of an observed effect in a hypothesis test.
Range The spread of a set of numeric data.
Regression A model type that looks at the relationship between a dependent variable and one or more
independent variables.
R-Squared A value that measures the proportion of variance of a dependent variable that is predictable from
independent variables.
Standard A statistical measure of the amount of variation in a set of numeric values.
deviation
Sum The combined values of a set of numeric data.
Supervised An algorithm that uses labeled data, which means data where a target variable is known.
Algorithm
Trend A general direction in which a variable or phenomenon is moving.
T-test A statistical test to determine if there is a significant difference between the means of two groups.
Unsupervised A type of machine learning model that looks for hidden patterns or structures in data.
algorithm

Domain 4

Term Definition
Analysis question A question posed for data analysis to solve.
Bar and whisker A type of visualization incorporating the data range and mean.
diagram
Disaggregation Converting aggregated data into smaller, more granular categories.
Histogram A visualization of the distribution of numeric data.
Sankey diagram A type of flow diagram often used to visualize processes.
Visualization Analyzing and representing data and findings visually.

78 | Appendix: Glossary IT Specialist – Data Analytics Project Workbook, First Edition


Domain 5

Term Definition
Anonymity The quality of data being unable to be tied to a specific individual.
Bias An imbalance in data that can cause it to be skewed toward a demographic group, which can harm
an AI machine learning model.
Cognitive bias Systematic deviations in thought, perception, or judgment that can lead to inaccurate conclusions.
Encryption The process of converting data into an unreadable format, which can often only be reverted with an
encryption key.
FERPA The Family Educational Rights and Privacy Act, a United States federal law protecting the privacy of
student education records.
GDPR GDPR (General Data Protection Regulation) is the European Union (EU) standard for handling data
for companies doing business in the EU.
HIPAA The Health Insurance Portability and Accountability Act, a United States federal law protecting the
privacy and governing access to personal health records.
IRB Institutional Review Board, or a committee responsible for oversight and review of research
involving human subjects.
Motivational bias A type of bias that occurs when an individual's beliefs or motivations influence their decision-
making processes.
PCI Payment Card Industry Data Security Standard, or a set of security standards governing the use and
storage of payment card data.
PII Personally identifiable information, or data that can be used to identify an individual and their
characteristics.
Population A group of individuals or data points with specific characteristics.
Sample A smaller set of a population chosen for analysis processes.

79 | Appendix: Glossary IT Specialist – Data Analytics Project Workbook, First Edition


Objectives
IT Specialist - Data Analytics Objectives
Domain 1 Domain 2 Domain 3
Data Basics Data Manipulation Data Analysis
1.1 Define the concept of data 2.1 Import, store, and export data 3.1 Describe and differentiate between
1.1.1 Data concepts and uses 2.1.1 Fundamental understanding of ETL different types of data analysis
(extract, transform and load) processes 3.1.1 Descriptive analysis
2.1.2 Data manipulation tools (SQL, R, 3.1.2 Diagnostic analysis
Python, Microsoft Excel including aspects 3.1.3 Hypothesis testing
of Power Query) 3.1.4 Predictive analysis
2.1.3 Common data storage file formats 3.1.5 Prescriptive analysis
(delimited data files, XML, JSON)
1.2 Describe basic data variable types 2.2 Clean data 3.2 Describe and differentiate between
1.2.1 Boolean 2.2.1 Handling null values data aggregation and interpretation
1.2.2 Numeric 2.2.2 Handling special characters metrics
1.2.3 String 2.2.3 Purpose and common practices: 3.2.1 Searching
trimming spaces 3.2.2 Filtering
2.2.4 Handling inconsistent formatting 3.2.3 Unique values
2.2.5 Removing duplicates 3.2.4 Aggregate functions (Sum, Max, Min,
2.2.6 Imputing data Count, Avg/Mean, Mode, Median, Std.
2.2.7 Validating data Dev)
1.3 Describe basic structures used in data 2.3 Organize data 3.3 Describe and differentiate between
analytics 2.3.1 Sorting and filtering data exploratory data analysis methods
1.3.1 Tables 2.3.2 Slicing data 3.3.1 Identifying data relationships
1.3.2 Rows 2.3.3 Transposing data 3.3.2 Describe data drilling concepts (e.g.,
1.3.3 Columns 2.3.4 Appending data granularity)
1.3.4 Lists 2.3.5 Truncating data 3.3.3 Describe data mining concepts
(anomalies, correlation analysis, patterns,
outliers, etc.)
1.4 Describe data categories 2.4 Aggregate data 3.4 Evaluate and explain the results of
1.4.1 Qualitative 2.4.1 Grouping data data analyses
1.4.2 Quantitative 2.4.2 Joining/merging data 3.4.1 Calculate trends
1.4.3 Structured 2.4.3 Summarizing data 3.4.2 Determine expected values
1.4.4 Unstructured 2.4.4 Pivoting data 3.4.3 Interpret results of predictive models
1.4.5 Metadata 3.4.4 Interpret results of p-values and t-
1.4.6 Big data tests
3.4.5 Interpret results of regression
analyses
3.5 Define and describe the role of
artificial intelligence in data analysis
3.5.1 Define artificial intelligence
3.5.2 Define machine learning
3.5.3 Define algorithm
3.5.4 Describe how AI is used in data
analysis
3.5.5 Describe how machine learning
algorithms are used in data analysis
(Note: Specific algorithms are out of
scope)

80 | Appendix: Objectives IT Specialist – Data Analytics Project Workbook, First Edition


Domain 4 Domain 5
Data Visualization and Communication Responsible Analytics Practices
4.1 Report data 5.1 Describe data privacy laws and best practices
4.1.1 Effectively display information in tables and charts 5.1.1 GDPR
4.1.2 Explain when and why to disaggregate data 5.1.2 FERPA
5.1.3 HIPAA
5.1.4 IRB
5.1.5 PCI
4.2 Create visualizations from data 5.2 Describe best practices for responsible data handling
4.2.1 Identify data visualization practices that minimize the 5.2.1 Methods of handling PII, securing data, and protecting
potential for misinterpretation anonymity within small data sets
4.2.2 Identify visualization types that represent the underlying 5.2.2 Importance of anonymizing data
data structure and analysis questions (including comparison, 5.2.3 Trade-offs when balancing interpretability and accuracy
time/trend, part-to-whole, relationship, distribution, correlation 5.2.4 Shortcomings of making population-level generalizations
graphs) with limited sample data
4.2.3 Identify additional visualization types that represent the
underlying data structure and analysis questions (including box
and whisker diagram, scatter chart, scatter plot, bar chart,
Sankey diagram, histogram, pie chart, column chart, etc.)
4.3 Derive conclusions from a data visualization 5.3 Given a scenario, describe types of bias that affect
4.3.1 Translate a visual representation of data into words collection and interpretation of data
4.3.2 Identify differences between claims based on an analysis 5.3.1 Confirmation bias
and its graphical representation 5.3.2 Human cognitive bias
5.3.3 Motivational bias
5.3.4 Sampling

81 | Appendix: IT Specialist – Data Analytics Project Workbook, First Edition


IT Specialist – Data Analytics
Lesson Plan
Approximately 19.5 hours of videos, labs, and projects.

82 | IT Specialist – Data Analytics Lesson Plan: IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson Plan
Domain 1 - Data Basics [approximately 3 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Workbook Projects and
Subtopics Labs Files
Pre-Assessment Data Basics: Pre-
Assessment time - Assessment
[Link]
Lesson 1 Define the Concept of 1.1 Define the concept of data Numeric Data Define the Concept of Data –
Video time - [Link] Data 1.1.1 Data concepts and uses pg. 8
Exercise Lab time - How to Study for This 1.2 Describe basic data variable N/A
[Link] Exam types Basic Data Variable Types –
Workbook time - Data Concepts and Uses 1.2.1 Boolean pg. 9
[Link] Basic Data Variable 1.2.2 Numeric N/A
Types 1.2.3 String
Boolean
Numeric
String
Lesson 2 Structures Used in Data 1.3 Describe basic structures used Data Tables Tables, Rows, Columns, and
Video time - [Link] Analytics in data analytics Quantitative Lists – pg. 11
Exercise Lab time - Tables, Rows, Columns 1.3.1 Tables Data N/A
[Link] Lists 1.3.2 Rows Qualitative Data – pg. 12
Workbook time - Data Categories 1.3.3 Columns N/A
[Link] Qualitative 1.3.4 Lists Quantitative Data – pg. 13
Quantitative 1.4 Describe data categories N/A
Structured 1.4.1 Qualitative Structured and Unstructured
Unstructured 1.4.2 Quantitative Data – pg. 14
Metadata 1.4.3 Structured N/A
Big Data 1.4.4 Unstructured Metadata and Big Data– pg.
1.4.5 Metadata 15
1.4.6 Big data N/A
Post-Assessment Data Basics: Post-
Assessment time - Assessment
[Link]

83 | IT Specialist – Data Analytics Lesson Plan: Domain 1 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson Plan
Domain 2 - Data Manipulation [approximately 5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects
Subtopics and Files
Pre-Assessment Data Manipulation:
Assessment time Pre-Assessment
- [Link]
Lesson 1 Import, Store, and 2.1 Import, store, and export data Using Power Query ETL Processes– pg. 17
Video time - Export Data 2.1.1 Fundamental understanding of ETL N/A
[Link] ETL Processes (extract, transform and load) processes Data Manipulation Tools
Exercise Lab time Data Manipulation 2.1.2 Data manipulation tools (SQL, R, – pg. 18
- [Link] Tools Python, Microsoft Excel including aspects of [Link]
Workbook time - Power Query Power Query) Data Storage File
[Link] Data Storage File 2.1.3 Common data storage file formats Formats– pg. 19
Formats (delimited data files, XML, JSON) N/A
Lesson 2 Clean Data 2.2 Clean data Handling NULL Handle NULL Values – pg.
Video time - Handle Null Values 2.2.1 Handling null values Values 21
[Link] Handle Special 2.2.2 Handling special characters Handling Special 221-CAT_Survey
Exercise Lab time Characters 2.2.3 Purpose and common practices: Characters [Link]
- [Link] Trim Spaces trimming spaces Trimming Spaces Handle Special Characters
Workbook time - Handle Inconsistent 2.2.4 Handling inconsistent formatting Handling – pg. 22
[Link] Formatting 2.2.5 Removing duplicates Inconsistent 222-CSAT_Survey
Remove Duplicates 2.2.6 Imputing data Formatting [Link]
Impute Data 2.2.7 Validating data Removing Trim Spaces– pg. 23
Validate Data Duplicates 223-CSAT_Survey
[Link]
Handle Inconsistent
Formatting – pg. 24
224-CSAT_Survey
[Link]
Remove Duplicates – pg.
25
225-CSAT_Survey
[Link]
Impute Data and Validate
Data– pg. 26
226-CSAT_Survey
[Link]
Lesson 3 Organize Data 2.3 Organize data Sorting Data Sort and Filter Data – pg.
Video time - Sort and Filter Data 2.3.1 Sorting and filtering data Filtering Data 28
[Link] Slice Data 2.3.2 Slicing data Slicing Data with 231-Feb_visits.xls
Exercise Lab time Transpose Data 2.3.3 Transposing data PivotTable Slice Data – pg. 29
- [Link] Append Data 2.3.4 Appending data Transposing Data 232-Feb_visits.xls
Workbook time - Truncate Data 2.3.5 Truncating data Transpose and Append
[Link] Data – pg. 30
233-Feb_visits.xls
234-Feb_visits.xls
Truncate Data – pg. 31
235-Feb_visits.xls
Lesson 4 Aggregate Data 2.4 Aggregate data Grouping Data Group, Join, and Merge
Video time - Group Data 2.4.1 Grouping data Joining or Merging Data– pg. 33
[Link] Join or Merge Data 2.4.2 Joining/merging data Data 241-Feb_visits.xls
Exercise Lab time Summarize Data 2.4.3 Summarizing data SUBTOTAL 242-Join_merge.xls
- [Link] Pivot Data 2.4.4 Pivoting data Function 242-Survey_results_1-
84 | IT Specialist – Data Analytics Lesson Plan: Domain 2 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 - Data Manipulation [approximately 5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects
Subtopics and Files
Workbook time - Adding a Level to a start
[Link] Pivot Table 242-Survey_results_2-
start
Summarize Data – pg. 34
243-Feb_visits.xls
Pivot Data– pg. 35
244-Feb_visits.xls
Post- Data Manipulation:
Assessment Post-Assessment
Assessment time
- [Link]

85 | IT Specialist – Data Analytics Lesson Plan: Domain 2 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson Plan
Domain 3 - Data Analysis [approximately 5.5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects and
Subtopics Files
Pre-Assessment Data Analysis: Pre-
Assessment time Assessment
- [Link]
Lesson 1 Different Types of 3.1 Describe and differentiate between N/A Descriptive Analysis – pg. 37
Video time - Data Analysis different types of data analysis 311-Feb_visits.xls
[Link] Descriptive Analysis 3.1.1 Descriptive analysis Diagnostic Analysis – pg. 38
Exercise Lab Diagnostic Analysis 3.1.2 Diagnostic analysis 312-Feb_visits.xls
time - [Link] Hypothesis Testing 3.1.3 Hypothesis testing Hypothesis Testing – pg. 39
Workbook time Predictive Analysis 3.1.4 Predictive analysis N/A
- [Link] Prescriptive Analysis 3.1.5 Prescriptive analysis Predictive and Prescriptive
Analytics – pg. 40
N/A
Lesson 2 Aggregation and 3.2 Describe and differentiate between data Search an Excel Search Data – pg. 42
Video time - Metrics aggregation and interpretation metrics Sheet 321-Feb_visits.xls
[Link] Search 3.2.1 Searching Use Filters Filter Data – pg. 43
Exercise Lab Filter 3.2.2 Filtering PivotTables 322-Feb_visits.xls
time - [Link] Unique Values 3.2.3 Unique values Find Unique Values – pg. 44
Workbook time Aggregate Functions 3.2.4 Aggregate functions (Sum, Max, Min, 323-Feb_visits.xls
- [Link] Count, Avg/Mean, Mode, Median, Std. Dev) Aggregate Functions – pg.
45
N/A
Lesson 3 Exploratory Data 3.3 Describe and differentiate between Identify Find Relationships in Data –
Video time - Analysis Methods exploratory data analysis methods Correlations pg. 47
[Link] Identify Data 3.3.1 Identifying data relationships 331-Datacize Membership
Exercise Lab Relationships 3.3.2 Describe data drilling concepts (e.g., [Link]
time - [Link] Correlation granularity) Data Drilling and Data
Workbook time Coefficient 3.3.3 Describe data mining concepts Mining – pg. 48
- [Link] Data Drilling (anomalies, correlation analysis, patterns, N/A
Concepts outliers, etc.)
Data Mining
Concepts
Lesson 4 Data Analysis 3.4 Evaluate and explain the results of data Understand Calculate Trends and
Video time - Results analyses Linear Expected Values – pg. 50
[Link] Calculate Trends 3.4.1 Calculate trends Regressions N/A
Exercise Lab Determine Expected 3.4.2 Determine expected values Evaluate Linear Interpret Predictive Models –
time - [Link] Values 3.4.3 Interpret results of predictive models Regressions pg. 51
Workbook time Interpret Predictive 3.4.4 Interpret results of p-values and t-tests N/A
- [Link] Model Results 3.4.5 Interpret results of regression analyses Interpret P-Values and T-
Interpret P-Values Tests – pg. 52
and T-Tests N/A
Interpret Regression Interpret Regression
Analyses Analyses – pg. 53
Tables and Values 345-Datacize-Membership-
[Link]
Lesson 5 AI in Data Analysis 3.5 Define and describe the role of artificial N/A Artificial Intelligence,
Video time - Define Aritificial intelligence in data analysis Machine Learning, and
[Link] Intelligence 3.5.1 Define artificial intelligence Algorithms – pg. 55
Exercise Lab Define Machine 3.5.2 Define machine learning N/A
time - [Link] Learning 3.5.3 Define algorithm AI and Machine Learning in
Define Algorithm 3.5.4 Describe how AI is used in data analysis
86 | IT Specialist – Data Analytics Lesson Plan: Domain 3 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 - Data Analysis [approximately 5.5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects and
Subtopics Files
Workbook time Using AI in Data 3.5.5 Describe how machine learning Data Analytics – pg. 56
- [Link] Analysis algorithms are used in data analysis (Note: N/A
Machine Learning Specific algorithms are out of scope)
Algorithms
Post- Data Analysis: Post-
Assessment Assessment
Assessment time
- [Link]

87 | IT Specialist – Data Analytics Lesson Plan: Domain 3 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson Plan
Domain 4 - Data Visualization and Communication [approximately 3 hours of videos, labs, and
projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook
Subtopics Projects and
Files
Pre- Data Visualization and
Assessment Communication: Pre-
Assessment Assessment
time -
[Link]
Lesson 1 Report Data 4.1 Report data Data in Tables Display Information
Video time - Display Information 4.1.1 Effectively display information in tables and Data in Charts – pg. 58
[Link] Disaggregate Data charts 411-Datacize
Exercise Lab 4.1.2 Explain when and why to disaggregate data Membership 2019-
time - [Link]
[Link] Disaggregate Data
Workbook – pg. 59
time - 412-Feb_visits.xlsx
[Link]
Lesson 2 Create Visualizations 4.2 Create visualizations from data Data Visualization Data Visualization
Video time - from Data 4.2.1 Identify data visualization practices that Practices Practices – pg. 61
[Link] Data Visualization minimize the potential for misinterpretation Identify 421-Datacize
Exercise Lab Practices 4.2.2 Identify visualization types that represent the Visualization Membership 2019-
time - Identify Visualization underlying data structure and analysis questions Types [Link]
[Link] Types (including comparison, time/trend, part-to-whole, Identify Additional Visualization Types–
Workbook Identify Additional relationship, distribution, correlation graphs) Visualization pg. 62
time - Visualization Types 4.2.3 Identify additional visualization types that Types N/A
[Link] represent the underlying data structure and analysis
questions (including box and whisker diagram,
scatter chart, scatter plot, bar chart, Sankey diagram,
histogram, pie chart, column chart, etc.)
Lesson 3 Derive Conclusions 4.3 Derive conclusions from a data visualization Translate a Visual Translate Visual
Video time - Translate into Words 4.3.1 Translate a visual representation of data into Representation Representations –
[Link] Claims vs. words pg. 64
Exercise Lab Representation 4.3.2 Identify differences between claims based on an N/A
time - analysis and its graphical representation Visualizations vs.
[Link] Statistics – pg. 65
Workbook N/A
time -
[Link]
Post- Data Visualization and
Assessment Communication: Post-
Assessment Assessment
time -
[Link]

88 | IT Specialist – Data Analytics Lesson Plan: Domain 4 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson Plan
Domain 5 - Responsible Analytics Practices [approximately 3 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook
Subtopics Projects and Files
Pre-Assessment Responsible Analytics
Assessment time Practices: Pre-Assessment
- [Link]
Lesson 1 Privacy Laws and Best 5.1 Describe data privacy laws and best N/A Privacy Laws and
Video time - Practices practices Standards – pg. 67
[Link] GDPR 5.1.1 GDPR N/A
Exercise Lab time FERPA 5.1.2 FERPA
- [Link] HIPAA 5.1.3 HIPAA
Workbook time - IRB 5.1.4 IRB
[Link] PCI 5.1.5 PCI
Lesson 2 Responsible Data 5.2 Describe best practices for responsible Simple Vs. Managing PII – pg.
Video time - Handling data handling Complex 69
[Link] Handle PII 5.2.1 Methods of handling PII, securing data, Analysis N/A
Exercise Lab time Anonymize Data and protecting anonymity within small data Data Analysis – pg.
- [Link] Interpretability and sets 70
Workbook time - Accuracy 5.2.2 Importance of anonymizing data N/A
[Link] Shortcomings 5.2.3 Trade-offs when balancing
interpretability and accuracy
5.2.4 Shortcomings of making population-
level generalizations with limited sample
data
Lesson 3 Bias 5.3 Given a scenario, describe types of bias Sampling Types Biases – pg. 72
Video time - Confirmation Bias that affect collection and interpretation of N/A
[Link] Human Cognitive Bias data Sampling Methods –
Exercise Lab time Motivational Bias 5.3.1 Confirmation bias pg. 74
- [Link] Probability Sampling 5.3.2 Human cognitive bias N/A
Workbook time - Non-Probability Sampling 5.3.3 Motivational bias
[Link] 5.3.4 Sampling
Post-Assessment Responsible Analytics
Assessment time Practices: Post-Assessment
- [Link]

89 | IT Specialist – Data Analytics Lesson Plan: Domain 5 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition

You might also like