Tevta Syllabus Book
Tevta Syllabus Book
Project Workbook
First Edition
LearnKey creates signature multimedia courseware. LearnKey provides expert instruction for popular computer software,
technical certifications, and application development with dynamic video-based courseware and effective learning
management systems. For a complete list of courses, visit [Link]
© 2025 LearnKey
[Link]
Table of Contents
Introduction 1
Best Practices Using LearnKey’s Online Training 2
Using This Workbook 3
Skills Assessment 4
IT Specialist – Data Analytics Video Times 6
Domain 1 Lesson 1 7
Define the Concept of Data 8
Basic Data Variable Types 9
Domain 1 Lesson 2 10
Tables, Rows, Columns, and Lists 11
Qualitative Data 12
Quantitative Data 13
Structured and Unstructured Data 14
Metadata and Big Data 15
Domain 2 Lesson 1 16
ETL Processes 17
Data Manipulation Tools 18
Data Storage File Formats 19
Domain 2 Lesson 2 20
Handle Null Values 21
Handle Special Characters 22
Trim Spaces 23
Handle Inconsistent Formatting 24
Remove Duplicates 25
Impute Data and Validate Data 26
Domain 2 Lesson 3 27
Sort and Filter Data 28
Slice Data 29
Transpose and Append Data 30
Truncate Data 31
Domain 2 Lesson 4 32
Group, Join, and Merge Data 33
Summarize Data 34
Pivot Data 35
Domain 3 Lesson 1 36
Descriptive Analysis 37
Diagnostic Analysis 38
Hypothesis Testing 39
Predictive and Prescriptive Analytics 40
Domain 3 Lesson 2 41
Search Data 42
Filter Data 43
Find Unique Values 44
Aggregate Functions 45
Domain 3 Lesson 3 46
Find Relationships in Data 47
Data Drilling and Data Mining 48
Domain 3 Lesson 4 49
Calculate Trends and Expected Values 50
Interpret Predictive Models 51
Interpret P-Values and T-Tests 52
Interpret Regression Analyses 53
Domain 3 Lesson 5 54
AI, Machine Learning, and Algorithms 55
AI and Machine Learning in Data Analytics 56
Domain 4 Lesson 1 57
Display Information 58
Disaggregate Data 59
Domain 4 Lesson 2 60
Data Visualization Practices 61
Visualization Types 62
Domain 4 Lesson 3 63
Translate Visual Representations 64
Visualizations vs. Statistics 65
Domain 5 Lesson 1 66
Privacy Laws and Standards 67
Domain 5 Lesson 2 68
Managing PII 69
Data Analysis 70
Domain 5 Lesson 3 71
Biases 72
Sampling Methods 74
Appendix 76
Glossary 77
Objectives 80
IT Specialist – Data Analytics Lesson Plan 82
Domain 1 Lesson Plan 83
Domain 2 Lesson Plan 84
Domain 3 Lesson Plan 86
Domain 4 Lesson Plan 88
Domain 5 Lesson Plan 89
Introduction
1 | Introduction: Best Practices Using LearnKey’s Online Training IT Specialist – Data Analytics Project Workbook, First Edition
Best Practices Using LearnKey’s Online Training
LearnKey offers video-based training solutions that are flexible enough to accommodate private students and educational
facilities and organizations.
Our course content is presented by top experts in their respective fields and provides clear and comprehensive
information. The full line of LearnKey products has been extensively reviewed to meet superior quality standards. Our
course content has also been endorsed by organizations such as Certiport, CompTIA®, Cisco, Adobe, and Microsoft.
However, it is the testimonials given by countless satisfied customers that truly set us apart as leaders in the information
training world.
LearnKey experts are highly qualified professionals who offer years of job and project experience in their subjects. Each
expert has been certified at the highest level available for their field of expertise. This expertise provides the student with
the knowledge necessary to obtain top-level certifications in their chosen field.
Our accomplished instructors have a rich understanding of the content they present. Effective teaching encompasses
presenting the basic principles of a subject and understanding and appreciating organization, real-world application, and
links to other related disciplines. Each instructor represents the collective wisdom of their field and within our industry.
We ensure that the subject matter is up-to-date and relevant. We examine the needs of each student and create training
that is both interesting and effective. LearnKey training provides auditory, visual, and kinesthetic learning materials to fit
diverse learning styles.
Pre-assessment: The pre-assessment is used to determine the student’s prior knowledge of the subject matter. It will also
identify a student’s strengths and weaknesses, allowing them to focus on the specific subject matter they need to improve
the most. Students should not necessarily expect a passing score on the pre-assessment as it is a test of prior knowledge.
Video training sessions: Each training course is divided into sessions or domains and lessons with topics and subtopics.
LearnKey recommends incorporating all available external resources into your training, such as student workbooks,
glossaries, course support files, and additional customized instructional material. These resources are located in the folder
icon at the top of the page.
Exercise labs: Labs are interactive activities that simulate situations presented in the training videos. Step-by-step
instructions and live demonstrations are provided.
Post-assessment: The post-assessment is used to determine the student’s knowledge gained from interacting with the
training. In taking the post-assessment, students should not consult the training or any other materials. A passing score is
80 percent or higher. If the individual does not pass the post-assessment the first time, LearnKey recommends
incorporating external resources, such as the workbook and additional customized instructional material.
Workbook: The workbook has various activities, including fill-in-the-blank questions, short answer questions, practice
exam questions, and group and individual projects that allow the student to study and apply concepts presented in the
course videos.
2 | Introduction: Best Practices Using LearnKey’s Online Training IT Specialist – Data Analytics Project Workbook, First Edition
Using This Workbook
This project workbook contains practice projects and exercises to reinforce the knowledge you have gained through the
video portion of the IT Specialist – Data Analytics course. The purpose of this workbook is twofold. First, get you further
prepared to pass the IT Specialist – Data Analytics exam, and second, to teach you job-ready skills and increase your
employability in the area of data analysis.
The projects within this workbook follow the order of the video portion of this course. To save your answers in this
workbook, you must first download a copy to your computer. You will not be able to save your answers in the web version.
You can complete the workbook exercises as you go through each section of the course, complete several at the end of
each domain, or complete them after viewing the entire course. The key is to go through these projects to strengthen your
knowledge in this subject.
Each project is based upon a specific video (or videos) in the course and specific test objectives. The materials you will
need for this course include:
• The course project files. All applicable project files are located in the support area where you downloaded this
workbook.
• Microsoft Excel, as some projects require this software to complete project steps.
For Teachers
LearnKey is proud to provide extra support to instructors upon request. For your benefit as an instructor, we also provide
an instructor support .zip file containing answer keys, completed versions of the workbook project files, and other teacher
resources. This .zip file is available within your learning platform’s admin portal.
Notes
• Extra teacher notes, when applicable, are in the Project Details box within each exercise.
• Exam objectives are aligned with the course objectives listed in each project, and project file names correspond
with these numbers.
• The Finished folder in each domain has reference versions of each project. These can help you grade projects.
• Short answers may vary but should be similar to those provided in this workbook.
• Teachers may consider asking students to add their initials, student ID, or other personal identifiers at the end of
each saved project.
We value your feedback about our courses. If you have any questions, comments, or concerns, please let us know by
visiting [Link]
3 | Introduction: Using This Workbook IT Specialist – Data Analytics Project Workbook, First Edition
Skills Assessment
Instructions: Rate your skills on the following tasks from 1-5 (1 being needs improvement, 5 being excellent).
Skills 1 2 3 4 5
Define the concept of data.
Clean data.
Organize data.
Aggregate data.
Report data.
4 | Introduction: Skills Assessment IT Specialist – Data Analytics Project Workbook, First Edition
Skills 1 2 3 4 5
5 | Introduction: Skills Assessment IT Specialist – Data Analytics Project Workbook, First Edition
IT Specialist – Data Analytics Video Times
Domain 1 Video Time
Define the Concept of Data and Basic Data Variable
[Link]
Types
Structures Used in Data Analytics and Data Categories [Link]
Total Time [Link]
6 | Introduction: IT Specialist – Data Analytics Video Times IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson 1
7 | Domain 1 Lesson 1: IT Specialist – Data Analytics Video Times IT Specialist – Data Analytics Project Workbook, First Edition
Define the Concept of Data Project Details
Project file
The first step in data analysis is defining data – its meaning, use, and why it N/A
matters. Understanding data in any form is a foundational step in data analysis.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will better understand the concept of data.
Domain 1
Topic: Define the Concept of Data
Steps for Completion Subtopic: Data Concepts and Uses
a. Data takes nearly any form one can imagine in the real world.
8 | Domain 1 Lesson 1: Define the Concept of Data IT Specialist – Data Analytics Project Workbook, First Edition
Basic Data Variable Types Project Details
Project file
Several data types exist, including Boolean, numeric, and string data. Each type N/A
has a specific use, and some have more than one type.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will become more familiar with basic data
Domain 1
variable types. Topic: Basic Data Variable Types
Subtopic: Boolean; Numeric; String
Steps for Completion
Objectives covered
1. Match each data type to its description. 1 Data Basics
1.2 Describe basic data variable types
A. Boolean B. Numeric C. String 1.2.1 Boolean
1.2.2 Numeric
a. Data that is primarily used in quantitative analysis. 1.2.3 String
b. Data that is made up of a sequence of characters, such Notes for the teacher
Ensure students know how to define
as letters, numbers, or spaces, that are typically arranged in a
and identify the basic data variable
specific, meaningful order. types.
2. Refer to each image below. Then, label the data type pictured as Boolean, numeric, or string.
a. b.
c.
9 | Domain 1 Lesson 1: Basic Data Variable Types IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson 2
10 | Domain 1 Lesson 2: Basic Data Variable Types IT Specialist – Data Analytics Project Workbook, First Edition
Tables, Rows, Columns, and Project Details
Project file
Lists N/A
2. What are individual data points in a table known as? Notes for the teacher
Ensure students can define a data table
a. and its different parts. Make sure they
understand the difference between a
3. What do rows in a data table usually represent? table and list.
a.
a.
5. What is a list?
a.
a.
11 | Domain 1 Lesson 2: Tables, Rows, Columns, and Lists IT Specialist – Data Analytics Project Workbook, First Edition
Qualitative Data Project Details
Project file
In addition to data types, there are different data categories, which can be N/A
thought of as the larger buckets in which data types fit and serve various
Estimated completion time
purposes. The first data category in this workbook is qualitative data.
5 minutes
a.
6. What methodologies might one use to analyze the qualitative data in this table?
a.
12 | Domain 1 Lesson 2: Qualitative Data IT Specialist – Data Analytics Project Workbook, First Edition
Quantitative Data Project Details
Project file
Qualitative data’s counterpart is quantitative data and is the bucket in which N/A
numeric data fits; it is quantifiable and can be analyzed statistically.
Estimated completion time
5 minutes
Purpose
Video reference
Upon completing this project, you will better understand quantitative data.
Domain 1
Topic: Data Categories
Steps for Completion Subtopic: Quantitative
a.
13 | Domain 1 Lesson 2: Quantitative Data IT Specialist – Data Analytics Project Workbook, First Edition
Structured and Unstructured Project Details
Project file
Data N/A
a.
b.
14 | Domain 1 Lesson 2: Structured and Unstructured Data IT Specialist – Data Analytics Project Workbook, First Edition
Metadata and Big Data Project Details
Project file
Metadata and big data are two more data categories available in data analysis. N/A
Metadata is data that accompanies and gives context to other types of data. Big
Estimated completion time
data is data in very large quantities. Data analysts often work with extensive
5 minutes
datasets with millions or billions of entries.
Video reference
Purpose Domain 1
Topic: Data Categories
Upon completing this project, you will better understand metadata and big Subtopic: Metadata; Big Data
data.
Objectives covered
Steps for Completion 1 Data Basics
1.4 Describe data categories
1. List two examples of metadata. 1.4.5 Metadata
1.4.6 Big data
a.
Notes for the teacher
Ensure students can define metadata
and big data. If time permits, discuss
more examples of each kind of data.
2. What are the three Vs of big data?
a.
a.
a.
15 | Domain 1 Lesson 2: Metadata and Big Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 1
16 | Domain 2 Lesson 1: Metadata and Big Data IT Specialist – Data Analytics Project Workbook, First Edition
ETL Processes Project Details
Project file
The first step in data analysis is ETL, which stands for extract, transform, load. N/A
ETL serves as the foundation for data analysis, initiating the process of preparing
Estimated completion time
data for analysis.
5 minutes
a. Eliminate unnecessary data columns or attributes and edit headers to remove unnecessary
characters.
17 | Domain 2 Lesson 1: ETL Processes IT Specialist – Data Analytics Project Workbook, First Edition
Data Manipulation Tools Project Details
Project file
Data manipulation tools are software designed to process, modify, and [Link]
transform raw data into a more usable format. Various data manipulation tools,
Estimated completion time
such as Python, SQL, R, Microsoft Excel, and Power Query, are available to help
10 minutes
prepare data for analysis.
Video reference
Purpose Domain 2
Topic: Import, Store, and Export Data
Upon completing this project, you will become more familiar with data Subtopic: Data Manipulation
manipulation tools. Tools; Power Query
c. A data connection and transformation tool that is available across several Microsoft products
and provides powerful data connection, import, and transformation capabilities.
d. A high-level programming language that tends to be used specifically for its extensive statistics
libraries.
e. A commonly used data analytics environment that provides many data transformation, analysis,
and visualization capabilities. It allows users to store, manipulate, and display data in a tabular format.
3. Import data from the [Link] file in your Domain 2 Student folder.
18 | Domain 2 Lesson 1: Data Manipulation Tools IT Specialist – Data Analytics Project Workbook, First Edition
Data Storage File Formats Project Details
Project file
Data comes in many formats, such as text documents, video files, or audio files. N/A
Knowing the basics about file formats one is most likely to encounter in data
Estimated completion time
analysis is important, as it can help one understand the tools used to load and
5 minutes
manipulate them.
Video reference
Purpose Domain 2
Topic: Import, Store, and Export Data
Upon completing this project, you will better understand data storage file Subtopic: Data Storage File
formats. Formats
c. A markup language used to store and exchange structured data that can also store and
represent semi-structured data, such as configuration settings files.
2. Which file type is popular for being an easily read and accessed method of storing data in a simple table format?
a.
3. Which two file formats mentioned here are often used with programming languages such as Python?
a.
19 | Domain 2 Lesson 1: Data Storage File Formats IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 2
20 | Domain 2 Lesson 2: Data Storage File Formats IT Specialist – Data Analytics Project Workbook, First Edition
Handle Null Values Project Details
Project file
Data cleaning is a crucial first step in data analytics because raw data is often 221-CAT_Survey [Link]
messy. Cleaning data ensures the information is accurate, reliable, and useful for
Estimated completion time
analysis. There are several issues with datasets that require cleaning, and a
10 minutes
common one is null values.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice handling null values. Subtopic: Handle Null Values
a.
a.
21 | Domain 2 Lesson 2: Handle Null Values IT Specialist – Data Analytics Project Workbook, First Edition
Handle Special Characters Project Details
Project file
Another issue one might encounter when cleaning data for analysis is special 222-CSAT_Survey [Link]
characters. Special characters can be intentionally present in a dataset or result
Estimated completion time
from accidental data entry or parsing.
10 minutes
1. Open the 222-CSAT_Survey [Link] from your Domain 2 Student Objectives covered
folder. 2 Data Manipulation
2.2 Clean data
2. Use Find and Replace to remove all exclamation points, periods, 2.2.2 Handling special characters
commas, and question marks from column J.
Notes for the teacher
3. In cell K1, type Cleaned feedback Ensure students know how to handle
special characters using find and
4. In cells K2 through K53, use the SUBSTITUTE formula to replace all replace. Make sure they know there are
hyphens with spaces in cells J2 through J53. several ways to handle special
characters.
5. Copy and paste the column K values onto column J.
a.
a.
22 | Domain 2 Lesson 2: Handle Special Characters IT Specialist – Data Analytics Project Workbook, First Edition
Trim Spaces Project Details
Project file
Trimming spaces is another common data-cleaning process. It is necessary when 223-CSAT_Survey [Link]
data has too many spaces or spaces where there should be none. These
Estimated completion time
additional spaces can create issues with parsing data, especially when passed
10 minutes
through a tool or algorithm that does not expect them to be there.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice trimming spaces. Subtopic: Trim Spaces
23 | Domain 2 Lesson 2: Trim Spaces IT Specialist – Data Analytics Project Workbook, First Edition
Handle Inconsistent Formatting Project Details
Project file
Another common issue when cleaning data is inconsistent formatting. Like 224-CSAT_Survey [Link]
spaces, inconsistent formatting may be due to input errors, problems with
Estimated completion time
previous data parsing, or a storage issue. Whatever the case, data analysts must
5 minutes
address and clean the issue before performing effective data analysis.
Video reference
Purpose Domain 2
Topic: Clean Data
In this project, you will practice handling inconsistent formatting. Subtopic: Handle Inconsistent
Formatting
Steps for Completion
Objectives covered
1. Open the 224-CSAT_Survey [Link] file from your Domain 2 2 Data Manipulation
Student folder. 2.2 Clean data
2.2.4 Handling inconsistent
2. Change the dates in column C to the MM/DD/YY format. formatting
3. Manually adjust any dates that remain unformatted. Notes for the teacher
Make sure students understand why it is
4. Save the file as 224-CSAT_Survey Results-Completed important to have consistent formatting
for data analysis. If time permits, discuss
5. Give an example of inconsistent formatting. when one might use different formats
and why.
a.
24 | Domain 2 Lesson 2: Handle Inconsistent Formatting IT Specialist – Data Analytics Project Workbook, First Edition
Remove Duplicates Project Details
Project file
Another step that is often necessary when cleaning data is removing duplicates. 225-CSAT_Survey [Link]
Duplicate entries can bias datasets too far toward one data point or observation,
Estimated completion time
so analysts must remove them in many cases.
5 minutes
25 | Domain 2 Lesson 2: Remove Duplicates IT Specialist – Data Analytics Project Workbook, First Edition
Impute Data and Validate Data Project Details
Project file
Data analysts must deal with null values in a dataset, as they can affect the 226-CSAT_Survey [Link]
accuracy of data analysis and insights. A potential solution to dealing with null
Estimated completion time
values is imputing data.
5 minutes
Validation verifies that analyzed data is reliable, accurate, and complete. It is a Video reference
more high-level process than other data cleaning steps, meaning there is no Domain 2
one process or formula to follow to validate data effectively. Topic: Clean Data
Subtopic: Impute Data; Validate
Purpose Data
In this project, you will practice imputing and validating data. Objectives covered
2 Data Manipulation
Steps for Completion 2.2 Clean data
2.2.6 Imputing data
1. Open the 226-CSAT_Survey [Link] file from your Domain 2 2.2.7 Validating data
Student folder. Notes for the teacher
Make sure students understand how to
2. In cell B53, use the MEDIAN formula to find the median age of cells
impute data to handle null values.
B2:B51. Ensure students know what validation
is. If time permits, discuss questions one
3. Select cells B2:B51 and use Find and Replace to find blank cells and
might ask when validating data.
replace them with the median value in cell B53.
4. Use the COUNTBLANK formula in cell B52 to ensure no blank cells are
in the Age column.
6. What are some descriptive statistics one might use on numeric data for validation?
a.
26 | Domain 2 Lesson 2: Impute Data and Validate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 3
27 | Domain 2 Lesson 3: Impute Data and Validate Data IT Specialist – Data Analytics Project Workbook, First Edition
Sort and Filter Data Project Details
Project file
After cleaning is complete, data organization is another important step in data 231-Feb_visits.xlsx
analysis. Analysts can take a deeper look at data and rearrange it to work more
Estimated completion time
effectively in their analyses. Sorting and filtering in Excel is an easy way to
5-10 minutes
organize data.
Video reference
Purpose Domain 2
Topic: Organize Data
In this project, you will practice sorting and filtering data. Subtopic: Sort and Filter Data
5. Use the Class Type dropdown menu to show only Spin/Cycle class data.
28 | Domain 2 Lesson 3: Sort and Filter Data IT Specialist – Data Analytics Project Workbook, First Edition
Slice Data Project Details
Project file
Another important data manipulation skill is slicing data or subsetting. This 232-Feb_visits.xlsx
means extracting a subset of data from a larger dataset based on specified
Estimated completion time
conditions or criteria. Slicing data gives analysts more flexibility in manipulating
5 minutes
and looking at different pieces of data.
Video reference
Purpose Domain 2
Topic: Organize Data
In this project, you will practice slicing data. Subtopic: Slice Data
5. Open the Insert Slicer tool and select the Date of Last Visit.
29 | Domain 2 Lesson 3: Slice Data IT Specialist – Data Analytics Project Workbook, First Edition
Transpose and Append Data Project Details
Project file
Transposing data is a method of organization specifically used for tabular 233-Feb_visits.xlsx
structured data. It entails swapping the rows and columns in the schema; rows 234-Feb_visits.xlsx
become columns and vice versa.
Estimated completion time
10 minutes
Appending data means adding new rows or columns of data to an existing
dataset. It is typically easiest to do if the data is in table form, and it is important Video reference
if an analyst wants to combine datasets or add later observations or responses. Domain 2
Topic: Organize Data
Purpose Subtopic: Transpose Data; Append
Data
In this project, you will practice transposing and appending data.
Objectives covered
Steps for Completion 2 Data Manipulation
2.3 Organize data
1. Open the 233-Feb_visits.xlsx file from your Domain 2 Student folder. 2.3.3 Transposing data
2.3.4 Appending data
2. Create a new sheet named Transposed data
Notes for the teacher
3. On the 233-Feb_vists sheet, select and copy all the data. Make sure students know how to
transpose and append data in Microsoft
4. Transpose and paste the data to the Transposed data sheet. Excel.
30 | Domain 2 Lesson 3: Transpose and Append Data IT Specialist – Data Analytics Project Workbook, First Edition
Truncate Data Project Details
Project file
While appending data means adding entries, or rows, to the entire dataset, 235-Feb_visits.xlsx
truncating data can mean more than one thing in practice. In a database
Estimated completion time
management context, truncating data often means removing data from a table
5 minutes
entirely. For this lesson, truncating data means shortening or reducing the data
length on a variable-by-variable basis. Video reference
Domain 2
Purpose Topic: Organize Data
Subtopic: Truncate Data
Upon completing this project, you will better understand truncating data.
Objectives covered
Steps for Completion 2 Data Manipulation
2.3 Organize data
1. Open the 235-Feb_visits.xlsx file from your Domain 2 Student folder. 2.3.5 Truncating data
2. Create a new sheet named Truncated data Notes for the teacher
Ensure students understand what it
3. Copy the table from the 235-Feb_visits sheet to the Truncated data means to truncate data in the context of
database management, and for this
sheet.
lesson. Make sure they have truncated
4. On the Truncated data sheet, truncate the Timestamp and IP Address the unnecessary data in the table on a
new sheet named Truncated data.
columns from the table.
a. Data can be removed from a table entirely with a database management tool or
programming language such as SQL.
b. Truncating data can also mean eliminating rows or columns in data that are not deemed
valuable or necessary for analysis purposes.
31 | Domain 2 Lesson 3: Truncate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson 4
32 | Domain 2 Lesson 4: Truncate Data IT Specialist – Data Analytics Project Workbook, First Edition
Group, Join, and Merge Data Project Details
Project file
The final set of topics in data manipulation is about data aggregation, or 241-Feb_visits.xlsx
grouping and organizing data together for analysis. Grouping data means 242-Join_merge.xlsx
organizing it into subsets based on common characteristics or criteria. Grouping 242-Survey_results_1-start
242-Survey_results_2-start
is important because it empowers analysts to group relevant data in large
datasets and makes their analysis more focused. Estimated completion time
10 minutes
Another way to aggregate data is by joining and merging it. Joining and
merging are interchangeable terms in most data analytics contexts and refer to Video reference
combining information from multiple sources into a single dataset. Domain 2
Topic: Aggregate Data
Subtopic: Group Data; Join or
Purpose
Merge Data
In this project, you will practice grouping, joining, and merging data. Objectives covered
2 Data Manipulation
Steps for Completion 2.4 Aggregate data
2.4.1 Grouping data
1. Open the 241-Feb_visits.xlsx file from your Domain 2 Student folder.
2.4.2 Joining/merging data
2. Group the Class Type, Purchased Items, Amount Spent, and Satisfaction Notes for the teacher
Score columns together. In file 241-Feb_visits – [Link],
make sure students have grouped
3. Use the Subtotal tool to give the average Satisfaction Score for the columns E through H together and
group data. found the average Satisfaction Score for
the grouped data. In file 242-
4. Save the file as 241-Feb_visits-Completed Join_merge – [Link], ensure
students have merged data the 223-
5. Open the 242-Join_merge.xlsx file from your Domain 2 Student folder. CSAT_Survey_results-start table and the
Truncated data table.
6. Import data from the 242-Survey_results_1-start file.
a. Load 223-CSAT_Survey_results-start.
9. Use the Left Outer Join Kind and select the Customer ID columns in each table.
33 | Domain 2 Lesson 4: Group, Join, and Merge Data IT Specialist – Data Analytics Project Workbook, First Edition
Summarize Data Project Details
Project file
Summarizing data is an initial step of analyzing data and refers to the process of 243-Feb_visits.xlsx
obtaining summary statistics. Summarizing data is a great way for data analysts
Estimated completion time
to understand their data better and get summary statistics before beginning
10 minutes
deeper analysis.
Video reference
Purpose Domain 2
Topic: Aggregate Data
In this project, you will practice summarizing data. Subtopic: Summarize Data
7. In cell G86, use the SUBTOTAL function to produce the sum of cells G2:G79
8. In cell G87, use the SUBTOTAL function to produce the average of cells G2:G79
34 | Domain 2 Lesson 4: Summarize Data IT Specialist – Data Analytics Project Workbook, First Edition
Pivot Data Project Details
Project file
Pivoting data is similar to transposing data but with some key differences. While 244-Feb_visits.xlsx
transposing data is typically focused on switching rows with columns and vice
Estimated completion time
versa, pivoting involves extra steps to gain summary statistics and possibly
5 minutes
aggregate the data.
Video reference
Purpose Domain 2
Topic: Aggregate Data
In this project, you will practice pivoting data. Subtopic: Pivot Data
b. Pivoting data can aggregate or summarize data values and rearrange them.
c. Wide format data has many rows and fewer columns, while tall format data has many
columns and fewer rows.
35 | Domain 2 Lesson 4: Pivot Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 1
36 | Domain 3 Lesson 1: Pivot Data IT Specialist – Data Analytics Project Workbook, First Edition
Descriptive Analysis Project Details
Project file
Descriptive analysis describes data's characteristics, summarizing and describing 311-Feb_visits.xlsx
its main features. Descriptive analysis often includes summary statistics. It also
Estimated completion time
includes some basic visualization methods to help dig deeper into data features.
5 minutes
5. According to the PivotTable, which two items were purchased the least?
a.
37 | Domain 3 Lesson 1: Descriptive Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Diagnostic Analysis Project Details
Project file
Diagnostic analysis refers to the process of analyzing data to identify or 312-Feb_visits.xlsx
diagnose trends, patterns, and potential outcomes of events. It involves
Estimated completion time
identifying and defining relationships in the data, exploring potential causal
10 minutes
factors, finding potential anomalies, and devising hypotheses based on these
findings. Video reference
Domain 3
Purpose Topic: Different Types of Data
Analysis
Upon completing this project, you will better understand how to perform Subtopic: Diagnostic Analysis
diagnostic analysis.
Objectives covered
3 Data Analysis
Steps for Completion 3.1 Describe and differentiate
between different types of data
1. What is the difference between correlation and causation?
analysis
a. 3.1.2 Diagnostic analysis
5. Add the Satisfaction Score field to the Values section on the PivotTable.
6. Change the sum of the satisfaction scores in the PivotTable to the average of the satisfaction scores.
38 | Domain 3 Lesson 1: Diagnostic Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Hypothesis Testing Project Details
Project file
Hypothesis testing involves making inferences or drawing conclusions about a N/A
parameter based on sample data. To perform hypothesis testing, an analyst
Estimated completion time
formulates two competing hypotheses, known as null and alternative
5-10 minutes
hypotheses, collecting data and using statistical methods to evaluate the
evidence and argument against the null hypothesis. Video reference
Domain 3
Purpose Topic: Different Types of Data
Analysis
Upon completing this project, you will better understand how to test Subtopic: Hypothesis Testing
hypotheses.
Objectives covered
3 Data Analysis
Steps for Completion 3.1 Describe and differentiate
between different types of data
1. Determine whether each hypothesis is a null hypothesis or an
analysis
alternative hypothesis. 3.1.3 Hypothesis testing
a.
3. What is a p-value?
a.
b. Used to determine the proportion of variance in one variable explained by another variable.
c. Used to determine if the means of two groups are significantly different from each other.
39 | Domain 3 Lesson 1: Hypothesis Testing IT Specialist – Data Analytics Project Workbook, First Edition
Predictive and Prescriptive Project Details
Project file
Analytics N/A
1. List at least two predictive analysis models. Notes for the teacher
Ensure students understand the
a. difference between predictive and
prescriptive analysis.
a.
3. Describe the process of preparing predictive analysis models with testing and training.
a.
a.
a.
40 | Domain 3 Lesson 1: Predictive and Prescriptive Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 2
41 | Domain 3 Lesson 2: Predictive and Prescriptive Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Search Data Project Details
Project file
Searching a dataset might seem like a basic process, but it serves as a vital tool 321-Feb_visits.xlsx
in data analytics. Understanding how searching a dataset works in one’s
Estimated completion time
analytics tool of choice is essential to effective analysis.
5-10 minutes
42 | Domain 3 Lesson 2: Search Data IT Specialist – Data Analytics Project Workbook, First Edition
Filter Data Project Details
Project file
Filtering is a method of obtaining a subset of a dataset based on one or more 322-Feb_visits.xlsx
criteria. It can be used to look only at observations or respondents with specific
Estimated completion time
demographic characteristics or who provide the same answer to a question. In
5 minutes
data analytics, filtering data can help an analyst drill down into the most
relevant data quickly and provide quicker, more impactful analysis. Video reference
Domain 3
Purpose Topic: Aggregation and Metrics
Subtopic: Filter
Upon completing this project, you will better understand how to filter data in
Objectives covered
Excel.
3 Data Analysis
3.2 Describe and differentiate
Steps for Completion between data aggregation and
interpretation metrics
1. Open the 322-Feb_visits.xlsx file from your Domain 3 Student folder.
3.2.2 Filtering
2. Add filters to each column in the file. Notes for the teacher
Students’ completed projects should
3. Filter the Class attended column to show only entries with Yes as a
show entries where a class was
value. attended. The entries should be sorted
alphabetically from A to Z using the
4. Sort data in the Class Type field from A to Z.
Class Type column.
5. Save the file as 322-Feb_visits-Completed
43 | Domain 3 Lesson 2: Filter Data IT Specialist – Data Analytics Project Workbook, First Edition
Find Unique Values Project Details
Project file
Unique values appear only once in a dataset. Unique values are important to 323-Feb_visits.xlsx
find and consider in data analytics because they can provide greater insight and
Estimated completion time
context into the dataset they are in or to muddle and distract from analyses and
5 minutes
findings. Each unique value should be carefully considered to understand where
it fits in the context of the dataset and how it relates to the goals of an analysis. Video reference
Domain 3
Purpose Topic: Aggregation and Metrics
Subtopic: Unique Values
Upon completing this project, you will better understand how to find unique
Objectives covered
values in Excel.
3 Data Analysis
3.2 Describe and differentiate
Steps for Completion between data aggregation and
interpretation metrics
1. Open the 323-Feb_visits.xlsx file from your Domain 3 Student folder.
3.2.3 Unique values
2. Create a PivotTable from all data in the file. Notes for the teacher
Students’ completed projects should
3. Add fields to the Rows and Values sections of the PivotTable until you
show a PivotTable that displays the
find one or more unique values. Purchased Items? field and its values.
a.
a.
44 | Domain 3 Lesson 2: Find Unique Values IT Specialist – Data Analytics Project Workbook, First Edition
Aggregate Functions Project Details
Project file
Aggregate functions are an effective and simple way to obtain summary N/A
statistics, which provide high-level insight about the dataset and its features. In
Estimated completion time
Excel, these functions include SUM, MAX, MIN, COUNT, AVERAGE, MODE,
10 minutes
MEDIAN, and STDEV.
Video reference
Purpose Domain 3
Topic: Aggregation and Metrics
Upon completing this project, you will better understand how to use aggregate Subtopic: Aggregate Functions
functions in Excel.
Objectives covered
Steps for Completion 3 Data Analysis
3.2 Describe and differentiate
1. Which Excel function finds the number of instances of a numeric value between data aggregation and
interpretation metrics
in a selected data range?
3.2.4 Aggregate functions (Sum,
a. Max, Min, Count, Avg/Mean, Mode,
Median, Std. Dev)
2. Which Excel function finds the mean of a selected data range?
Notes for the teacher
a. If time permits, you may choose to
show students each of these functions
3. Which Excel function finds the amount of variation in a selected range used on a dataset in Excel.
or set of values?
a.
4. Which Excel function finds the smallest values in a selected data range?
a.
a.
6. Which Excel function adds numeric data in a selected data range together?
a.
7. Which Excel function finds the largest values in a selected data range?
a.
8. Which Excel function calculates what data value occurs the most often across a selected data range?
a.
45 | Domain 3 Lesson 2: Aggregate Functions IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 3
46 | Domain 3 Lesson 3: Aggregate Functions IT Specialist – Data Analytics Project Workbook, First Edition
Find Relationships in Data Project Details
Project file
After cleaning the data and analyzing summary statistics, exploratory data 331-Datacize Membership 2019-
analysis is often the next stage in data analytics. It includes finding initial [Link]
relationships in data and determining where, in the future, more in-depth data
Estimated completion time
analysis might take place to develop actionable insights. One method of finding 15 minutes
initial relationships in data involves visualization. For more subtle relationships,
finding correlations may be necessary. Video reference
Domain 3
Purpose Topic: Exploratory Data Analysis
Methods
Upon completing this project, you will better understand how to find Subtopic: Identify Data
Relationships; Correlation
relationships in data.
Coefficient
Steps for Completion Objectives covered
3 Data Analysis
1. Open the 331-Datacize Membership [Link] file from your 3.3 Describe and differentiate
Domain 3 Student folder. between exploratory data analysis
methods
2. Create a bar chart containing all the data in the file. 3.3.1 Identifying data relationships
3. Remove the Total annual revenue data from the chart. Notes for the teacher
If time permits, you may choose to have
4. What does the chart reveal about the gross membership revenue over students use the CORREL function to
time? calculate correlation coefficients
between other variables in the file.
a.
5. In an empty cell near the existing data, add the text Price/membership correlation
6. In the cell below the added text, use the CORREL function to calculate a correlation coefficient to analyze the
relationship between the number of current memberships and the cost of those memberships.
7. Rounded to the hundredth place, what is the correlation coefficient of the number of current memberships and
the cost of those memberships?
8. What does the correlation coefficient reveal about the relationship between the number of current memberships
and the cost of those memberships?
a.
47 | Domain 3 Lesson 3: Find Relationships in Data IT Specialist – Data Analytics Project Workbook, First Edition
Data Drilling and Data Mining Project Details
Project file
Data drilling, sometimes called data drilling down or data drill-down, is an N/A
analytical process of exploring data with increasing levels of granularity to
Estimated completion time
uncover insights or answer business questions. Data drilling is less of a specific
10 minutes
technique and more of a concept. Data mining discovers trends, patterns,
correlations, and insights from data, especially those found in large datasets. It Video reference
can include techniques such as classification, regression, clustering, association Domain 3
Topic: Exploratory Data Analysis
rule mining, anomaly detection, time series analysis, and text mining.
Methods
Subtopic: Data Drilling Concepts;
Purpose Data Mining Concepts
Upon completing this project, you will better understand how to perform data Objectives covered
drilling and data mining. 3 Data Analysis
3.3 Describe and differentiate
Steps for Completion between exploratory data analysis
methods
1. What data type do data analysts use when starting the data drilling 3.3.2 Describe data drilling
process? concepts (e.g., granularity)
3.3.3 Describe data mining
a. concepts (anomalies, correlation
analysis, patterns, outliers, etc.)
a.
4. What is clustering?
a.
a.
a.
7. What is classification?
a.
a.
48 | Domain 3 Lesson 3: Data Drilling and Data Mining IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 4
49 | Domain 3 Lesson 4: Data Drilling and Data Mining IT Specialist – Data Analytics Project Workbook, First Edition
Calculate Trends and Expected Project Details
Project file
Values N/A
a.
a.
3. What is the easiest way to calculate expected values when there is a set of data values?
a.
a.
50 | Domain 3 Lesson 4: Calculate Trends and Expected Values IT Specialist – Data Analytics Project Workbook, First Edition
Interpret Predictive Models Project Details
Project file
Interpreting the results of predictive models is crucial to a data analyst, as the N/A
interpretation of the data communicates the insights the model provides.
Estimated completion time
Metrics like coefficients, R-squared, and p-values mean very little as numeric
5 minutes
data unless analysts can explain their values to the business and how they
provide insights and answer business questions. When using a predictive model, Video reference
analysts should take time to understand how that model works, what the Domain 3
Topic: Data Analysis Results
metrics for its performance mean for the data, and the business questions the
Subtopic: Interpret Predictive
predictive model is trying to answer. Model Results
1. What does Mean Square Error (MSE) measure? Notes for the teacher
If time permits, you may choose to
a. show students sample datasets with
different MSE and R-squared values and
explain what the values mean within the
context of the dataset.
2. What does R-squared measure?
a.
a. The meaning of an R-squared value is relative to the values of the variables in the
dataset.
b. An R-squared value of 0.28 indicates a weak, positive relationship between the variances
of a dependent variable and an independent variable.
51 | Domain 3 Lesson 4: Interpret Predictive Models IT Specialist – Data Analytics Project Workbook, First Edition
Interpret P-Values and T-Tests Project Details
Project file
P-values and t-tests are used in hypothesis testing to statistically determine the N/A
likelihood of rejecting the null hypothesis, which is the status quo. P-values are
Estimated completion time
often associated with linear regression and are used to determine the
5-10 minutes
significance of an observed result, and t-tests are used to determine if there is a
significant difference between the means of two groups. Video reference
Domain 3
Purpose Topic: Data Analysis Results
Subtopic: Interpret P-Values and
Upon completing this project, you will better understand how to interpret p- T-Tests
values and t-tests.
Objectives covered
3 Data Analysis
Steps for Completion 3.4 Evaluate and explain the results
of data analyses
1. Determine what should happen based on the statistical analysis results,
3.4.4 Interpret results of p-values
assuming the significance level is 0.05. and t-tests
A. Reject the null hypothesis B. Fail to reject the null hypothesis Notes for the teacher
Ensure students understand that both
a. The p-value is 0.01. p-values and t-tests are both compared
to a confidence level to determine their
b. The p-value is 0.12. significance.
a.
a.
52 | Domain 3 Lesson 4: Interpret P-Values and T-Tests IT Specialist – Data Analytics Project Workbook, First Edition
Interpret Regression Analyses Project Details
Project file
Linear regression is a statistical predictive method that models the relationship 345-Datacize-Membership-2019-
between one or more independent variables and a dependent variable. This [Link]
equation can represent a linear regression model: y = B0 + Bx1 + e. There are
Estimated completion time
also non-linear regression models that can be used for variables with a non- 10 minutes
linear relationship.
Video reference
Purpose Domain 3
Topic: Data Analysis Results
Upon completing this project, you will better understand how to interpret the Subtopic: Interpret Regression
results of linear regression analyses. Analyses; Tables and Values
Objectives covered
Steps for Completion 3 Data Analysis
3.4 Evaluate and explain the results
1. Open the [Link] file from your
of data analyses
Domain 3 Student folder. 3.4.5 Interpret results of regression
analyses
2. Use the Data Analysis tool to create a regression analysis from the
current membership data and membership cost data. Notes for the teacher
Students’ completed projects should
a. Use current memberships as the Y variable. show a regression analysis, with current
memberships as the dependent variable
b. Use the cost of memberships as the X variable. and membership costs as the
independent variable. If time permits,
3. Interpret the meaning of the R-square value. you may choose to go into further
analysis of the different statistics
a.
included in the regression analysis.
a.
a.
a.
53 | Domain 3 Lesson 4: Interpret Regression Analyses IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson 5
54 | Domain 3 Lesson 5: Interpret Regression Analyses IT Specialist – Data Analytics Project Workbook, First Edition
AI, Machine Learning, and Project Details
Project file
Algorithms N/A
a.
a.
c. Natural language processing (NLP) is often used for chatbots and sentiment analysis.
e. Unsupervised learning occurs when models learn from labeled data where the input-
output pairs are provided and labeled during training.
f. Training is the process of evaluating models on unseen data to assess their performance
and ability.
g. Algorithms are finite and end after specified steps have occurred.
h. If given the same input in multiple instances, algorithms will produce different outputs.
55 | Domain 3 Lesson 5: AI, Machine Learning, and Algorithms IT Specialist – Data Analytics Project Workbook, First Edition
AI and Machine Learning in Project Details
Project file
Data Analytics N/A
2. List two or more advantages of using machine learning algorithms for data analytics.
a.
56 | Domain 3 Lesson 5: AI and Machine Learning in Data Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 1
57 | Domain 4 Lesson 1: AI and Machine Learning in Data Analytics IT Specialist – Data Analytics Project Workbook, First Edition
Display Information Project Details
Project file
Once data has been collected, the next step is to display the information in a 411-Datacize Membership 2019-
table or a chart. Choosing the right method of displaying the information is [Link]
important for making sure it is understood. There are a few best practices to
Estimated completion time
follow to ensure that the information in a table or chart is effectively displayed. 5 minutes
a.
58 | Domain 4 Lesson 1: Display Information IT Specialist – Data Analytics Project Workbook, First Edition
Disaggregate Data Project Details
Project file
Disaggregating data is the process of breaking aggregated data down into 412-Feb_visits.xlsx
individual categories for more granular data analysis. It can be useful for
Estimated completion time
performing more detailed analysis, enhancing transparency, and fostering
5 minutes
equity. Keep in mind when aggregating data that you may need to disaggregate
it later, so save the data appropriately. Video reference
Domain 4
Purpose Topic: Report Data
Subtopic: Disaggregate Data
Upon completing this project, you will better understand when and why to
Objectives covered
disaggregate data.
4 Data Visualization and
Communication
Steps for Completion 4.1 Report Data
4.1.2 Explain when and why to
1. Open the 412-Feb_visits.xlsx file in your Domain 4 Student Folder
disaggregate data
2. Which column contains disaggregated data, and what is it showing? Notes for the teacher
Ensure students understand the
a.
meaning of aggregation and
disaggregation, and when to use each
skill. Demonstrate how to disaggregate
3. List three reasons one might need to disaggregate data.
data if time permits.
a.
59 | Domain 4 Lesson 1: Disaggregate Data IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 2
60 | Domain 4 Lesson 2: Disaggregate Data IT Specialist – Data Analytics Project Workbook, First Edition
Data Visualization Practices Project Details
Project file
Following a few best practices can help minimize misinterpretation when 421-Datacize Membership 2019-
creating data visualizations. Sometimes, charts and graphs can be taken out of [Link]
context and misunderstood, but following these best practices can help mitigate
Estimated completion time
this problem and minimize potential misinterpretation. 5 minutes
3. Save the file as 421-Datacize Membership 2019-2023-Completed Notes for the teacher
Ensure students understand why
4. How can one prevent misinterpretation if a file might be viewed without misinterpretation can be harmful, and
the analyst providing context? why it is worth it to take the time to
follow these practices
a.
61 | Domain 4 Lesson 2: Data Visualization Practices IT Specialist – Data Analytics Project Workbook, First Edition
Visualization Types Project Details
Project file
When creating visualizations, it is important to understand the types of data N/A
visualizations, how they represent the research question they seek to answer,
Estimated completion time
and the data’s underlying structure. Choosing the appropriate visualization is
10 minutes
crucial to presenting data in a way that can be easily understood. Sometimes,
visualizations of the same type can be used in conjunction with one another to Video reference
provide a further look at the data. Domain 4
Topic: Create Visualizations from
Purpose Data
Subtopic: Identify Visualization
Upon completing this project, you will better understand data visualization Types; Identify Additional
Visualization Types
types.
Objectives covered
Steps for Completion 4 Data Visualization and
Communication
1. Label each visualization category with its corresponding data 4.2 Create visualizations from data
visualizations. 4.2.2 Identify visualization types
that represent the underlying data
a. Comparison: structure and analysis questions
(including comparison, time/trend,
b. Time/Trend: part-to-whole, relationship,
distribution, correlation graphs)
c. Part-to-whole:
4.2.3 Identify additional
d. Relationship: visualization types that represent
the underlying data structure and
e. Distribution: analysis questions (including box
and whisker diagram, scatter chart,
f. Correlation: scatter plot, bar chart, Sankey
diagram, histogram, pie chart,
2. What is a Sankey diagram? column chart, etc.)
a.
4. What chart type is best for looking at either percentages or parts of a sum?
a.
62 | Domain 4 Lesson 2: Visualization Types IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson 3
63 | Domain 4 Lesson 3: Visualization Types IT Specialist – Data Analytics Project Workbook, First Edition
Translate Visual Project Details
Project file
Representations N/A
a. Which two days of the week have the most class registrations?
i.
b. Which two days of the week have the least class registrations?
i.
c. How do the weekly registrations for Wednesday compare to the rest of the week?
i.
64 | Domain 4 Lesson 3: Translate Visual Representations IT Specialist – Data Analytics Project Workbook, First Edition
Visualizations vs. Statistics Project Details
Project file
Visualizations and statistics are both powerful tools that work together in data N/A
analytics. Still, they have different strengths and drawbacks when used to
Estimated completion time
substantiate claims analysts make from the data. Knowing these strengths and
5 minutes
drawbacks is essential to understanding which approach to employ and where
to employ it when substantiating analytical claims. Video reference
Domain 4
Purpose Topic: Derive Conclusions
Subtopic: Claims vs.
Upon completing this project, you will better understand how to choose Representation
between visualizations and statistics for data analysis.
Objectives covered
4 Data Visualization and
Steps for Completion Communication
4.3 Derive conclusions from a data
1. List at least one advantage of using statistics to analyze data rather than
visualization
visualizations. 4.3.2 Identify differences between
claims based on an analysis and its
a.
graphical representation
3. List at least one advantage of using visualizations to analyze data rather than statistics.
a.
4. List at least one disadvantage of using visualizations to analyze data rather than statistics.
a.
65 | Domain 4 Lesson 3: Visualizations vs. Statistics IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 1
66 | Domain 5 Lesson 1: Visualizations vs. Statistics IT Specialist – Data Analytics Project Workbook, First Edition
Privacy Laws and Standards Project Details
Project file
Responsible analytics practices involve being responsible with data and ensuring N/A
one is aware of and following the laws and regulations concerning data storage
Estimated completion time
and handling. There are some laws and regulations one must know, including
5-10 minutes
the General Data Protection Regulation (GDPR), the Family Educational Rights
and Privacy Act (FERPA), the Health Insurance Portability and Accountability Act Video reference
(HIPAA), and the Payment Card Industry Data Security Standard (PCI DSS), Domain 5
Topic: Privacy Laws and Best
commonly referred to as PCI. Organizations utilized in data analytics are
Practices
Institutional Review Boards (IRBs). Subtopic: GDPR; FERPA; HIPAA;
IRB; PCI
Purpose
Objectives covered
Upon completing this project, you will better understand privacy laws and 5 Responsible Analytics Practices
standards. 5.1 Describe data privacy laws and
best practices
Steps for Completion 5.1.1 GDPR
5.1.2 FERPA
1. Match the privacy law terms to their functions. 5.1.3 HIPAA
5.1.4 IRB
A. GDPR C. HIPAA E. PCI 5.1.5 PCI
B. FERPA D. IRB
Notes for the teacher
Ensure students understand the
a. A set of security standards established to protect differences between the privacy laws
outlined in this course.
payment card data.
c. A United States federal law protecting the privacy of student education data, applying to all
educational institutions that receive funding from the US federal government.
d. A data protection law that governs the processing and handling of EU residents’ personal data and
the transfer of any personal data to countries or entities outside the EU.
e. A law that governs the protection of sensitive personal health information (PHI), regulating its use,
storage, disclosure, and security.
a. GDPR is designed to protect the privacy rights of individuals and has strict requirements for
organizations working with data on its collection, processing, storage, and security.
b. HIPAA gives the parents of students certain rights regarding the education records of their
children.
c. FERPA applies to anyone who works with healthcare data, including hospitals, health insurance
companies, healthcare clearinghouses, and business associates of these organizations who handle PHI.
d. Those pursuing a career in data analytics will likely encounter IRBs at some point.
e. PCI DSS standards apply to any organization handling cardholder information from major
credit card brands, including Mastercard, Visa, American Express, and Discover.
67 | Domain 5 Lesson 1: Privacy Laws and Standards IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 2
68 | Domain 5 Lesson 2: Privacy Laws and Standards IT Specialist – Data Analytics Project Workbook, First Edition
Managing PII Project Details
Project file
Data analysts must know how to handle personally identifiable information (PII) N/A
best. Improper handling of PII can result in breaches and expose those whose
Estimated completion time
data is at risk to negative consequences. A few best practices to follow when
5-10 minutes
working with PII include minimization, anonymization, encryption, secure
storage, access controls, and regular audits. Video reference
Domain 5
Purpose Topic: Responsible Data Handling
Subtopic: Handle PII; Anonymize
Upon completing this project, you will better understand how to manage PII and Data
keep data secure.
Objectives covered
5 Responsible Analytics Practices
Steps for Completion 5.2 Describe best practices for
responsible data handling
1. List at least three types of data that are considered PII.
5.2.1 Methods of handling PII,
a. securing data, and protecting
anonymity within small datasets
5.2.2 Importance of anonymizing
2. Describe minimization. data
3. Describe anonymization.
a.
4. Describe encryption.
a.
a. Anonymizing data is another level of protection against breaches, identity theft, and fraud.
b. Anonymizing data does not protect datasets when they are shared with internal and external
collaborators and third parties.
c. Regular audits and training on the security of PII should be conducted to ensure continued
compliance with security guidelines.
d. Access controls should be strict, only letting those users who absolutely need to use the data
access it.
69 | Domain 5 Lesson 2: Managing PII IT Specialist – Data Analytics Project Workbook, First Edition
Data Analysis Project Details
Project file
Data analysts must know how to prepare their analyses with their audience in N/A
mind, often decision-makers. Analyses should be clearly interpretable and
Estimated completion time
accurate, but depending on how one communicates those analyses, they will
5 minutes
likely skew one way or the other. There are pros and cons to both simple and
complex visual models displaying data. There is also a weakness in making Video reference
population-level generalizations with limited sample data. Domain 5
Topic: Responsible Data Handling
Purpose Subtopic: Interpretability and
Accuracy; Shortcomings
Upon completing this project, you will better understand the differences
Objectives covered
between simple and complex visual models and the downsides of limited 5 Responsible Analytics Practices
sample data. 5.2 Describe best practices for
responsible data handling
Steps for Completion 5.2.3 Trade-offs when balancing
interpretability and accuracy
1. Match the analysis model type to its pros and cons. 5.2.4 Shortcomings of making
population-level generalizations
Simple Complex with limited sample data
a. Easier for the analyst and audience to interpret. Notes for the teacher
Remind students that there is no one
b. Often more accurate and can highlight difficult-to- correct answer for the balance between
locate insights. interpretability and accuracy. Each
analysis problem requires a unique
c. May sacrifice accuracy and miss subtler data approach.
findings.
2. Sampling , or sampling too much or too little from different parts of a population, can lead to a study
or survey not accurately representing the population about which one is trying to gain insights.
3. Limited sample data increases the risk of encountering Type I errors (false ) and Type II errors
(false ).
70 | Domain 5 Lesson 2: Data Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson 3
71 | Domain 5 Lesson 3: Data Analysis IT Specialist – Data Analytics Project Workbook, First Edition
Biases Project Details
Project file
Data analysts should be aware of two main bias categories when researching: N/A
cognitive biases and motivational biases. Cognitive biases often find their basis
Estimated completion time
in subconscious mental thought processes and shortcuts. In contrast,
10 minutes
motivational biases tend to be driven by conscious and unconscious motivations
that can affect the way people interpret information, classify findings, and make Video reference
decisions. Because objectivity is essential to best practices in data analytics, it is Domain 5
Topic: Bias
crucial for analysts to understand biases and how to avoid them best. Biases can
Subtopic: Confirmation Bias;
be overcome by remaining aware of them during research and analysis, staying Human Cognitive Bias;
as objective as possible during these processes, and inviting outside Motivational Bias
perspectives on the data.
Objectives covered
Purpose 5 Responsible Analytics Practices
5.3 Given a scenario, describe types
Upon completing this project, you will better understand different types of of bias that affect collection and
biases when presenting analysis findings. interpretation of data
5.3.1 Confirmation bias
5.3.2 Human cognitive bias
Steps for Completion
5.3.3 Motivational bias
1. Match the cognitive bias type to its definition. Notes for the teacher
Remind students that analysts should
A. Confirmation bias E. Loss aversion bias work to avoid biases in their research
B. Availability heuristic F. Hindsight bias and analyses. Knowing about and
recognizing bias types is the first step
C. Anchoring bias G. Sunk cost fallacy
to being able to avoid or minimize any
D. Overconfidence bias influence they may have in one’s
analysis process.
b. The tendency to prefer avoiding losses over acquiring equivalent gains, leading to risk-averse
behavior.
c. The tendency for a person to overestimate the likelihood of events based on how well or recently
they remember them.
d. Occurs when a person overestimates their own abilities or knowledge, leading to unwarranted
confidence in their research, analysis, or predictions.
e. The tendency for people to seek out, interpret, or remember information in a manner that confirms
their existing beliefs or hypotheses and for them to ignore evidence to the contrary.
f. The tendency to rely too heavily on the first piece of information encountered in research when
making decisions and subsequent information is viewed as less important.
g. The tendency to continue to invest resources in a project or course of action that is no longer
deemed beneficial due to the resources already invested.
72 | Domain 5 Lesson 3: Biases IT Specialist – Data Analytics Project Workbook, First Edition
2. Match the motivational bias type to its definition.
a. This occurs when a person’s existing beliefs, desires, or preferences guide their reasoning and
decision-making processes, making them more likely to arrive at conclusions that align with these rather than
objective evidence.
b. This occurs when a person believes in outcomes or evidence they prefer to be true, despite what
evidence may say to the contrary.
c. The tendency for a person to attribute positive events and outcomes to their own skill and efforts
while attributing negative events and outcomes to external factors beyond their control.
d. The tendency to interpret information or findings in a way that aligns with or serves one’s desires or
goals.
73 | Domain 5 Lesson 3: Biases IT Specialist – Data Analytics Project Workbook, First Edition
Sampling Methods Project Details
Project file
Sampling is the process of identifying and using a subset of a population to N/A
represent it. Employing proper sampling methods is crucial to effective data
Estimated completion time
analysis. Otherwise, analysts risk obtaining samples that are not representative
10 minutes
of the population, thus causing bias and skewing the data analysis. While the
only perfect way to obtain findings and insights from a population is to survey Video reference
that whole population directly, this is rarely practical, so data analysts must Domain 5
Topic: Bias
understand how to use different sampling methods. There are two main types of
Subtopic: Probability Sampling;
sampling: probability and non-probability sampling. Non-Probability Sampling
1. How are participants chosen from a population with probability Notes for the teacher
If time permits, you may choose to
sampling?
discuss with students some of the
a. advantages and disadvantages of
different probability and non-
probability sampling methods.
a.
a. The first individual or item for the sample is chosen randomly, and after this, every item or
individual is selected at a set interval.
d. Every individual or item in the population has an equal probability of being selected for the
sample.
c. Individuals are chosen by asking existing participants to ask people they know to participate.
74 | Domain 5 Lesson 3: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
d. Individuals or items are chosen based on the analyst’s understanding of the research question
and their existing knowledge.
75 | Domain 5 Lesson 3: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
Appendix
76 | Appendix: Sampling Methods IT Specialist – Data Analytics Project Workbook, First Edition
Glossary
Domain 1
Term Definition
Boolean A type of data with only one of two values possible.
Continuous data Data that can take on any value in a specified range.
Data Information, facts, or figures that can be collected, stored, and analyzed.
Dataset A group of related data, often organized in a specific structure.
Discrete data Data that can only take on specific values from a finite set of possible values.
Feature A measurable characteristic in a dataset, also known as a variable.
Interval data Data measured on a scale with meaningful intervals.
List An organized, ordered collection of items or elements.
Metadata Data containing details and context about other data.
Qualitative Data or information expressed non-numerically.
Quantitative Numerically expressed data or information.
Ratio data Interval data that has a true zero point.
Schema A structural representation of the organization, arrangement, and relationships of a dataset.
Table A method of storing data in a system containing columns and rows.
Domain 2
Term Definition
Aggregation Combining smaller data categories into larger grouped categories.
Correlation A measure of connection between two points of data.
Data Type A characteristic of data, such as numeric, string, or date.
Database An organized collection of structured data or information stored in a manageable, retrievable,
manipulatable format.
ETL Extract, transform, load; the process of obtaining data from the source, altering or reorganizing it
for analysis, and loading it into an analysis tool.
Imputation Replacing missing data values with estimated values.
Validation Checking the accuracy and completeness of data.
Domain 3
Term Definition
AI Artificial intelligence, a subset of machine learning models that can learn, grow, and improve their
capabilities without the need for human interaction.
Algorithm A set of rules used to generate calculations or perform problem-solving operations.
Alternative A hypothesis that there is a significant relationship between variables in a dataset or statistical
hypothesis analysis
Anomaly A data point that deviates from what is expected within the dataset or findings.
Data drilling Exploring and analyzing data in increasing levels of detail and granularity.
Data mining Discovery patterns, trends, and relationships in a dataset.
Hypothesis A proposed explanation or educated guess that can be tested.
77 | Appendix: Glossary IT Specialist – Data Analytics Project Workbook, First Edition
Term Definition
Linear Regression A form of regression in which an outcome is a continuous variable, often a number.
Machine learning The development of algorithms and statistical models that can learn and improve without being
explicitly programmed.
Max The maximum value in a set of numeric data.
Mean The average of a set of numeric data.
Median The value closest to the midpoint of a set of numeric data.
Min The minimum value in a set of numeric data.
Mode The value appearing most often in a set of numeric data.
Model A mathematical representation of the relationship between variables in a dataset.
Natural Language A form of AI that reads data from text and images and can include speech recognition and object
Processing detection.
Null hypothesis A hypothesis that there is no significant relationship between variables in a dataset or statistical
analysis.
Outlier A data point significantly outside the range of most of the rest of a dataset.
Parameter An input value used to help train an algorithm.
Pattern A reoccurring relationship or behavioral tendency in a dataset.
P-value A statistic to help determine the significance of an observed effect in a hypothesis test.
Range The spread of a set of numeric data.
Regression A model type that looks at the relationship between a dependent variable and one or more
independent variables.
R-Squared A value that measures the proportion of variance of a dependent variable that is predictable from
independent variables.
Standard A statistical measure of the amount of variation in a set of numeric values.
deviation
Sum The combined values of a set of numeric data.
Supervised An algorithm that uses labeled data, which means data where a target variable is known.
Algorithm
Trend A general direction in which a variable or phenomenon is moving.
T-test A statistical test to determine if there is a significant difference between the means of two groups.
Unsupervised A type of machine learning model that looks for hidden patterns or structures in data.
algorithm
Domain 4
Term Definition
Analysis question A question posed for data analysis to solve.
Bar and whisker A type of visualization incorporating the data range and mean.
diagram
Disaggregation Converting aggregated data into smaller, more granular categories.
Histogram A visualization of the distribution of numeric data.
Sankey diagram A type of flow diagram often used to visualize processes.
Visualization Analyzing and representing data and findings visually.
Term Definition
Anonymity The quality of data being unable to be tied to a specific individual.
Bias An imbalance in data that can cause it to be skewed toward a demographic group, which can harm
an AI machine learning model.
Cognitive bias Systematic deviations in thought, perception, or judgment that can lead to inaccurate conclusions.
Encryption The process of converting data into an unreadable format, which can often only be reverted with an
encryption key.
FERPA The Family Educational Rights and Privacy Act, a United States federal law protecting the privacy of
student education records.
GDPR GDPR (General Data Protection Regulation) is the European Union (EU) standard for handling data
for companies doing business in the EU.
HIPAA The Health Insurance Portability and Accountability Act, a United States federal law protecting the
privacy and governing access to personal health records.
IRB Institutional Review Board, or a committee responsible for oversight and review of research
involving human subjects.
Motivational bias A type of bias that occurs when an individual's beliefs or motivations influence their decision-
making processes.
PCI Payment Card Industry Data Security Standard, or a set of security standards governing the use and
storage of payment card data.
PII Personally identifiable information, or data that can be used to identify an individual and their
characteristics.
Population A group of individuals or data points with specific characteristics.
Sample A smaller set of a population chosen for analysis processes.
82 | IT Specialist – Data Analytics Lesson Plan: IT Specialist – Data Analytics Project Workbook, First Edition
Domain 1 Lesson Plan
Domain 1 - Data Basics [approximately 3 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Workbook Projects and
Subtopics Labs Files
Pre-Assessment Data Basics: Pre-
Assessment time - Assessment
[Link]
Lesson 1 Define the Concept of 1.1 Define the concept of data Numeric Data Define the Concept of Data –
Video time - [Link] Data 1.1.1 Data concepts and uses pg. 8
Exercise Lab time - How to Study for This 1.2 Describe basic data variable N/A
[Link] Exam types Basic Data Variable Types –
Workbook time - Data Concepts and Uses 1.2.1 Boolean pg. 9
[Link] Basic Data Variable 1.2.2 Numeric N/A
Types 1.2.3 String
Boolean
Numeric
String
Lesson 2 Structures Used in Data 1.3 Describe basic structures used Data Tables Tables, Rows, Columns, and
Video time - [Link] Analytics in data analytics Quantitative Lists – pg. 11
Exercise Lab time - Tables, Rows, Columns 1.3.1 Tables Data N/A
[Link] Lists 1.3.2 Rows Qualitative Data – pg. 12
Workbook time - Data Categories 1.3.3 Columns N/A
[Link] Qualitative 1.3.4 Lists Quantitative Data – pg. 13
Quantitative 1.4 Describe data categories N/A
Structured 1.4.1 Qualitative Structured and Unstructured
Unstructured 1.4.2 Quantitative Data – pg. 14
Metadata 1.4.3 Structured N/A
Big Data 1.4.4 Unstructured Metadata and Big Data– pg.
1.4.5 Metadata 15
1.4.6 Big data N/A
Post-Assessment Data Basics: Post-
Assessment time - Assessment
[Link]
83 | IT Specialist – Data Analytics Lesson Plan: Domain 1 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 Lesson Plan
Domain 2 - Data Manipulation [approximately 5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects
Subtopics and Files
Pre-Assessment Data Manipulation:
Assessment time Pre-Assessment
- [Link]
Lesson 1 Import, Store, and 2.1 Import, store, and export data Using Power Query ETL Processes– pg. 17
Video time - Export Data 2.1.1 Fundamental understanding of ETL N/A
[Link] ETL Processes (extract, transform and load) processes Data Manipulation Tools
Exercise Lab time Data Manipulation 2.1.2 Data manipulation tools (SQL, R, – pg. 18
- [Link] Tools Python, Microsoft Excel including aspects of [Link]
Workbook time - Power Query Power Query) Data Storage File
[Link] Data Storage File 2.1.3 Common data storage file formats Formats– pg. 19
Formats (delimited data files, XML, JSON) N/A
Lesson 2 Clean Data 2.2 Clean data Handling NULL Handle NULL Values – pg.
Video time - Handle Null Values 2.2.1 Handling null values Values 21
[Link] Handle Special 2.2.2 Handling special characters Handling Special 221-CAT_Survey
Exercise Lab time Characters 2.2.3 Purpose and common practices: Characters [Link]
- [Link] Trim Spaces trimming spaces Trimming Spaces Handle Special Characters
Workbook time - Handle Inconsistent 2.2.4 Handling inconsistent formatting Handling – pg. 22
[Link] Formatting 2.2.5 Removing duplicates Inconsistent 222-CSAT_Survey
Remove Duplicates 2.2.6 Imputing data Formatting [Link]
Impute Data 2.2.7 Validating data Removing Trim Spaces– pg. 23
Validate Data Duplicates 223-CSAT_Survey
[Link]
Handle Inconsistent
Formatting – pg. 24
224-CSAT_Survey
[Link]
Remove Duplicates – pg.
25
225-CSAT_Survey
[Link]
Impute Data and Validate
Data– pg. 26
226-CSAT_Survey
[Link]
Lesson 3 Organize Data 2.3 Organize data Sorting Data Sort and Filter Data – pg.
Video time - Sort and Filter Data 2.3.1 Sorting and filtering data Filtering Data 28
[Link] Slice Data 2.3.2 Slicing data Slicing Data with 231-Feb_visits.xls
Exercise Lab time Transpose Data 2.3.3 Transposing data PivotTable Slice Data – pg. 29
- [Link] Append Data 2.3.4 Appending data Transposing Data 232-Feb_visits.xls
Workbook time - Truncate Data 2.3.5 Truncating data Transpose and Append
[Link] Data – pg. 30
233-Feb_visits.xls
234-Feb_visits.xls
Truncate Data – pg. 31
235-Feb_visits.xls
Lesson 4 Aggregate Data 2.4 Aggregate data Grouping Data Group, Join, and Merge
Video time - Group Data 2.4.1 Grouping data Joining or Merging Data– pg. 33
[Link] Join or Merge Data 2.4.2 Joining/merging data Data 241-Feb_visits.xls
Exercise Lab time Summarize Data 2.4.3 Summarizing data SUBTOTAL 242-Join_merge.xls
- [Link] Pivot Data 2.4.4 Pivoting data Function 242-Survey_results_1-
84 | IT Specialist – Data Analytics Lesson Plan: Domain 2 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 2 - Data Manipulation [approximately 5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects
Subtopics and Files
Workbook time - Adding a Level to a start
[Link] Pivot Table 242-Survey_results_2-
start
Summarize Data – pg. 34
243-Feb_visits.xls
Pivot Data– pg. 35
244-Feb_visits.xls
Post- Data Manipulation:
Assessment Post-Assessment
Assessment time
- [Link]
85 | IT Specialist – Data Analytics Lesson Plan: Domain 2 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 Lesson Plan
Domain 3 - Data Analysis [approximately 5.5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects and
Subtopics Files
Pre-Assessment Data Analysis: Pre-
Assessment time Assessment
- [Link]
Lesson 1 Different Types of 3.1 Describe and differentiate between N/A Descriptive Analysis – pg. 37
Video time - Data Analysis different types of data analysis 311-Feb_visits.xls
[Link] Descriptive Analysis 3.1.1 Descriptive analysis Diagnostic Analysis – pg. 38
Exercise Lab Diagnostic Analysis 3.1.2 Diagnostic analysis 312-Feb_visits.xls
time - [Link] Hypothesis Testing 3.1.3 Hypothesis testing Hypothesis Testing – pg. 39
Workbook time Predictive Analysis 3.1.4 Predictive analysis N/A
- [Link] Prescriptive Analysis 3.1.5 Prescriptive analysis Predictive and Prescriptive
Analytics – pg. 40
N/A
Lesson 2 Aggregation and 3.2 Describe and differentiate between data Search an Excel Search Data – pg. 42
Video time - Metrics aggregation and interpretation metrics Sheet 321-Feb_visits.xls
[Link] Search 3.2.1 Searching Use Filters Filter Data – pg. 43
Exercise Lab Filter 3.2.2 Filtering PivotTables 322-Feb_visits.xls
time - [Link] Unique Values 3.2.3 Unique values Find Unique Values – pg. 44
Workbook time Aggregate Functions 3.2.4 Aggregate functions (Sum, Max, Min, 323-Feb_visits.xls
- [Link] Count, Avg/Mean, Mode, Median, Std. Dev) Aggregate Functions – pg.
45
N/A
Lesson 3 Exploratory Data 3.3 Describe and differentiate between Identify Find Relationships in Data –
Video time - Analysis Methods exploratory data analysis methods Correlations pg. 47
[Link] Identify Data 3.3.1 Identifying data relationships 331-Datacize Membership
Exercise Lab Relationships 3.3.2 Describe data drilling concepts (e.g., [Link]
time - [Link] Correlation granularity) Data Drilling and Data
Workbook time Coefficient 3.3.3 Describe data mining concepts Mining – pg. 48
- [Link] Data Drilling (anomalies, correlation analysis, patterns, N/A
Concepts outliers, etc.)
Data Mining
Concepts
Lesson 4 Data Analysis 3.4 Evaluate and explain the results of data Understand Calculate Trends and
Video time - Results analyses Linear Expected Values – pg. 50
[Link] Calculate Trends 3.4.1 Calculate trends Regressions N/A
Exercise Lab Determine Expected 3.4.2 Determine expected values Evaluate Linear Interpret Predictive Models –
time - [Link] Values 3.4.3 Interpret results of predictive models Regressions pg. 51
Workbook time Interpret Predictive 3.4.4 Interpret results of p-values and t-tests N/A
- [Link] Model Results 3.4.5 Interpret results of regression analyses Interpret P-Values and T-
Interpret P-Values Tests – pg. 52
and T-Tests N/A
Interpret Regression Interpret Regression
Analyses Analyses – pg. 53
Tables and Values 345-Datacize-Membership-
[Link]
Lesson 5 AI in Data Analysis 3.5 Define and describe the role of artificial N/A Artificial Intelligence,
Video time - Define Aritificial intelligence in data analysis Machine Learning, and
[Link] Intelligence 3.5.1 Define artificial intelligence Algorithms – pg. 55
Exercise Lab Define Machine 3.5.2 Define machine learning N/A
time - [Link] Learning 3.5.3 Define algorithm AI and Machine Learning in
Define Algorithm 3.5.4 Describe how AI is used in data analysis
86 | IT Specialist – Data Analytics Lesson Plan: Domain 3 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 3 - Data Analysis [approximately 5.5 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook Projects and
Subtopics Files
Workbook time Using AI in Data 3.5.5 Describe how machine learning Data Analytics – pg. 56
- [Link] Analysis algorithms are used in data analysis (Note: N/A
Machine Learning Specific algorithms are out of scope)
Algorithms
Post- Data Analysis: Post-
Assessment Assessment
Assessment time
- [Link]
87 | IT Specialist – Data Analytics Lesson Plan: Domain 3 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 4 Lesson Plan
Domain 4 - Data Visualization and Communication [approximately 3 hours of videos, labs, and
projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook
Subtopics Projects and
Files
Pre- Data Visualization and
Assessment Communication: Pre-
Assessment Assessment
time -
[Link]
Lesson 1 Report Data 4.1 Report data Data in Tables Display Information
Video time - Display Information 4.1.1 Effectively display information in tables and Data in Charts – pg. 58
[Link] Disaggregate Data charts 411-Datacize
Exercise Lab 4.1.2 Explain when and why to disaggregate data Membership 2019-
time - [Link]
[Link] Disaggregate Data
Workbook – pg. 59
time - 412-Feb_visits.xlsx
[Link]
Lesson 2 Create Visualizations 4.2 Create visualizations from data Data Visualization Data Visualization
Video time - from Data 4.2.1 Identify data visualization practices that Practices Practices – pg. 61
[Link] Data Visualization minimize the potential for misinterpretation Identify 421-Datacize
Exercise Lab Practices 4.2.2 Identify visualization types that represent the Visualization Membership 2019-
time - Identify Visualization underlying data structure and analysis questions Types [Link]
[Link] Types (including comparison, time/trend, part-to-whole, Identify Additional Visualization Types–
Workbook Identify Additional relationship, distribution, correlation graphs) Visualization pg. 62
time - Visualization Types 4.2.3 Identify additional visualization types that Types N/A
[Link] represent the underlying data structure and analysis
questions (including box and whisker diagram,
scatter chart, scatter plot, bar chart, Sankey diagram,
histogram, pie chart, column chart, etc.)
Lesson 3 Derive Conclusions 4.3 Derive conclusions from a data visualization Translate a Visual Translate Visual
Video time - Translate into Words 4.3.1 Translate a visual representation of data into Representation Representations –
[Link] Claims vs. words pg. 64
Exercise Lab Representation 4.3.2 Identify differences between claims based on an N/A
time - analysis and its graphical representation Visualizations vs.
[Link] Statistics – pg. 65
Workbook N/A
time -
[Link]
Post- Data Visualization and
Assessment Communication: Post-
Assessment Assessment
time -
[Link]
88 | IT Specialist – Data Analytics Lesson Plan: Domain 4 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition
Domain 5 Lesson Plan
Domain 5 - Responsible Analytics Practices [approximately 3 hours of videos, labs, and projects]
Lesson Lesson Topic and Objectives Exercise Labs Workbook
Subtopics Projects and Files
Pre-Assessment Responsible Analytics
Assessment time Practices: Pre-Assessment
- [Link]
Lesson 1 Privacy Laws and Best 5.1 Describe data privacy laws and best N/A Privacy Laws and
Video time - Practices practices Standards – pg. 67
[Link] GDPR 5.1.1 GDPR N/A
Exercise Lab time FERPA 5.1.2 FERPA
- [Link] HIPAA 5.1.3 HIPAA
Workbook time - IRB 5.1.4 IRB
[Link] PCI 5.1.5 PCI
Lesson 2 Responsible Data 5.2 Describe best practices for responsible Simple Vs. Managing PII – pg.
Video time - Handling data handling Complex 69
[Link] Handle PII 5.2.1 Methods of handling PII, securing data, Analysis N/A
Exercise Lab time Anonymize Data and protecting anonymity within small data Data Analysis – pg.
- [Link] Interpretability and sets 70
Workbook time - Accuracy 5.2.2 Importance of anonymizing data N/A
[Link] Shortcomings 5.2.3 Trade-offs when balancing
interpretability and accuracy
5.2.4 Shortcomings of making population-
level generalizations with limited sample
data
Lesson 3 Bias 5.3 Given a scenario, describe types of bias Sampling Types Biases – pg. 72
Video time - Confirmation Bias that affect collection and interpretation of N/A
[Link] Human Cognitive Bias data Sampling Methods –
Exercise Lab time Motivational Bias 5.3.1 Confirmation bias pg. 74
- [Link] Probability Sampling 5.3.2 Human cognitive bias N/A
Workbook time - Non-Probability Sampling 5.3.3 Motivational bias
[Link] 5.3.4 Sampling
Post-Assessment Responsible Analytics
Assessment time Practices: Post-Assessment
- [Link]
89 | IT Specialist – Data Analytics Lesson Plan: Domain 5 Lesson Plan IT Specialist – Data Analytics Project Workbook, First Edition