100% found this document useful (1 vote)

55 views6 pages

IBM AI Enterprise Workflow Exam Guide

This document contains an exam for the IBM AI Enterprise Workflow V1 Data Scientist Specialist certification. The exam contains 18 multiple choice questions covering topics like machine learning, data cleaning, model evaluation, natural language processing, and deploying models to production. It tests knowledge of techniques like logistic regression, decision trees, dimensionality reduction, and handling missing data.

Uploaded by

Alb Fir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

55 views6 pages

IBM AI Enterprise Workflow Exam Guide

Uploaded by

Alb Fir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Exam C1000 – 059 IBM AI Enterprise

Workflow V1 Data Scientist Specialist

(Sample Questions)

1. To reduce the overall time to complete a data ingestion job, what

two actions should be taken?

A. Assemble the data pipeline into a series of immutable

transformations, which can be combined after the processing.
B. Partition the data within each pipeline to take advantage of parallel
processing (multiple server cores, processors, etc.).
C. Look for outliers in the data, missing values, and skewness of the
data.
D. Build a dedicated pipeline for each dataset to ensure that all of them
can be processed independently and concurrently.
E. Apply a chi-squared statistical test to rank the impact of each feature
on the concept label and discard the less impactful features before
model training.

2. A design thinking project at a large corporation is in-progress

and most of the project activities involve conducting interviews
and the creation and review of photo journals. Which phase of
the design thinking process is currently being executed?

A. Empathize
B. Define
C. Ideate
D. Prototype

3. A client requests a general artificial intelligence (AI) tool that

they can plug into their data warehouse. What is the best
response to this request?

A. There is no general AI tool currently that works universally.

B. Apply neural networks to your data.
C. IBM Watson is the tool you are looking for.
D. AI can create value without any human-intervention.

4. What is a key advantage to a machine learning system versus a

rule-based system for making business decisions?

A. Machine learning systems can be implemented by business users.

B. Machine learning systems generalize better than a rule-based
system.
C. Machine learning systems are always more accurate than
rule-based systems.
D. Rule-based systems can only deal with nominal and ordinal
categorical data, whereas machine learning systems can deal with all
types of data.

5. What is a class of machine learning problems where the

algorithm builds a mathematical model from a small amount of
labeled data with a large amount of unlabeled data?

A. semi-supervised learning
B. partially labeled learning
C. nearest-neighbor clustering
D. imperfect knowledge clustering

6. What should be the first step to begin the task of collecting

initial data?

A. Copy data from several sources to a central repository to review the

data
B. Determine if a poll is required to collect data
C. Verify the technical skills that are required to collect data
D. Understand the business requirement to find out what would be the
relevant data needed

7. What are two common ways to handle missing values when

cleaning data?

A. delete records
B. replace with '1'
C. replace with mean
D. replace with '100'
E. replace with standard deviation

8. A client, a tomato grower, provides a dataset of measurements

of tomato plants and environmental data. A data scientist thinks
the features probably have a significant amount of redundancy.
The data scientist decides to apply dimensionality reduction to
the data features.

Which three techniques are examples of dimensionality

reduction?

A. k-means clustering
B. batch normalization
C. combinatorial optimization
D. autoencoder neural network
E. principal component analysis (PCA)
F. t-distributed stochastic neighbor embedding (t-SNE)

9. Which is an accurate statement regarding logistic regression?

A. Logistic regression is a non-linear classifier.

B. Logistic regression can be used for unsupervised learning.
C. Logistic regression can be used for binary classification.
D. The logistic function f(x) = 1/(1 + exp(-(wx + b))) can take values
between [0, inf].

10. What are three hyperparameters that are used when building a
simple decision tree model?

A. kernel
B. learning rate
C. maximum depth
D. split criterion
E. number of nearest neighbors
F. minimum number of samples in a leaf node

11. What is used to update coefficients in logistic regression?

A. number of features
B. kernel
C. slope
D. gradient descent

12. Which two statements are true in the context of evaluating

machine learning models?

A. Accuracy of 95% is always a good result.

B. Random guessing can be used as a baseline.
C. The F2-score puts equal weight on precision and recall.
D. F-score is the harmonic mean between precision and recall.
E. Evaluation metrics on training data are more important than on test
data.

13. What is the main benefit of adjusted R-squared compared to

R-squared?

A. all samples are considered in the formula

B. the number of features is considered in the formula
C. the average R-squared is calculated
D. train and test split is respected

14. Which model evaluation metric is best suited for imbalanced

data sets?

A. precision-recall curve
B. roc curve
C. misclassification curve
D. lift curve

15. Which IBM offering enables data scientists to deploy their

trained machine learning models to production in a scalable
environment?

A. Watson Machine Learning

B. Watson Studio
C. Watson Knowledge Catalog
D. Watson OpenScale

16. Which Python function would allow a data analyst to convert

strings of dates (such as "10 June 1964") into struct_time
objects to be used for further data cleansing?

A. import [Link]()
B. import timobj.str2obj()
C. import [Link]()
D. import [Link]()

17. The "aperture problem" in machine vision is best defined as?

A. Identifying a whole object or scene based on seeing only a small

part of that object or scene
B. generating "snakes" of active contours based on boundary curves
C. pattern matching based on an undertrained model
D. over-fitting a model based on close-up images

18. What is an example of a relation type that can be detected with

Watson Natural Language Understanding?

A. partOf
B. describedBy
C. assistant
D. during
Answer Key:

1. BD
2. A
3. A
4. B
5. A
6. D
7. AC
8. DEF
9. C
10. CDF
11. D
12. BD
13. B
14. A
15. A
16. A
17. A
18. A

Common questions

A non-linear classifier might capture complex patterns in data that logistic regression, a linear classifier, cannot due to its linear decision boundary. However, logistic regression is advantageous when data is linearly separable, providing interpretability and efficiency, while non-linear classifiers may require more computational resources and risk overfitting, particularly on small datasets .

Assembling a data pipeline into a series of immutable transformations allows for efficient data processing by enabling parallelization and reducing the need for intermediate storage or duplication, thus enhancing performance. Partitioning data within each pipeline allows for parallel processing, which maximizes resource utilization (multiple server cores, processors, etc.), reducing the overall time to complete a data ingestion job .

The precision-recall curve is most suitable for dealing with imbalanced datasets because it focuses on the performance of a model regarding true positive predictions, rather than misleading accuracy levels which can be high due to the majority class dominance in imbalanced datasets .

Gradient descent is used in logistic regression to update the model coefficients iteratively by minimizing the cost function, allowing the model to converge to a point where it effectively separates data classes based on feature values .

Common techniques for dimensionality reduction include Principal Component Analysis (PCA), which identifies linear combinations of variables that account for the largest variance in the data; Autoencoder neural networks, which learn data representations using neural networks with bottleneck layers; and t-distributed Stochastic Neighbor Embedding (t-SNE), which is used for nonlinear dimensionality reduction particularly in high-dimensional spaces .

The phase of the design thinking process that involves conducting interviews and creating photo journals is the 'Empathize' phase. This phase focuses on understanding users through observation and engagement, gathering insights about their needs and experiences .

A request for a general AI tool to plug into a data warehouse may be unreasonable because there is currently no general AI tool that works universally across all tasks and data. AI solutions are typically tailored to specific applications and require integration with specific datasets and goals .

Machine learning systems differ from rule-based systems in that they can generalize better by learning from data, adapting to new patterns without predefined rules, whereas rule-based systems operate on a fixed set of conditions and often fail to generalize beyond the scenarios they were specifically coded for .

The first step when beginning the task of collecting initial data should be to understand the business requirement to determine what relevant data is needed. This ensures that the data collected aligns with the project's goals and prevents unnecessary data processing .

Adjusted R-squared is more reliable than R-squared in some cases because it accounts for the number of predictors in a model, preventing overestimation of goodness-of-fit as more variables are added, thus providing a more accurate measure of a model's explanatory power .

Implementing Distributed Database Solutions
No ratings yet
Implementing Distributed Database Solutions
4 pages
Optimisation des performances SQL Oracle
No ratings yet
Optimisation des performances SQL Oracle
5 pages
Guide sur les Plans d'Exécution SQL
No ratings yet
Guide sur les Plans d'Exécution SQL
4 pages
Comparing PPO, DDPG, and SAC Methods
No ratings yet
Comparing PPO, DDPG, and SAC Methods
10 pages
Quantum Computation Problems Solved
No ratings yet
Quantum Computation Problems Solved
8 pages
Probabilistic Models in Information Retrieval
No ratings yet
Probabilistic Models in Information Retrieval
60 pages
C++ Payroll Management System Code
No ratings yet
C++ Payroll Management System Code
16 pages
ML Methods: Key Concepts and MCQs
No ratings yet
ML Methods: Key Concepts and MCQs
30 pages
Big Data Exam Preparation 2022-2023
No ratings yet
Big Data Exam Preparation 2022-2023
4 pages
Quiz sur la performance des bases de données
No ratings yet
Quiz sur la performance des bases de données
9 pages
Linux User Account Management Q&A
No ratings yet
Linux User Account Management Q&A
3 pages
Efficient Square Topology for NoCs
No ratings yet
Efficient Square Topology for NoCs
4 pages
Naïve Bayes Classification Exercises
No ratings yet
Naïve Bayes Classification Exercises
5 pages
Instructions SQL en français
No ratings yet
Instructions SQL en français
5 pages
1.examen S.F.S.D - QCM - Recto-Verso
100% (1)
1.examen S.F.S.D - QCM - Recto-Verso
2 pages
Access Control Exam: QCM & Case Study
No ratings yet
Access Control Exam: QCM & Case Study
3 pages
QCM ML Révision
No ratings yet
QCM ML Révision
8 pages
AI Fundamentals MCQs: Units 1, 2, 4
No ratings yet
AI Fundamentals MCQs: Units 1, 2, 4
25 pages
Sorting Algorithm Performance Analysis
No ratings yet
Sorting Algorithm Performance Analysis
18 pages
K-Means and DBSCAN Clustering Exercises
No ratings yet
K-Means and DBSCAN Clustering Exercises
4 pages
Dijkstra's Algorithm Implementation Exercise
0% (1)
Dijkstra's Algorithm Implementation Exercise
10 pages
Java Collections QCM: 50 Questions
No ratings yet
Java Collections QCM: 50 Questions
9 pages
Distributed Commit Protocols Overview
No ratings yet
Distributed Commit Protocols Overview
2 pages
Cloud Computing: Risks and Benefits
No ratings yet
Cloud Computing: Risks and Benefits
9 pages
Mirth Connect File Reading Guide
No ratings yet
Mirth Connect File Reading Guide
20 pages
Neo4j Exercises for Students
No ratings yet
Neo4j Exercises for Students
2 pages
Recherche D Information
No ratings yet
Recherche D Information
34 pages
Subject Name:: Knowledge Institute of Technology & Engineering-135
No ratings yet
Subject Name:: Knowledge Institute of Technology & Engineering-135
22 pages
Alpha-Beta Pruning Explained with Examples
No ratings yet
Alpha-Beta Pruning Explained with Examples
46 pages
ITS OD 201 Databases
100% (1)
ITS OD 201 Databases
2 pages
Routing Table Examples and Configurations
No ratings yet
Routing Table Examples and Configurations
5 pages
MiniZinc Modelling Tutorial
No ratings yet
MiniZinc Modelling Tutorial
13 pages
Binary Tree Concepts and Traversals
No ratings yet
Binary Tree Concepts and Traversals
6 pages
Examen XSLT et XML pour Licence SI
No ratings yet
Examen XSLT et XML pour Licence SI
5 pages
Tekong Ferry Schedule and Guidelines
No ratings yet
Tekong Ferry Schedule and Guidelines
5 pages
Introduction à l'intelligence artificielle
No ratings yet
Introduction à l'intelligence artificielle
27 pages
Data Preprocessing by Mahesh Huddar
No ratings yet
Data Preprocessing by Mahesh Huddar
4 pages
AGNES et DIANA: Méthodes d'Agrégation
No ratings yet
AGNES et DIANA: Méthodes d'Agrégation
3 pages
TD Système de Numération et Opérations
No ratings yet
TD Système de Numération et Opérations
2 pages
Exam on Systems Engineering Topics
No ratings yet
Exam on Systems Engineering Topics
2 pages
SQL Basics and Key Constraints Guide
No ratings yet
SQL Basics and Key Constraints Guide
25 pages
Locality Sensitive Hashing Overview
No ratings yet
Locality Sensitive Hashing Overview
54 pages
Ascend Machine Learning Exam Answers
No ratings yet
Ascend Machine Learning Exam Answers
35 pages
OLAP Operations for Train and University Data
No ratings yet
OLAP Operations for Train and University Data
4 pages
Software Engineering Midterm Exam Guide
No ratings yet
Software Engineering Midterm Exam Guide
8 pages
SQL Database Management Essentials
No ratings yet
SQL Database Management Essentials
11 pages
Gini Index and Decision Tree Analysis
No ratings yet
Gini Index and Decision Tree Analysis
19 pages
DIGI-Net: Multi-Format Digit Recognition
No ratings yet
DIGI-Net: Multi-Format Digit Recognition
11 pages
MongoDB Queries for Restaurant Data
No ratings yet
MongoDB Queries for Restaurant Data
4 pages
Modèles de Data Warehouse et OLAP
No ratings yet
Modèles de Data Warehouse et OLAP
2 pages
WAN Design and Configuration Case Study
No ratings yet
WAN Design and Configuration Case Study
12 pages
Hadoop MapReduce and YARN Quiz Questions
No ratings yet
Hadoop MapReduce and YARN Quiz Questions
2 pages
Text Preprocessing Techniques in NLP
No ratings yet
Text Preprocessing Techniques in NLP
29 pages
HCIA-Storage V4.0 Mock Exam Questions
No ratings yet
HCIA-Storage V4.0 Mock Exam Questions
5 pages
Data Mining Course CM5107 Overview
No ratings yet
Data Mining Course CM5107 Overview
6 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Data Science Model Selection Guide
No ratings yet
Data Science Model Selection Guide
43 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
35 pages
Machine Learning Final Exam Questions
No ratings yet
Machine Learning Final Exam Questions
7 pages
Machine Learning Exam Paper August
No ratings yet
Machine Learning Exam Paper August
5 pages
Understanding Algorithms and Complexity
No ratings yet
Understanding Algorithms and Complexity
10 pages
Lecture#02 FileSystemAndDB
No ratings yet
Lecture#02 FileSystemAndDB
39 pages
Components of Feedback Control Systems
No ratings yet
Components of Feedback Control Systems
33 pages
Introduction to Spiking Neural Networks
No ratings yet
Introduction to Spiking Neural Networks
10 pages
Understanding Decision Tree Algorithm
No ratings yet
Understanding Decision Tree Algorithm
16 pages
Advanced Temperature Forecasting Model
0% (1)
Advanced Temperature Forecasting Model
3 pages
Understanding Internal Model Control (IMC)
No ratings yet
Understanding Internal Model Control (IMC)
12 pages
SwinIR: Image Restoration with Transformers
No ratings yet
SwinIR: Image Restoration with Transformers
12 pages
Assignment 1 Individual Assignment
No ratings yet
Assignment 1 Individual Assignment
7 pages
Universal Manuscript Template For Optica Publishing Group Journals
No ratings yet
Universal Manuscript Template For Optica Publishing Group Journals
8 pages
Adaptive Nonlinear Control Techniques
No ratings yet
Adaptive Nonlinear Control Techniques
40 pages
Momentum Contrast for Unsupervised Learning
No ratings yet
Momentum Contrast for Unsupervised Learning
10 pages
SVM Overview and Applications
No ratings yet
SVM Overview and Applications
34 pages
Probabilistic Machine Learning Insights
No ratings yet
Probabilistic Machine Learning Insights
71 pages
Comprehensive AI Research Notes
No ratings yet
Comprehensive AI Research Notes
13 pages
Advanced OTBI Workshop Exercises
No ratings yet
Advanced OTBI Workshop Exercises
8 pages
Image Segmentation Techniques Review
No ratings yet
Image Segmentation Techniques Review
8 pages
Information Theory & Coding Overview
100% (1)
Information Theory & Coding Overview
129 pages
Understanding Radial Basis Function Networks
No ratings yet
Understanding Radial Basis Function Networks
22 pages
GCP ETL Architecture Overview
No ratings yet
GCP ETL Architecture Overview
8 pages
Spectrum Scale Stretched Cluster Best Practices
No ratings yet
Spectrum Scale Stretched Cluster Best Practices
61 pages
Industrial Control System Examples
0% (1)
Industrial Control System Examples
45 pages
Student Attendance System Using Face Recognition
No ratings yet
Student Attendance System Using Face Recognition
7 pages
Pattern Recognition Course Overview
No ratings yet
Pattern Recognition Course Overview
5 pages
Emotion Recognition in Spanish on Facebook
No ratings yet
Emotion Recognition in Spanish on Facebook
6 pages
HOG and Bag of Features for Face Detection
No ratings yet
HOG and Bag of Features for Face Detection
5 pages
Introduction to Data Science Analytics
0% (1)
Introduction to Data Science Analytics
17 pages
Student Database Management System
No ratings yet
Student Database Management System
13 pages
Fooling Deep Neural Networks with Images
No ratings yet
Fooling Deep Neural Networks with Images
20 pages
Loop Shaping
100% (1)
Loop Shaping
182 pages