Module 1: Fundamentals of Machine Learning and AI
Q1. What is the primary objective of Machine Learning?
A. To make machines learn from data
B. To replace human intelligence
C. To store large datasets
D. To improve hardware performance
Correct Answer: A. To make machines learn from data
Q2. How does AI relate to Machine Learning?
A. ML is a subset of AI
B. AI is a subset of ML
C. AI and ML are completely unrelated
D. ML and AI are the same
Correct Answer: A. ML is a subset of AI
Q3. Which of the following is a goal of Machine Learning?
A. Creating new programming languages
B. Developing traditional databases
C. Automating decision-making
D. Replacing all human jobs
Correct Answer: C. Automating decision-making
Q4. What is a common misconception about AI?
A. AI is limited by hardware capabilities
B. AI will replace all human intelligence
C. AI cannot learn from data
D. AI does not require human intervention
Correct Answer: B. AI will replace all human intelligence
Q5. What is a major limitation of Machine Learning?
A. It does not require computational resources
B. It eliminates the need for human input
C. It depends on the quality and quantity of data
D. It works without algorithms
Correct Answer: C. It depends on the quality and quantity of data
Q6. What is one of the key goals of Machine Learning?
A. To manually program intelligence
B. To make accurate predictions from data
C. To store large amounts of information
D. To replace traditional programming
Correct Answer: B. To make accurate predictions from data
Q7. How do hardware limitations impact Machine Learning?
A. They only affect supervised learning
B. They do not affect ML at all
C. They limit processing speed and storage
D. They improve ML model accuracy
Correct Answer: C. They limit processing speed and storage
Q8. Why is defining Machine Learning important?
A. To create new programming languages
B. To eliminate manual processes
C. To set clear expectations and goals
D. To make computers self-aware
Correct Answer: C. To set clear expectations and goals
Q9. Which specification is crucial for ML algorithms?
A. Length of the source code
B. Number of software engineers
C. Size of the development team
D. Computational efficiency
Correct Answer: D. Computational efficiency
Q10. What is an unrealistic expectation of AI and ML?
A. Machines will fully replace human intelligence
B. ML models require data for training
C. AI has hardware limitations
D. AI can be used in multiple domains
Correct Answer: A. Machines will fully replace human intelligence
Module 2: Big Data and Data Sources
Q11. What is Big Data?
A. A large volume of structured and unstructured data
B. A database management system
C. A type of machine learning algorithm
D. A new programming language
Correct Answer: A. A large volume of structured and unstructured data
Q12. Which of the following is a source of Big Data?
A. Handwritten notes
B. Social media
C. Small-scale surveys
D. Personal diaries
Correct Answer: B. Social media
Q13. What does training a machine learning model mean?
A. Ignoring patterns in data
B. Building new computer hardware
C. Manually programming the rules
D. Providing data for the model to learn patterns
Correct Answer: D. Providing data for the model to learn patterns
Q14. Which of the following is an example of using existing data sources?
A. Ignoring data while training a model
B. Creating a new dataset from scratch
C. Using an open-source dataset for analysis
D. Manually labeling all data
Correct Answer: C. Using an open-source dataset for analysis
Q15. Why is the role of statistics important in Machine Learning?
A. It helps in understanding and processing data patterns
B. It eliminates the need for data
C. It replaces machine learning algorithms
D. It is only useful in manual calculations
Correct Answer: A. It helps in understanding and processing data patterns
Q16. What is a test data source in Machine Learning?
A. A dataset used to evaluate model performance
B. A dataset used for training models
C. A dataset used to store information
D. A dataset manually labeled by humans
Correct Answer: A. A dataset used to evaluate model performance
Q17. How do algorithms contribute to Machine Learning?
A. They define how models learn from data
B. They store data for future use
C. They eliminate the need for data processing
D. They only work with supervised learning
Correct Answer: A. They define how models learn from data
Q18. What does ‘defining training’ mean in ML?
A. Removing the need for datasets
B. Manually writing rules for models
C. Understanding how models improve with data
D. Making ML models without any input
Correct Answer: C. Understanding how models improve with data
Q19. Why is locating reliable test data sources important?
A. To increase dataset size without verification
B. To evaluate model performance effectively
C. To replace training data
D. To reduce computational requirements
Correct Answer: B. To evaluate model performance effectively
Q20. What is the purpose of building a new data source in ML?
A. To collect fresh, unbiased data
B. To remove the need for data preprocessing
C. To reduce the role of algorithms
D. To replace supervised learning methods
Correct Answer: A. To collect fresh, unbiased data
Module 3: Probability and Statistics in ML
Q21. What role does probability play in Machine Learning?
A. It helps quantify uncertainty in predictions
B. It replaces all data preprocessing
C. It is only used in deep learning
D. It is not relevant to ML models
Correct Answer: A. It helps quantify uncertainty in predictions
Q22. Which theorem is commonly used for probability-based ML models?
A. Central Limit Theorem
B. Pythagorean Theorem
C. Bayes’ Theorem
D. Chebyshev’s Theorem
Correct Answer: C. Bayes’ Theorem
Q23. How is probability used in Naïve Bayes classification?
A. To store dataset values
B. To compute class probabilities based on feature likelihoods
C. To replace training data
D. To generate random numbers
Correct Answer: B. To compute class probabilities based on feature likelihoods
Q24. What does conditioning chance by Bayes’ theorem mean?
A. Ignoring prior probabilities
B. Updating probabilities based on new evidence
C. Removing features from datasets
D. Replacing statistics with AI
Correct Answer: B. Updating probabilities based on new evidence
Q25. Why is optimizing with Big Data important?
A. To replace all statistical methods
B. To eliminate the need for training models
C. To handle large-scale datasets efficiently
D. To remove biases from data
Correct Answer: C. To handle large-scale datasets efficiently
Q26. What is the purpose of statistical analysis in ML?
A. To identify patterns and trends in data
B. To create datasets manually
C. To eliminate all errors
D. To replace algorithms
Correct Answer: A. To identify patterns and trends in data
Q27. Which of the following is a key challenge in working with Big Data?
A. Data storage and processing speed
B. Reducing dataset size manually
C. Eliminating probabilities
D. Replacing feature selection
Correct Answer: A. Data storage and processing speed
Q28. Which type of learning does not require labeled data?
A. Unsupervised learning
B. Supervised learning
C. Reinforcement learning
D. Feature engineering
Correct Answer: A. Unsupervised learning
Q29. How does probability affect decision-making in ML models?
A. It allows models to predict outcomes based on likelihood
B. It removes the need for feature selection
C. It eliminates biases from datasets
D. It ensures 100% accuracy in predictions
Correct Answer: A. It allows models to predict outcomes based on likelihood
Q30. What is the learning process in Machine Learning?
A. Hardcoding outputs in the model
B. Removing all data from datasets
C. Manually adjusting predictions
D. Improving model accuracy through experience
Correct Answer: D. Improving model accuracy through experience
Module 4: Model Validation and Testing
Q31. What is the purpose of model validation in Machine Learning?
A. To ignore out-of-sample errors
B. To replace training data with test data
C. To manually adjust model parameters
D. To assess model performance on unseen data
Correct Answer: D. To assess model performance on unseen data
Q32. What is an out-of-sample error?
A. Error caused by overfitting
B. Error within the training dataset
C. Error on data not used during training
D. Error due to missing values
Correct Answer: C. Error on data not used during training
Q33. Which method helps in preventing overfitting?
A. Cross-validation
B. Increasing model complexity
C. Using the entire dataset for training
D. Reducing training data
Correct Answer: A. Cross-validation
Q34. What is the purpose of training a Machine Learning model?
A. To learn patterns from data
B. To optimize test data
C. To eliminate the need for validation
D. To create manual rules
Correct Answer: A. To learn patterns from data
Q35. What is a common issue in dataset validation?
A. Sample bias
B. Excessive data collection
C. Increasing training accuracy
D. Removing labels from test data
Correct Answer: A. Sample bias
Q36. Why is cross-validation important in ML?
A. To ensure model generalization
B. To maximize model overfitting
C. To eliminate testing needs
D. To remove data biases
Correct Answer: A. To ensure model generalization
Q37. What is the goal of validating a Machine Learning model?
A. To increase dataset size
B. To replace training with testing
C. To verify its ability to make accurate predictions
D. To ignore test accuracy
Correct Answer: C. To verify its ability to make accurate predictions
Q38. What does sample bias refer to?
A. Overfitting to test data
B. Random errors in the dataset
C. Missing values in training data
D. The dataset not representing the actual population
Correct Answer: D. The dataset not representing the actual population
Q39. How does a leakage trap affect model performance?
A. It ensures high validation accuracy
B. It improves model generalization
C. It causes data leakage from test to training set
D. It eliminates overfitting issues
Correct Answer: C. It causes data leakage from test to training set
Q40. What is the purpose of testing in ML?
A. To increase training accuracy
B. To evaluate the model’s real-world performance
C. To remove bias from data
D. To improve computational efficiency
Correct Answer: B. To evaluate the model’s real-world performance
Module 5: Data Preprocessing and Feature Engineering
Q41. What is the purpose of data preprocessing in Machine Learning?
A. To generate new models
B. To clean and transform raw data for analysis
C. To store data without modification
D. To manually adjust data values
Correct Answer: B. To clean and transform raw data for analysis
Q42. Which technique is used to handle missing data?
A. Imputation
B. Feature scaling
C. One-hot encoding
D. Principal Component Analysis
Correct Answer: A. Imputation
Q43. How do we measure similarity between vectors?
A. Feature elimination
B. Sorting techniques
C. Matrix multiplication
D. Using distance metrics
Correct Answer: D. Using distance metrics
Q44. Which of the following is an example of an anomaly detection technique?
A. Apriori Algorithm
B. Linear Regression
C. K-Means Clustering
D. Isolation Forest
Correct Answer: D. Isolation Forest
Q45. Why is repairing missing data crucial in ML?
A. To increase dataset size artificially
B. To remove unnecessary features
C. To prevent biased model training
D. To reduce the number of algorithms required
Correct Answer: C. To prevent biased model training
Q46. What is the purpose of transforming data distributions?
A. To speed up computation
B. To reduce dataset size
C. To eliminate irrelevant features
D. To normalize or standardize features
Correct Answer: D. To normalize or standardize features
Q47. Which of the following measures is commonly used to find clusters in ML?
A. Entropy
B. Mean Squared Error
C. Log-Likelihood
D. Euclidean distance
Correct Answer: D. Euclidean distance
Q48. What is feature engineering?
A. Creating new relevant features from raw data
B. Eliminating redundant features
C. Manually labeling all data
D. Ignoring outliers
Correct Answer: A. Creating new relevant features from raw data
Q49. What is the benefit of delimiting anomalous data?
A. To improve model accuracy by removing noise
B. To reduce dataset size
C. To replace feature selection
D. To increase data complexity
Correct Answer: A. To improve model accuracy by removing noise
Q50. How does data gathering impact Machine Learning?
A. It eliminates biases completely
B. It replaces feature engineering
C. It ensures diverse and relevant data is collected
D. It reduces the need for training
Correct Answer: C. It ensures diverse and relevant data is collected
Q51. What is the purpose of building a new data source in ML?
A. To collect fresh, unbiased data
B. To remove the need for data preprocessing
C. To reduce the role of algorithms
D. To replace supervised learning methods
Correct Answer: A. To collect fresh, unbiased data