0% found this document useful (0 votes)
14 views7 pages

GUI Machine Learning with Orange3 Guide

This document is a lab handout for a course on Deep Learning at the Sukkur Institute of Business Administration, focusing on GUI-based machine learning using Orange3. It outlines lab activities, including installing Orange3, importing data, building models, and evaluating performance, specifically using the MNIST dataset for digit classification. The handout also includes lab questions and exercises to reinforce learning objectives.

Uploaded by

Erfaan Mughal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

GUI Machine Learning with Orange3 Guide

This document is a lab handout for a course on Deep Learning at the Sukkur Institute of Business Administration, focusing on GUI-based machine learning using Orange3. It outlines lab activities, including installing Orange3, importing data, building models, and evaluating performance, specifically using the MNIST dataset for digit classification. The handout also includes lab questions and exercises to reinforce learning objectives.

Uploaded by

Erfaan Mughal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Sukkur Institute of Business Administration University

Department of Computer Systems Engineering


Deep Learning
Handout # 04:
Introduction to GUI-Based Machine Learning with Orange3

Lab Conduction Date: ___________________

Instructor: Engr. Muhammad Irfan Younas

Note: Submit this lab hand-out in the next lab with attached solved activities and exercises

S. No. Criterion 0.5 0.25 0.125 Score

1  Accuracy  Desired output  Minor  Critical


Mistake Mistake

2  Timing  Submitted within  1 day late  More than 3


the given time Days
Total Score Achieved

Submission Profile

Name: Submission date:

Marks obtained: Receiving authority name and signature:

Comments:

________________________________________________________________________________

Instructor Signature
Introduction to GUI-Based Machine Learning
Graphical User Interface (GUI)-based machine learning software provides an intuitive, visual
approach to building and analyzing machine learning models without requiring extensive
programming knowledge. These tools simplify data preprocessing, model selection, and
evaluation using drag-and-drop interfaces.
Common GUI-Based Machine Learning Tools
Several GUI-based machine learning platforms are widely used for educational and professional
purposes:
1. Orange3
• An open-source data visualization and machine learning platform.
• Provides interactive workflows for data analysis.
• Supports classification, clustering, and regression tasks.
• Well-suited for beginners and educational purposes.
2. WEKA (Waikato Environment for Knowledge Analysis)
• Developed at the University of Waikato, New Zealand.
• Offers various machine learning algorithms for data mining.
• Includes tools for pre-processing, classification, clustering, and association rule mining.
3. KNIME (Konstanz Information Miner)
• A powerful tool for data analytics and machine learning.
• Provides a drag-and-drop workflow interface.
• Suitable for big data processing and integration with Python and R.
4. RapidMiner
• A user-friendly tool with a strong focus on predictive analytics.
• Includes automated machine learning (AutoML) features.
• Supports deep learning, time-series analysis, and text mining.
5. Google AutoML
• A cloud-based machine learning tool from Google.
• Designed for non-experts to build custom models with minimal coding.
• Focuses on image, text, and tabular data analysis.
Why Orange3?
Orange3 is selected for this lab due to its ease of use, interactive visualizations, and strong
educational focus. It allows users to explore machine learning workflows without writing code,
making it an ideal choice for students and beginners.

2
Common GUI-Based ML Tools
Tool Description Strengths Tutorial Link

Orange3 Open-source, visual Drag-and-drop Getting Started with


programming tool for ML interface, great for Orange
and data mining. education.
WEKA Java-based ML tool with Strong in feature Weka tutorial
many built-in algorithms. selection and
classification tasks.
KNIME Advanced data analytics Great for enterprise Machine Learning
platform with ML applications, workflow- with KNIME
integration. based.
RapidMiner Business-oriented data Suitable for real-world RapidMiner Studio
science tool with business analytics. Tutorial Videos
automation.
Google Cloud-based ML model Best for automated ML Easy Path to
AutoML automation by Google. with minimal expertise. Machine Learning
on Google

Among these, Orange3 is ideal for teaching machine learning due to its interactive visual
programming and easy workflow-based model building.

2. Lab Activities Using Orange3


Activity 1: Installing Orange3
1. Download and install Orange3 from [Link]
2. Launch the Orange3 application.
3. Explore the interface, including the workspace, widgets, and available tools.
Activity 2: Importing and Exploring Data
1. Open Orange3 and add the "File" widget to the workspace.
2. Load a sample dataset (e.g., "Iris" or "Titanic").
3. Connect the "File" widget to the "Data Table" widget.
4. Examine the dataset, its attributes, and statistics.
Activity 3: Data Preprocessing
1. Add the "Select Columns" widget to filter relevant features.
2. Use the "Data Sampler" widget to split data into training and test sets.
3. Apply the "Feature Constructor" widget to create new attributes.

3
Activity 4: Building a Classification Model
1. Add the "Logistic Regression", "Random Forest", and "SVM" widgets.
2. Connect the models to the "Test & Score" widget.
3. Compare the classification performance using accuracy, precision, and recall.
Activity 5: Model Evaluation and Visualization
1. Use the "Confusion Matrix" widget to analyze classification results.
2. Add the "ROC Analysis" widget to compare model performance.
3. Visualize decision boundaries using the "Scatter Plot" widget.
Activity 6: Clustering and Unsupervised Learning
1. Load an unsupervised dataset using the "File" widget.
2. Apply the "K-Means Clustering" widget.
3. Visualize clusters with the "Silhouette Score" and "Scatter Plot" widgets.

Activity 7: Saving and Exporting Results


1. Use the "Save Data" widget to export processed data.
2. Save and share Orange3 workflows for future use.

Classifying Handwritten Digits (MNIST Dataset)


In this project, we will use Orange3 to classify handwritten digits from the MNIST dataset.
The MNIST dataset consists of 28x28 grayscale images of digits (0-9).
We will train a Deep Learning model using Orange3’s built-in tools.
Step 1: Install and Launch Orange3
1. Download and install Orange3 from [Link]
2. Open Orange3 and select "New" to create a new workflow.
3. Install the "Image Analytics" add-on:
o Click Options → Add-ons
o Search for Image Analytics and install it.

Step 2: Load the MNIST Dataset


1. Drag and drop the "Import Images" widget into the workspace.
2. Download the MNIST dataset from
[Link]

4
3. Extract the dataset and browse to the folder containing digit images.
4. Click on the Import Images widget and ensure the images are loaded.

Step 3: Convert Images to Features


1. Drag and drop the "Image Embedding" widget.
2. Connect Import Images → Image Embedding.
3. Open the Image Embedding widget and choose "ResNet-50" as the embedding
model (pre-trained deep learning model).
4. Click Apply to extract numerical features from the images.

Step 4: Assign Labels to Images


1. Drag and drop the "Data Table" widget to inspect the dataset.
2. Drag and drop the "Select Columns" widget.
3. Connect Image Embedding → Select Columns.
4. Set "Class variable" to the column containing digit labels.

Step 5: Split Data into Training and Testing Sets


1. Drag and drop the "Data Sampler" widget.
2. Connect Select Columns → Data Sampler.
3. Set "Proportion" to 80% for training and 20% for testing.
4. Click Apply to split the data.

Step 6: Train a Neural Network Model


1. Drag and drop the "Neural Network" widget.
2. Connect Training Data (Data Sampler) → Neural Network.
3. Open the Neural Network widget and:
o Set hidden layers = 2.
o Set epochs = 20.
o Click Apply to train the model.

Step 7: Test the Model on New Images


1. Drag and drop the "Test & Score" widget.
5
2. Connect Neural Network → Test & Score.
3. Connect Testing Data (Data Sampler) → Test & Score.
4. Click Start Evaluation and observe metrics like:
o Accuracy
o Precision
o Recall
o F1 Score

Step 8: Make Predictions on New Images


1. Drag and drop the "Predictions" widget.
2. Connect Neural Network → Predictions.
3. Connect Testing Data (Data Sampler) → Predictions.
4. Click on Predictions to view the model’s classification results.

Step 9: Visualize Results


1. Drag and drop the "Confusion Matrix" widget.
2. Connect Test & Score → Confusion Matrix.
3. Open the Confusion Matrix to analyze model performance.

Final Observations
• The ResNet-50 embeddings help extract meaningful features from images.
• The Neural Network classifier achieves high accuracy in classifying handwritten digits.
• If accuracy is low, try:
o Increasing the number of epochs.
o Using a different embedding model.
o Adjusting hidden layer size.

Lab Questions
1. What are the advantages of GUI-based machine learning tools over coding-based
implementations?
2. Why is Orange3 particularly useful for beginners in machine learning?
6
3. How can the "Test & Score" widget help evaluate classification models?
4. What is the role of the "ROC Analysis" widget in model evaluation?
5. Why does clustering require an unsupervised dataset?
6. What is the role of the Image Embedding widget?
7. Why do we split data into training and testing sets?
8. What is the significance of the Confusion Matrix?
9. How does changing the number of hidden layers impact performance?
10. What happens if we replace Neural Network with kNN or SVM?

4. Lab Exercises
Q1: How do you load and visualize a dataset in Orange3?
Q2: How can you preprocess data in Orange3?
Q3: How do you train a classification model in Orange3?
Q4: How do you visualize data distributions?
Q5: How do you compare multiple ML models in Orange3?

Common questions

Powered by AI

GUI-based machine learning tools like Orange3 offer an intuitive and visual approach to building and analyzing machine models without requiring extensive programming knowledge, thus they simplify data preprocessing, model selection, and evaluation using drag-and-drop interfaces . This makes these tools particularly beneficial for beginners as they provide an interactive platform to explore machine learning concepts and workflows without needing to write complex code .

Orange3 simplifies the process of building machine learning models by offering a drag-and-drop interface, which allows users to perform data preprocessing, model training, and evaluation through interactive workflows rather than writing complex code . This graphical approach facilitates understanding and reduces the cognitive load associated with coding, making the model-building process more accessible to users with limited programming skills .

Machine learning models can be compared effectively in Orange3 using the 'Test & Score' widget, which provides a range of metrics such as accuracy, precision, recall, F1 score, and ROC analysis . These metrics allow users to quantitatively assess and contrast model performances. Additionally, visual tools like confusion matrices and ROC curves can be used to understand model strengths and weaknesses further, facilitating informed decision-making regarding model selection .

The 'Select Columns' widget in Orange3 allows users to filter relevant features or attributes from the dataset, crucial for ensuring that only pertinent data is used for model training and processing . The 'Data Sampler' widget complements this by enabling the division of data into training and testing subsets, thereby facilitating robust model evaluation and reducing overfitting by ensuring data diversity during model training .

The 'Image Embedding' widget in Orange3 plays a crucial role in extracting numerical features from images by using pre-trained models like ResNet-50 . It transforms raw image data into a format that can be processed by machine learning algorithms. This step is necessary to convert high-dimensional image data into a lower-dimensional feature space, enabling efficient model training and improving the ability to classify images from the MNIST dataset .

Varying the number of hidden layers in a neural network model can significantly impact both the complexity and learning capacity of the model. Increasing the number of hidden layers can enable the network to capture more complex patterns in the data, potentially improving performance . However, it may also lead to overfitting if the model becomes too complex, requiring careful tuning and validation to maintain a balance between bias and variance .

The choice of embedding models such as ResNet-50 impacts classification performance significantly by determining the quality and type of features extracted from the image data . ResNet-50's sophisticated architecture captures intricate patterns and details in images, allowing for more accurate and robust feature representations. This enhanced feature quality can lead to improved accuracy and generalization of the classification model, enabling it to better distinguish between different classes in the MNIST dataset .

A confusion matrix is important in evaluating the performance of a classification model because it provides detailed insights into the model's predictions, including true positives, false negatives, false positives, and true negatives . This helps identify not only the overall accuracy but also areas where the model might be misclassifying or biased towards certain classes, allowing for more informed adjustments and improvements .

The 'Test & Score' widget in Orange3 assists in model evaluation by providing metrics such as accuracy, precision, recall, and F1 score, which are crucial for assessing a model's performance . These metrics offer insights into the model's ability to generalize to new data, thereby helping users identify areas for improvement or optimization. Evaluating models using these metrics is essential to ensure that the models function effectively in real-world scenarios .

Splitting data into training and testing sets is critical because it allows for an unbiased evaluation of the model's performance on unseen data, which is essential for assessing its generalization capabilities . Neglecting this step can lead to overfitting, where the model performs well on the training data but fails to generalize to new instances, resulting in poor real-world performance .

You might also like