0% found this document useful (0 votes)

13 views26 pages

Predicting 3D Print Quality with Regression

This assignment report from Ho Chi Minh University of Technology focuses on developing a linear regression model to predict the quality and strength of 3D prints based on various printer settings. The study utilizes a dataset from TR/Selcuk University, analyzing nine setting parameters and three output parameters through statistical methods including ANOVA and regression analysis. The report outlines the methodology for data preprocessing, analysis, and the significance of statistical tools in mechanical engineering applications.

Uploaded by

Minh Triết Huỳnh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views26 pages

Predicting 3D Print Quality with Regression

Uploaded by

Minh Triết Huỳnh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

HO CHI MINH UNIVERSITY OF TECHNOLOGY

FACULTY OF MECHANICAL ENGINEERING

ASSIGNMENT REPORT

TITLE:
LINEAR REGRESSION MODEL FOR PREDICTING 3D PRINT QUALITY AND
STRENGTH
COURSE: PROBABILITY AND STATISTICS (MT2013)
SUPERVISOR: PHAN THỊ HƯỜNG
CLASS: CC07
GROUP: 2

Ho Chi Minh City, 2024

No. Student ID Name Tasks Contribution

1 20%

2 20%

3 20%

4 20%

Code and Data

Huỳnh Minh
5 2252838 Availibility + 20%
Triết
Write Report

CONTRIBUTION TABLE

Comment Evaluations
INTRODUCTION
Probability and statistics are two branches of mathematics which involve
collecting, analyzing, and interpreting numerical data to make informed decisions and
predictions. Together, they provide powerful tools for understanding uncertainty,
identifying patterns, and making evidence-based decisions in various fields in both social
and natural science. By quantifying uncertainty and uncovering insights from data,
probability and statistics enable deeper understanding of a subject.
Given these points, precision and accuracy are paramount in mechanical engineering
in general and 3D-printing in specific, statistics plays a crucial role by providing tools to
analyze and interpret data from experiments, simulations, and real-world observations.
By applying the theoretical and practicality of ANOVA and linear regression model used
in statistical analysis, this study aims to determine how much of the adjustment
parameters in 3d printers affect the print quality, accuracy and strength.
1. DATA INTRODUCTION
1.1. Dataset Description
₋ Title: 3D Printer Dataset for Mechanical Engineers
₋ Access Link: [Link]
₋ Source information:
+ Dataset comes from research by TR/Selcuk University Mechanical
Engineering department.
+ Work is based on the Ultimaker S5 3-D printer settings and filaments.
+ Material and strength tests were carried out on a Sincotec GMBH tester
capable of pulling 20 kN.
+ There are nine setting parameters and three measured output parameters.
(Described in section 1.3)
+ Number of observations/measurements: 50
1.2. Variables Description

Variable Type Unit Description

Setting Parameters

Layer Vertical thickness of each layer deposited by the

Quantitative mm
Height 3D printer
Wall
Quantitative mm Width of walls or outer shell of printed object
Thickness
Infill
Quantitative % Percentage of interior space filled with material
Density

Infill Pattern Categorical none Geometric pattern used to fill interior of the object

Nozzle
Quantitative ℃ Temperature of extruder nozzle
Temperature
Bed Quantitative ℃ Temperature of print bed
Temperature
mm/ Speed at which printer's nozzle moves along X, Y,
Print Speed Quantitative
s and Z axes

Material Categorical none Type of material used for printing

Fan Speed Quantitative % Speed of cooling fan

Output Parameters (Measured)

Roughness Quantitative µm Surface roughness or texture of printed object

Tension
Maximum stress printed object can withstand
(ultimate) Quantitative MPa
before breaking under tension
Strength
Percentage by which printed object can stretch or
Elongation Quantitative %
deform before breaking
1.3. Procedure Steps
₋ Step 1: Data preprocessing
Data reading
Dealing with missing data, data transformation
Defining new features
Adding, removing, converting variables (if necessary)
₋ Step 2: Descriptive statistics
Using sample statistics and plots
₋ Step 3: Inferential statistics (main method)
Applying linear regression model to analyse the relationships between the
setting parameters and find their impact to the printer’s output parameters
₋ Step 4: Prediction
Testing the assumption from the results
2. BACKGROUND
2.1. Regression Analysis
2.1.1. Definition
Regression Analysis is a statistical method used to study the relationship between a
dependent variable (Y) and one or more random variables (X), also known as explanatory
variables. The main objective of regression analysis is to make predictions or describe the
dependent variable based on the random variables. The relationship between X and Y can
be represented in the form of a linear function or equation.
General idea: estimate a random variable Y (the dependent variable) approximately as a
function F( X 1, ..., X n) of other random variables X 1, ..., X n (control variables, or
independent variables). This means that when we have the values of X 1, ..., X n, we want
to estimate the value of Y from them. Function F may depend on some parameters θ = (θ1
, ..., θ k). We can write Y as follows:
Y =F θ ( X 1 , … , X n ) +ϵ
₋ Where ϵ is the error (also a random variable). We want to choose the most
appropriate function F and parameters θ so that the error is as small as possible.
₋ √ E ¿ ¿ is called the standard error of the regression model: a model with a lower
standard error is considered more accurate.
₋ In the functional relationship, for each value of X, we find a unique value of Y.
However, in statistics, one value of X can correspond to multiple different values
of Y because besides the main variable X, the variable Y may also be influenced
by other factors.
2.1.2. Simple Linear Regression Model
A simple linear regression model involving a dependent variable Y and a random
variable X is expressed as the equation:
Y = β0 + β 1 X +ϵ

Where:
₋ β 0 and β 1 are unknown parameters (referred to as the intercept and slope
coefficients of the regression line).
₋ Y is the dependent variable, and X is the random variable.
₋ ϵ represents the error component, assumed to have a normal distribution N(0, σ²).
₋ The term "linear" in the simple linear regression model refers to linearity in the
regression coefficients, not necessarily in the variables Y and X.
2.2. Multiple Regression Model
2.2.1. Definition
Multiple linear regression is an extension of simple linear regression. It is used to
predict the value of a response variable based on the values of two or more explanatory
variables. The variable we want to predict is called the response variable (dependent
variable). The variables we use to predict the value of the response variable are called
explanatory variables (predictor variables, independent variables).
The general form of a multiple linear regression model is:
Y = β0 + β 1 X 1 + β 2 X 2 +…+ βi X i + μ

Where:
₋ Y is the dependent variable.
₋ X i are the explanatory variables.
₋ β iare the regression coefficients.
₋ β 0 is the intercept.
₋ μ is the random error.

The coefficients β i represent the change in the expected value of Y for a one-unit change
in X i , holding other variables constant.
₋ If β i > 0: the relationship between Y and X i is positive, meaning that when X i
increases (or decreases) with other independent variables held constant, Y also
increases (or decreases).
₋ If β i < 0: the relationship between Y and X i is negative, meaning that when X i
increases (or decreases) with other independent variables held constant, Y
decreases (or increases).
₋ If β i = 0: it suggests that there is no correlation between Y and X i , meaning that
Y may not depend on X i or X i may not significantly influence Y.
2.2.2. Testing the Model
In multiple regression models, the null hypothesis states that the model has no
significance, meaning all regression coefficients are equal to zero.
The Wald test (often called the F-test) is conducted as follows:
₋ Step 1: Null hypothesis H 0: β 1 = β 2 = ... = β i = 0.
₋ Step 2: Regress Y on a constant term and X 1, X 2 , ..., X i , then calculate the sum of
squared errors RSSU, RSSR. The F-distribution is the ratio of two independent
chi-square distributed variables.
₋ Step 3: Look up the values in the F-table corresponding to the degrees of freedom
(k – 1) for the numerator and (n – k) for the denominator, and with a
predetermined significance level α.
₋ Step 4: Reject the null hypothesis H 0 at significance level α if Fc > F(α, k-1, n-k).
For the p-value method, calculate the p-value = P (F > Fc| H 0) and reject H 0 if p <
α.
2.2.3. Testing the Assumptions of the Multiple Regression Model
Recalling the assumptions of the regression model:
Y = β0 + β 1 X 1 + β 2 X 2 +…+ βi X i + μi

₋ Assumption 1: Linearity of data: the relationship between the predictor variable

X and the response variable Y is assumed to be linear.
₋ Assumption 2: Normal distribution of errors.
₋ Assumption 3: Constant variance of errors.
₋ Assumption 4: Expectation of errors = 0.
₋ Assumption 5: Independence of errors μ1, ..., μn.
2.3. ANOVA
2.3.1. Definition
Analysis of Variance (ANOVA), also known as ANOVA test, is a parametric
statistical technique used to compare groups of data based on the mean values of
observed samples from these groups. It evaluates and concludes the equality of these
group mean values through hypothesis testing. In research, ANOVA is used as a tool to
examine the influence of one random factor on an outcome factor.
ANOVA is essentially an extension of the t-test method for independent samples
when comparing means of multiple groups of independent observations. Unlike the t-test
method, ANOVA can compare more than two groups. Note that ANOVA does not
compare variances but analyzes variances to compare with expectations.
ANOVA is used to test the hypothesis that the population means of groups are
equal.
This technique is based on calculating the variability within groups and between
group means.
There are two procedures for ANOVA: One-way ANOVA and Two-way ANOVA.
2.3.2. Two-Way Analysis of Variance
Two-way ANOVA is a partial extension of one-way ANOVA. With One-way, we
have one independent variable affecting the dependent variable. With Two-way
ANOVA, there are 2 independent variables.
Hypothesis of Two-way ANOVA:
The population has a normal distribution.
Each sample is observed once without repetition.
Steps to conduct hypothesis testing: We take non-repeated samples, then units of the first
random factor are grouped into K groups (columns), and units of the second random
factor are arranged into H blocks (rows). Thus, we have a combined table of two causal
factors consisting of K columns and H rows and (K x H) data cells. The total number of
observed samples is n = (K x H).
Columns (Groups)
Rows (Blocks)
1 2 … K
1 X 11 X 12 X1 K

2 X 21 X 22 X2 K

…
H XH1 XH2 X HK

₋ Step 1: Calculate sample means of groups

Individual group means (K columns):
H

∑ ❑ X ij
X j= i=1 ( j=1 , 2 ,… , K )
H
Individual block means (H rows):
K

∑ ❑ X ij
X i = j=1 (i=1 , 2, … , H )
K
Overall sample mean:
H K H K

∑ ❑ ∑ ❑ X ij ∑ ❑ X i ∑ ❑ X j
X = i =1 j=1
= i=1 = j =1
n H K
₋ Step 2: Calculate sum of squares deviations
SST: Total sum of squares, reflecting the variability of the outcome factor due to the
influence of all factors.
Formula:
H K
SST =∑ ❑ ∑ ❑( X ij− X )2
i=1 j =1

SSK: Sum of squares between groups (columns), reflecting the variability of the outcome
factor due to the influence of the first factor (arranged by column).
Formula:
K
SSK=H ∑ ❑( X j− X )2
j=1

SSH: Sum of squares between blocks (rows), reflecting the variability of the outcome
factor due to the influence of the second factor (arranged by row).
Formula:
H
SSH =K ∑ ❑( X i−X )2
i=1

SSE: Sum of squares of residuals, reflecting the variability of the outcome factor due to
the influence of other unrelated factors.
Formula:
SSE = SST - SSK – SSH
₋ Step 3: Calculate variances
SSK
MSK =
K −1
SSH
MSH =
H −1
SSE
MSE=
(K −1)(H −1)

₋ Step 4: Hypothesis testing

Calculate F-test statistic (F experimental)
MSK
F 1=
MSE
MSH
F 2=
MSE
Where: F 1 and F 2 are used for first factor and second factor respectively.
Find theoretical F for 2 causal factors:
+ Factor 1:
Standard F = F(k-1, (k-1)(h-1), α) is the critical value obtained from the F-distribution
table with k-1 degrees of freedom for the numerator and (k-1)(h-1) degrees of freedom
for the denominator at significance level α.
+ Factor 2:
Standard F = F(h-1, (k-1)(h-1), α) is the critical value obtained from the F-distribution
table with h-1 degrees of freedom for the numerator and (k-1)(h-1) degrees of freedom
for the denominator at significance level α.
If F 1 experimental > F 1 theoretical, reject H 0, meaning the means of k groups (columns)
are not equal.
If F 2 experimental > F 2 theoretical, reject H 0, meaning the means of h blocks (rows) are
not equal.
Two-Way Analysis of Variance Table:
Total sum of Degrees of Mean Square
Source of variation F-ratio
squares (SS) freedom (df) (MS)
Between columns
SSK (k - 1) MSK F1
(groups)
Between columns
SSH (h - 1) MSH F2
(groups)

Residual SSE (k - 1)(h - 1) MSE

Total SST (n - 1)

3. DATA ANALYSIS
3.1 Data reading
First, read data by using [Link] and display the data to the terminal to check if data
is successfully imported.
Figure 1: 3D Printer Dataset for Mechanical Engineers
3.2 Checking missing values
Using the command [Link](data) will return a new data frame which has null value.
Therefore, the sum command can be used to calculate the total number of rows having
null value.

Result of checking missing values. Our data doesn’t have any null value so just skip it
and move to the next step.
3.3 Data summary
Data statistics
First, We change material and infill_pattern into factor by using
dat$material=[Link](dat$material)
dat$infill_pattern=[Link](dat$infill_pattern)
then display the overview of data using summary(dat)
Figure 2: Overview of dataset
Next, we will examine the influence of infill_pattern factor on output parameters
by using boxplot() function

The result:
Figure 3: Boxplot between grid an honeycomb
Comment:
Only in the first graph, we can see that the median level of the grid is a bit higher
than the median level of the honeycomb while the medians of the grid are slightly lower
than the medians of the honeycomb in two remaining graphs. Overall, the difference of
infill_pattern does not make any big affection to output parameters.
After that, we redu with material factor and output parameters.
The result:

Figure 4: Boxplot between abs an pla

Comment:
Only in the first graph, we can see that the median level of the abs is higher than
median level of pla while medians of abs are clearly lower than medians of pla in two
remaining graphs. In conclusion, the difference of material strongly impacts output
parameters.
In the next step, we use histogram to consider the distribution of output parameters
Figure 5: Histogram
From the 3 pictures above we can see that the graphs are not evenly distributed. As in
the histogram graph of roughness large values are often concentrated between 50 and
200. Meanwhile, the histogram of elongation has large values concentrated in the middle
from 1 to 2. As for the histogram graph of tension_strength is concentrated to the right
from 25 to 30.
3.4 Correlation coefficients between variables.
To see the linear relationship between each variable, we will plot the correlation
coefficient of all variables using corrplot function and display these coefficients to the
terminal.
Figure 6: Correlogram of the data

Figure 7: Correlation parameters summary

4. INFERENTIAL STATISTICS
4.1 Build linear regression model
From the given data set, we build the appropriate regression model to analize how
adjustment parameters affect product after printing.
First, let assume the given data as:
₋ The dependent variable: roughness.
₋ The independent variables: layer_height, wall_thickness, infill_density,
infill_pattern, nozzle_temperature, bed_temperature, print_speed, material,
and fan_speed.
The model is displayed as follows:
roughness= β 0+ β 1×layer_height+ β 2×wall_thickness+ β 3×infill_density+ β 4×infill_patt
ern+ β 5×nozzle_temperature+ β 6×bed_temperature+ β 7×print_speed+ β 8×material+ β 9×
fan_speed + ε
We estimate the coefficients β i with i =0; 1; 2;...; 9;
#Linear regression model
model_1=lm(roughness~layer_height+wall_thickness+infill_density+infill_pattern+no
zzle_temperature+bed_temperature+print_speed+material+fan_speed,data=data1)
summary(model_1)
Figure 8: Results when building linear regression model 1
Comment: The result from the figure illustrates the values of β 0 to β 9 are -2371;
1269; 2.334; -0.04231; -0.1255; 15.06; -16.13; 0.6496; 298.5; NA respectively.
But it is notably that β 9’s value can not be estimated due to singularities.
In order to examine these singularities, we will check the correlation which
calculated in section 2.

Figure 9: correlation summary

Comment:
The figure shows that the correlation coefficient between bed_temperature and
fan_speed equals 1, it means that fan_speed and bed_temperature have a strong linear
relationship.
 Rebuild the linear regression without fan_speed parameter.
Model 1_2: new linear regression
#New linear regression after eliminating fan_speed
model_1_2=lm(roughness~layer_height+wall_thickness+infill_density+infill_pattern+
nozzle_temperature+bed_temperature+print_speed+material,data=data1)
summary(model_1_2)

Figure 10: New linear regression

Comment: The values in the column “estimate” remain the same, hence we get the
equation:
roughness=−2371+1269 ×layer height +2.334 ×wallthickness −0. 04231 ×infill density −0. 1255× infill patternhon
Now, we will use p-values from the column Pr( > |t|) to test the hypotheses which
help examining if the independent variables affect considerably the output.
4.2 Use P-values for hypotheses testing
₋ Assuming significant value 𝛼 = 0.05:
+ Hypothesis Ho: 𝛽i = 0 => The variable has no statistical meanings to the output
value.
+ Hypothesis Ho: 𝛽i ≠ 0 => The variable has statistical meanings to the output
value.
₋ From the column Pr( > |t|), we get that the p-values of layer_height is
₋ 2 ×10−16< α=0.05 => we can reject Ho so that the adjustment of this data has the
considerable affect on roughness.
₋ Similarly, nozzle_temperature, bed_temperature, print_speed and material have
enormous affect on roughess.
₋ On the other hand, the P-value of 𝛽2 (wall_thickness), 𝛽3 (infill_density), 𝛽4
(infill_patternhoneycomb) are 0.29259, 0.85742 and 0.99117, respectively. We can
not reject the null hypothesis Ho with these P-values. Hence, they have no statical
meanings in regression model.
 Eliminate wall_thickness, infill_density, infill_patternhoneycomb from the model.
 Model 2: removing infill_patternhoneycomb
#eliminate infill_patternhoneycomb
model_2=lm(roughness~layer_height+wall_thickness+infill_density+nozzle_temperatu
re+bed_temperature+print_speed+material,data=data1)
summary(model_2)
Figure 11: Eliminate infill_patternhoneycomb
 Model 3: removing infill_density
#eliminate infill_density
model_3=lm(roughness~layer_height+wall_thickness+nozzle_temperature+bed_tempe
rature+print_speed+material,data=data1)
summary(model_3)

Figure 12: Eliminate infill_density

 Model 4: removing wall_thickness
#eliminate infill_density
model_4=lm(roughness~layer_height++
+nozzle_temperature+bed_temperature+print_speed+material,data=data1)
summary(model_4)

Figure 13: Eliminate wall thickness

4.3 Use anova to compare models
In this section, we use anova test to find the most suitable regression model for the output
data. It means that we will find out which model has the dependent variable explained
most by independent variables.
₋ Assuming significant value 𝛼 = 0.05:
+ Hypothesis H0: 𝛽i = 0 => Model is more effective
+ Hypothesis H1: 𝛽i ≠ 0 => The other model is more effective
₋ After identifying the most appropriate model, checking the assumption of the linear
regression model
4.3.1 Compare models
 model 1_2 vs model 2
#compare 1_2 vs 2
anova(model_1_2,model_2 )
Figure 14: Anova of model 1_2 and model 2
Comment:
We can see see P−value=0.9912,, which is greater than the significant value 𝛼 = 0.05, so
we cannot reject H0. Therefore, model 2 is more effective than model 1_2.
 model 2 vs model 3
#compare 2 vs 3
anova(model_2,model_3 )

Figure 15: Anova of model 2 and model 3

Comment:
We can see see P−value=0.8557,, which is greater than the significant value 𝛼 = 0.05, so
we cannot reject H0. Therefore, model 3 is more effective than model 2.
 model 3 vs model 4
#compare 3 vs 4
anova(model_3,model_4 )
Figure 16: Anova of model 3 and model 4

Comment:
We can see P−value=0.2828, which is greater than the significant value 𝛼 = 0.05, so we
cannot reject H0. Therefore, model 4 is more effective than model 3.
Conclusion:
From comparison, model 4 is most effective, hence it is the most appropriate linear
regression model for roughness.
4.3.2 Checking the assumption of the linear regression model
₋ The assumptions of the regression model:Y = β0 + β 1 x1 + β 2 x 2 +..+ β i x i (i=1 , … , n)
₋ There must be a linear relationship between the outcome variable and the independent
variables.
₋ The error has a normal distribution.
₋ The variance of the errors is constant.
₋ Errors ε have expectation = 0
We will plot residual analysis graphs to examine easier.
#plot graph
plot(model_4)
Figure 17: Residual analysis graphs

Comment:
₋ Graph 1 displays the error values corresponding to the forecasted values, aiming to
verify the assumption of data linearity and the expectation of zero errors:
+ The red line appears almost straight, indicating that the linearity assumption of the
data is met.
+ The errors are mostly clustered around the zero line y = 0, with only a few
outliers. This confirms that the assumption of error expectation being 0 holds true.
₋ In Graph 2, the standardized errors are plotted to assess the normal distribution
assumption:
+ The standardized errors align closely with a straight line, suggesting that the
assumption of normal distribution is satisfied.
₋ Graph 3 depicts the square root of the errors, examining the constant variance
assumption:
+ While some outliers are present, the square root errors primarily cluster around
the red line, indicating an acceptable degree of variance stability.
₋ In Graph 4, any high-influence points within the dataset are identified:
+ Points 23, 25, and 5 exhibit relatively high impact scores. However, considering
they have not surpassed Cook’s distance, these points are not deemed highly
influential and do not require exclusion during analysis.
5. DISCUSSION AND EXTENSION
6. DATA AND CODE AVAILABILITY
Data link: Dataset
Code Link: Codelink
7. REFERENCES

Common questions

One-way ANOVA is used to evaluate the influence of a single independent variable on a dependent variable, comparing mean values across multiple groups. In contrast, two-way ANOVA involves two independent variables and examines their separate and combined effects on the dependent variable, also allowing for the examination of interaction effects between these variables .

Removing variables with statistically insignificant coefficients, as shown by their high p-values compared to α = 0.05, improves model performance by eliminating noise introduced by variables that do not meaningfully contribute to explaining the variability in the dependent variable. This leads to a more parsimonious and interpretable model, ensuring that remaining predictors demonstrate a significant effect on the predictive outcome .

It is necessary to eliminate multicollinearity, such as between bed_temperature and fan_speed which have a correlation coefficient of 1, because it leads to inflated variances for the estimated coefficients, making them unreliable and statistically insignificant. Multicollinearity prevents accurate determination of the individual impact of correlated explanatory variables on the dependent variable, which compromises the validity of the regression model .

The assumptions necessary for a multiple regression model to hold true include linearity of data, normal distribution of errors, constant variance of errors, expectation of errors equaling zero, and independence of errors μ1, ..., μn. These assumptions are important because deviations from them can lead to inaccurate parameter estimates, invalid tests of hypotheses, and misleading predictions, affecting the validity and reliability of the regression analysis .

Boxplots are useful for visualizing the distribution of a dependent variable across different levels of a categorical independent variable, helping to identify median shifts and variance within groups. This aids in understanding the effect of categorical variables by providing insight into potential differences in central tendency and spread, thus highlighting interactions or trends that should be considered in regression analysis .

ANOVA test compares nested regression models by evaluating whether a more complex model significantly improves the fit compared to a simpler model. By calculating the F statistic and corresponding p-value, ANOVA determines if additional parameters explain the variance in the dependent variable significantly. If the p-value exceeds a significance level like α = 0.05, the simpler model is preferred due to its parsimony and effectiveness in capturing the essential variance with fewer predictors .

The main objective of regression analysis is to study the relationship between a dependent variable (Y) and one or more explanatory variables (X), to make predictions or describe the dependent variable based on these explanatory variables. This is achieved by estimating a function F(X1, ..., Xn) that minimizes the error term ϵ, thus providing the most accurate predictions for Y by adjusting parameters θ = (θ1, ..., θk).

The null hypothesis in a multiple regression model, which states that all regression coefficients are equal to zero, can be rejected if the computed F statistic (Fc) is greater than the critical value from the F-table corresponding to the (k-1) and (n-k) degrees of freedom at a predetermined significance level α. Alternatively, if the p-value is less than α, the null hypothesis is rejected, indicating that at least one explanatory variable significantly affects the dependent variable .

Residual analysis involves examining the residuals, or differences between observed and predicted values, to ensure model assumptions are met, such as linearity, constant variance, independence of errors, and normal distribution of errors. Plotting residuals helps identify patterns or violations of assumptions, such as heteroscedasticity or non-linearity, guiding model refinement or transformation to improve predictive accuracy and validity .

A simple linear regression model involves one dependent variable Y and a single explanatory variable X, expressed as Y=β0+β1X+ϵ, whereas a multiple linear regression model involves one dependent variable Y and two or more explanatory variables X1, X2, ..., Xi, expressed as Y=β0+β1X1+β2X2+...+βiXi+μ. The multiple regression model considers the impact of multiple factors simultaneously on the dependent variable, while the simple regression model considers only one factor .

Understanding Statistical Modelling Techniques
No ratings yet
Understanding Statistical Modelling Techniques
10 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
28 pages
ST 321 All Notes
No ratings yet
ST 321 All Notes
91 pages
Understanding Regression Analysis
No ratings yet
Understanding Regression Analysis
14 pages
Regression Model Development Lab Guide
No ratings yet
Regression Model Development Lab Guide
8 pages
Regression
No ratings yet
Regression
38 pages
Regression Model Diagnostics and Types
No ratings yet
Regression Model Diagnostics and Types
14 pages
Regression and Correlation Analysis Guide
100% (1)
Regression and Correlation Analysis Guide
32 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
33 pages
Linear Regression in Predictive Analytics
No ratings yet
Linear Regression in Predictive Analytics
39 pages
Linear Regression in Machine Learning
100% (1)
Linear Regression in Machine Learning
55 pages
Business Analytics: Advance: Simple & Multiple Linear Regression
No ratings yet
Business Analytics: Advance: Simple & Multiple Linear Regression
38 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
2 pages
Linear Regression Analysis Basics
No ratings yet
Linear Regression Analysis Basics
51 pages
Data Analytics: Regression & Correlation Concepts
No ratings yet
Data Analytics: Regression & Correlation Concepts
16 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
39 pages
Module 1
No ratings yet
Module 1
163 pages
Introduction to Statistical Learning
No ratings yet
Introduction to Statistical Learning
19 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
24 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
60 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
19 pages
Understanding Simple Regression Analysis
No ratings yet
Understanding Simple Regression Analysis
3 pages
Linear Models and Regression Analysis Guide
No ratings yet
Linear Models and Regression Analysis Guide
9 pages
Intermediate Analytics Course Overview
No ratings yet
Intermediate Analytics Course Overview
52 pages
Understanding Linear Regression Techniques
No ratings yet
Understanding Linear Regression Techniques
41 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
29 pages
Regression Analysis in Data Science with R
No ratings yet
Regression Analysis in Data Science with R
50 pages
Linear Regression: Simple & Multiple Models
No ratings yet
Linear Regression: Simple & Multiple Models
43 pages
Unit-4-Analytical Model
No ratings yet
Unit-4-Analytical Model
66 pages
Linear Regression Analysis in R
No ratings yet
Linear Regression Analysis in R
24 pages
Introduction to Regression Analysis
No ratings yet
Introduction to Regression Analysis
16 pages
Regression Analysis Fundamentals
No ratings yet
Regression Analysis Fundamentals
30 pages
Regression Concepts and Model Building
50% (2)
Regression Concepts and Model Building
15 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
53 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
5 pages
Correlation vs. Multiple Regression Analysis
100% (1)
Correlation vs. Multiple Regression Analysis
9 pages
Understanding Linear Regression Analysis
No ratings yet
Understanding Linear Regression Analysis
18 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
17 pages
Linear Regression An Overview
No ratings yet
Linear Regression An Overview
32 pages
DAUNIT3
No ratings yet
DAUNIT3
27 pages
Regression Analysis: Concepts & Techniques
No ratings yet
Regression Analysis: Concepts & Techniques
54 pages
Introduction to Statistical Learning with R
No ratings yet
Introduction to Statistical Learning with R
13 pages
Statistical Learning with R Overview
No ratings yet
Statistical Learning with R Overview
9 pages
Multiple Regression Analysis: Prentice-Hall, Inc
No ratings yet
Multiple Regression Analysis: Prentice-Hall, Inc
30 pages
Fundamental Concepts of Model Building Using Regression Analysis-Mitra Sept 2 2016
No ratings yet
Fundamental Concepts of Model Building Using Regression Analysis-Mitra Sept 2 2016
37 pages
Chapter 3 Multiple Linear Regression - Jan
No ratings yet
Chapter 3 Multiple Linear Regression - Jan
47 pages
Hướng Dẫn Hồi Quy Tuyến Tính SPSS
No ratings yet
Hướng Dẫn Hồi Quy Tuyến Tính SPSS
11 pages
Regression and Correlation Techniques
No ratings yet
Regression and Correlation Techniques
12 pages
Regression Analysis and Covariance Concepts
No ratings yet
Regression Analysis and Covariance Concepts
13 pages
BTL XSTK 1
No ratings yet
BTL XSTK 1
30 pages
Data Analysis and Regression Techniques
No ratings yet
Data Analysis and Regression Techniques
33 pages
ANOVA Analysis of Methanol Content
No ratings yet
ANOVA Analysis of Methanol Content
11 pages
Regression Analysis Practice Test
No ratings yet
Regression Analysis Practice Test
7 pages
2998 Ba 86
No ratings yet
2998 Ba 86
27 pages
Statistics in Psychological Research
No ratings yet
Statistics in Psychological Research
27 pages
Chi-Squared Test: Degrees of Freedom
No ratings yet
Chi-Squared Test: Degrees of Freedom
46 pages
Overfitting and Underfitting Explained
No ratings yet
Overfitting and Underfitting Explained
2 pages
BBL 536E Regression
No ratings yet
BBL 536E Regression
91 pages
Spearman Rank Correlation Analysis
No ratings yet
Spearman Rank Correlation Analysis
16 pages
SPSS Output Interpretation Guide
No ratings yet
SPSS Output Interpretation Guide
13 pages
Hypothesis Testing Overview and Errors
No ratings yet
Hypothesis Testing Overview and Errors
18 pages
Understanding R²: Coefficient of Determination
No ratings yet
Understanding R²: Coefficient of Determination
4 pages
scSplit: Severity-Aware Image Decomposition
No ratings yet
scSplit: Severity-Aware Image Decomposition
13 pages
Logistic Regression Analysis Overview
No ratings yet
Logistic Regression Analysis Overview
32 pages
Machine Learning for Japan's Inflation Forecasts
No ratings yet
Machine Learning for Japan's Inflation Forecasts
23 pages
IMDB Movie Success Prediction Analysis
No ratings yet
IMDB Movie Success Prediction Analysis
9 pages
Exam: Quantitative Research Methods
No ratings yet
Exam: Quantitative Research Methods
13 pages
Factors Influencing Hotel Stays in Pangkalan Bun
No ratings yet
Factors Influencing Hotel Stays in Pangkalan Bun
8 pages
Univariate Time Series Modelling and Forecasting
No ratings yet
Univariate Time Series Modelling and Forecasting
62 pages
Business Analysis Report: Income Factors
No ratings yet
Business Analysis Report: Income Factors
13 pages
BS in Economics & Statistics at IISER
No ratings yet
BS in Economics & Statistics at IISER
6 pages
Lineweaver-Burk Analysis in Excel
No ratings yet
Lineweaver-Burk Analysis in Excel
3 pages
Tutorial in Biostatistics Handling Drop-Out in Longitudinal Studies
No ratings yet
Tutorial in Biostatistics Handling Drop-Out in Longitudinal Studies
43 pages
Understanding Statistical Modeling Basics
No ratings yet
Understanding Statistical Modeling Basics
39 pages
Statistical Techniques Classification Guide
No ratings yet
Statistical Techniques Classification Guide
32 pages
Liu Parameter Estimation in Regression
No ratings yet
Liu Parameter Estimation in Regression
11 pages
ANOVA: One-Way and Two-Way Analysis
No ratings yet
ANOVA: One-Way and Two-Way Analysis
55 pages
Understanding Heteroskedasticity in Finance
No ratings yet
Understanding Heteroskedasticity in Finance
17 pages
STA201 Subject Outline: Scientific Statistics
No ratings yet
STA201 Subject Outline: Scientific Statistics
23 pages

Predicting 3D Print Quality with Regression

Uploaded by

Predicting 3D Print Quality with Regression

Uploaded by

HO CHI MINH UNIVERSITY OF TECHNOLOGY

FACULTY OF MECHANICAL ENGINEERING

Ho Chi Minh City, 2024

Code and Data

Variable Type Unit Description

Layer Vertical thickness of each layer deposited by the

Material Categorical none Type of material used for printing

Fan Speed Quantitative % Speed of cooling fan

Output Parameters (Measured)

Roughness Quantitative µm Surface roughness or texture of printed object

₋ Assumption 1: Linearity of data: the relationship between the predictor variable

₋ Step 1: Calculate sample means of groups

₋ Step 4: Hypothesis testing

Residual SSE (k - 1)(h - 1) MSE

Figure 4: Boxplot between abs an pla

Figure 7: Correlation parameters summary

Figure 9: correlation summary

Figure 10: New linear regression

Figure 12: Eliminate infill_density

Figure 13: Eliminate wall thickness

Figure 15: Anova of model 2 and model 3

Common questions

How does the analysis of variance (ANOVA) differ between one-way and two-way ANOVA?

What does the removal of variables with statistically insignificant coefficients reveal when optimizing a regression model?

Why is it necessary to eliminate multicollinearity in regression models, as seen with the variables bed_temperature and fan_speed?

What are the assumptions necessary for a multiple regression model to hold true, and why are they important?

How does using boxplots aid in understanding the effect of categorical variables within linear regression analysis?

Describe how an ANOVA test can be used to compare multiple regression models and select the optimal one.

What is the main objective of regression analysis and how is it achieved?

Under what conditions can the null hypothesis in a multiple regression model be rejected using the F-test?

Explain the process and importance of conducting a residual analysis in regression modeling.

How does a simple linear regression differ from a multiple linear regression model?

You might also like