0% found this document useful (0 votes)

14 views27 pages

Real Estate Agent Performance Analysis

The document analyzes data from 500 real estate transactions to understand agent performance and property characteristics. Various statistical analyses were conducted using SPSS including frequency tables, charts, and distributions. Frequency tables show the number of transactions by agent, with Carter having the most. Charts show properties are most commonly located in Area 5 and that half have pools. Distributions examine number of bedrooms, days on market, and property prices.

Uploaded by

Hello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views27 pages

Real Estate Agent Performance Analysis

Uploaded by

Hello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DATA ANALYSIS

In this report, we are going to demonstrate our findings on the working performance of all
of the currently employed agents under Real Estate Sdn Bhd and the type of properties
that sold out during the past 500 sample of business transactions. Carter, Isaacs, Marty,
Peterson and Rose are Real Estate Sdn Bhd currently employed agents.

We used SPSS software to conduct 20 statistics testing across the following data. Before
we carry out analysis and decide which type of tools will be using, we need to understand
the type of variable of each of the data. (Appendices 1 & 2 for data view and variable view
in SPSS software)

No. Type of variables Data use in report

1. Categorical (qualitative) a) Agent – 1=Carter, 2=Isaacs, 3=Marty,
variables refer to their values 4=Peterson, 5=Rose
such as “yes” or “no” that can b) Pool - does the home have a pool
be divide into categories (1=yes, 2(0)=no)
c) Garage - does the home have an
attached garage (1=yes, 2(0)=no)
d) Mortgage type – fixed or adjustable
(1=fixed mortgage is a 30 years, fixed
interest rate loan; 2=adjustable rate loan
begins with an introductory interest rate of
3% for the first 5 years, then the interest
rate is based on the current interest rates
plus 1%)
e) Township – area where the property is
located
f) Default – is the mortgage loan in default?
(1=yes, 2(0)=no)
2. Numerical Discrete a) Bedrooms - number of bedrooms
(quantitative) variables b) Baths - number of bathrooms
variables arise from a c) Days - number of days the property on
have values counting the market
that represent process

1
a counted or d) Years – the number of years that the
measured mortgage loan has been paid
3 quantity Continuous a) Price - market price in dollar
variables b) Size - livable square feet of the property
arise from a c) FICO – the credit score of the mortgage
measuring loan holder. The highest score is 850; an
process average score is 680; a low score is
depend on below 680. The score reflects a person’s
precision of ability to pay their debts.
measuring
instrument
used

Independent variables can be changed or controlled in a given model or equation.

Dependent variables that result from the independent variables.

1. Frequency Table
Statistics
Agent
N Valid 500
Missing 0

Agent
Valid Cumulative
Frequency Percent Percent Percent
Valid Carter 119 23.8 23.8 23.8
Isaacs 100 20.0 20.0 43.8
Marty 92 18.4 18.4 62.2
Peterson 104 20.8 20.8 83.0
Rose 85 17.0 17.0 100.0
Total 500 100.0 100.0

The above table is known as frequency table. It contains qualitative variable. In this table,
the variable used is the Agent. The total amount of sample sales transactions collected
is 500. The overall sales transactions are divided into 5 different agents which are Carter,
Isaacs, Marty, Peterson and Rose. Each agent represents its own frequency for the data.
Carter agent has the highest frequencies of sales transactions which is 119, it means
Carter complete the most property sales among others. While the lowest amount of
frequencies is the Rose with 85. While the others contain Isaacs have 100 frequencies,
Marty have 92 frequencies, Peterson have 104 frequencies. Rose complete the fewest
property sales among others.

2
2. Bar Chart

The bar chart above visualize the area of properties owned by the company. The variable
is are of town, a category variable. It is clearly to see that most of the properties located
in Area 5 (116), follow by Area 4 (115), Area 3 (104), Area 2 (92), and the fewest
properties located in Area 1 (73). The reason of using bar chart is because it will visualize
categorical variable and each length bar represents the tallies for the category. Further,
it useful to show relative values between each categories.

3. Pie Chart

Based on the chart above, indicate whether the 500 sample of properties in sales
transactions have a pool or not. The pie chart above showed 50% of the property have
pool and 50% of them do not have pool. It can see that number of pools have same
proportion. The reason of using pie chart is because it visually simpler than other types
of graphs. Pie chart is good to display data and uses parts of circle to represent the
percentages of each category for categorical/quantitative variables.

3
4. Frequency distribution

Frequency distribution tells how frequencies distributed over values. It can be visualized
through a list, table or graph. The frequency distribution shown above is displayed through
bar chart with the number of days to sell property by every agent in the company. The
lowest limit for this graph is day-2 and the highest limit is day-60. We can see that in the
day-20 have the highest frequencies follow by day-32. The lowest frequencies is day-14.
The reason of using day as the variable is because we want to see how many days to
wait for property to sold out and can provide the information of shape of distribution.

5. Histogram

Bedroom
Frequen Valid Cumulative
cy Percent Percent Percent
Valid 2 68 13.6 13.6 13.6
3 68 13.6 13.6 27.2

4
4 78 15.6 15.6 42.8
5 74 14.8 14.8 57.6
6 72 14.4 14.4 72.0
7 61 12.2 12.2 84.2
8 79 15.8 15.8 100.0
Total 500 100.0 100.0

The graph below is known as histogram. It is used to represent a quantitative variable.

Histogram is graph in which the bedrooms are marked on the horizontal axis and the
class frequency is on the vertical axis. Histograms used with interval and ratio data. The
class frequencies are represented by the heights of the bars and the bars are drawn
adjacent to each other. The variable use in this graph is number of bedrooms. There are
about 500 data collected. 13.6% of properties which are 68 of properties have 2 and 3
bedrooms respectively. 78 or 15.6% of properties have 4 bedrooms, 74 or 14.8% of 500
properties have 5 bedrooms. In additional, 72 properties or 14.4% have 6 bedrooms, 61
of 12.2% properties have 7 bedrooms. Only 79 properties of 15.8% of properties that
have 8 bedrooms.
The reason of using bedroom as the variable is because we want to see the frequency of
numbers of bedroom by the bedroom and can provide the information we want in a clearer
picture and get the frequency of values in dataset.

5
6. Frequency Polygon

Frequency polygon is a graphical chart for understanding the shapes of distributions and
helpful for comparing sets of data. From the frequency polygon above, we can see that
the price of properties reached the highest peak compare to the others which has exceed
$ 300,000. It’s represents the quantitative variable. The graph is connected by the
midpoint in each class. From the frequency polygon above, we can see the highest
frequency is between the $ 300,000 to $ 350,000, followed by $ 500,000 to $ 550,000,
then $ 800,000 to $ 850,000.
The reason of using days as the variable is because frequency polygon can superimpose
two or more frequency polygons on the same axes and make comparisons between the
sets of data.

7. Cumulative frequency polygon

6
Cumulative frequency polygon is also known as an ogive (oh-jive). It is a type of frequency
polygon that shows cumulative frequencies. It is a curve showing the cumulative
frequency for a given set of data. The cumulative frequency is plotted on the y-axis against
the data which is on the x-axis for un-group data. In this graph, it shows the cumulative
price of the first sales transaction from the smallest to the largest cumulative sales. The
highest sales value of a property is over $ 900,000 among the 500 sales transactions.

8. Central tendency

Statistics
Price Size
N Valid 500 500
Missing 0 0
Mean $559,977.93 4644.20
Median $569,929.50 4660.00
Mode $168,256a 3141a
Std. Deviation $216,050.045 1750.683
Variance 46677622100.610 3064890.871
Range $750,306 6090
Percentile 25 $360,501.00 3086.25
s 50 $569,929.50 4660.00
75 $751,988.50 6093.75
a. Multiple modes exist. The smallest value is shown

Central tendency is the extent to which the values of a numerical variable group around
a typical, or central, value. To put in other words, it is a way to describe the center of a
data set. There are three measures of central tendency: the mean, the median, and the
mode. Mean is a “balance points” in a set of data
Mean = Sum of the value / Number of values
The table describe that the mean value of price of property is $ 559,977.93, mean of
property size is 4,644.20 square feet.
Median is the middle value in an ordered array of data that has been ranked from smallest
to largest.
Median = (n+1)/2 ranked value
The table describe that the median value of price of property is $ 569,929.50, median of
property size is 4,660.00 square feet. The mode is the value appears most frequently.
The mode of price of property is $ 168,256 and the mode of property size is 3,141 square
feet.

7
9. Dot Plot

Dot plot is a type of simple histogram -like chart used in statistics for relatively small data
sets where values fall into a number of discrete bins (categories). The values are
represented by dot. In the dot plot above, each dot represents the credit score (FICO) of
each property mortgage loan holder. We can see the frequency number of wheelbases
was concentrated between 670 and 690. The highest score is 850; an average score is
680; a low score is below 680. The score reflects a person’s ability to pay their debts. As
dot plot describe that the credit score of each mortgage loan holders are fairly equal
stated. The reason of using dot plot is because of it is less cluttered.

10. Percentiles

Statistics
years

N Valid 500

Missing 0
Mean 10.67
Median 10.00
Percentiles 25 6.00

50 10.00

75 16.00

8
years

Cumulative
Frequency Percent Valid Percent Percent

Valid 1 21 4.2 4.2 4.2

2 22 4.4 4.4 8.6

3 22 4.4 4.4 13.0

4 21 4.2 4.2 17.2

5 27 5.4 5.4 22.6

6 27 5.4 5.4 28.0

7 30 6.0 6.0 34.0

8 28 5.6 5.6 39.6

9 29 5.8 5.8 45.4

10 29 5.8 5.8 51.2

11 27 5.4 5.4 56.6

12 20 4.0 4.0 60.6

13 24 4.8 4.8 65.4

14 19 3.8 3.8 69.2

15 23 4.6 4.6 73.8

16 28 5.6 5.6 79.4

17 20 4.0 4.0 83.4

18 23 4.6 4.6 88.0

19 25 5.0 5.0 93.0

20 35 7.0 7.0 100.0

Total 500 100.0 100.0

Percentiles will split a variable into 100 equal parts. The first quartile is equivalent to the
25th percentile, second quartile to the 50th percentile, and the third quartile to the 75th
percentile. The 25th percentile is 6, 50th percentile is 10 and the 75th percentile is 16.
Percentiles can used to visualize the distribution of the values for a numerical variable.
Therefore, we can visualize the type of distribution of the number of years that the
mortgage loan has been paid by using five-number summary method.
Five-number summary method:
X smallest - Q1 - Median - Q3 - X largest
For number of years that the mortgage loan has been paid:
1 – 6 – 10 -16 – 20
Since the distance from Q1 to median (10-6=4) is less than the distance from Q3 to
median (16-10=6). The distribution of years of mortgage loan is right-skewed.

9
11. Box Plot

The function of box plot is similar with percentiles to

visualize the distribution of the values for a numerical
variable. A box plot is a graphical rendition of statistical
data based on the minimum, first quartile, median, third
quartile, and maximum. The term "box plot" comes from
the fact that the graph looks like a rectangle with lines
extending from the top and bottom. The reason of using it
because of it can directly visualize the distribution for a
variable compare to percentiles. Box plots also are a
quick and efficient way to visualize a relationship between
variables. The relationship of price and agent 4, Peterson
is right-skewed, others are left-skewed.

12. Skewness

Skewness is asymmetry in a statistical distribution, in which the curve appears distorted

or skewed either to the left or to the right. Skewness can be quantified to define the extent
to which a distribution differs from a normal distribution. The reasons of using skewness
because it helps us determine the overall shape of the distribution curve, whether it’s
positive or negative. The coefficient number also helps us determine whether the right tail
or the left tail of the distribution is more pronounced.

10
• Skewness > 0 – Right skewed distribution – most values are concentrated on
left of the mean, with extreme values to the right.
• Skewness < 0 – Left skewed distribution – most values are concentrated on
the right of the mean, with extreme values to the left.
• Skewness = 0 – mean = median, the distribution is symmetrical around the
mean.

• Descriptive Statistics
N Skewness
Statistic Statistic Std. Error
Number of bathroom 500 .066 .109
Valid N (listwise) 500

N represents the number of observations. The skewness statistic is .066, since

skewness > 0, therefore is right skewed distribution.
A positive skew could be good or bad, depending on the mean. A positive mean with a
positive skew is good, while a negative mean with a positive skew is not good. If a data
set has a positive skew, but the mean of the returns is negative, it means that overall
performance is negative, but the outlier months are positive. A negative skew is generally
not good, because it highlights the risk of left tail events or what are sometimes referred
to as “black swan events.” While a consistent and steady track record with a positive
mean would be a great thing, if the track record has a negative skew then you should
proceed with caution.

11
13. Scatter diagram

A scatter diagram (or "scatter plot") is a graph used to plot the relationship between two
continuous variables, with the x-axis representing one variable and the y-axis
representing the other.
The reason of using scatter diagram because it is easy to understand relationship
between variables. This diagram indicates that the relationship between price and the
size of the property.
If the points cluster in a band running from lower left to upper right, there is a positive
correlation (if x increases, y increases). If the points cluster in a band from upper left to
lower right, there is a negative correlation (if x increases, y decreases). Imagine drawing
a straight line or curve through the data so that it "fits" as well as possible. The more the
points cluster closely around the imaginary line of best fit, the stronger the relationship
that exists between the two variables. If it is hard to see where you would draw a line,
and if the points show no significant clustering, there is probably no correlation. In
conclusion, there are no correlation between price and size of the property and even we
unable to draw a straight line in the graph.

12
14. Contingency table

A contingency table, sometimes called a two-way frequency table, is a tabular mechanism

with at least two rows and two columns used in statistics to present categorical data in
terms of frequency counts. The reason of using contingency table is to study patterns that
may exists between the variables.

Township * Agent Crosstabulation

Agent
1 2 3 4 5 Total
Townshi 1 Count 17 14 10 19 13 73
p % within
23.3% 19.2% 13.7% 26.0% 17.8% 100.0%
Township
2 Count 23 19 17 16 17 92
% within
25.0% 20.7% 18.5% 17.4% 18.5% 100.0%
Township
3 Count 29 19 19 26 11 104
% within
27.9% 18.3% 18.3% 25.0% 10.6% 100.0%
Township
4 Count 32 27 17 18 21 115
% within
27.8% 23.5% 14.8% 15.7% 18.3% 100.0%
Township
5 Count 18 21 29 25 23 116
% within
15.5% 18.1% 25.0% 21.6% 19.8% 100.0%
Township
Total Count 119 100 92 104 85 500
% within
23.8% 20.0% 18.4% 20.8% 17.0% 100.0%
Township

From this table, we can see that Agent 1, Carter proceeded the most sales property
transaction and he in charge properties mainly located in Area 2 (25%), Area 3 (27.9%)
and Area 4 (27.8%), only fewer properties located in Area 5 (15.5%). However, Agent 3,
Marty largely in charge properties are located in Area 5 (25% properties in Area 5 in
charge by him). Whereas for Agent 4, Peterson mainly responsible for properties located
in Area 1 with 26% out of 100%.

Is the mortgage loan default? * Mortgage type Crosstabulation

Mortgage type
Fixed Adjustable Total
Is the 0 Count 138 124 262
mortgage % within Is the mortgage loan
52.7% 47.3% 100.0%
default?

13
loan Yes Count 128 110 238
default? % within Is the mortgage loan
53.8% 46.2% 100.0%
default?
Total Count 266 234 500
% within Is the mortgage loan
53.2% 46.8% 100.0%
default?

There are 53.2% of 500 properties mortgage loans are fixed mortgage with a period of 30
years, the mortgage loan holders need paid fixed interest rate loan. While 46.8% of 500
properties mortgage loans are adjustable rate loan that begins with an introductory
interest rate of 3% for the first 5 years, then the interest rate is based on the current
interest rates plus 1%. The default mortgage loan have 53.8% if fixed mortgage loan while
the not default mortgage loan recorded the larger portion of 52.7% is fixed mortgage loan.

15. Normal probability of distribution

The are several tests can use to determine if the selected variable is approximately
normally distributed or not. We use several tests to test our continuous variable: price,
size and FICO. The reason of using dot plot is because of it is less cluttered.
(1) Q-Q plots for the property size

14
As long as the points follow approximately along the diagonal line, we conclude that
the data is approximately normally distributed. If the points skewed to right, or skewed
to left. As the points follow an “S-curve” shape, then the data is likely to be uniform(flat)

(Source: A Q-Q Plot Dissection Kit)

(2) P-P plot for the price and FICO

Same as Q-Q plot, as long as the points follow approximately along the diagonal line,
we conclude the data for price and FICO are approximately normally distributed. As
both of the points are S shape, we conclude the data are likely to be uniform.

16. Confidence interval for the mean

A confidence interval estimate is a range of numbers, called an interval, constructed

around the point estimate. The confidence interval is constructed such that the probability
that the interval includes the population parameter is known. We use the confidence
interval for mean, population standard deviation unknown methods to figure out whether
the 95% of the intervals will contains mean credit score, 680.
Step 1:
Make sure fulfill the assumption 1) population is normally distribution, 2) at least 30 of
sample (n=500)
Step 2: Assume the level of significance, ɑ = 0.05, number of size, n=500, ц =680.
Step 3: Since the population standard deviation didn’t provide, we use the t-test.
Step 4: Freedom of degree (df) = 499, t (ɑ/2) is 1.973
Step 5: Perform the test

15
One-Sample Statistics
Std. Std. Error
N Mean Deviation Mean
Credit score of the
500 704.44 67.670 3.026
mortgage loan holder

Descriptives
Std.
Statistic Error
Credit score of the Mean 704.44 3.026
mortgage loan holder 95% Confidence Lower
698.49
Interval for Mean Bound
Upper
710.38
Bound
5% Trimmed Mean 704.47
Median 706.00
Variance 4579.20
8
Std. Deviation 67.670
Minimum 583
Maximum 825
Range 242
Interquartile Range 110
Skewness .027 .109
Kurtosis -1.108 .218

From the one sample test table, we can conclude that 95% of the mean credit score is
somewhere between 698.49 and 710.38. The mean credit score, 680 is not fall within the
confidence interval.

17. One sample T-test of hypothesis testing

Hypothesis test testing a claim or an assertion about a particular parameter of a

population. We use this method because we want to test whether the average FICO credit
score by mortgage loan holder is 680.
Step 1:
Null hypotheses, H0=ц=680
Alternative hypothesis: H1=ц≠680
Step 2: Assume the level of significance, ɑ = 0.05 & number of size, n=500
Step 3: Since the population standard deviation didn’t provide, we use the t-test.

16
Step 4: Freedom of degree (df) = 499, critical value for T-test is 1.653
Step 5: Perform the test

One-Sample Statistics
Std. Std. Error
N Mean Deviation Mean
Credit score of the
500 704.44 67.670 3.026
mortgage loan holder

One-Sample Test
Test Value = 680
95% Confidence Interval of
Sig. (2- Mean the Difference
t df tailed) Difference Lower Upper
Credit score of the
8.075 499 .000 24.436 18.49 30.38
mortgage loan holder
Step 6: Make decision
The t value is 8.075, which gives us a p-value (or 2-tailed significance value) of .000. This
is going to be a significant result for any realistic alpha level.
In T-value approach, the T-value 8.075 is over the critical value of 1.653, fall within
rejection region, reject null hypothesis. We conclude that have sufficient evidence that the
mean credit core is not equal to 680.
If use p-value approach, the p-value 0.000 is smaller than level of significance 0.05. We’re
going to reject the null hypothesis which have sufficient evidence that the mean credit
core is not equal to 680. The conclusions to reject null hypothesis are same either using
any methods.

18. Two sample tests on hypothesis: Independent samples test

The independent samples t test (also called the unpaired samples t test) is the most
common form of the T test. It helps you to compare the means of two sets of data. We
use this tool because we want to test to see whether the property have swimming pool
will differ significantly the market value of the property.
Step 1:
Null hypotheses, H0=ц1=ц2
Alternative hypothesis: H1=ц1≠ц2
* ц1 is mean market value for properties have pool, ц2 is mean market value for properties
that do not have pool.
Step 2: Assume the level of significance, ɑ = 0.05 & number of size, n=500
Step 3: Since the population standard deviation didn’t provide, we use the t-test.
Step 4: Freedom of degree (df) = 499, critical value for T-test is +/-1.973
Step 5: Perform the test

17
Group Statistics
Does the home have Std. Std. Error
pool? N Mean Deviation Mean
Market price in Yes 565851.6
250 209453.192 13246.983
dollars 5
0 554104.2
250 222716.831 14085.849
0

Step 6: Make decision

We can see that the average properties have swimming pool have mean market value of
$565,851.65 compared to those properties do not have pool was $554,104.20. SPSS is
reporting a t value of .608 and a 2-tailed p-value of .544.
In T-value approach, the T-value 0.608 is below the critical value of 1.653, fall within the
non-rejection region, accept null hypothesis. We conclude that have insufficient evidence
that the mean market value for both group of properties are same.
If use p-value approach, the p-value .544 is larger than level of significance 0.05. We’re
going to accept the null hypothesis which have insufficient evidence that the mean market
value for both group of properties are same. There is no difference market value between
properties have swimming pool and properties that do not have swimming pool. The
conclusions to accept null hypothesis are same either using any methods.

19. One-way ANOVA

One-way ANOVA also known by analysis of variance that allow you to compare multiple
populations unlike the hypothesis-testing methods our mentioned before. We will take
samples from each group to examine the effects of differences among two or more group,
The criteria that distinguishes the groups are called factors/ factors of interest. Factors
contain levels which are analogous to the category of the categorical variables.
We select this tools because we want to examine differences among more than 2 groups
which are 5 different agents can reached how many values of property sales (market
value in dollars). At the 0.05 significance level, is there a difference in mean property
values? There are not null hypothesis and alternate hypothesis provided in the question,
thus we cannot make hypothesis testing.

18
Descriptives
Market price in dollars
95% Confidence
Interval for Mean
Std. Std. Lower Upper Maximu
N Mean Deviation Error Bound Bound Minimum m
Carter 119 590969.21 221647.848 20318.425 550733.20 631205.22 171642 909687
Isaacs 100 569605.64 214708.514 21470.851 527002.81 612208.47 176815 917088
Marty 92 554026.32 217477.100 22673.555 508988.08 599064.55 185294 908010
Peterso
104 515099.80 220478.429 21619.689 472222.24 557977.35 168256 918562
n
Rose 85 566614.86 197613.315 21434.174 523990.65 609239.06 183757 917593
Total 500 559977.93 216050.045 9662.052 540994.61 578961.24 168256 918562

It shows the mean sales of property values by each of the five agents. Carter recorded
the highest mean value of $590,969.21, while Peterson recorded the least mean market
value of property of $515,099.80.

Robust Tests of Equality of Means

Market price in dollars
Statistic
a df1 df2 Sig.
Welc
1.736 4 243.533 .143
h
a. Asymptotically F distributed.

A requirement for the ANOVA test is that the variances of each comparison group are
equal. Levene statistic is testing the assumption of homogeneity of variance. The
significance value of 0.143 shows that it is significant the groups are not statistically
different and means the requirement of homogeneity of variance has been met, and the
ANOVA test can be considered to be robust.

ANOVA
Market price in dollars
Sum of Squares df Mean Square F Sig.
Between 85006940261. 1.83
340027761044.550 4 .121
Groups 138 3
Within Groups 46367890236.
22952105667159.710 495
686

19
Total 23292133428204.260 499

Since it is one-way ANOVA, we will use F Test statistics. The formula is as below: Fstat=
MSA/MSW. MSA = SSA/(c-1) while MSW = SSW/(n-c).
Sum of Squares df Mean Square F Sig.
Between MSA=SSA/(c- p-
SSA c-1 Fstat
Groups 1) value
Within Groups MSW=SSW/(n
SSW n-c
-c)
Total SST n-c

To simplify, in this diagram, the MSA is 85006940261.138 meanwhile MSW is

46367890236.686. Therefore, The value of F is 1.8333, which reaches significance with
a p-value of .121 (which is more than the .05 alpha level). This means there is a
statistically significant difference between the means of the different levels of the agent
variable.

We look at the result of the post hoc Tukey HSD test to know between which of the various
pairs of means the difference is significant.

Multiple Comparisons
Dependent Variable: Market price in dollars
(J) 95% Confidence
(I) Different Different Mean Interval
agents (1 to agents Difference (I- Lower Upper
5) (1 to 5) J) Std. Error Sig. Bound Bound
Tukey Carter Isaacs 21363.570 29211.728 .949 -58614.04 101341.18
HSD Marty 36942.895 29893.895 .730 -44902.39 118788.18
Peterson 75869.412 28904.865 .067 -3268.05 155006.88
Rose 24354.351 30580.234 .932 -59370.04 108078.74
Isaacs Carter -
-21363.570 29211.728 .949 58614.04
101341.18
Marty 15579.325 31107.519 .987 -69588.70 100747.34
Peterson 54505.842 30158.316 .370 -28063.39 137075.08
Rose 2990.781 31767.649 1.000 -83984.58 89966.14
Marty Carter -
-36942.895 29893.895 .730 44902.39
118788.18
Isaacs -
-15579.325 31107.519 .987 69588.70
100747.34
Peterson 38926.517 30819.538 .714 -45453.05 123306.09

20
Rose -
-12588.544 32396.040 .995 76107.27
101284.35
Peterson Carter -
-75869.412 28904.865 .067 3268.05
155006.88
Isaacs -
-54505.842 30158.316 .370 28063.39
137075.08
Marty -
-38926.517 30819.538 .714 45453.05
123306.09
Rose -
-51515.061 31485.706 .475 34688.38
137718.51
Rose Carter -
-24354.351 30580.234 .932 59370.04
108078.74
Isaacs -2990.781 31767.649 1.000 -89966.14 83984.58
Marty 12588.544 32396.040 .995 -76107.27 101284.35
Peterson 51515.061 31485.706 .475 -34688.38 137718.51
Games Carter Isaacs 21363.570 29560.715 .951 -59967.64 102694.78
-Howell Marty 36942.895 30445.500 .744 -46877.64 120763.43
Peterson 75869.412 29668.996 .082 -5744.59 157483.41
Rose 24354.351 29534.085 .923 -56977.74 105686.44
Isaacs Carter -
-21363.570 29560.715 .951 59967.64
102694.78
Marty 15579.325 31226.392 .987 -70430.64 101589.29
Peterson 54505.842 30469.795 .383 -29364.69 138376.38
Rose 2990.781 30338.445 1.000 -80602.14 86583.70
Marty Carter -
-36942.895 30445.500 .744 46877.64
120763.43
Isaacs -
-15579.325 31226.392 .987 70430.64
101589.29
Peterson 38926.517 31328.917 .726 -47350.72 125203.75
Rose -12588.544 31201.184 .994 -98592.60 73415.52
Peterson Carter -
-75869.412 29668.996 .082 5744.59
157483.41
Isaacs -
-54505.842 30469.795 .383 29364.69
138376.38
Marty -
-38926.517 31328.917 .726 47350.72
125203.75

21
Rose -
-51515.061 30443.960 .441 32352.37
135382.49
Rose Carter -
-24354.351 29534.085 .923 56977.74
105686.44
Isaacs -2990.781 30338.445 1.000 -86583.70 80602.14
Marty 12588.544 31201.184 .994 -73415.52 98592.60
Peterson 51515.061 30443.960 .441 -32352.37 135382.49

We can see that significance values have been generated for the mean differences
between different agents of the various market values of property, such as Carter - Rose,
Rose to Carter. The Tukey HSD (Honest Significant Difference) shows that it is only the
mean difference between the Cater and Peterson (.067) that reaches significance (p-
value) which is .121.

There was a statistically significant difference between groups as demonstrated by one-

way ANOVA (F(4,495) = 1.833, p = .121). A Tukey post hoc test showed that the Carter
is able to sell more high market value of property statistically significantly further than the
Peterson (p = .067). There was no statistically significant difference between the Carter
and Isaacs (p=.949), Carter and Marty (p=.730) and Carter and Rose (p=.934). Supported
by diagram below which indicate the relationship between agents and mean of market
price in dollars.

22
20. Linear multiple regression
Multiple regression models use two or more independent variables to predict the value of
dependent variable. We use linear multiple regression to figure out how square feet of
property, number of bathroom and number of bedrooms in a property will affect its market
price in dollar.

Variables Entered/Removeda
Mode Variables
l Variables Entered Removed Method
1 Number of bedrooms, Square feet of
. Enter
property, Number of bathroomb
a. Dependent Variable: Market price in dollars
b. All requested variables entered.

Based on the table, the dependent variables is market price in dollar, independent
variables are square feet of property, number of bathroom and number of bedrooms.

Coefficientsa
Standardize
Unstandardized d
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 507521.43
45985.934 11.036 .000
4
Square feet of
7.923 5.530 .064 1.433 .153
property
Number of
4448.753 6861.117 .029 .648 .517
bathroom
Number of
-376.957 4848.053 -.003 -.078 .938
bedrooms
a. Dependent Variable: Market price in dollars

The regression equation of market price in dollar is 507,521.44 + 7.92 square feet of
property + 4,448.75 number of bathrooms – 376.96 number of bedrooms.
ŷ =β1χ1 + β2χ2 + β3χ3 + a
ŷ = market price in dollars
χ1 = square feet of property
χ2 = number of bathrooms
χ3 = number of bedrooms
a = estimate of ŷ when all independent variables is 0
Therefore, ŷ = 507,521.44 + 7.92 χ1 + 4,448.75 χ2 – 376.96 χ3

23
Model Summary
Std. Error of
Mode R Adjusted R the
l R Square Square Estimate
1 .070 a .005 -.001 216173.156
a. Predictors: (Constant), Number of bedrooms,
Square feet of property, Number of bathroom

R square is the coefficient of determination, It explain portion of the total variation in the
dependent variable that is explained by variation in independent variable.
R square is 0.005 tells us that 0.5% of variation on the market price in dollars is explained
by the variation in Number of bedrooms, Square feet of property and Number of bathroom.
The lower R square indicates weak linear relationship between dependent variable and
independent variables.

24
CONCLUSION

As result, Carter agent has the highest frequencies of sales transactions which is 119,
follow by Peterson (104), Isaacs (100), Marty (92) and Rose (85) among this 500 sample
of sales. Among 500 of properties in sales sample, most of the properties located in Area
5 (116), follow by Area 4 (115), Area 3 (104), Area 2 (92), and the fewest properties
located in Area 1 (73). The property is fairly located in every area of town, therefore we
cannot concluded that which area of properties are attractive to the buyers. Carter have
the best working performance, which proceeded the most sales transaction of 119
recorded the highest mean value of $590,969.21. Further, Carter proceeded the many
sales property transaction to the properties mainly located in Area 2 (25%), Area 3
(27.9%) and Area 4. Marty largely in charge properties are located in Area 5. Peterson
mainly responsible for properties located in Area 1, although Peterson proceed the
second high of property sold out transaction of 104 but the mean market value of property
proceed by him is $515,099.80.
As demonstrate by pie chart, 50% of the properties which are 250 do not have pool while
the remaining 50% of properties have pool. However, there is no difference market value
between properties have swimming pool and properties that do not have swimming pool.
The company is not recommended to build swimming pool in order to increase its market
value.
The number of bedrooms in a property does not affect the sell-ability of property. As the
percentage is fairly stated in every category number of bedrooms. 13.6% of properties
which are 68 of properties have 2 and 3 bedrooms respectively. 78 or 15.6% of properties
have 4 bedrooms, 74 or 14.8% of 500 properties have 5 bedrooms. In additional, 72
properties or 14.4% have 6 bedrooms, 61 of 12.2% properties have 7 bedrooms. Only 79
properties of 15.8% of properties that have 8 bedrooms. There are not concluded that
which whether the property have pool, whether the garage attached, what number of
bedrooms and what number of bathrooms in a property are attractive to buyer, since the
percentage is fairly equal between each category of variables.
The frequency distribution indicate that most of the properties can be sold out within 60
days, and day-20 recorded the highest frequencies follow by day-32.
The percentage for default mortgage loan and not default mortgage loan is fairly equal,
which are 53.2% and 46.8% respectively. Same applied to fixed interest mortgage loan
and adjustable interest mortgage loan recorded 52.4% and 47.6% of 500 samples
transaction. The years for mortgage loan holder to paid out the debt are fairly equally
ranging from 1 year to 20 years.
FICO, the credit score of the mortgage loan holder reflects a person’s ability to pay their
[Link] highest score is 850; an average score is 680; a low score is below 680.
According to SPSS calculation, the actual population mean is 704.4 and we can conclude
that 95% of the mean credit score is somewhere between 698.49 and 710.38.
We conclude the company should set up strategy for long-term prosperity, such as
increase the selling value for some property have more number of bathroom or swimming
pool or located in some areas, in order to gain more revenue. The company can train
agents to be expert in certain type of properties such as affordable or luxury property.
25
REFERENCES

A Q-Q Plot Dissection Kit [Link]

[Link]
Box plot [Link]
Confidence Intervals in SPSS [Link]
Independent and Dependent Variables [Link]
and-dependent-variables/
Levine, D.M., Krehbiel, T.C. & Berenson, M.L. (2016). Business Statistics: A First Course
(7th ed). Upper Saddle River, NJ: Pearson Global Edition.
Measures of Relative Standing [Link]
statistics/chapter/measures-of-relative-standing/
Normal Probability Plots in SPSS
[Link]
Ogive Graph / Cumulative Frequency Polygon in Easy Steps
[Link]
One Way ANOVA in SPSS Including Interpretation [Link]
in-spss-including-interpretation/
Scatter Plots [Link]
Skewness [Link]
SPSS TUTORIALS: CROSSTABS [Link]
Types of data [Link]
What is Skewness? [Link]

26
APPENDICES
1) Data View (Data 3)

2) Variables View

Real Estate Price Prediction Model
No ratings yet
Real Estate Price Prediction Model
5 pages
Introduction To Business Analytics Session 1b
No ratings yet
Introduction To Business Analytics Session 1b
18 pages
Group 4 - (26993) - (27182) - (26367) - (26286) - (26463) - Assignment 2
No ratings yet
Group 4 - (26993) - (27182) - (26367) - (26286) - (26463) - Assignment 2
17 pages
Summarizing Qualitative Data Methods
No ratings yet
Summarizing Qualitative Data Methods
30 pages
Sheffield Housing Market Analysis Report
No ratings yet
Sheffield Housing Market Analysis Report
7 pages
MBA Business Statistics Assignment Guide
No ratings yet
MBA Business Statistics Assignment Guide
2 pages
Marketing Analytics: Salary Prediction Analysis
No ratings yet
Marketing Analytics: Salary Prediction Analysis
20 pages
Descriptive Statistics Analysis Lab
No ratings yet
Descriptive Statistics Analysis Lab
1 page
Analyzing House Price Data
No ratings yet
Analyzing House Price Data
42 pages
House Price Dataset Analysis Guide
No ratings yet
House Price Dataset Analysis Guide
41 pages
Descriptive Statistics: Numerical Measures
No ratings yet
Descriptive Statistics: Numerical Measures
68 pages
Data Tabulation and Graphing Methods
No ratings yet
Data Tabulation and Graphing Methods
10 pages
Hypothesis Testing vs. Exploratory Data Analysis
No ratings yet
Hypothesis Testing vs. Exploratory Data Analysis
20 pages
Data Analysis Techniques Overview
No ratings yet
Data Analysis Techniques Overview
44 pages
SPSS Data Analysis Guide for Students
No ratings yet
SPSS Data Analysis Guide for Students
18 pages
Week 1: Intro to Descriptive Statistics
No ratings yet
Week 1: Intro to Descriptive Statistics
8 pages
Topic 2 - Descriptive Statistics
No ratings yet
Topic 2 - Descriptive Statistics
41 pages
Assignment 02
No ratings yet
Assignment 02
7 pages
Frequency Distribution
No ratings yet
Frequency Distribution
6 pages
Data Summarization Techniques Guide
No ratings yet
Data Summarization Techniques Guide
9 pages
Statistical Graphs and Data Representation
No ratings yet
Statistical Graphs and Data Representation
89 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
58 pages
Univariate EDA of Numerical Data
No ratings yet
Univariate EDA of Numerical Data
42 pages
Visualizing Numerical Data Trends
No ratings yet
Visualizing Numerical Data Trends
5 pages
2013 Single Family Housing Value Analysis
No ratings yet
2013 Single Family Housing Value Analysis
37 pages
Visual Data Displays in Statistics
No ratings yet
Visual Data Displays in Statistics
8 pages
Non-Graphical Descriptive Statistics
No ratings yet
Non-Graphical Descriptive Statistics
51 pages
Univariate Data Analysis Techniques
No ratings yet
Univariate Data Analysis Techniques
87 pages
BreezyRealty Sales Performance Analysis
No ratings yet
BreezyRealty Sales Performance Analysis
9 pages
Analyzing Data with Frequencies
No ratings yet
Analyzing Data with Frequencies
10 pages
House Price Prediction Analysis
100% (2)
House Price Prediction Analysis
26 pages
Advertising Effectiveness Analysis
No ratings yet
Advertising Effectiveness Analysis
44 pages
Data Presentation Techniques Explained
No ratings yet
Data Presentation Techniques Explained
91 pages
Advanced Data Analytics Overview
No ratings yet
Advanced Data Analytics Overview
98 pages
EDA Data Cleaning
No ratings yet
EDA Data Cleaning
53 pages
Data Presentation in Statistics
No ratings yet
Data Presentation in Statistics
18 pages
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
28 pages
Introduction to Quantitative Methods
No ratings yet
Introduction to Quantitative Methods
61 pages
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
54 pages
Housing Price Time Series Analysis
No ratings yet
Housing Price Time Series Analysis
18 pages
Univariate and Bivariate Analysis in Hospitality
No ratings yet
Univariate and Bivariate Analysis in Hospitality
6 pages
Stat 100 - Lecture 2 - 2026
No ratings yet
Stat 100 - Lecture 2 - 2026
41 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
95 pages
Advertising Effectiveness Analysis
No ratings yet
Advertising Effectiveness Analysis
41 pages
Ethics in Statistical Data Analysis
No ratings yet
Ethics in Statistical Data Analysis
29 pages
Statistical Analysis of Sales and Calls
No ratings yet
Statistical Analysis of Sales and Calls
15 pages
Coefficient of Variation in Property Data
No ratings yet
Coefficient of Variation in Property Data
26 pages
Sampling Methods and Data Analysis Guide
No ratings yet
Sampling Methods and Data Analysis Guide
10 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
59 pages
Data Collection and Presentation Techniques
No ratings yet
Data Collection and Presentation Techniques
6 pages
Frequency Tables for Data Analysis
No ratings yet
Frequency Tables for Data Analysis
6 pages
Data Presentation Techniques Explained
No ratings yet
Data Presentation Techniques Explained
95 pages
Data Presentation Techniques in Business
No ratings yet
Data Presentation Techniques in Business
123 pages
Role of Statistics in Architecture
No ratings yet
Role of Statistics in Architecture
5 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
19 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
5 pages
USM Location Descriptions and Context
No ratings yet
USM Location Descriptions and Context
4 pages
Real Estate Agent Performance Analysis
No ratings yet
Real Estate Agent Performance Analysis
27 pages
Valuation Insights for Orange Brite
100% (1)
Valuation Insights for Orange Brite
8 pages
Case 2 Marking Scheme
100% (1)
Case 2 Marking Scheme
22 pages
Expected Returns and Portfolio Analysis
No ratings yet
Expected Returns and Portfolio Analysis
20 pages
Corporate Finance Bond Problem Set
No ratings yet
Corporate Finance Bond Problem Set
37 pages
Critique of Conventional Accounting Practices
No ratings yet
Critique of Conventional Accounting Practices
32 pages
Dividend Valuation Problem Set
100% (1)
Dividend Valuation Problem Set
19 pages
Retirement Savings and Present Value Analysis
No ratings yet
Retirement Savings and Present Value Analysis
38 pages
3 - Cho Et Al 2015
No ratings yet
3 - Cho Et Al 2015
17 pages
PERT Method for Project Duration Analysis
No ratings yet
PERT Method for Project Duration Analysis
3 pages
Goodness of Fit - SEM PDF
No ratings yet
Goodness of Fit - SEM PDF
52 pages
Sun and Song (2025)
No ratings yet
Sun and Song (2025)
29 pages
Age and Treatment Impact on CD4 Levels
No ratings yet
Age and Treatment Impact on CD4 Levels
6 pages
Probability Distributions Practice Quiz
No ratings yet
Probability Distributions Practice Quiz
10 pages
Business Statistics Assignment Insights
No ratings yet
Business Statistics Assignment Insights
6 pages
Isolation Forest Anomaly Detection Guide
No ratings yet
Isolation Forest Anomaly Detection Guide
9 pages
Ecotal V1i1 3
No ratings yet
Ecotal V1i1 3
12 pages
Stats Data and Models Testbank 5th Ed.
No ratings yet
Stats Data and Models Testbank 5th Ed.
22 pages
Data Science Statistics Exam Questions
No ratings yet
Data Science Statistics Exam Questions
13 pages
Business Statistics for B.Com Semesters
0% (1)
Business Statistics for B.Com Semesters
1 page
Six Sigma DOE Workshop Overview
No ratings yet
Six Sigma DOE Workshop Overview
32 pages
Data Analysis Exercises in R Programming
No ratings yet
Data Analysis Exercises in R Programming
32 pages
Hemoglobin Levels in Pregnant Mothers Analysis
No ratings yet
Hemoglobin Levels in Pregnant Mothers Analysis
2 pages
Beginner Data Science Roadmap Guide
No ratings yet
Beginner Data Science Roadmap Guide
6 pages
WILP Statistical Methods Exam Solutions
No ratings yet
WILP Statistical Methods Exam Solutions
4 pages
Statistical Methods for Data Analysis
No ratings yet
Statistical Methods for Data Analysis
5 pages
Ola Data Analysis for Fare Prediction
No ratings yet
Ola Data Analysis for Fare Prediction
8 pages
Sample Size Determination Methods
No ratings yet
Sample Size Determination Methods
7 pages
Classification and Regression Overview
No ratings yet
Classification and Regression Overview
26 pages
Statistical Analysis and Problem Solving
No ratings yet
Statistical Analysis and Problem Solving
6 pages
Advanced Statistical Methods in Education
No ratings yet
Advanced Statistical Methods in Education
5 pages
Diseño de Experimentos y Análisis Estadístico
100% (2)
Diseño de Experimentos y Análisis Estadístico
20 pages
Inferential Statistics Final Exam 2025
No ratings yet
Inferential Statistics Final Exam 2025
3 pages
Understanding Mode in Categorical Data
No ratings yet
Understanding Mode in Categorical Data
30 pages
Data Science Laboratory Record 2024-25
No ratings yet
Data Science Laboratory Record 2024-25
69 pages
SUV MPG Analysis and Approval Times
No ratings yet
SUV MPG Analysis and Approval Times
5 pages
Modern Regression Homework 9 Solutions
No ratings yet
Modern Regression Homework 9 Solutions
14 pages
Understanding Statistical Averages
No ratings yet
Understanding Statistical Averages
13 pages
Point Estimation Methods in Statistics
No ratings yet
Point Estimation Methods in Statistics
109 pages

Real Estate Agent Performance Analysis

Uploaded by

Real Estate Agent Performance Analysis

Uploaded by

DATA ANALYSIS

No. Type of variables Data use in report

Independent variables can be changed or controlled in a given model or equation.

The graph below is known as histogram. It is used to represent a quantitative variable.

7. Cumulative frequency polygon

Valid 1 21 4.2 4.2 4.2

2 22 4.4 4.4 8.6

3 22 4.4 4.4 13.0

4 21 4.2 4.2 17.2

5 27 5.4 5.4 22.6

6 27 5.4 5.4 28.0

7 30 6.0 6.0 34.0

8 28 5.6 5.6 39.6

10 29 5.8 5.8 51.2

11 27 5.4 5.4 56.6

12 20 4.0 4.0 60.6

13 24 4.8 4.8 65.4

14 19 3.8 3.8 69.2

15 23 4.6 4.6 73.8

16 28 5.6 5.6 79.4

17 20 4.0 4.0 83.4

18 23 4.6 4.6 88.0

19 25 5.0 5.0 93.0

20 35 7.0 7.0 100.0

Total 500 100.0 100.0

The function of box plot is similar with percentiles to

Skewness is asymmetry in a statistical distribution, in which the curve appears distorted

N represents the number of observations. The skewness statistic is .066, since

A contingency table, sometimes called a two-way frequency table, is a tabular mechanism

Township * Agent Crosstabulation

Is the mortgage loan default? * Mortgage type Crosstabulation

15. Normal probability of distribution

(Source: A Q-Q Plot Dissection Kit)

16. Confidence interval for the mean

A confidence interval estimate is a range of numbers, called an interval, constructed

17. One sample T-test of hypothesis testing

Hypothesis test testing a claim or an assertion about a particular parameter of a

18. Two sample tests on hypothesis: Independent samples test

Step 6: Make decision

19. One-way ANOVA

Robust Tests of Equality of Means

To simplify, in this diagram, the MSA is 85006940261.138 meanwhile MSW is

There was a statistically significant difference between groups as demonstrated by one-

A Q-Q Plot Dissection Kit [Link]

You might also like