Real Estate Agent Performance Analysis
Real Estate Agent Performance Analysis
In this report, we are going to demonstrate our findings on the working performance of all
of the currently employed agents under Real Estate Sdn Bhd and the type of properties
that sold out during the past 500 sample of business transactions. Carter, Isaacs, Marty,
Peterson and Rose are Real Estate Sdn Bhd currently employed agents.
We used SPSS software to conduct 20 statistics testing across the following data. Before
we carry out analysis and decide which type of tools will be using, we need to understand
the type of variable of each of the data. (Appendices 1 & 2 for data view and variable view
in SPSS software)
1
a counted or d) Years – the number of years that the
measured mortgage loan has been paid
3 quantity Continuous a) Price - market price in dollar
variables b) Size - livable square feet of the property
arise from a c) FICO – the credit score of the mortgage
measuring loan holder. The highest score is 850; an
process average score is 680; a low score is
depend on below 680. The score reflects a person’s
precision of ability to pay their debts.
measuring
instrument
used
1. Frequency Table
Statistics
Agent
N Valid 500
Missing 0
Agent
Valid Cumulative
Frequency Percent Percent Percent
Valid Carter 119 23.8 23.8 23.8
Isaacs 100 20.0 20.0 43.8
Marty 92 18.4 18.4 62.2
Peterson 104 20.8 20.8 83.0
Rose 85 17.0 17.0 100.0
Total 500 100.0 100.0
The above table is known as frequency table. It contains qualitative variable. In this table,
the variable used is the Agent. The total amount of sample sales transactions collected
is 500. The overall sales transactions are divided into 5 different agents which are Carter,
Isaacs, Marty, Peterson and Rose. Each agent represents its own frequency for the data.
Carter agent has the highest frequencies of sales transactions which is 119, it means
Carter complete the most property sales among others. While the lowest amount of
frequencies is the Rose with 85. While the others contain Isaacs have 100 frequencies,
Marty have 92 frequencies, Peterson have 104 frequencies. Rose complete the fewest
property sales among others.
2
2. Bar Chart
The bar chart above visualize the area of properties owned by the company. The variable
is are of town, a category variable. It is clearly to see that most of the properties located
in Area 5 (116), follow by Area 4 (115), Area 3 (104), Area 2 (92), and the fewest
properties located in Area 1 (73). The reason of using bar chart is because it will visualize
categorical variable and each length bar represents the tallies for the category. Further,
it useful to show relative values between each categories.
3. Pie Chart
Based on the chart above, indicate whether the 500 sample of properties in sales
transactions have a pool or not. The pie chart above showed 50% of the property have
pool and 50% of them do not have pool. It can see that number of pools have same
proportion. The reason of using pie chart is because it visually simpler than other types
of graphs. Pie chart is good to display data and uses parts of circle to represent the
percentages of each category for categorical/quantitative variables.
3
4. Frequency distribution
Frequency distribution tells how frequencies distributed over values. It can be visualized
through a list, table or graph. The frequency distribution shown above is displayed through
bar chart with the number of days to sell property by every agent in the company. The
lowest limit for this graph is day-2 and the highest limit is day-60. We can see that in the
day-20 have the highest frequencies follow by day-32. The lowest frequencies is day-14.
The reason of using day as the variable is because we want to see how many days to
wait for property to sold out and can provide the information of shape of distribution.
5. Histogram
Bedroom
Frequen Valid Cumulative
cy Percent Percent Percent
Valid 2 68 13.6 13.6 13.6
3 68 13.6 13.6 27.2
4
4 78 15.6 15.6 42.8
5 74 14.8 14.8 57.6
6 72 14.4 14.4 72.0
7 61 12.2 12.2 84.2
8 79 15.8 15.8 100.0
Total 500 100.0 100.0
5
6. Frequency Polygon
Frequency polygon is a graphical chart for understanding the shapes of distributions and
helpful for comparing sets of data. From the frequency polygon above, we can see that
the price of properties reached the highest peak compare to the others which has exceed
$ 300,000. It’s represents the quantitative variable. The graph is connected by the
midpoint in each class. From the frequency polygon above, we can see the highest
frequency is between the $ 300,000 to $ 350,000, followed by $ 500,000 to $ 550,000,
then $ 800,000 to $ 850,000.
The reason of using days as the variable is because frequency polygon can superimpose
two or more frequency polygons on the same axes and make comparisons between the
sets of data.
6
Cumulative frequency polygon is also known as an ogive (oh-jive). It is a type of frequency
polygon that shows cumulative frequencies. It is a curve showing the cumulative
frequency for a given set of data. The cumulative frequency is plotted on the y-axis against
the data which is on the x-axis for un-group data. In this graph, it shows the cumulative
price of the first sales transaction from the smallest to the largest cumulative sales. The
highest sales value of a property is over $ 900,000 among the 500 sales transactions.
8. Central tendency
Statistics
Price Size
N Valid 500 500
Missing 0 0
Mean $559,977.93 4644.20
Median $569,929.50 4660.00
Mode $168,256a 3141a
Std. Deviation $216,050.045 1750.683
Variance 46677622100.610 3064890.871
Range $750,306 6090
Percentile 25 $360,501.00 3086.25
s 50 $569,929.50 4660.00
75 $751,988.50 6093.75
a. Multiple modes exist. The smallest value is shown
Central tendency is the extent to which the values of a numerical variable group around
a typical, or central, value. To put in other words, it is a way to describe the center of a
data set. There are three measures of central tendency: the mean, the median, and the
mode. Mean is a “balance points” in a set of data
Mean = Sum of the value / Number of values
The table describe that the mean value of price of property is $ 559,977.93, mean of
property size is 4,644.20 square feet.
Median is the middle value in an ordered array of data that has been ranked from smallest
to largest.
Median = (n+1)/2 ranked value
The table describe that the median value of price of property is $ 569,929.50, median of
property size is 4,660.00 square feet. The mode is the value appears most frequently.
The mode of price of property is $ 168,256 and the mode of property size is 3,141 square
feet.
7
9. Dot Plot
Dot plot is a type of simple histogram -like chart used in statistics for relatively small data
sets where values fall into a number of discrete bins (categories). The values are
represented by dot. In the dot plot above, each dot represents the credit score (FICO) of
each property mortgage loan holder. We can see the frequency number of wheelbases
was concentrated between 670 and 690. The highest score is 850; an average score is
680; a low score is below 680. The score reflects a person’s ability to pay their debts. As
dot plot describe that the credit score of each mortgage loan holders are fairly equal
stated. The reason of using dot plot is because of it is less cluttered.
10. Percentiles
Statistics
years
N Valid 500
Missing 0
Mean 10.67
Median 10.00
Percentiles 25 6.00
50 10.00
75 16.00
8
years
Cumulative
Frequency Percent Valid Percent Percent
Percentiles will split a variable into 100 equal parts. The first quartile is equivalent to the
25th percentile, second quartile to the 50th percentile, and the third quartile to the 75th
percentile. The 25th percentile is 6, 50th percentile is 10 and the 75th percentile is 16.
Percentiles can used to visualize the distribution of the values for a numerical variable.
Therefore, we can visualize the type of distribution of the number of years that the
mortgage loan has been paid by using five-number summary method.
Five-number summary method:
X smallest - Q1 - Median - Q3 - X largest
For number of years that the mortgage loan has been paid:
1 – 6 – 10 -16 – 20
Since the distance from Q1 to median (10-6=4) is less than the distance from Q3 to
median (16-10=6). The distribution of years of mortgage loan is right-skewed.
9
11. Box Plot
12. Skewness
10
• Skewness > 0 – Right skewed distribution – most values are concentrated on
left of the mean, with extreme values to the right.
• Skewness < 0 – Left skewed distribution – most values are concentrated on
the right of the mean, with extreme values to the left.
• Skewness = 0 – mean = median, the distribution is symmetrical around the
mean.
• Descriptive Statistics
N Skewness
Statistic Statistic Std. Error
Number of bathroom 500 .066 .109
Valid N (listwise) 500
11
13. Scatter diagram
A scatter diagram (or "scatter plot") is a graph used to plot the relationship between two
continuous variables, with the x-axis representing one variable and the y-axis
representing the other.
The reason of using scatter diagram because it is easy to understand relationship
between variables. This diagram indicates that the relationship between price and the
size of the property.
If the points cluster in a band running from lower left to upper right, there is a positive
correlation (if x increases, y increases). If the points cluster in a band from upper left to
lower right, there is a negative correlation (if x increases, y decreases). Imagine drawing
a straight line or curve through the data so that it "fits" as well as possible. The more the
points cluster closely around the imaginary line of best fit, the stronger the relationship
that exists between the two variables. If it is hard to see where you would draw a line,
and if the points show no significant clustering, there is probably no correlation. In
conclusion, there are no correlation between price and size of the property and even we
unable to draw a straight line in the graph.
12
14. Contingency table
From this table, we can see that Agent 1, Carter proceeded the most sales property
transaction and he in charge properties mainly located in Area 2 (25%), Area 3 (27.9%)
and Area 4 (27.8%), only fewer properties located in Area 5 (15.5%). However, Agent 3,
Marty largely in charge properties are located in Area 5 (25% properties in Area 5 in
charge by him). Whereas for Agent 4, Peterson mainly responsible for properties located
in Area 1 with 26% out of 100%.
13
loan Yes Count 128 110 238
default? % within Is the mortgage loan
53.8% 46.2% 100.0%
default?
Total Count 266 234 500
% within Is the mortgage loan
53.2% 46.8% 100.0%
default?
There are 53.2% of 500 properties mortgage loans are fixed mortgage with a period of 30
years, the mortgage loan holders need paid fixed interest rate loan. While 46.8% of 500
properties mortgage loans are adjustable rate loan that begins with an introductory
interest rate of 3% for the first 5 years, then the interest rate is based on the current
interest rates plus 1%. The default mortgage loan have 53.8% if fixed mortgage loan while
the not default mortgage loan recorded the larger portion of 52.7% is fixed mortgage loan.
14
As long as the points follow approximately along the diagonal line, we conclude that
the data is approximately normally distributed. If the points skewed to right, or skewed
to left. As the points follow an “S-curve” shape, then the data is likely to be uniform(flat)
Same as Q-Q plot, as long as the points follow approximately along the diagonal line,
we conclude the data for price and FICO are approximately normally distributed. As
both of the points are S shape, we conclude the data are likely to be uniform.
15
One-Sample Statistics
Std. Std. Error
N Mean Deviation Mean
Credit score of the
500 704.44 67.670 3.026
mortgage loan holder
Descriptives
Std.
Statistic Error
Credit score of the Mean 704.44 3.026
mortgage loan holder 95% Confidence Lower
698.49
Interval for Mean Bound
Upper
710.38
Bound
5% Trimmed Mean 704.47
Median 706.00
Variance 4579.20
8
Std. Deviation 67.670
Minimum 583
Maximum 825
Range 242
Interquartile Range 110
Skewness .027 .109
Kurtosis -1.108 .218
From the one sample test table, we can conclude that 95% of the mean credit score is
somewhere between 698.49 and 710.38. The mean credit score, 680 is not fall within the
confidence interval.
16
Step 4: Freedom of degree (df) = 499, critical value for T-test is 1.653
Step 5: Perform the test
One-Sample Statistics
Std. Std. Error
N Mean Deviation Mean
Credit score of the
500 704.44 67.670 3.026
mortgage loan holder
One-Sample Test
Test Value = 680
95% Confidence Interval of
Sig. (2- Mean the Difference
t df tailed) Difference Lower Upper
Credit score of the
8.075 499 .000 24.436 18.49 30.38
mortgage loan holder
Step 6: Make decision
The t value is 8.075, which gives us a p-value (or 2-tailed significance value) of .000. This
is going to be a significant result for any realistic alpha level.
In T-value approach, the T-value 8.075 is over the critical value of 1.653, fall within
rejection region, reject null hypothesis. We conclude that have sufficient evidence that the
mean credit core is not equal to 680.
If use p-value approach, the p-value 0.000 is smaller than level of significance 0.05. We’re
going to reject the null hypothesis which have sufficient evidence that the mean credit
core is not equal to 680. The conclusions to reject null hypothesis are same either using
any methods.
17
Group Statistics
Does the home have Std. Std. Error
pool? N Mean Deviation Mean
Market price in Yes 565851.6
250 209453.192 13246.983
dollars 5
0 554104.2
250 222716.831 14085.849
0
One-way ANOVA also known by analysis of variance that allow you to compare multiple
populations unlike the hypothesis-testing methods our mentioned before. We will take
samples from each group to examine the effects of differences among two or more group,
The criteria that distinguishes the groups are called factors/ factors of interest. Factors
contain levels which are analogous to the category of the categorical variables.
We select this tools because we want to examine differences among more than 2 groups
which are 5 different agents can reached how many values of property sales (market
value in dollars). At the 0.05 significance level, is there a difference in mean property
values? There are not null hypothesis and alternate hypothesis provided in the question,
thus we cannot make hypothesis testing.
18
Descriptives
Market price in dollars
95% Confidence
Interval for Mean
Std. Std. Lower Upper Maximu
N Mean Deviation Error Bound Bound Minimum m
Carter 119 590969.21 221647.848 20318.425 550733.20 631205.22 171642 909687
Isaacs 100 569605.64 214708.514 21470.851 527002.81 612208.47 176815 917088
Marty 92 554026.32 217477.100 22673.555 508988.08 599064.55 185294 908010
Peterso
104 515099.80 220478.429 21619.689 472222.24 557977.35 168256 918562
n
Rose 85 566614.86 197613.315 21434.174 523990.65 609239.06 183757 917593
Total 500 559977.93 216050.045 9662.052 540994.61 578961.24 168256 918562
It shows the mean sales of property values by each of the five agents. Carter recorded
the highest mean value of $590,969.21, while Peterson recorded the least mean market
value of property of $515,099.80.
A requirement for the ANOVA test is that the variances of each comparison group are
equal. Levene statistic is testing the assumption of homogeneity of variance. The
significance value of 0.143 shows that it is significant the groups are not statistically
different and means the requirement of homogeneity of variance has been met, and the
ANOVA test can be considered to be robust.
ANOVA
Market price in dollars
Sum of Squares df Mean Square F Sig.
Between 85006940261. 1.83
340027761044.550 4 .121
Groups 138 3
Within Groups 46367890236.
22952105667159.710 495
686
19
Total 23292133428204.260 499
Since it is one-way ANOVA, we will use F Test statistics. The formula is as below: Fstat=
MSA/MSW. MSA = SSA/(c-1) while MSW = SSW/(n-c).
Sum of Squares df Mean Square F Sig.
Between MSA=SSA/(c- p-
SSA c-1 Fstat
Groups 1) value
Within Groups MSW=SSW/(n
SSW n-c
-c)
Total SST n-c
We look at the result of the post hoc Tukey HSD test to know between which of the various
pairs of means the difference is significant.
Multiple Comparisons
Dependent Variable: Market price in dollars
(J) 95% Confidence
(I) Different Different Mean Interval
agents (1 to agents Difference (I- Lower Upper
5) (1 to 5) J) Std. Error Sig. Bound Bound
Tukey Carter Isaacs 21363.570 29211.728 .949 -58614.04 101341.18
HSD Marty 36942.895 29893.895 .730 -44902.39 118788.18
Peterson 75869.412 28904.865 .067 -3268.05 155006.88
Rose 24354.351 30580.234 .932 -59370.04 108078.74
Isaacs Carter -
-21363.570 29211.728 .949 58614.04
101341.18
Marty 15579.325 31107.519 .987 -69588.70 100747.34
Peterson 54505.842 30158.316 .370 -28063.39 137075.08
Rose 2990.781 31767.649 1.000 -83984.58 89966.14
Marty Carter -
-36942.895 29893.895 .730 44902.39
118788.18
Isaacs -
-15579.325 31107.519 .987 69588.70
100747.34
Peterson 38926.517 30819.538 .714 -45453.05 123306.09
20
Rose -
-12588.544 32396.040 .995 76107.27
101284.35
Peterson Carter -
-75869.412 28904.865 .067 3268.05
155006.88
Isaacs -
-54505.842 30158.316 .370 28063.39
137075.08
Marty -
-38926.517 30819.538 .714 45453.05
123306.09
Rose -
-51515.061 31485.706 .475 34688.38
137718.51
Rose Carter -
-24354.351 30580.234 .932 59370.04
108078.74
Isaacs -2990.781 31767.649 1.000 -89966.14 83984.58
Marty 12588.544 32396.040 .995 -76107.27 101284.35
Peterson 51515.061 31485.706 .475 -34688.38 137718.51
Games Carter Isaacs 21363.570 29560.715 .951 -59967.64 102694.78
-Howell Marty 36942.895 30445.500 .744 -46877.64 120763.43
Peterson 75869.412 29668.996 .082 -5744.59 157483.41
Rose 24354.351 29534.085 .923 -56977.74 105686.44
Isaacs Carter -
-21363.570 29560.715 .951 59967.64
102694.78
Marty 15579.325 31226.392 .987 -70430.64 101589.29
Peterson 54505.842 30469.795 .383 -29364.69 138376.38
Rose 2990.781 30338.445 1.000 -80602.14 86583.70
Marty Carter -
-36942.895 30445.500 .744 46877.64
120763.43
Isaacs -
-15579.325 31226.392 .987 70430.64
101589.29
Peterson 38926.517 31328.917 .726 -47350.72 125203.75
Rose -12588.544 31201.184 .994 -98592.60 73415.52
Peterson Carter -
-75869.412 29668.996 .082 5744.59
157483.41
Isaacs -
-54505.842 30469.795 .383 29364.69
138376.38
Marty -
-38926.517 31328.917 .726 47350.72
125203.75
21
Rose -
-51515.061 30443.960 .441 32352.37
135382.49
Rose Carter -
-24354.351 29534.085 .923 56977.74
105686.44
Isaacs -2990.781 30338.445 1.000 -86583.70 80602.14
Marty 12588.544 31201.184 .994 -73415.52 98592.60
Peterson 51515.061 30443.960 .441 -32352.37 135382.49
We can see that significance values have been generated for the mean differences
between different agents of the various market values of property, such as Carter - Rose,
Rose to Carter. The Tukey HSD (Honest Significant Difference) shows that it is only the
mean difference between the Cater and Peterson (.067) that reaches significance (p-
value) which is .121.
22
20. Linear multiple regression
Multiple regression models use two or more independent variables to predict the value of
dependent variable. We use linear multiple regression to figure out how square feet of
property, number of bathroom and number of bedrooms in a property will affect its market
price in dollar.
Variables Entered/Removeda
Mode Variables
l Variables Entered Removed Method
1 Number of bedrooms, Square feet of
. Enter
property, Number of bathroomb
a. Dependent Variable: Market price in dollars
b. All requested variables entered.
Based on the table, the dependent variables is market price in dollar, independent
variables are square feet of property, number of bathroom and number of bedrooms.
Coefficientsa
Standardize
Unstandardized d
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 507521.43
45985.934 11.036 .000
4
Square feet of
7.923 5.530 .064 1.433 .153
property
Number of
4448.753 6861.117 .029 .648 .517
bathroom
Number of
-376.957 4848.053 -.003 -.078 .938
bedrooms
a. Dependent Variable: Market price in dollars
The regression equation of market price in dollar is 507,521.44 + 7.92 square feet of
property + 4,448.75 number of bathrooms – 376.96 number of bedrooms.
ŷ =β1χ1 + β2χ2 + β3χ3 + a
ŷ = market price in dollars
χ1 = square feet of property
χ2 = number of bathrooms
χ3 = number of bedrooms
a = estimate of ŷ when all independent variables is 0
Therefore, ŷ = 507,521.44 + 7.92 χ1 + 4,448.75 χ2 – 376.96 χ3
23
Model Summary
Std. Error of
Mode R Adjusted R the
l R Square Square Estimate
1 .070 a .005 -.001 216173.156
a. Predictors: (Constant), Number of bedrooms,
Square feet of property, Number of bathroom
R square is the coefficient of determination, It explain portion of the total variation in the
dependent variable that is explained by variation in independent variable.
R square is 0.005 tells us that 0.5% of variation on the market price in dollars is explained
by the variation in Number of bedrooms, Square feet of property and Number of bathroom.
The lower R square indicates weak linear relationship between dependent variable and
independent variables.
24
CONCLUSION
As result, Carter agent has the highest frequencies of sales transactions which is 119,
follow by Peterson (104), Isaacs (100), Marty (92) and Rose (85) among this 500 sample
of sales. Among 500 of properties in sales sample, most of the properties located in Area
5 (116), follow by Area 4 (115), Area 3 (104), Area 2 (92), and the fewest properties
located in Area 1 (73). The property is fairly located in every area of town, therefore we
cannot concluded that which area of properties are attractive to the buyers. Carter have
the best working performance, which proceeded the most sales transaction of 119
recorded the highest mean value of $590,969.21. Further, Carter proceeded the many
sales property transaction to the properties mainly located in Area 2 (25%), Area 3
(27.9%) and Area 4. Marty largely in charge properties are located in Area 5. Peterson
mainly responsible for properties located in Area 1, although Peterson proceed the
second high of property sold out transaction of 104 but the mean market value of property
proceed by him is $515,099.80.
As demonstrate by pie chart, 50% of the properties which are 250 do not have pool while
the remaining 50% of properties have pool. However, there is no difference market value
between properties have swimming pool and properties that do not have swimming pool.
The company is not recommended to build swimming pool in order to increase its market
value.
The number of bedrooms in a property does not affect the sell-ability of property. As the
percentage is fairly stated in every category number of bedrooms. 13.6% of properties
which are 68 of properties have 2 and 3 bedrooms respectively. 78 or 15.6% of properties
have 4 bedrooms, 74 or 14.8% of 500 properties have 5 bedrooms. In additional, 72
properties or 14.4% have 6 bedrooms, 61 of 12.2% properties have 7 bedrooms. Only 79
properties of 15.8% of properties that have 8 bedrooms. There are not concluded that
which whether the property have pool, whether the garage attached, what number of
bedrooms and what number of bathrooms in a property are attractive to buyer, since the
percentage is fairly equal between each category of variables.
The frequency distribution indicate that most of the properties can be sold out within 60
days, and day-20 recorded the highest frequencies follow by day-32.
The percentage for default mortgage loan and not default mortgage loan is fairly equal,
which are 53.2% and 46.8% respectively. Same applied to fixed interest mortgage loan
and adjustable interest mortgage loan recorded 52.4% and 47.6% of 500 samples
transaction. The years for mortgage loan holder to paid out the debt are fairly equally
ranging from 1 year to 20 years.
FICO, the credit score of the mortgage loan holder reflects a person’s ability to pay their
[Link] highest score is 850; an average score is 680; a low score is below 680.
According to SPSS calculation, the actual population mean is 704.4 and we can conclude
that 95% of the mean credit score is somewhere between 698.49 and 710.38.
We conclude the company should set up strategy for long-term prosperity, such as
increase the selling value for some property have more number of bathroom or swimming
pool or located in some areas, in order to gain more revenue. The company can train
agents to be expert in certain type of properties such as affordable or luxury property.
25
REFERENCES
26
APPENDICES
1) Data View (Data 3)
2) Variables View
27