0% found this document useful (0 votes)

20 views22 pages

House Price Prediction Model Report

The document is a micro-project report on a 'House Price Prediction Model' developed by students Kulodit Pandey and Ankur Kamdi at Guru Nanak Institute of Engineering & Technology. It details the project's objectives, methodology, and the use of machine learning techniques, specifically Random Forest Regression, to predict housing prices based on various features. The report emphasizes the importance of data preprocessing, feature engineering, and the development of an interactive web application for real-time predictions.

Uploaded by

kuloditp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views22 pages

House Price Prediction Model Report

Uploaded by

kuloditp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

MICRO-PROJECT

REPORT

COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)

Submitted By 5th sem
1] Kulodit Pandey (RollNo.) 2] Ankur kamdi (25)

Under the Guidance of

Guide's Name

Department of COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)

GURU NANAK INSTITUTE OF ENGINEERING

& TECHNOLOGY
NAGPUR – 441501
Academic Year-2025-2026
CERTIFICATE
This is to certify that the micro project report entitled

" HOUSE PRICE PREDICTION MODEL "

submitted by

1] Kulodit Pandey (RollNo.) 2] Ankur Kamdi(25)

This work has been done in the Department of Computer Science & Engineering (Data Science) at Guru
Nanak Institute of Engineering & Technology.

Guide's Signature:
Department: Computer Science & Engineering (Data Science)

Head of Department's Signature:

Department: Computer Science & Engineering (Data Science)

Date:

Place:
DECLARATION

I/We …………. hereby declare that the Project Report entitled “HOUSE PRICE PREDICTION MODEL”
is done by us under the guidance of………, ,Department of Computer Science and Engineering at Guru
Nanak Institute of Engineering and Technology is submitted in partial fulfillment of the requirements for the
award of Bachelor of Engineering degree in Computer Science and Engineering(Data Science).

DATE:

PLACE: SIGNATURE OF
CANDIDATE
ACKNOWLEDGEMENT

We wish to express our sincere gratitude to all those who provided guidance and support
throughout the successful completion of the "House Price Prediction" project.

This project would not have been possible without the consistent encouragement and expert
guidance of Kulodit Pandey and Ankur Kamdi, whose insights and technical feedback were
invaluable at every stage, from data preprocessing to model deployment.

We are also deeply thankful to the academic staff and department of Computer Science and
Engineering Department for providing the necessary infrastructure and environment to
conduct this research.
Finally, we extend our gratitude to the creators and maintainers of the open-source tools—
Python, Scikit-learn, and Streamlit—which were fundamental to the development and
realization of this application.
ABSTRACT:

Usually, House price index represents the summarized price changes of residential
[Link] make it more easier for a family to search for a house we have made it more
precise by asking the required square feet, no of bedrooms and bathrooms required. With
preloaded dataset and data features, a practical data pre-processing, creative feature
engineering method is examined in this paper. The paper also proposes regression technique
in machine learning to predict house price.

Keywords: House Price, Regression Technique, Machine Learning

TABLE OF CONTENTS

Chapter CONTENTS Page No.

Cover Page
[Link]
[Link]
[Link]
0
[Link]
[Link] of Contents
[Link] of Figures
[Link] of Tables
Introduction
1.1 Introduction to the Project
1
1.2 Problem Statement
1.3 Objectives of the Project
Literature Review

2 2.1 Overview of Existing Work

2.2 Critical Analysis
Methodology
3.1 Tools and Technologies Used
3 3.2 System Architecture
3.3 Data Collection and Pre-processing
3.4 Algorithm/Model Implementation
Results and Discussion
15.4.1 Implementation Details
4
16.4.2 Performance Evaluation
17.4.3 Discussion of Results
Conclusion and Future Scope

5 5.1 Conclusion
5.2 Future Scope

I Bibliography

II Appendix
CHAPTER 1

INTRODUCTION:

Data is at the heart of technical innovations, achieving any result is now possible using predictive models.
Machine learning is extensively used in this approach. Machine learning means providing valid dataset and
further on predictions are based on that, the machine itself learns how much importance a particular event
may have on the entire system supported its pre-loaded data and accordingly predicts the result. Various
modern applications of this technique include predicting stock prices, predicting the possibility of an
earthquake, predicting company sales and the list has endless possibilities.

Our aim is to predict a house price based on their needs and priorities.. By analyzing previous market trends
and price ranges, and also upcoming developments future prices will be [Link] functioning involves a
website which accepts customers specifications and then combines the application of neuralnetwork.

Machine Learning

It is a subset of artificial intelligence (AI).It provides system the ability to automatically learn and improve
by [Link] focuses on the development of computer programs that can access data learn by themselves. The
process of learning begins with observations based on the examples that we provide. The aim is to make
computers to learn by itself without the need of a human.

Machine Learning Methods

Machine learning can be classified into three types namely the supervised, unsupervised and reinforcement
learning. Supervised machine learning algorithms can apply what has been learned in the past to new data
predict future events. It analysis from a known training dataset, and produces a functions to predict outputs.

The system will provide outputs for inputs after training. The system will compare with the correct, intended
output and find errors and modify it to make the model more practical and useful.

In contrast, unsupervised machine learning algorithms are the ones which does not require any
supervision. It is used when when the sample data used to train is classified .As name suggests it, the model
itself finds the hidden patterns and insights. The system may or may not produce right output, but it explores
the data and can draw inferences from datasets by its own.

Semi-supervised machine learning algorithms is a combination of both supervised and unsupervised

learning, In semi-supervised learning, an algorithm learns from a dataset that includes both labeled and
unlabeled data, usually mostly [Link] it is chosen when the sample data requires skilled
resources in order to train from it. Otherwise, It doesn’t require additional resources.

Reinforcement machine learning algorithms is a learning method that works based on feedback .
Reinforcement learning differs from supervised learning in not needing labelled input/output pairs be
presented. It is studied in various disciplines such as statistics,information theory etc.

Advantages of Machine Learning It helps to manage a large amount of data .There is no need for human
[Link] can also perform complex operations by its [Link] is extremely useful for those who are in the
field of e commerce or even [Link] is extremely useful in manufacturing industry. 2 While even
experts often cannot be sure where and by which correlation a production error in a plant fleet arises,
Machine Learning offers the possibility to identify the error early this saves down times and money.
Machine learning are now used in the medical field. In the future, after collecting huge amounts of data apps
will be able to warn in case his doctor wants to prescribe a drug that he cannot [Link] app can also
suggest alternative options by taking into account the genetics of patient.

INTRODUCTION TO PROJECT:

Housing is one of the most valuable economic assets an individual can purchase during his adult life. Hence
we need to be extremely careful before buying a house we need to spend correct money to buy a house. In
the following, we explore different machine learning techniques and methodologies to predict house prices.
The data contains a train and a test dataset. Our objective is, to predict house prices based on users
requirements and needs .Our model predicts the price of a house from the sample data that has been given.

The “House Price Prediction” micro project aims to develop a data-driven model that accurately estimates
the price of residential properties based on various influential factors. The main purpose of this project is to
assist home buyers, sellers, and real estate companies in making informed decisions by providing reliable
price predictions. This model helps users understand the fair market value of a property, enabling transparent
and data-backed transactions in the real estate market.

The project utilizes supervised machine learning, specifically the Random Forest Regression algorithm,
which is known for its high accuracy and ability to handle both linear and non-linear data. The model learns
from historical housing data to predict future prices based on given input features.

The key features considered in this project include:

Area (in square feet) – total space of the property

Number of Bedrooms and Bathrooms – key indicators of living comfort

Location Score – reflects accessibility, neighborhood quality, and amenities

Age of the Property – influences depreciation and overall value

Parking Availability – an important factor for urban buyers

Furnishing Status – whether the house is furnished, semi-furnished, or unfurnished

In addition to these, features like proximity to schools, markets, hospitals, and transport facilities can further
improve the prediction accuracy. The dataset is preprocessed through data cleaning, feature selection, and
normalization to ensure optimal performance.

By implementing this project, we aim to demonstrate how machine learning can revolutionize the real estate
industry by providing automated, efficient, and objective price estimations. The model serves as a
foundation for building smart real estate applications that can benefit customers, agents, and investors alike.
CHAPTER 2
LITERATURE REVIEW:

The prediction of house prices has been a key research topic in the fields of data science, artificial
intelligence, and real estate analytics. With the increasing availability of housing data, various studies have
explored machine learning algorithms to create models capable of producing accurate and explainable
predictions. The growing demand for data-driven insights in real estate has encouraged research on
predictive modeling to aid buyers, sellers, investors, and policymakers.

Early research primarily relied on Multiple Linear Regression (MLR) models to predict housing prices based
on limited variables such as location, number of rooms, and area. While simple and interpretable, these
models were often inadequate for capturing non-linear relationships among housing features. To overcome
this limitation, more advanced algorithms like Decision Trees, Random Forest, Support Vector Regression
(SVR), Gradient Boosting Machines (GBM), and Artificial Neural Networks (ANNs) were introduced.
Among these, Random Forest Regression has emerged as one of the most effective techniques because it
combines multiple decision trees to reduce overfitting, improve robustness, and handle large datasets
efficiently. Recent studies also suggest that hybrid models combining multiple machine learning techniques
can further enhance prediction accuracy.

Several studies have highlighted the importance of feature selection and engineering in improving model
accuracy. Features such as Area, Number of Bedrooms, Number of Bathrooms, Location Score, Age of the
Property, Parking Availability, Furnishing Status, Proximity to Schools, and Accessibility to Public
Transport are found to have significant influence on property valuation. Additionally, preprocessing
techniques like data normalization, handling missing values, outlier detection, and encoding categorical
variables are crucial to ensure the reliability of predictive models. Techniques like Principal Component
Analysis (PCA) have also been applied to reduce dimensionality and remove redundant features, improving
model performance.

Modern research emphasizes the importance of visual analytics and interactive dashboards to enhance
interpretability and usability. Platforms like Streamlit and Dash enable developers to build interactive web
applications that allow users to input values dynamically, visualize prediction results with charts, and
explore data insights through interactive plots. Visualization tools such as Matplotlib, Seaborn, Plotly, and
Bokeh are commonly used for this purpose. Providing explainable predictions with feature importance charts
or SHAP (SHapley Additive exPlanations) values has become increasingly important to help stakeholders
understand the reasoning behind predictions.

Moreover, the deployment of machine learning models in real-world environments has become a growing
focus. Studies suggest integrating models with web technologies to make them accessible to buyers, sellers,
and real estate professionals. Cloud platforms like Heroku, AWS, Google Cloud, and Streamlit Cloud allow
these predictive systems to be deployed and accessed in real time, making the project scalable, user-friendly,
and professional. Incorporating API services enables seamless integration with other real estate platforms
and applications, enhancing practical usability.

Evaluation metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), R² Score, and
Mean Squared Logarithmic Error (MSLE) are typically used to assess model performance and ensure
prediction reliability. Researchers have found that ensemble methods like Random Forest and Gradient
Boosting tend to achieve lower error rates and better generalization compared to single-model approaches.
Cross-validation techniques and hyperparameter tuning (e.g., using Grid Search or Random Search) are
commonly applied to further improve model robustness and prevent overfitting.
Recent studies have also explored the integration of predictive models with investment recommendation
systems, which can guide potential buyers on property selection, price negotiation, and return on investment
estimation. The inclusion of confidence intervals in predictions is gaining popularity, providing users with an
estimate of prediction uncertainty and helping make more informed decisions.

In summary, the literature shows that integrating robust machine learning algorithms, effective feature
engineering, explainable and interactive visualization through platforms like Streamlit, and professional
deployment can produce a comprehensive and practical system for accurate house price prediction. Such
models not only enhance transparency in real estate transactions but also provide actionable, data-driven
insights to all stakeholders, including buyers, sellers, investors, and policymakers. The combination of
advanced analytics, real-time accessibility, and interpretability represents the current trend and future
direction of research in predictive real estate modeling.
CHAPTER 3

METHODOLOGY:
The methodology of the House Price Prediction project involves several key stages — from data
preprocessing and model training to the creation of an interactive dashboard for real-time prediction. Each
phase ensures that the model is accurate, reliable, and user-friendly.

1. Data Preprocessing
The primary goal of the data preprocessing phase is to transform the raw, heterogeneous housing dataset into a
clean, structured format suitable for machine learning model training.

A. Handle Missing Values

Missing values (NaNs) will be addressed using targeted imputation techniques.

 Numerical Features: For continuous features such as square footage, year built, or lot size, missing
values will be imputed using the median of the respective column to mitigate the influence of potential
outliers.
 Categorical Features: For features like property zoning or utility type, missing values will be treated as
a separate category, 'Missing', or imputed using the mode (most frequent value) if the number of
missing entries is minor. Columns with excessive sparsity (e.g., over 40% missing data) may be
considered for removal if they do not contribute significant predictive power.

B. Encode Categorical Features

To allow the Random Forest Regressor to process non-numeric data, all categorical features will be encoded.

 Nominal Data: Features with no inherent order (e.g., street type, neighborhood) will be transformed
using One-Hot Encoding. This creates a new binary column for each unique category, ensuring the
model does not assume false ordinal relationships.
 Ordinal Data: Features with a clear rank (e.g., quality rating, basement condition) will be transformed
using Ordinal Encoding, mapping categories to numerical ranks (e.g., 'Excellent' = 5, 'Good' = 4, etc.).

2. Model Training and Persistence

This phase involves selecting an appropriate machine learning algorithm, training it on the processed data, and
saving the final model artifact for deployment.

A. Train Random Forest Regressor

The predictive model chosen for this project is the Random Forest Regressor. This ensemble learning method
is preferred due to its robustness against overfitting, ability to handle non-linear relationships, and capacity to
implicitly manage feature scaling.

 Implementation: The data will be split into training and testing sets (typically 80/20). The model will
be trained on the training data, and performance will be evaluated on the test set using metrics like
Root Mean Squared Error (RMSE) and score.
 Hyperparameter Optimization: Grid Search or Random Search techniques will be employed to tune
key hyperparameters (e.g., n_estimators, max_depth, min_samples_leaf) to maximize the model's
predictive accuracy.

B. Save Model using Joblib

Once the optimal model configuration is determined, the trained RandomForestRegressor object will be
serialized and saved to a file using the Joblib library.
 Purpose: Joblib is highly efficient for handling large NumPy arrays, making it ideal for storing
computationally heavy scikit-learn models. This serialized file will be loaded directly into the
Streamlit

dashboard environment, eliminating the need to retrain the model upon application start.

3. Streamlit Dashboard Development

The final, client-facing system will be deployed as a web application using the Streamlit framework, providing
an intuitive, real-time prediction interface.

A. Input Panel with Cards

The left-hand sidebar or a dedicated panel will serve as the user input area, structured using Streamlit
containers or cards for clarity and user experience.

 Functionality: Users will input the key independent variables (e.g., location, total square footage,
number of bedrooms/bathrooms, year built) using appropriate widgets (sliders, drop-downs, text
inputs).
 Data Consistency: The input panel will ensure that user selections are immediately converted and
validated to match the format of the data used during the model training phase (e.g., ensuring
categorical inputs exactly match the training categories).

B. Output Panel with Prediction and Visualization

The main dashboard area will display the model's output in a clear, actionable format.

 Prediction Display: The calculated predicted house price will be prominently displayed as the primary
result, typically formatted for local currency.
 Gauge Chart: A gauge chart visualization will be implemented to show the predicted price relative to a
benchmark (e.g., the average price of comparable homes in the dataset or a predefined market range).
This provides immediate context on whether the predicted price is high, low, or average.
 Summary Table: A detailed summary table will present a side-by-side comparison of the user's input
features and the resulting predicted price. This aids transparency by summarizing the exact data points
the model used to generate the output.

Python Programming Characteristics

 It provides rich data types
 Syntax is simple
 It is a platform independent scripted language
 Compared to other programming languages, it allows more run-time flexibility
 A module in Python may have one or more classes and free functions
 Libraries in Pythons can also run in Linux and Windows
 Forbuilding large applications, Python can be compiled to byte-code
 It supports functional and structured programming
 It supports interactive mode that allows interacting Testing and debugging of snippets of code
 In Python editing, debugging and testing is fast.
CHAPTER 4

RESULTS & DISCUSSION

This section presents the performance metrics achieved by the trained Random Forest Regressor
model and discusses the utility and transparency of the final deployed Streamlit dashboard.

1. Model Performance Results

The processed dataset was partitioned into training (80%) and testing (20%) sets. After
hyperparameter optimization, the Random Forest Regressor demonstrated strong predictive
capability on the unseen test data.

The high score of confirms that the model is highly effective at capturing the complex, non-linear
relationship between the input features (e.g., square footage, location score, furnishing status) and the
final house price. The relatively low RMSE of ₹ 6.80 Lakhs indicates that the model's predictions are
accurate enough for practical use in market estimation.

2. Discussion on Model Efficacy and Feature Impact

A. Random Forest Suitability

The choice of the Random Forest Regressor proved highly effective. This model handled the mixed
data types (numerical and encoded categorical features) robustly without requiring complex manual
feature scaling. Its ensemble nature effectively mitigated overfitting, evidenced by the small drop in
the score between the training and testing sets.

B. Key Feature Contributions

Feature importance analysis revealed that Area (sqft) and Location Score were the most influential
predictors, aligning with general real estate principles. The encoded categorical features, particularly
Furnishing status (Furnished, Semi-Furnished, Unfurnished) and Neighborhood, also played a
significant role, demonstrating the value of careful One-Hot and Ordinal Encoding during
preprocessing.

3. Discussion on Streamlit Dashboard Utility

The Streamlit dashboard successfully bridges the gap between the complex machine learning model
and the end-user.
A. Real-Time, Contextual Prediction

The deployment allows users to input various property attributes and receive a real-time price
prediction. The immediate conversion and prediction, exemplified by an input like "Area: 1500 sqft,
Bedrooms: 3, Bathrooms: 2" resulting in a "Predicted Price: ₹ 75 Lakhs," demonstrates high system
responsiveness.

B. Transparency through Visualization

The Gauge Chart is a critical component for contextualizing the prediction. By showing the
predicted price against the average market price, a user instantly understands if the ₹ 75 Lakhs
prediction is considered high or low for similar properties. Furthermore, the Summary Table
provides full transparency, clearly displaying the exact feature values used for the prediction,
validating the input and output integrity.
CHAPTER 5

CONCLUSION & FUTURE SCOPE

CONCLUSION:
The House Price Prediction project successfully demonstrates the application of machine learning to the real
estate domain, specifically using the Random Forest Regressor to predict property prices with high accuracy.
Through rigorous data preprocessing, including handling missing values and encoding categorical features,
the model was trained effectively to understand complex relationships between property attributes and
prices.

A significant achievement of this project is the development of a professional and interactive dashboard
using Streamlit, which allows users to input property details and instantly obtain price predictions. The
dashboard is designed with usability in mind, making it valuable for home buyers, sellers, and real estate
professionals who require quick, reliable insights into property pricing trends. By successfully integrating
machine learning with an intuitive front-end interface, the project provides a practical solution to the
challenges of property valuation in real-world scenarios.
Future Scope:

While the current implementation provides accurate predictions and a functional interface, there is
considerable scope for enhancement to make the system more comprehensive and widely applicable:

1. Incorporating Real-World Datasets with More Features: The predictive power of the model can be
enhanced by including additional features such as economic indicators, neighborhood ratings, proximity to
amenities, and property age. Accessing larger, real-world datasets will enable the model to capture diverse
market trends and improve accuracy.

2. Cloud Deployment for Global Accessibility: Deploying the application on cloud platforms like Heroku or
Streamlit Cloud will make it accessible to users worldwide. This will allow continuous interaction,
scalability, and easier integration with real estate portals or mobile applications.

3. Adding Confidence Intervals for Predictions: Providing confidence intervals or prediction ranges will help
users understand the reliability of the predictions, making the tool more trustworthy for financial decisions
and investment planning.

4. Investment Recommendation Analytics: Future versions can include analytical modules to suggest
profitable investments, highlighting undervalued properties or predicting areas with high appreciation
potential. This feature will greatly benefit investors and real estate developers.

5. Continuous Model Improvement: By periodically retraining the model with updated datasets, the system
can adapt to changing market dynamics, ensuring that predictions remain accurate and relevant over time.

 Time Series Integration: Incorporate time-series data to account for market fluctuations and
inflation, enabling more accurate long-term forecasting.
 External Data Augmentation: Integrate external datasets, such as proximity to schools or job hubs,
to further improve the Location Score impact and predictive accuracy.
 Explainable AI (XAI): Implement Shapley Additive explanations (SHAP values) within the
Streamlit dashboard to show why the model arrived at a specific price, providing deep insight into the
individual feature contributions for each prediction.

In summary, the project not only demonstrates the successful application of machine learning for property
valuation but also provides a user-friendly tool for decision-making in real estate. With future enhancements
like cloud deployment, advanced analytics, and real-world data integration, this system has the potential to
become a comprehensive platform for buyers, sellers, and investors seeking accurate and actionable real
estate insights.
CHAPTER I

BIBLIOGRAPHY:

 Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. DOI: 10.1023/A:1010933404324.
(Foundational paper on the ensemble learning algorithm).

 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot,
M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning
Research, 12, 2825-2830. (Reference for the primary machine learning library used for modeling).

 Streamlit Inc. (2023). Streamlit: The fastest way to build and share data apps. Retrieved from
[Link] (Reference for the deployment and visualization framework).

 Joblib Development Team. (2023). Joblib: Running Python functions as pipeline jobs. Retrieved from
[Link] (Reference for model persistence/serialization).

 Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). From Data Mining to Knowledge Discovery
in Databases. AI Magazine, 17(3), 37-54. (Conceptual reference for the overall Knowledge Discovery in
Databases (KDD) process).
CHAPTER II

APPENDIX
CODING:

# importing libraries
import pandas as pd
import numpy as np
import [Link] as plt
import matplotlib [Link]["[Link]"]= (20,10)
import joblib
import plotly.graph_objects as go

saved = [Link]("house_price_model.pkl")
model = saved["model"]
model_columns = saved["columns"]

st.set_page_config(page_title="🏠 House Price Prediction", layout="wide")

[Link]("<h1 style='text-align: center; color: #6a0dad;'>🏡 House Price
Prediction Dashboard</h1>", unsafe_allow_html=True)
[Link]("<p style='text-align: center;'>Predict the <b>price of a
house</b> based on its features.</p>", unsafe_allow_html=True)
[Link]("---")

import pandas as pd\n",

"from sklearn.model_selection import train_test_split\n",
"from [Link] import LabelEncoder\n",
"from [Link] import RandomForestRegressor\n",
"from [Link] import r2_score, mean_absolute_error\n",
"import joblib\n",

# Data preprocessing ## getting the count of area type in the dataset

print([Link]('area_type')['area_type'].agg('count'))

## droping unnecessary columns

[Link](['area_type','society','availability','balcony'], axis='columns',
inplace=True) print([Link])

"y_pred = [Link](X_test)\n",
"r2 = r2_score(y_test, y_pred)\n",
"mae = mean_absolute_error(y_test, y_pred)\n",
"print(\" Model Trained Successfully!\")\n",
"print(f\"R² Score: {r2:.3f}, MAE: {mae:.2f} Lakhs\")\n",
"\n",
"\n",
"# Save model with feature names\n",
"\n",
"[Link]({\"model\": model, \"columns\":
[Link]()}, \"house_price_model.pkl\")\n",
import streamlit as st
import pandas as pd
import joblib
import plotly.graph_objects as go

# ======================
# Load Model and Columns
# ======================
saved = [Link]("house_price_model.pkl")
model = saved["model"]
model_columns = saved["columns"]

# ======================
# Page Config
# ======================
st.set_page_config(page_title="🏠 House Price Prediction", layout="wide")
[Link]("<h1 style='text-align: center; color: #6a0dad;'>🏡 House Price Prediction Dashboard</h1>",
unsafe_allow_html=True)
[Link]("<p style='text-align: center;'>Predict the <b>price of a house</b> based on its features.</p>",
unsafe_allow_html=True)
[Link]("---")

# ======================
# Main Layout: Input & Output Panels
# ======================
input_panel, output_panel = [Link]([1,1])

# ======================
# Input Panel (Left)
# ======================
with input_panel:
[Link]("<h3 style='color:#6a0dad;'>🏠 Enter House Details</h3>", unsafe_allow_html=True)
house_name = st.text_input("House Name (Optional)", "My House")
# Card-style input sections (darker purple background)
area_sqft = st.number_input("📏 Area (in sqft)", 500, 10000, 1500, step=50)
bedrooms = [Link](" Bedrooms", [1,2,3,4,5,6], index=2)
bathrooms = [Link]("🛁 Bathrooms", [1,2,3,4,5], index=1)
location_score = [Link]("📍 Location Score (1-10)", list(range(1,11)), index=4)
age_of_house = st.number_input("⏳ Age of House (years)", 0, 30, 5, step=1)
parking = [Link]("🚗 Parking Available", ["Yes", "No"])
furnishing = [Link](" Furnishing Status", ["Furnished", "Semi-Furnished", "Unfurnished"])

predict_btn = [Link]("🔮 Predict Price")

# ======================
# Encode categorical inputs
# ======================
parking_encoded = 1 if parking=="Yes" else 0
furnishing_map = {"Furnished": 0, "Semi-Furnished": 1, "Unfurnished": 2}
furnishing_encoded = furnishing_map[furnishing]

# ======================
# Prepare Input Data
# ======================
input_data = [Link]([{
"area_sqft": area_sqft,
"bedrooms": bedrooms,
"bathrooms": bathrooms,
"location_score": location_score,
"age_of_house": age_of_house,
"parking": parking_encoded,
"furnishing": furnishing_encoded
}])

input_data = input_data[model_columns]

# ======================
# Prediction & Output Panel (Right)
# ======================
with output_panel:
if predict_btn:
price = [Link](input_data)[0]

# Prediction Card
[Link](f"""
<div style='background-color:#4B0082; padding:20px; border-radius:15px; text-align:center;
color:white;'>
<h3>🏠 Predicted Price for: <b>{house_name}</b></h3>
<h2>💰 ₹ {price:,.2f} Lakhs</h2>
</div>
""", unsafe_allow_html=True)

# Gauge Chart
fig = [Link]([Link](
mode="gauge+number",
value=price,
title={'text': "Predicted Price (Lakhs)"},
gauge={'axis': {'range':[0, 200]},
'bar': {'color': "#800080"},
'steps':[{'range':[0,50], 'color':'#6a0dad'},
{'range':[50,100], 'color':'#7b1fa2'},
{'range':[100,200], 'color':'#800080'}]}
))
st.plotly_chart(fig, use_container_width=True)

# Input Summary Card

summary_data = [Link]({
"Feature": ["Area (sqft)", "Bedrooms", "Bathrooms", "Location Score", "Age of House", "Parking",
"Furnishing"],
"Value": [area_sqft, bedrooms, bathrooms, location_score, age_of_house, parking, furnishing]
})
[Link]("""
<div style='background-color:#4B0082; padding:15px; border-radius:10px; color:white;'>
<h4>📝 Entered House Details</h4>
</div>
""", unsafe_allow_html=True)
[Link](summary_data)

# ======================
# Footer
# ======================
[Link]("---")
[Link]("<p style='text-align: center; color: #6a0dad;'>💡 Built with Streamlit & Random Forest

House Price Prediction Model Report
No ratings yet
House Price Prediction Model Report
72 pages
NLP Mini Project House Predic - Rushi
No ratings yet
NLP Mini Project House Predic - Rushi
8 pages
ML Project File
No ratings yet
ML Project File
27 pages
House Price Prediction with Python
No ratings yet
House Price Prediction with Python
50 pages
House Price Prediction Model Analysis
No ratings yet
House Price Prediction Model Analysis
72 pages
House Price Prediction with Python
No ratings yet
House Price Prediction with Python
50 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
14 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
92 pages
House Price Prediction with ML Models
No ratings yet
House Price Prediction with ML Models
6 pages
Bengaluru House Price Prediction Model
No ratings yet
Bengaluru House Price Prediction Model
6 pages
ML Mini Project
No ratings yet
ML Mini Project
10 pages
House Price Prediction Model Analysis
No ratings yet
House Price Prediction Model Analysis
27 pages
House Price Prediction Using ML
100% (1)
House Price Prediction Using ML
17 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
16 pages
House Price Prediction with ML Models
No ratings yet
House Price Prediction with ML Models
62 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
7 pages
House Price Prediction with XGBoost
No ratings yet
House Price Prediction with XGBoost
11 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
51 pages
Housing Price Prediction Project Report
No ratings yet
Housing Price Prediction Project Report
41 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
11 pages
House Price Prediction System Project
No ratings yet
House Price Prediction System Project
14 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
20 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
9 pages
House Price Prediction Model Insights
No ratings yet
House Price Prediction Model Insights
12 pages
Mini Project - Template
No ratings yet
Mini Project - Template
35 pages
Blockchain Simulation
No ratings yet
Blockchain Simulation
24 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
30 pages
House Price Prediction Using Linear Regression
No ratings yet
House Price Prediction Using Linear Regression
10 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
29 pages
House Price Prediction Project Report
No ratings yet
House Price Prediction Project Report
22 pages
Multiple Linear Regression for House Prices
100% (1)
Multiple Linear Regression for House Prices
10 pages
House Price Prediction Using ML Models
No ratings yet
House Price Prediction Using ML Models
6 pages
Machine Learning for House Price Prediction
No ratings yet
Machine Learning for House Price Prediction
5 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
7 pages
House Price Prediction
No ratings yet
House Price Prediction
68 pages
House Price Prediction with Linear Regression
No ratings yet
House Price Prediction with Linear Regression
14 pages
House Price Prediction Project Report
No ratings yet
House Price Prediction Project Report
28 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
4 pages
Real Estate Price Prediction Report
No ratings yet
Real Estate Price Prediction Report
16 pages
REPORT FILE by TOMER
No ratings yet
REPORT FILE by TOMER
47 pages
House Price Prediction Using ML Techniques
No ratings yet
House Price Prediction Using ML Techniques
16 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
16 pages
Synopsis
No ratings yet
Synopsis
11 pages
House Price Prediction with ML Techniques
No ratings yet
House Price Prediction with ML Techniques
5 pages
House Price Prediction Using ML Models
No ratings yet
House Price Prediction Using ML Models
22 pages
House Price Prediction System Overview
No ratings yet
House Price Prediction System Overview
36 pages
Smart House Price 002
No ratings yet
Smart House Price 002
28 pages
House Price Prediction Using ML
No ratings yet
House Price Prediction Using ML
5 pages
House Price Prediction and Analysis Using Machine Learning Techniques
No ratings yet
House Price Prediction and Analysis Using Machine Learning Techniques
30 pages
Machine Learning for House Price Prediction
No ratings yet
Machine Learning for House Price Prediction
4 pages
Machine Learning for House Price Prediction
No ratings yet
Machine Learning for House Price Prediction
9 pages
House Price Prediction Using ML Models
No ratings yet
House Price Prediction Using ML Models
5 pages
House Price Prediction Project Report
No ratings yet
House Price Prediction Project Report
32 pages
Machine Learning for House Pricing Prediction
No ratings yet
Machine Learning for House Pricing Prediction
2 pages
House Price Prediction with Python
No ratings yet
House Price Prediction with Python
52 pages
House Price Prediction Full Report-11
No ratings yet
House Price Prediction Full Report-11
3 pages
Bangalore House Price Prediction Model
No ratings yet
Bangalore House Price Prediction Model
4 pages
Mini Project - File
No ratings yet
Mini Project - File
20 pages
IJCRT2111135
No ratings yet
IJCRT2111135
7 pages
Classical Mechanics III by Ashoke Sen
No ratings yet
Classical Mechanics III by Ashoke Sen
21 pages
XYZ-ATM Project Management Overview
No ratings yet
XYZ-ATM Project Management Overview
91 pages
Understanding Rheostat Measurements
No ratings yet
Understanding Rheostat Measurements
291 pages
Deep Beams and Corbels: Structural Insights
No ratings yet
Deep Beams and Corbels: Structural Insights
53 pages
Beee Working Model TOPIC: Li-Fi: Submitted To:-Sundeep Sir Submitted By
No ratings yet
Beee Working Model TOPIC: Li-Fi: Submitted To:-Sundeep Sir Submitted By
28 pages
Lutron SL-4030 Sound Level Meter Guide
No ratings yet
Lutron SL-4030 Sound Level Meter Guide
2 pages
GC-MS Analysis of Red Wine Compounds
No ratings yet
GC-MS Analysis of Red Wine Compounds
16 pages
Ketosis in Dairy Cattle: A Review
No ratings yet
Ketosis in Dairy Cattle: A Review
12 pages
FCL Inventory Management Strategies
No ratings yet
FCL Inventory Management Strategies
1 page
Patterns of Mathematics in Nature
No ratings yet
Patterns of Mathematics in Nature
12 pages
Introduction to Computer and IT Basics
No ratings yet
Introduction to Computer and IT Basics
11 pages
Branches and Types of Technical Drawing
No ratings yet
Branches and Types of Technical Drawing
7 pages
MAX II Bolt Tension Monitor Overview
No ratings yet
MAX II Bolt Tension Monitor Overview
2 pages
Control - ABN2, Lot TP83115 Exp. 2019-03-10, 3 in 1 TESTpoint, ADVIA 120 2120, SMN 10318905 - T03-4 DXDCM 09008b838089ab6b-1542766634283
No ratings yet
Control - ABN2, Lot TP83115 Exp. 2019-03-10, 3 in 1 TESTpoint, ADVIA 120 2120, SMN 10318905 - T03-4 DXDCM 09008b838089ab6b-1542766634283
2 pages
MBA Business Economics Syllabus 2021-22
No ratings yet
MBA Business Economics Syllabus 2021-22
64 pages
North American GeoGebra Journal Vol 1
No ratings yet
North American GeoGebra Journal Vol 1
7 pages
Multimedia Technologies Exam Paper
No ratings yet
Multimedia Technologies Exam Paper
5 pages
Sanwa CD800A Multimeter User Guide
No ratings yet
Sanwa CD800A Multimeter User Guide
1 page
Shading Effects on Millet and Taro Yields
No ratings yet
Shading Effects on Millet and Taro Yields
11 pages
MATH 1311 Course Outline 2019-20
No ratings yet
MATH 1311 Course Outline 2019-20
21 pages
Pressure Modeling in Atmosphere
No ratings yet
Pressure Modeling in Atmosphere
7 pages
Importance of Accurate Forecasting
No ratings yet
Importance of Accurate Forecasting
8 pages
Boundary Layer and Fluid Dynamics Analysis
No ratings yet
Boundary Layer and Fluid Dynamics Analysis
2 pages
Engineering Mathematics IV Question Paper
No ratings yet
Engineering Mathematics IV Question Paper
3 pages
Cadmium Removal Using Amberjet 1200H
No ratings yet
Cadmium Removal Using Amberjet 1200H
5 pages
Preparing for CAT Quantitative Aptitude
100% (1)
Preparing for CAT Quantitative Aptitude
4 pages
Sulawesi Geological Overview and Bibliography
No ratings yet
Sulawesi Geological Overview and Bibliography
108 pages
AAA Variable Spring Hangers Catalog
0% (1)
AAA Variable Spring Hangers Catalog
31 pages
Geometric Transformations: Translations & Reflections
No ratings yet
Geometric Transformations: Translations & Reflections
3 pages
Understanding the Solar Nebula Theory
No ratings yet
Understanding the Solar Nebula Theory
29 pages

House Price Prediction Model Report

Uploaded by

House Price Prediction Model Report

Uploaded by

MICRO-PROJECT

COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)

Under the Guidance of

Department of COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)

GURU NANAK INSTITUTE OF ENGINEERING

" HOUSE PRICE PREDICTION MODEL "

1] Kulodit Pandey (RollNo.) 2] Ankur Kamdi(25)

Head of Department's Signature:

Keywords: House Price, Regression Technique, Machine Learning

Chapter CONTENTS Page No.

2 2.1 Overview of Existing Work

Machine Learning Methods

Semi-supervised machine learning algorithms is a combination of both supervised and unsupervised

The key features considered in this project include:

Area (in square feet) – total space of the property

Number of Bedrooms and Bathrooms – key indicators of living comfort

Location Score – reflects accessibility, neighborhood quality, and amenities

Age of the Property – influences depreciation and overall value

Parking Availability – an important factor for urban buyers

Furnishing Status – whether the house is furnished, semi-furnished, or unfurnished

A. Handle Missing Values

B. Encode Categorical Features

2. Model Training and Persistence

A. Train Random Forest Regressor

B. Save Model using Joblib

3. Streamlit Dashboard Development

A. Input Panel with Cards

B. Output Panel with Prediction and Visualization

Python Programming Characteristics

RESULTS & DISCUSSION

1. Model Performance Results

2. Discussion on Model Efficacy and Feature Impact

A. Random Forest Suitability

B. Key Feature Contributions

3. Discussion on Streamlit Dashboard Utility

B. Transparency through Visualization

CONCLUSION & FUTURE SCOPE

st.set_page_config(page_title="🏠 House Price Prediction", layout="wide")

import pandas as pd\n",

# Data preprocessing ## getting the count of area type in the dataset

## droping unnecessary columns

predict_btn = [Link]("🔮 Predict Price")

# Input Summary Card

You might also like