MICRO-PROJECT
REPORT
COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)
Submitted By 5th sem
1] Kulodit Pandey (RollNo.) 2] Ankur kamdi (25)
Under the Guidance of
Guide's Name
Department of COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)
GURU NANAK INSTITUTE OF ENGINEERING
& TECHNOLOGY
NAGPUR – 441501
Academic Year-2025-2026
CERTIFICATE
This is to certify that the micro project report entitled
" HOUSE PRICE PREDICTION MODEL "
submitted by
1] Kulodit Pandey (RollNo.) 2] Ankur Kamdi(25)
This work has been done in the Department of Computer Science & Engineering (Data Science) at Guru
Nanak Institute of Engineering & Technology.
Guide's Signature:
Department: Computer Science & Engineering (Data Science)
Head of Department's Signature:
Department: Computer Science & Engineering (Data Science)
Date:
Place:
DECLARATION
I/We …………. hereby declare that the Project Report entitled “HOUSE PRICE PREDICTION MODEL”
is done by us under the guidance of………, ,Department of Computer Science and Engineering at Guru
Nanak Institute of Engineering and Technology is submitted in partial fulfillment of the requirements for the
award of Bachelor of Engineering degree in Computer Science and Engineering(Data Science).
DATE:
PLACE: SIGNATURE OF
CANDIDATE
ACKNOWLEDGEMENT
We wish to express our sincere gratitude to all those who provided guidance and support
throughout the successful completion of the "House Price Prediction" project.
This project would not have been possible without the consistent encouragement and expert
guidance of Kulodit Pandey and Ankur Kamdi, whose insights and technical feedback were
invaluable at every stage, from data preprocessing to model deployment.
We are also deeply thankful to the academic staff and department of Computer Science and
Engineering Department for providing the necessary infrastructure and environment to
conduct this research.
Finally, we extend our gratitude to the creators and maintainers of the open-source tools—
Python, Scikit-learn, and Streamlit—which were fundamental to the development and
realization of this application.
ABSTRACT:
Usually, House price index represents the summarized price changes of residential
[Link] make it more easier for a family to search for a house we have made it more
precise by asking the required square feet, no of bedrooms and bathrooms required. With
preloaded dataset and data features, a practical data pre-processing, creative feature
engineering method is examined in this paper. The paper also proposes regression technique
in machine learning to predict house price.
Keywords: House Price, Regression Technique, Machine Learning
TABLE OF CONTENTS
Chapter CONTENTS Page No.
Cover Page
[Link]
[Link]
[Link]
0
[Link]
[Link] of Contents
[Link] of Figures
[Link] of Tables
Introduction
1.1 Introduction to the Project
1
1.2 Problem Statement
1.3 Objectives of the Project
Literature Review
2 2.1 Overview of Existing Work
2.2 Critical Analysis
Methodology
3.1 Tools and Technologies Used
3 3.2 System Architecture
3.3 Data Collection and Pre-processing
3.4 Algorithm/Model Implementation
Results and Discussion
15.4.1 Implementation Details
4
16.4.2 Performance Evaluation
17.4.3 Discussion of Results
Conclusion and Future Scope
5 5.1 Conclusion
5.2 Future Scope
I Bibliography
II Appendix
CHAPTER 1
INTRODUCTION:
Data is at the heart of technical innovations, achieving any result is now possible using predictive models.
Machine learning is extensively used in this approach. Machine learning means providing valid dataset and
further on predictions are based on that, the machine itself learns how much importance a particular event
may have on the entire system supported its pre-loaded data and accordingly predicts the result. Various
modern applications of this technique include predicting stock prices, predicting the possibility of an
earthquake, predicting company sales and the list has endless possibilities.
Our aim is to predict a house price based on their needs and priorities.. By analyzing previous market trends
and price ranges, and also upcoming developments future prices will be [Link] functioning involves a
website which accepts customers specifications and then combines the application of neuralnetwork.
Machine Learning
It is a subset of artificial intelligence (AI).It provides system the ability to automatically learn and improve
by [Link] focuses on the development of computer programs that can access data learn by themselves. The
process of learning begins with observations based on the examples that we provide. The aim is to make
computers to learn by itself without the need of a human.
Machine Learning Methods
Machine learning can be classified into three types namely the supervised, unsupervised and reinforcement
learning. Supervised machine learning algorithms can apply what has been learned in the past to new data
predict future events. It analysis from a known training dataset, and produces a functions to predict outputs.
The system will provide outputs for inputs after training. The system will compare with the correct, intended
output and find errors and modify it to make the model more practical and useful.
In contrast, unsupervised machine learning algorithms are the ones which does not require any
supervision. It is used when when the sample data used to train is classified .As name suggests it, the model
itself finds the hidden patterns and insights. The system may or may not produce right output, but it explores
the data and can draw inferences from datasets by its own.
Semi-supervised machine learning algorithms is a combination of both supervised and unsupervised
learning, In semi-supervised learning, an algorithm learns from a dataset that includes both labeled and
unlabeled data, usually mostly [Link] it is chosen when the sample data requires skilled
resources in order to train from it. Otherwise, It doesn’t require additional resources.
Reinforcement machine learning algorithms is a learning method that works based on feedback .
Reinforcement learning differs from supervised learning in not needing labelled input/output pairs be
presented. It is studied in various disciplines such as statistics,information theory etc.
Advantages of Machine Learning It helps to manage a large amount of data .There is no need for human
[Link] can also perform complex operations by its [Link] is extremely useful for those who are in the
field of e commerce or even [Link] is extremely useful in manufacturing industry. 2 While even
experts often cannot be sure where and by which correlation a production error in a plant fleet arises,
Machine Learning offers the possibility to identify the error early this saves down times and money.
Machine learning are now used in the medical field. In the future, after collecting huge amounts of data apps
will be able to warn in case his doctor wants to prescribe a drug that he cannot [Link] app can also
suggest alternative options by taking into account the genetics of patient.
INTRODUCTION TO PROJECT:
Housing is one of the most valuable economic assets an individual can purchase during his adult life. Hence
we need to be extremely careful before buying a house we need to spend correct money to buy a house. In
the following, we explore different machine learning techniques and methodologies to predict house prices.
The data contains a train and a test dataset. Our objective is, to predict house prices based on users
requirements and needs .Our model predicts the price of a house from the sample data that has been given.
The “House Price Prediction” micro project aims to develop a data-driven model that accurately estimates
the price of residential properties based on various influential factors. The main purpose of this project is to
assist home buyers, sellers, and real estate companies in making informed decisions by providing reliable
price predictions. This model helps users understand the fair market value of a property, enabling transparent
and data-backed transactions in the real estate market.
The project utilizes supervised machine learning, specifically the Random Forest Regression algorithm,
which is known for its high accuracy and ability to handle both linear and non-linear data. The model learns
from historical housing data to predict future prices based on given input features.
The key features considered in this project include:
Area (in square feet) – total space of the property
Number of Bedrooms and Bathrooms – key indicators of living comfort
Location Score – reflects accessibility, neighborhood quality, and amenities
Age of the Property – influences depreciation and overall value
Parking Availability – an important factor for urban buyers
Furnishing Status – whether the house is furnished, semi-furnished, or unfurnished
In addition to these, features like proximity to schools, markets, hospitals, and transport facilities can further
improve the prediction accuracy. The dataset is preprocessed through data cleaning, feature selection, and
normalization to ensure optimal performance.
By implementing this project, we aim to demonstrate how machine learning can revolutionize the real estate
industry by providing automated, efficient, and objective price estimations. The model serves as a
foundation for building smart real estate applications that can benefit customers, agents, and investors alike.
CHAPTER 2
LITERATURE REVIEW:
The prediction of house prices has been a key research topic in the fields of data science, artificial
intelligence, and real estate analytics. With the increasing availability of housing data, various studies have
explored machine learning algorithms to create models capable of producing accurate and explainable
predictions. The growing demand for data-driven insights in real estate has encouraged research on
predictive modeling to aid buyers, sellers, investors, and policymakers.
Early research primarily relied on Multiple Linear Regression (MLR) models to predict housing prices based
on limited variables such as location, number of rooms, and area. While simple and interpretable, these
models were often inadequate for capturing non-linear relationships among housing features. To overcome
this limitation, more advanced algorithms like Decision Trees, Random Forest, Support Vector Regression
(SVR), Gradient Boosting Machines (GBM), and Artificial Neural Networks (ANNs) were introduced.
Among these, Random Forest Regression has emerged as one of the most effective techniques because it
combines multiple decision trees to reduce overfitting, improve robustness, and handle large datasets
efficiently. Recent studies also suggest that hybrid models combining multiple machine learning techniques
can further enhance prediction accuracy.
Several studies have highlighted the importance of feature selection and engineering in improving model
accuracy. Features such as Area, Number of Bedrooms, Number of Bathrooms, Location Score, Age of the
Property, Parking Availability, Furnishing Status, Proximity to Schools, and Accessibility to Public
Transport are found to have significant influence on property valuation. Additionally, preprocessing
techniques like data normalization, handling missing values, outlier detection, and encoding categorical
variables are crucial to ensure the reliability of predictive models. Techniques like Principal Component
Analysis (PCA) have also been applied to reduce dimensionality and remove redundant features, improving
model performance.
Modern research emphasizes the importance of visual analytics and interactive dashboards to enhance
interpretability and usability. Platforms like Streamlit and Dash enable developers to build interactive web
applications that allow users to input values dynamically, visualize prediction results with charts, and
explore data insights through interactive plots. Visualization tools such as Matplotlib, Seaborn, Plotly, and
Bokeh are commonly used for this purpose. Providing explainable predictions with feature importance charts
or SHAP (SHapley Additive exPlanations) values has become increasingly important to help stakeholders
understand the reasoning behind predictions.
Moreover, the deployment of machine learning models in real-world environments has become a growing
focus. Studies suggest integrating models with web technologies to make them accessible to buyers, sellers,
and real estate professionals. Cloud platforms like Heroku, AWS, Google Cloud, and Streamlit Cloud allow
these predictive systems to be deployed and accessed in real time, making the project scalable, user-friendly,
and professional. Incorporating API services enables seamless integration with other real estate platforms
and applications, enhancing practical usability.
Evaluation metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), R² Score, and
Mean Squared Logarithmic Error (MSLE) are typically used to assess model performance and ensure
prediction reliability. Researchers have found that ensemble methods like Random Forest and Gradient
Boosting tend to achieve lower error rates and better generalization compared to single-model approaches.
Cross-validation techniques and hyperparameter tuning (e.g., using Grid Search or Random Search) are
commonly applied to further improve model robustness and prevent overfitting.
Recent studies have also explored the integration of predictive models with investment recommendation
systems, which can guide potential buyers on property selection, price negotiation, and return on investment
estimation. The inclusion of confidence intervals in predictions is gaining popularity, providing users with an
estimate of prediction uncertainty and helping make more informed decisions.
In summary, the literature shows that integrating robust machine learning algorithms, effective feature
engineering, explainable and interactive visualization through platforms like Streamlit, and professional
deployment can produce a comprehensive and practical system for accurate house price prediction. Such
models not only enhance transparency in real estate transactions but also provide actionable, data-driven
insights to all stakeholders, including buyers, sellers, investors, and policymakers. The combination of
advanced analytics, real-time accessibility, and interpretability represents the current trend and future
direction of research in predictive real estate modeling.
CHAPTER 3
METHODOLOGY:
The methodology of the House Price Prediction project involves several key stages — from data
preprocessing and model training to the creation of an interactive dashboard for real-time prediction. Each
phase ensures that the model is accurate, reliable, and user-friendly.
1. Data Preprocessing
The primary goal of the data preprocessing phase is to transform the raw, heterogeneous housing dataset into a
clean, structured format suitable for machine learning model training.
A. Handle Missing Values
Missing values (NaNs) will be addressed using targeted imputation techniques.
Numerical Features: For continuous features such as square footage, year built, or lot size, missing
values will be imputed using the median of the respective column to mitigate the influence of potential
outliers.
Categorical Features: For features like property zoning or utility type, missing values will be treated as
a separate category, 'Missing', or imputed using the mode (most frequent value) if the number of
missing entries is minor. Columns with excessive sparsity (e.g., over 40% missing data) may be
considered for removal if they do not contribute significant predictive power.
B. Encode Categorical Features
To allow the Random Forest Regressor to process non-numeric data, all categorical features will be encoded.
Nominal Data: Features with no inherent order (e.g., street type, neighborhood) will be transformed
using One-Hot Encoding. This creates a new binary column for each unique category, ensuring the
model does not assume false ordinal relationships.
Ordinal Data: Features with a clear rank (e.g., quality rating, basement condition) will be transformed
using Ordinal Encoding, mapping categories to numerical ranks (e.g., 'Excellent' = 5, 'Good' = 4, etc.).
2. Model Training and Persistence
This phase involves selecting an appropriate machine learning algorithm, training it on the processed data, and
saving the final model artifact for deployment.
A. Train Random Forest Regressor
The predictive model chosen for this project is the Random Forest Regressor. This ensemble learning method
is preferred due to its robustness against overfitting, ability to handle non-linear relationships, and capacity to
implicitly manage feature scaling.
Implementation: The data will be split into training and testing sets (typically 80/20). The model will
be trained on the training data, and performance will be evaluated on the test set using metrics like
Root Mean Squared Error (RMSE) and score.
Hyperparameter Optimization: Grid Search or Random Search techniques will be employed to tune
key hyperparameters (e.g., n_estimators, max_depth, min_samples_leaf) to maximize the model's
predictive accuracy.
B. Save Model using Joblib
Once the optimal model configuration is determined, the trained RandomForestRegressor object will be
serialized and saved to a file using the Joblib library.
Purpose: Joblib is highly efficient for handling large NumPy arrays, making it ideal for storing
computationally heavy scikit-learn models. This serialized file will be loaded directly into the
Streamlit
dashboard environment, eliminating the need to retrain the model upon application start.
3. Streamlit Dashboard Development
The final, client-facing system will be deployed as a web application using the Streamlit framework, providing
an intuitive, real-time prediction interface.
A. Input Panel with Cards
The left-hand sidebar or a dedicated panel will serve as the user input area, structured using Streamlit
containers or cards for clarity and user experience.
Functionality: Users will input the key independent variables (e.g., location, total square footage,
number of bedrooms/bathrooms, year built) using appropriate widgets (sliders, drop-downs, text
inputs).
Data Consistency: The input panel will ensure that user selections are immediately converted and
validated to match the format of the data used during the model training phase (e.g., ensuring
categorical inputs exactly match the training categories).
B. Output Panel with Prediction and Visualization
The main dashboard area will display the model's output in a clear, actionable format.
Prediction Display: The calculated predicted house price will be prominently displayed as the primary
result, typically formatted for local currency.
Gauge Chart: A gauge chart visualization will be implemented to show the predicted price relative to a
benchmark (e.g., the average price of comparable homes in the dataset or a predefined market range).
This provides immediate context on whether the predicted price is high, low, or average.
Summary Table: A detailed summary table will present a side-by-side comparison of the user's input
features and the resulting predicted price. This aids transparency by summarizing the exact data points
the model used to generate the output.
Python Programming Characteristics
It provides rich data types
Syntax is simple
It is a platform independent scripted language
Compared to other programming languages, it allows more run-time flexibility
A module in Python may have one or more classes and free functions
Libraries in Pythons can also run in Linux and Windows
Forbuilding large applications, Python can be compiled to byte-code
It supports functional and structured programming
It supports interactive mode that allows interacting Testing and debugging of snippets of code
In Python editing, debugging and testing is fast.
CHAPTER 4
RESULTS & DISCUSSION
This section presents the performance metrics achieved by the trained Random Forest Regressor
model and discusses the utility and transparency of the final deployed Streamlit dashboard.
1. Model Performance Results
The processed dataset was partitioned into training (80%) and testing (20%) sets. After
hyperparameter optimization, the Random Forest Regressor demonstrated strong predictive
capability on the unseen test data.
The high score of confirms that the model is highly effective at capturing the complex, non-linear
relationship between the input features (e.g., square footage, location score, furnishing status) and the
final house price. The relatively low RMSE of ₹ 6.80 Lakhs indicates that the model's predictions are
accurate enough for practical use in market estimation.
2. Discussion on Model Efficacy and Feature Impact
A. Random Forest Suitability
The choice of the Random Forest Regressor proved highly effective. This model handled the mixed
data types (numerical and encoded categorical features) robustly without requiring complex manual
feature scaling. Its ensemble nature effectively mitigated overfitting, evidenced by the small drop in
the score between the training and testing sets.
B. Key Feature Contributions
Feature importance analysis revealed that Area (sqft) and Location Score were the most influential
predictors, aligning with general real estate principles. The encoded categorical features, particularly
Furnishing status (Furnished, Semi-Furnished, Unfurnished) and Neighborhood, also played a
significant role, demonstrating the value of careful One-Hot and Ordinal Encoding during
preprocessing.
3. Discussion on Streamlit Dashboard Utility
The Streamlit dashboard successfully bridges the gap between the complex machine learning model
and the end-user.
A. Real-Time, Contextual Prediction
The deployment allows users to input various property attributes and receive a real-time price
prediction. The immediate conversion and prediction, exemplified by an input like "Area: 1500 sqft,
Bedrooms: 3, Bathrooms: 2" resulting in a "Predicted Price: ₹ 75 Lakhs," demonstrates high system
responsiveness.
B. Transparency through Visualization
The Gauge Chart is a critical component for contextualizing the prediction. By showing the
predicted price against the average market price, a user instantly understands if the ₹ 75 Lakhs
prediction is considered high or low for similar properties. Furthermore, the Summary Table
provides full transparency, clearly displaying the exact feature values used for the prediction,
validating the input and output integrity.
CHAPTER 5
CONCLUSION & FUTURE SCOPE
CONCLUSION:
The House Price Prediction project successfully demonstrates the application of machine learning to the real
estate domain, specifically using the Random Forest Regressor to predict property prices with high accuracy.
Through rigorous data preprocessing, including handling missing values and encoding categorical features,
the model was trained effectively to understand complex relationships between property attributes and
prices.
A significant achievement of this project is the development of a professional and interactive dashboard
using Streamlit, which allows users to input property details and instantly obtain price predictions. The
dashboard is designed with usability in mind, making it valuable for home buyers, sellers, and real estate
professionals who require quick, reliable insights into property pricing trends. By successfully integrating
machine learning with an intuitive front-end interface, the project provides a practical solution to the
challenges of property valuation in real-world scenarios.
Future Scope:
While the current implementation provides accurate predictions and a functional interface, there is
considerable scope for enhancement to make the system more comprehensive and widely applicable:
1. Incorporating Real-World Datasets with More Features: The predictive power of the model can be
enhanced by including additional features such as economic indicators, neighborhood ratings, proximity to
amenities, and property age. Accessing larger, real-world datasets will enable the model to capture diverse
market trends and improve accuracy.
2. Cloud Deployment for Global Accessibility: Deploying the application on cloud platforms like Heroku or
Streamlit Cloud will make it accessible to users worldwide. This will allow continuous interaction,
scalability, and easier integration with real estate portals or mobile applications.
3. Adding Confidence Intervals for Predictions: Providing confidence intervals or prediction ranges will help
users understand the reliability of the predictions, making the tool more trustworthy for financial decisions
and investment planning.
4. Investment Recommendation Analytics: Future versions can include analytical modules to suggest
profitable investments, highlighting undervalued properties or predicting areas with high appreciation
potential. This feature will greatly benefit investors and real estate developers.
5. Continuous Model Improvement: By periodically retraining the model with updated datasets, the system
can adapt to changing market dynamics, ensuring that predictions remain accurate and relevant over time.
Time Series Integration: Incorporate time-series data to account for market fluctuations and
inflation, enabling more accurate long-term forecasting.
External Data Augmentation: Integrate external datasets, such as proximity to schools or job hubs,
to further improve the Location Score impact and predictive accuracy.
Explainable AI (XAI): Implement Shapley Additive explanations (SHAP values) within the
Streamlit dashboard to show why the model arrived at a specific price, providing deep insight into the
individual feature contributions for each prediction.
In summary, the project not only demonstrates the successful application of machine learning for property
valuation but also provides a user-friendly tool for decision-making in real estate. With future enhancements
like cloud deployment, advanced analytics, and real-world data integration, this system has the potential to
become a comprehensive platform for buyers, sellers, and investors seeking accurate and actionable real
estate insights.
CHAPTER I
BIBLIOGRAPHY:
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. DOI: 10.1023/A:1010933404324.
(Foundational paper on the ensemble learning algorithm).
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot,
M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning
Research, 12, 2825-2830. (Reference for the primary machine learning library used for modeling).
Streamlit Inc. (2023). Streamlit: The fastest way to build and share data apps. Retrieved from
[Link] (Reference for the deployment and visualization framework).
Joblib Development Team. (2023). Joblib: Running Python functions as pipeline jobs. Retrieved from
[Link] (Reference for model persistence/serialization).
Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). From Data Mining to Knowledge Discovery
in Databases. AI Magazine, 17(3), 37-54. (Conceptual reference for the overall Knowledge Discovery in
Databases (KDD) process).
CHAPTER II
APPENDIX
CODING:
# importing libraries
import pandas as pd
import numpy as np
import [Link] as plt
import matplotlib [Link]["[Link]"]= (20,10)
import joblib
import plotly.graph_objects as go
saved = [Link]("house_price_model.pkl")
model = saved["model"]
model_columns = saved["columns"]
st.set_page_config(page_title="🏠 House Price Prediction", layout="wide")
[Link]("<h1 style='text-align: center; color: #6a0dad;'>🏡 House Price
Prediction Dashboard</h1>", unsafe_allow_html=True)
[Link]("<p style='text-align: center;'>Predict the <b>price of a
house</b> based on its features.</p>", unsafe_allow_html=True)
[Link]("---")
import pandas as pd\n",
"from sklearn.model_selection import train_test_split\n",
"from [Link] import LabelEncoder\n",
"from [Link] import RandomForestRegressor\n",
"from [Link] import r2_score, mean_absolute_error\n",
"import joblib\n",
# Data preprocessing ## getting the count of area type in the dataset
print([Link]('area_type')['area_type'].agg('count'))
## droping unnecessary columns
[Link](['area_type','society','availability','balcony'], axis='columns',
inplace=True) print([Link])
"y_pred = [Link](X_test)\n",
"r2 = r2_score(y_test, y_pred)\n",
"mae = mean_absolute_error(y_test, y_pred)\n",
"print(\" Model Trained Successfully!\")\n",
"print(f\"R² Score: {r2:.3f}, MAE: {mae:.2f} Lakhs\")\n",
"\n",
"\n",
"# Save model with feature names\n",
"\n",
"[Link]({\"model\": model, \"columns\":
[Link]()}, \"house_price_model.pkl\")\n",
import streamlit as st
import pandas as pd
import joblib
import plotly.graph_objects as go
# ======================
# Load Model and Columns
# ======================
saved = [Link]("house_price_model.pkl")
model = saved["model"]
model_columns = saved["columns"]
# ======================
# Page Config
# ======================
st.set_page_config(page_title="🏠 House Price Prediction", layout="wide")
[Link]("<h1 style='text-align: center; color: #6a0dad;'>🏡 House Price Prediction Dashboard</h1>",
unsafe_allow_html=True)
[Link]("<p style='text-align: center;'>Predict the <b>price of a house</b> based on its features.</p>",
unsafe_allow_html=True)
[Link]("---")
# ======================
# Main Layout: Input & Output Panels
# ======================
input_panel, output_panel = [Link]([1,1])
# ======================
# Input Panel (Left)
# ======================
with input_panel:
[Link]("<h3 style='color:#6a0dad;'>🏠 Enter House Details</h3>", unsafe_allow_html=True)
house_name = st.text_input("House Name (Optional)", "My House")
# Card-style input sections (darker purple background)
area_sqft = st.number_input("📏 Area (in sqft)", 500, 10000, 1500, step=50)
bedrooms = [Link](" Bedrooms", [1,2,3,4,5,6], index=2)
bathrooms = [Link]("🛁 Bathrooms", [1,2,3,4,5], index=1)
location_score = [Link]("📍 Location Score (1-10)", list(range(1,11)), index=4)
age_of_house = st.number_input("⏳ Age of House (years)", 0, 30, 5, step=1)
parking = [Link]("🚗 Parking Available", ["Yes", "No"])
furnishing = [Link](" Furnishing Status", ["Furnished", "Semi-Furnished", "Unfurnished"])
predict_btn = [Link]("🔮 Predict Price")
# ======================
# Encode categorical inputs
# ======================
parking_encoded = 1 if parking=="Yes" else 0
furnishing_map = {"Furnished": 0, "Semi-Furnished": 1, "Unfurnished": 2}
furnishing_encoded = furnishing_map[furnishing]
# ======================
# Prepare Input Data
# ======================
input_data = [Link]([{
"area_sqft": area_sqft,
"bedrooms": bedrooms,
"bathrooms": bathrooms,
"location_score": location_score,
"age_of_house": age_of_house,
"parking": parking_encoded,
"furnishing": furnishing_encoded
}])
input_data = input_data[model_columns]
# ======================
# Prediction & Output Panel (Right)
# ======================
with output_panel:
if predict_btn:
price = [Link](input_data)[0]
# Prediction Card
[Link](f"""
<div style='background-color:#4B0082; padding:20px; border-radius:15px; text-align:center;
color:white;'>
<h3>🏠 Predicted Price for: <b>{house_name}</b></h3>
<h2>💰 ₹ {price:,.2f} Lakhs</h2>
</div>
""", unsafe_allow_html=True)
# Gauge Chart
fig = [Link]([Link](
mode="gauge+number",
value=price,
title={'text': "Predicted Price (Lakhs)"},
gauge={'axis': {'range':[0, 200]},
'bar': {'color': "#800080"},
'steps':[{'range':[0,50], 'color':'#6a0dad'},
{'range':[50,100], 'color':'#7b1fa2'},
{'range':[100,200], 'color':'#800080'}]}
))
st.plotly_chart(fig, use_container_width=True)
# Input Summary Card
summary_data = [Link]({
"Feature": ["Area (sqft)", "Bedrooms", "Bathrooms", "Location Score", "Age of House", "Parking",
"Furnishing"],
"Value": [area_sqft, bedrooms, bathrooms, location_score, age_of_house, parking, furnishing]
})
[Link]("""
<div style='background-color:#4B0082; padding:15px; border-radius:10px; color:white;'>
<h4>📝 Entered House Details</h4>
</div>
""", unsafe_allow_html=True)
[Link](summary_data)
# ======================
# Footer
# ======================
[Link]("---")
[Link]("<p style='text-align: center; color: #6a0dad;'>💡 Built with Streamlit & Random Forest