Housing Price Prediction Model for D. M. Pan National Real Estate Company

Housing Price Prediction Model for D. M. Pan National Real Estate Company

 

Introduction

  1. The current report aims at employing statistical analysis techniques for problem-solving. Particularly, the analysis involves a regression analysis to develop a model that predicts housing prices for homes sold in 2019 for D. M. Pan National Real Estate Company. The information obtained will help real-estate agents predict the median prices of the houses subject to the area in square feet.
  2. The study will answer the question as to whether there exists a relationship between the house price listing and the area in square feet. This will determine if the company can use the model for price prediction.
  3. The regression model is appropriate for the analysis since it shows the relationship between the variables as well as the direction of the relationship in a regression model and a scatter plot. When using a regression line, the scatter plot can either produce a straight line with an upward or a downward trend or a distribution that does not show any linear relationship.
  4. The response variable is the variable that is affected by a change in the variables that predict the outcome, while the predictor variables are the variables that determine the outcome of the response variables. The variable that poses an effect on the other variable is taken as the predictor variable.

Data Collection

  1. A random sample is selected from the data provided in the Excel sheet with house footage values and corresponding prices by selecting the first 50 data cells.
  2. The square footage affects the median housing prices, thus taken as the predictor and response variables, respectively.
  3. Scatter Plot for the Variables

Data Analysis

Listing Price    Square Feet  
       
       
Mean 229130 Mean 1758.5
Standard Error 8912.724795 Standard Error 92.1583121
Median 216200 Median 1630
Mode 254500 Mode 2087
Standard Deviation 63022.48141 Standard Deviation 651.657674
Sample Variance 3971833163 Sample Variance 424657.724

Interpret the graphs and statistics: For both variables, the mean value is larger than the median value. This shows that the values are concentrated at the center, and the shape of the distribution graphs is likely to be peaked at the center. The standard deviation for the listing is higher than the standard deviation for the square footage. Since the mean for the data is larger than the median listing price, it implies that the curves are positively skewed.

The national mean listing price is 342,365. The mean listing for the sample is smaller (229,130) than the national mean listing prices. However, for both the sample and national data, the mean listing price and the mean square footage is larger than the median listing price. This shows that the sample is fully representative of the population (Turner, 2020). The distribution curves for both the sample data and the national data are likely to be identical.

The Regression Model

From the graph above, a regression model can be developed out of the data. The model is obtained as y= 0.009x – 307.91. Although the scatter plot may not be very accurate, it can explain 76% of the change in the variables.

Discuss associations: From the scatter plot and the trend line, there is a positive association between the house footage and the listing prices. This can be derived from the upward trend between the two variables. Removing the outliers leads to a better model.

Find rFrom the regression output, the r value is 0.87. This shows

Order a similar paper

Get the results you need