I thought this was a great article on regression that I wanted to reprint from WorkingRE. You can find the link to the entire article below.
Editor’s Note: In life after Collateral Underwriter (CU), appraisers are eager to understand how they can create statistical support for their adjustments and value results. In this story, author James Swartz provides a good primer for understanding regression and appraising.
Appraising with Regression
By James A Swartz, PhD.
As a real estate appraiser, you are interested in determining how a set of characteristics such as the number of baths and bedrooms, total s As a real estate appraiser, you are interested in determining how a set of characteristics such as the number of baths and bedrooms, total square feet, and others, affects the value of a property.
Property characteristics that affect value are called independent or predictor variables because they help predict what the property is worth. An estimated sales price is called the dependent variable because it depends on the predictor variables.
One statistical tool used to estimate value is called regression analysis. Regression techniques have long been central to the field of economic statistics (“econometrics”). Increasingly, they have become important tools for appraisers as well.
In an appraisal regression model, the dependent variable or sales price is “regressed” on a set of property characteristics to determine how much of the variation in the sales prices, of geographically comparable properties, are due to the variation in the set of property characteristics. The higher the percentage of variation in the sales prices of like properties that can be “explained” by the included set of property characteristics, the more accurate the prediction of the property value. A thorough regression analysis will therefore improve your ability to accurately appraise property values beyond what can be done by “guestimating” or looking at only one or a few property characteristics and comparing only a few similar properties.
The best way to begin to understand how regression modeling works in the context of property appraisal is to explain the approaches used and then walk through an example.
Regression Modeling Approaches
Let’s begin by analyzing a few different methods of regression analysis. Simple linear regression, also called Bivariate regression, assesses the relationship or association between a single dependent variable, such as a sales price, and a single independent or predictor variable, such as square footage.
Multiple linear regression assesses the relationship or association between a single dependent variable, such as sales price, and multiple independent or predictor variables, such as square footage, lot size and age of the property.
Hedonic regression is an even higher level of analysis because it can be used to estimate the contributory value of each property characteristic on an estimated property value. There are four steps in the process:
First, a group of properties in geographic proximity to the subject is selected from the MLS database. The number of properties selected for analysis should be large enough (no fewer than 200 properties) to allow for the examination of the variation in property characteristics as well as in sales prices. Greater variation in the characteristics studied is desired, rather than a narrow comp selection based on strict comparability, because greater variation improves the accuracy of the estimated effects on sales price for each characteristic across a broader range of possible values for that characteristic.
Then, a set of simple regression models is run using the selected properties. A separate simple or Bivariate regression (i.e., two-variable) is run for each property characteristic. Each simple regression is run on the full data set containing information on the 200+ properties. The purpose of this step is to screen for those property characteristics that can be reliably associated with variations in the sales prices of the properties in the data set.
In the third step of the process, those property characteristics that are found to be reliably associated (i.e., statistically significant) with the sales prices of properties, are included in a multivariable regression model (i.e., many variables). Using the full data set, the multivariable regression assesses the association between sales price, the single dependent variable, and a set of multiple predictors such as square footage, lot size, and age of the property.
This model determines how well the remaining property characteristics individually and as a set predict variation in sales prices. Importantly, the model examines the associations among the property characteristics themselves. Only those characteristics that uniquely predict sales price are retained in the model. Characteristics are dropped that overlap substantially with other characteristics already in the model. Only the best predictor set is retained.
In the fourth and final step, the appraiser selects a small subset (3) of properties comparable to the subject property. A map displaying properties by their proximity to the subject assists in this step. Using the predicted sales prices for these properties and the valuation for each property characteristic in the final multivariable model, a final value is obtained for the appraised property.
Overcoming Math Anxiety
People new to regression modeling often have questions about the technique and concerns about their ability to use and interpret the more complex but more accurate multivariable models. However, once a level of familiarity is developed and “math anxiety” is overcome, multivariable regression models become an indispensible tool for developing appraisals.
Although simple or Bivariate linear regression models are appealing because of their simplicity, they will not produce as accurate a result as a multivariable regression analysis for a number of reasons. First, the selection of which variable(s) to use might leave out potentially important factors. Second, Bivariate regression does not control for the associations among the predictor variables themselves. For instance, the number of bedrooms and square footage might both have a very strong association with sale price when considered independently in separate Bivariate models. When considered simultaneously in a multivariable model, the association of the number of bedrooms with sales price might not be as substantial relative to the association with square footage.
Multivariable models will automatically adjust for the interdependency of these two (or more) predictors while Bivariate models will not. As a consequence, if you use two Bivariate models to develop your estimate, you would overestimate the sale price because you would be giving too much weight to the number of bedrooms within the context of the property’s square footage. The degree of error introduced by adding together the results of Bivariate models increases as you consider more and more predictors. In the end, it is much easier to use a multivariable model to make the correct adjustments and estimate a value without having to do all of the tedious mental and mathematical work of combining separate estimates.
Remember, regression analysis is not a substitute for traditional appraisal practices as much as it is a complement to your experience and judgment. It will help you identify the most salient features for estimating the value for a given property, possibly including some you might not have thought were important but which turn out to be, based on sales of other properties in the same area.
About the Author
James Swartz, Ph.D. is an Associate Professor in the Jane Adams College of Social Work at the University of Illinois at Chicago. He obtained his Doctorate in clinical psychology from the Northwestern University Feinberg School of Medicine (1990) and also has a Masters degree in research methods and statistics from Loyola University of Chicago (1982). He has authored over 50 publications in peer-reviewed journals, the majority of which use advanced statistical analytic techniques.