Category Archives: R

Moving from Excel to R-What Do I Need To Replicate in R?

This is the second post in a series describing my journey to move my residential appraisal business workflow from Microsoft Excel to R. Last time out, I made the case for why I’m making the change. This post will be a catalog of the ways I use Excel today to serve as a guide for where I need to go.

I use Excel a lot. Each appraisal I work on, I start a separate Excel file. Us appraisers are required to retain for each report a work file that supports our conclusions and allows for someone else auditing us to understand what we did.

Here’s what I do with Excel today:

  • Store my Neighborhood Market Data downloaded from MLS. I grab all sales in a competitive market area for at least five years back, sometimes ten years. This goes into a Market worksheet.
  • Store my Competitive Market Data downloaded from MLS. I grab all potential competitive comparable sales and listings going back at least 12 months but frequently further. This goes into a Comparables worksheet.
  • Create Neighborhood Price Per Square Foot and/or Sale Price trendlines from the Market data. If I have questions about trends, I’ll also look at changes in floor area over time. I create Charts for each data run. I then spend time formatting and labeling so I can include the charts in my reports.
  • Create Pivot Table summaries of the Neighborhood Market Data. My normal summary table includes all sales summarized by month in a neighborhood with homes sold, mean Days on Market, Low Price, High Price, Mean Sale Price, and Mean PSF. I use a template and replace my old market data with the new data, then refresh, so right now this is really fast in Excel. However, I can’t do Median summaries in Pivot Tables easily, an issue that I expect to be able to handle in R. For most of my reports, I include this pivot table summary.
  • I use the pivot table summaries to create a column chart showing 12 month change in mean PSF and/or mean sale price as another tool to understand and report changes in my neighborhoods. This is especially important in seasonal markets like Davis, California, where home selling revolves around the university schedule. I’ll include this chart in every Davis appraisal and in other appraisals where necessary to explain market trends.
  • I occasionally create histograms to show the shape of a market with regard to one variable (sale price and floor area primarily-great for showing where the subject lies in relation to the rest of the market). I’ve seen an example how easy it is to create a histogram in R. I have high expectations that R will be an improvement over Excel for histograms.
  • Create PSF and Sale Price scatter graphs of competitive sales. I use the trendline coefficients to determine daily price adjustments for market change. I’ll also look at floor area over time to see if my comparables are changing over time or not to help understand what my market is really doing. I include the scatter graphs in my reports so clients and intended users can understand the subject’s sub-market.
  • I use pivot tables linked to my comparable sales data for contrasting one variable. For example, I’ll use pivot tables to examine the difference between homes sold with pools and without pools, a significant factor in the Sacramento Valley. I’ll create a table that shows how many comparables sold with pools vs. without pools, the difference in mean sale price, the difference in mean PSF, and to understand my data, mean floor area and mean year built to see if I’m dealing with an apples-to-apples vs. apples-to-oranges comparison. If the homes with pools are relatively similar in size and age as the ones without pools, my adjustment is more likely to be strictly the pool. If homes with pools tend to be bigger than without, I have to consider covariance as part of the explanation for the differences noted. (Covariance is a significant issue in residential real estate markets)
  • I have a Calculators page that I use for random modeling and calculations I need to do by hand. The most significant calculators I have here that I’ll need to move to R are one I use to do the math to figure out the time adjustments for comparables plus others for modeling lot size adjustments. These should be painless to move over to R.

This is the bulk of what I do with Excel today. As I start to shift this workflow over to R, I plan to go into more detail about the special or not so special challenges I encounter. I also have high expectations that R will inspire me to come up with new ways of analyzing and presenting my data.

Reminder for appraiser readers in particular: R is a tool. Excel is a tool. Most of what I plan to discuss in this series is about changing tools. Occasionally, I’ll talk about modeling decisions (like covariance above). However, all of what I’m doing is rooted in the Stats, Graphs, and Data Sciences classes I’ve taken. You need to understand the theory so you can make informed decisions about your modeling choices.

Take classes from George, he’s very willing to help. https://georgedell.com/

Why I’m Switching to R

I’ve come to rely heavily on Microsoft Excel over the years to do my work as a residential appraiser. So much so that I teach classes to other appraisers on how to use Excel in their work. However, after taking George Dell’s first R for Appraisers class recently, I’ve decided to completely revise my workflow and replace what I do in Excel as much as possible with R.

What’s R? might be your first question. R is a free data analytics software package used widely for data analysis. Most university economics programs teach with R these days. Here’s the official description (link). You can download a free copy here.

However, if you’re going to use R, you need to use RStudio, the free integrated development environment for R. It provides a way of seeing more and doing more with the basic R programming language and really extends what you can do with R. Free copy and more information here.

Why would someone who has invested heavily in developing skills in Excel move to a brand new software package? Here are my reasons:

  • Data Analytics vs. Spreadsheets Data Retention-R is software designed for data analytics from the ground up. That’s what appraisers do, relatively specialized data analytics. Spreadsheets were designed to replace paper ledgers. You can scribble all over a sheet. If you’re not careful, you’re very likely to write on your numbers and make a mess. This is a problem if you’re trying to preserve your data in the future, say if you need to maintain workfiles like appraisers are legally required to do. R solves this data retention issue.
  • Reusable Processes-R is designed from the ground up to be reusable. Drop your data in and get your results. I’ve done a lot to get Excel to work that way for me but I still do a ton of manual processes each time I work on an appraisal. Once I know what I’m doing in R, I’ll have a lot fewer manual processes to deal with. It will be easy for someone else to audit my analysis for appraising.
  • Superior Analysis-I use pivot tables a lot in Excel. In fact, I teach a class on using pivot tables for appraisers. The big drawback with Excel pivot tables is that I can’t include median values as part of my summaries without a lot of work (coding or buying someone else’s product). This is not an issue in R.
  • Personal Growth-R gives me an opportunity to learn new ways to analyze data, the underlying job function that has given me the most satisfaction throughout my career. I’m excited to learn new ways to do my job better and I expect big changes once I’ve completed the move to R. Also, if this appraising gig doesn’t work out, with R I’ll have a job skill in demand in other industries.

I plan to document my migration from Excel to R here. I’ll share resources I find useful and will discuss issues I run into. I also plan to describe the benefits and drawbacks. This is mainly for me, and maybe Abdur Abdul-Malik and Bruce Hahn, the other appraisers I know making the same journey. And maybe I can help George Dell come up with ideas on how to spread the word to the rest of the industry. Thanks again George for the inspiration.