Francisco Riaño

Machine Learning: Airbnb Optimal Price Estimator App

Facebook
Twitter
LinkedIn

How can we use machine learning and R shiny to develop an app that estimates optimal accommodation prices?

Machine Learning is, nowadays, one of the most popular fields within data science due to its capabilities related to pattern identification, data simplification, and forecast elaboration.

Motivation

Machine learning allows us to elaborate different models through training and testing data to create accurate predictions. Based on this, an app was elaborated to help new landlords to estimate the optimal price for their properties, in Amsterdam, in the case that the property is desired to be listed on Airbnb. The app also enables new landlords to get an overview of the properties listed on Airbnb, detailed by neighborhood. 

Method

The elaboration of the project was based on a random forest algorithm that does not just allow us to predict the outcome variable, which was the optimal price for an Airbnb accommodation based on some attributes such as neighbourhood or type of room, but also to identify the most essential variables to be included in the model. The following steps were carried out to obtain the desired outcome.

  1. Download the data and clean it.
  2. Determine the most influential variables in the model, using “randomForest” function in R, to be included in the regression. 
  3. Deploy a correlation matrix to avoid including, in the model, two or more variables highly correlated.
  4. Run the model by splitting the data into training data (70% of the dataset) and testing data (30% left) 
  5. Use the “predict” function in R to predict the outcome variable based on the specific inputs selected at the beginning of the process. 
Figure 1: Output obtained by the Random Forest that shows the variables and their level of influence on the outcome variable which is the price accommodation.
Results and conclusions

After the whole analysis was carried out, an app supported by R Shiny was developed to show in a more dynamic, friendly, and practical way the main outcomes obtained through the whole model. You can find the app through the following link: www.franciscoriano.shinyapps.io/Airbnb-price-estimation/

It is important to recall that this web app is just a prototype and indeed, new features and adjustments are planned to be included in the upcoming versions.

As main conclusions, it is possible to highlight the functionality and pragmatism of the random forest in machine learning as a tool to classify treatment variables and predict outcome variables. Additionally, it was understood that in order to carry out a machine learning solution successfully is not just a matter of code and mathematics but also it is important to know the context behind the variables and information included in the model. 

This project was made with the collaboration of Lorenzo Taddei

Leave a Comment