A practical data analysis post with Python code.
Geospatial Data Science is one of my areas of interest. I find it fascinating how we can visualize data on a map and how — many times — the relationships between the data points present great insights real quickly.
I believe the applicability of this sub area of data science is pretty useful for any business, namely grocery stores, car rentals, logistics, real estate etc. In this post, we will go over a dataset from AirBnb for the city of Asheville, NC, in USA.
Side note: In that city lies one of the most amazing real estates in America, — and I would dare to say in the world. The property pertains to the Vanderbilt family and, during a long time, it was the largest private property in the country. Well, it is so worth a visit, but that’s not the core subject here.
The datasets to be used in this exercise are the AirBnb rentals for the city of Asheville. They can be downloaded directly from their web site in http://insideairbnb.com/get-the-data, under the Creative Commons Attribution 4.0 International License.
Let’s get to work.
The knowledge from this post is mostly from the book referred below (Applied Geospatial Data Science with Python, by David S. JORDAN). So let’s begin importing some modules to our session.
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import pysal
import splot
import re
import seaborn as sns
import folium# For points map
import geoplot.crs as gcrs
import geoplot as gplt
Now notice that some of them might be new for you, as they are for me as well. If needed, use pip install module_name
to install any package needed. In my case, pysal
and geoplot
are new to me, so they had to be installed.
Next, we will read the data from AirBnb.
# Open listings file
listings = pd.read_csv('/content/listings.csv',
usecols=['id', 'property_type', 'neighbourhood_cleansed',
'bedrooms', 'beds', 'bathrooms_text', 'price'…