Data preparation is one of the most difficult steps in any Machine Learning (ML) project. Each dataset is different and highly specific to the project and each predictive modeling project with ML is different, but there are common steps performed on each project.

This post assumes that the reader has a basic understanding of ML concepts and terminology.

Data Cleaning

Machine learning models can be divided into four broad categories/types based on what they’re used for —

  1. Classification — categorizing of instances
  2. Regression — prediction of continuous values
  3. Clustering — finding logical groupings that exist in a dataset
  4. Dimensionality reduction…

geopy is a Python 2 and 3 client for several popular geocoding web services.

geopy makes it easy for Python developers to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocoders and other data sources.

Geocoding is provided by a number of different services, which are not affiliated with geopy in any way. These services provide APIs, which anyone could implement, and geopy is just a library which provides these implementations for many different services in a single package.

It’s possible to geocode a pandas DataFrame with geopy, however, rate-limiting must be taken into…


