South Africa - Crime Data Visualization

Interactive maps with Python and Geemap

Featured on Hashnode
South Africa - Crime Data Visualization

Introduction

SAPS (South African Police Service) has provided crime data via Kaggle which contains the history of crime statistics from 2004 to 2015 per province and station in South Africa. The dataset contains a set of shapefiles and CSV files.

Data Sourcekaggle.com/slwessels/crime-statistics-for-s..

Let's see if we can generate an insightful analysis and report using this data by leveraging Python and Geemap for this project.

Credits of Cover Image - Photo by Simone Busatto on Unsplash

Implementation

We will analyze the data fully using Python. To do so, we begin with importing necessary packages.

If you are interested in watching the video explanation, then the below is for you.

import Packages

We are going to use certain packages like geopandas, geemap, etc to read data from shapefiles easily. We will need to import this along with some additional packages.

Earth Engine Initialization

The next step is to initialize the earth engine module. We do this in order to visualize the spatial data integrating with geemap.

(You will need to authenticate the earth engine package after signing up on the Google Earth Engine website).

Geocoder

In order to load the map centering the South Africa or the province of South Africa, we need a location. For that purpose, we require a geocoder to avoid hard-coded coordinate values.

(To know more about OpenRouteSerivices, you can refer to this blog).

ORS Token

Location Fetcher

South Africa location

Data Loading / Reading

The dataset contains two CSV files and one shapefile. We can load CSV files using pandas and shapefile using geopandas or geemap.

The sizes of the data can be known using the shape attribute.

String data manipulation can be done with the apply() method.

In order to analyze the data per each province and district, we also need some additional data (shapefiles) which we can easily obtain from diva-gis.org. Because we need South African states and districts, we can simply search for that to get the shapefile data.

Exploratory Data Analysis

Since South Africa contains 9 provinces, we can graph a bar chart to visualize the population data of each province.

We can also directly visualize the population data on the map in the form of a choropleth map. We can leverage geemap to get an interactive map plot. For that, we first need to merge sa_states with popn_province_df.

Choropleth Map of Population

popn_choropleth.PNG

In the above map, the darker region indicates the higher population and vice-versa.

The list of the total number of crime categories.

If we want to know the number of cases reported in each station for a selected crime category and year belonged to a particular province, then we shall have a function wherein we can pass the inputs and get the result accordingly.

Murders in the year 2006 at Western Cape

Murders in the year 2010 at Eastern Cape

Mapping Stations per district at a selected province

We can also plot the stations where the crimes are reported. We can show that per each district of a selected province. The following function helps us achieve that.

Murders reported in the year 2006 at the stations of Western Cape

stations_crime_reported.PNG

Murders reported in the year 2010 at the stations of Eastern Cape

stations_crime_reported_ec.PNG

Choropleth Map of Districts

As a matter of fact, we can also plot a choropleth map with respect to total crimes reported in a selected province in each district irrespective of the station that the crime has been reported. This is done by filtering the data based on the year that is passed.

We will incorporate some amazing functionalities of geopandas such as spatial join.

(More details about spatial joins will be explained in the coming posts).

Murders in the year 2010 at Eastern Cape

choropleth_ec.PNG

Murders in the year 2008 at Gauteng

choropleth_g.PNG

If we explore the first five rows of the crime data, we can see that it contains years as columns.

We shall sum all the years' data column-wise to end up having total crimes being reported per station. Using this, we can plot a choropleth map (of a different kind) based on province and districts. Again, we will incorporate spatial join techniques for district-wise visualization.

Total Murders at Western Cape

wc_allcrimes.PNG

We can change the parameters in every function to get the result according to the province, year, and crime category that are passed.

You can find my notebook with the full code here.


Well, that's it for this article, you can subscribe to my newsletter for such exclusive content. Thanks all.