The new french election geography

By Datapleth.io | September 21, 2019

The last European elections were on June 26th, 2019. One of the major observation was the increase of extreme right votes, and more particularily the shift toward extreme right of areas which used to be left voters. In this post we are going to visualize the results of this election designing a new shape for France. The objective is to create a choropleth map of the six main lists of candidates showing the percentage of votes per list for each french communes.

The process to build such maps is pretty straightforward, we need to get the election results commune per commune for each candidates lists. We’ll then products the choropleth maps based on these data, merged.

# load necessary libraires
library(data.table)
library(knitr)
library(kableExtra)
library(dplyr)
library(geojson)
library(broom) # replacing ggplot2::fortify
library(ggplot2)
library(ggthemes)
library(rmapshaper) # used to simplify geojson large files

Election results

We need first to download the official results from data.gouv.fr. The data provided from the french government is not directly usable and we will have to clean and reshape. Some alternatives exists on open data portals but we prefer get the source data directly.

# load election results data
results_uri <- "https://www.data.gouv.fr/en/datasets/r/35170deb-e5f3-4e79-889f-5b9a3f547742"
election_results <- data.table::fread(
  results_uri, encoding = "Latin-1", dec= ","
)

The election result file does not contain the insee code of the “communes” (an equivalent of counties in USA). This information will be necessary to merge the results with the detailled map of France. Thus, we have to build this code using department number and commune code. This is working well except for data outside France mainland.

## ZZ for other countries -> 99
election_results[ , insee := as.numeric(`Code du département`)]
election_results[ 
  `Code du département` == "ZZ",
  insee := 99
  ]
election_results[ 
  `Code du département` %in% c("ZX","ZS","ZM","ZD","ZC","ZB","ZA"),
  insee := 97
  ]
election_results[ 
  `Code du département` %in% c("ZW","ZP","ZN"),
  insee := 98
  ]
election_results[, insee := as.character(insee*1000 + `Code de la commune`)]
election_results[ nchar(insee) == 4, insee := paste0("0",insee)]

The format of the data we obtained is really awfull, headers are not covering all columns. We first select the interesting columns and rename. We’ll use first the results of the main lists. Communes data are store in columns 1 to 18, then each list results are stored in 7 columns.

extract_list <- function(data,list_id = 23){
  # compute list index and extract it
  idx <-  c(1:18, (19+(list_id -1)*7):(19+(list_id -1)*7+6),257)
  sub_data <- data[ , .SD, .SDcols=idx]
  names_clean <- c(
    "departement_id", #[1] "Code du département"   
    "deparement_name", #[2] "Libellé du département"
    "commune_id", #[3] "Code de la commune"    
    "commune_name", #[4] "Libellé de la commune" 
    "inscrits", #[5] "Inscrits"              
    "abstentions", #[6] "Abstentions"           
    "absentions_perc", #[7] "% Abs/Ins"             
    "votants", #[8] "Votants"               
    "votants_perc", #[9] "% Vot/Ins"             
    "blancs", #[10] "Blancs"                
    "blancs_perc", #[11] "% Blancs/Ins"          
    "blancs_perc_votants", #[12] "% Blancs/Vot"          
    "nuls", #[13] "Nuls"                  
    "nuls_perc", #[14] "% Nuls/Ins"            
    "nuls_perc_votants", #[15] "% Nuls/Vot"            
    "exprimes", #[16] "Exprimés"              
    "exprimes_perc", #[17] "% Exp/Ins"             
    "exprimes_perc_votants", #[18] "% Exp/Vot"             
    "liste_id", #[19] "N°Liste"               
    "liste_shortname", #[20] "Libellé Abrégé Liste"  
    "liste_name", #[21] "Libellé Etendu Liste"  
    "tete_de_liste", #[22] "Nom Tête de Liste"     
    "nb_voix", #[23] "Voix"                  
    "nb_voix_perc", #[24] "% Voix/Ins"            
    "nb_voix_perc_exprimes", #[25] "% Voix/Exp" 
    "code_insee"
  )
  setnames(sub_data, names_clean)
  return(sub_data)
}

election_results_rn <- extract_list(data = election_results, list_id = 23)
election_results_lrem <- extract_list(data = election_results, list_id = 5)
election_results_lfi <- extract_list(data = election_results, list_id = 1)
election_results_ps <- extract_list(data = election_results, list_id = 12)
election_results_lr <- extract_list(data = election_results, list_id = 29)
election_results_eelv <- extract_list(data = election_results, list_id = 30)

election_results_sub <- rbind(
  election_results_rn,
  election_results_lrem,
  election_results_lfi,
  election_results_ps,
  election_results_lr,
  election_results_eelv
  )

We obtain finally a clean table with data for the main parties lists as shown in the extract bellow.

# we use kable to generate an html table with nice formatting
knitr::kable(head(election_results_sub, 2)) %>%
  kable_styling(
    bootstrap_options = c(
      "striped"
      , "hover"
      , "condensed"
      , "responsive"
      )
    ) %>% scroll_box(width = "100%")
departement_id deparement_name commune_id commune_name inscrits abstentions absentions_perc votants votants_perc blancs blancs_perc blancs_perc_votants nuls nuls_perc nuls_perc_votants exprimes exprimes_perc exprimes_perc_votants liste_id liste_shortname liste_name tete_de_liste nb_voix nb_voix_perc nb_voix_perc_exprimes code_insee
01 Ain 1 L’Abergement-Clémenciat 601 268 44.59 333 55.41 1 0.17 0.30 18 3.00 5.41 314 52.25 94.29 23 PRENEZ LE POUVOIR PRENEZ LE POUVOIR, LISTE SOUTENUE PAR MARINE LE PEN BARDELLA Jordan 78 12.98 24.84 01001
01 Ain 2 L’Abergement-de-Varey 210 69 32.86 141 67.14 4 1.90 2.84 2 0.95 1.42 135 64.29 95.74 23 PRENEZ LE POUVOIR PRENEZ LE POUVOIR, LISTE SOUTENUE PAR MARINE LE PEN BARDELLA Jordan 22 10.48 16.30 01002

France ‘communes’ shapefiles

Once we have the election results for each communes, we need now to get data to build the map of all communes of France. Such data files are called shapefiles, these are polygons containing the geographical limits of all communes.

We are getting the shapefile stored as geojson formats directly on data.gouv.fr, however they are stored in zip and we store a copy in our cloud storage.

## url of the geojson file downloaded and unzipped from data.gouv.fr
france_sp_uri <- "https://data.datapleth.io/ext/france/spatial/communes-simple/communes-20190101.json"

## let's read directly the file
france_communes <- geojsonio::geojson_read(france_sp_uri,  what = "sp")

Let’s filter the results of France mainland only otherwise we will get a very large map with all french territories located in Pacific ocean, east coast of Africa, or Caribeans. These areas could be interesting but … in a later post.

communes_lim <- france_communes[ ! substr(france_communes@data$insee,1,2)  %in% c(
    "97","98","99"
    ), ]

The dataset we obtain is quite large, as we are going to plot the whole France on a single map, we don’t need such resolution in the limits of communes. Thus we simplify the polygons with a specific algorithm.

communes_lim <- rmapshaper::ms_simplify(communes_lim)

Let’s have a look on this map showing all France mainland communes subdivisions.

# Fortify the data AND keep trace of the commune code.
communes_lim_fortified <- broom::tidy(communes_lim, region = "insee")

# Now I can plot this shape easily as described before:
ggplot() +
  geom_polygon(data = communes_lim_fortified, 
               aes( x = long, y = lat, group = group),
               fill="white", 
               color="grey", size = 0.2
               ) +
  coord_map() +
  theme_tufte() +
  theme(
    axis.line=element_blank()
    , axis.text=element_blank()
    , axis.ticks=element_blank()
    , axis.title=element_blank()
    ) +
  ggtitle("France Subdivisions - Communes")

Election choropleth

We have election results per communes and spatial file per communes. It’s time now to merge both datasets and to a choropleth of vote results as % of expressed votes (not counting absent or null votes).

communes_results <- merge(
  x = election_results_sub
  , y = communes_lim_fortified
  , by.x = "code_insee"
  , by.y = "id"
  , allow.cartesian=TRUE
)

A choropleth map (from Greek χῶρος “area/region” and πλῆθος “multitude”) is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita income. (wikipedia)

Finaly we can plot the results with one map per list for the six main lists.

p <- ggplot() +
  geom_polygon(
    data = communes_results, 
    aes(fill = nb_voix_perc_exprimes, x = long, y = lat, group = group),
    color="grey", size = 0.02
  ) +
  scale_fill_viridis_c(option = "A", direction = -1) +
  facet_wrap(facets = . ~ liste_shortname) +
  coord_map() +
  theme_tufte() +
  theme(
    axis.line=element_blank()
    , axis.text=element_blank()
    , axis.ticks=element_blank()
    , axis.title=element_blank()
    ) +
  ggtitle("European election - 2019 - France") +
  labs(fill = "% of votes")
p