Skip to contents

Required packages

Let’s load the required packages for this tutorial.

Dictionary for parties and candidacies

global_dict_parties
#> # A tibble: 2,124 × 9
#>    id_elec abbrev_candidacies name_candidacies id_candidacies_nat id_candidacies
#>    <chr>   <chr>              <chr>            <chr>              <chr>         
#>  1 02-200… A                  ASAMBLEA DE AND… 000001             000001        
#>  2 02-200… A                  ASAMBLEA DE AND… 000001             000001        
#>  3 02-200… A-IZ               ASAMBLEA DE IZQ… 000002             000002        
#>  4 02-202… AA                 ADELANTE ANDALU… 000012             000012        
#>  5 02-200… ABA                ALIANCA BALEAR   000010             000010        
#>  6 02-199… ABA                ALIANCA BALEAR   000001             000001        
#>  7 02-199… ABE                ALTERNATIVA BAL… 000092             000092        
#>  8 02-200… ABLA               ALTERNATIVA EN … 000012             000013        
#>  9 02-200… ABLA               ALTERNATIVA EN … 000012             000012        
#> 10 02-198… AC-CC              ASAMBLEA CANARI… 000001             000001        
#> # ℹ 2,114 more rows
#> # ℹ 4 more variables: name_candidacies_nat <chr>, color <chr>,
#> #   abbrev_candidacies_unified <chr>, id_candidacies_unified <int>

The dataset global_dict_parties contains, for each election, the abbreviation and name for each candidacy, as well as the national id and province’s id. An hexadecimal code is proposed for the most important candidacies for dataviz purposes. The dataset is of our own creation and it was collected from all the election files by applying import_candidacies_data() function for each election.

The dataset contains 7 columns:

  • id_elec: id of elections (type of elections + date).
  • abbrev_candidacies: party abbreviation.
  • name_candidacies: party name (at each constituency).
  • id_candidacies_nat: national id.
  • id_candidacies: id at each constituency.
  • name_candidacies_nat: party name (common at national level).
  • color: hexadecimal (color) code for some parties.

Several clarifications should be made regarding the construction of this dataset. Firstly, multiple records are provided for each party within a given election, as the id_candidacies_nat (which represents the national-level party identifier) is consistent across all entries. However, each party may appear under different id_candidacies, corresponding to its provincial-level identifiers. For instance, in the 2023 elections, the PSOE is assigned the national ID id_candidacies_nat = "000002" throughout the country, but it is listed under different id_candidacies in some provinces, reflecting its participation through various regional federations in the respective constituencies.”

global_dict_parties |>
  filter(abbrev_candidacies == "PSOE" & id_elec == "02-2023-07-24") |> 
  select(abbrev_candidacies:name_candidacies_nat)
#> # A tibble: 6 × 5
#>   abbrev_candidacies name_candidacies          id_candidacies_nat id_candidacies
#>   <chr>              <chr>                     <chr>              <chr>         
#> 1 PSOE               PARTIT DELS SOCIALISTES … 000002             000047        
#> 2 PSOE               PARTIDO DOS SOCIALISTAS … 000002             000064        
#> 3 PSOE               PARTIDO SOCIALISTA DE EU… 000002             000076        
#> 4 PSOE               PARTIDO SOCIALISTA DE IS… 000002             000029        
#> 5 PSOE               PARTIDO SOCIALISTA DE NA… 000002             000072        
#> 6 PSOE               PARTIDO SOCIALISTA OBRER… 000002             000002        
#> # ℹ 1 more variable: name_candidacies_nat <chr>

If we examine the previous dataset, we can also observe a common name in the name_candidacies_nat variable, which is the one used when aggregating data at the national level—under the most widely recognized branding of the party.

Secondly, certain groupings have been applied in order to simplify the dataset returned to the user. These adjustments, while practical, are worth noting—particularly for political science experts:

Notes about acronyms
  • The predecessor of the Partido Popular (PP) was, in some instances, listed as AP-PDP, in others as AP-PL, and occasionally as AP-PDP-PL. In all such cases, votes for Alianza Popular (AP) and its allied formations have been grouped under the acronym AP-PDP-PL. Additionally, parties identified as PP-FORO, PP-PAR, PP-UP, PP-EU, and PP-CDEG have been consolidated under the PP acronym. The Unión del Pueblo Navarro (UPN) has retained its own acronym, although it shares the same id_candidacies_nat as PP due to their electoral coalition.

  • The Socialist Party of the Basque Country (PSE-EE) has consistently been grouped under the acronym PSOE, despite having run independently in the 1982 and 1986 elections.

  • All federations and coalitions led by Podemos have been grouped under the acronym PODEMOS, even when they appeared on ballots as UNIDAS/OS PODEMOS. Unless explicitly indicated as a coalition, En Comú Podem has been treated as an independent label, using the acronym ECP (also including Guanyem).

  • The Basque Nationalist Party (PNV), whose name in Basque sometimes appears as PNV - EAJ, has been grouped under the acronym PNV.

  • The Andalusian Party has been unified under the acronym PSA-PA across all elections.

  • The Basque party HB has been unified under the full name HERRI BATASUNA.

  • In the context of the Catalan independence movement, the historical party CIU went to the 2023 elections under the name Partit Demòcrata Europeu Català - Espai CIU (PDECAT-E-CIU), after splitting from JUNTS.

  • All federations and coalitions led by Izquierda Unida have been grouped under the acronym IU, including the 2015 UNIDAD POPULAR: ... candidacies, which ran separately from the newly created party PODEMOS. Also grouped under IU are IU-UPEC and candidacies such as Esquerra Unida, Ezker Batua, and Ezker Anitza.

  • Candidacies of the Spanish Communist Party have been grouped under the acronym PCE, although they sometimes appeared officially as PCA-PCE.

  • The now-defunct Partido Socialista Galego (PSG), also appearing as BLOQUE PSG or B-PSG, has been unified under the acronym of the Bloque Nacionalista Galego (BNG), since its various offshoots ultimately merged into BNG. Similarly, BNG includes candidacies such as BNG-NOS.

  • The party or candidacy Navarra Suma has been grouped under the acronym NA-SUMA.

  • The party Coalición Canaria ran under the acronym CC-PNC in 2008, and as CC-NC in 2011 and 2019; all such cases have been grouped under Coalición Canaria.

Last update: 2025/05/25. The code used to generate the dataset can be found at data-raw/dict_parties.R script.

Datasets about elections

Pending to doc

Datasets about surveys

Pending to doc

Auxiliary datasets

Dates of Spanish elections

dates_elections_spain
#> # A tibble: 62 × 8
#>    cod_elec type_elec  date        year month   day topic                pdf_CEB
#>    <chr>    <chr>      <date>     <dbl> <dbl> <int> <chr>                <chr>  
#>  1 01       referendum 1976-12-15  1976    12    15 Ref. Proyecto Ley R… NA     
#>  2 01       referendum 1978-12-06  1978    12     6 Constitución         NA     
#>  3 01       referendum 1986-03-12  1986     3    12 OTAN                 NA     
#>  4 01       referendum 2005-02-20  2005     2    20 Constitución UE      NA     
#>  5 02       congress   1982-10-28  1982    10    28 NA                   NA     
#>  6 02       congress   1986-06-22  1986     6    22 NA                   NA     
#>  7 02       congress   1989-10-29  1989    10    29 NA                   NA     
#>  8 02       congress   1993-06-06  1993     6     6 NA                   NA     
#>  9 02       congress   1996-03-03  1996     3     3 NA                   NA     
#> 10 02       congress   2000-03-12  2000     3    12 NA                   NA     
#> # ℹ 52 more rows

The dataset dates_elections_spain contains the dates of Spanish elections in referendum, congress, senate, municipal, cabildo (Canarian council) and European Parlament elections. The dataset is of our own creation and it contains 62 rows and 7 variables:

  • cod_elec: code of type of elections.
  • type_elec: type of elections (“referendum”, “congress”, “senate”, “local”, “regional”, “cabildo” or “EU”).
  • date: date of election in "YYYY-MM-DD" format.
  • year, month, day: year, month and day of election.
  • topic: topic (just for referendum).

Last update: 2025/05/25. The code used to generate the dataset can be found at data-raw/dates_elections_spain.R script.

INE’s code for ccaa and provinces

cod_INE_prov_ccaa
#> # A tibble: 52 × 5
#>    cod_INE_ccaa cod_MIR_ccaa ccaa      cod_INE_prov prov   
#>    <chr>        <chr>        <chr>     <chr>        <chr>  
#>  1 01           01           Andalucía 04           Almería
#>  2 01           01           Andalucía 11           Cádiz  
#>  3 01           01           Andalucía 14           Córdoba
#>  4 01           01           Andalucía 18           Granada
#>  5 01           01           Andalucía 21           Huelva 
#>  6 01           01           Andalucía 23           Jaén   
#>  7 01           01           Andalucía 29           Málaga 
#>  8 01           01           Andalucía 41           Sevilla
#>  9 02           02           Aragón    22           Huesca 
#> 10 02           02           Aragón    44           Teruel 
#> # ℹ 42 more rows

The dataset cod_INE_prov_ccaa contains the codes provided by INE of Spanish provinces and regions. It contains a tibble with 52 rows and 5 variables:

  • cod_INE_ccaa: code of region according INE.
  • cod_MIR_ccaa: code of region according Spanish Ministry of the Interior (MIR).
  • ccaa: name of region.
  • cod_INE_prov: code of province.
  • prov: name of province.

Data was extracted from https://www.ine.es/daco/daco42/codmun/cod_ccaa_provincia.html.

Last update: 2025/05/25 (data updated by INE on 2023/02/25). The code used to generate the dataset can be found at data-raw/cod_INE_prov_ccaa.R script.

INE’s code for municipalities

cod_INE_mun
#> # A tibble: 8,131 × 10
#>    id_INE_mun id_MIR_mun cod_INE_ccaa cod_MIR_ccaa cod_INE_prov cod_INE_mun
#>    <glue>     <glue>     <chr>        <chr>        <chr>        <chr>      
#>  1 16-01-051  14-01-051  16           14           01           051        
#>  2 16-01-001  14-01-001  16           14           01           001        
#>  3 16-01-002  14-01-002  16           14           01           002        
#>  4 16-01-049  14-01-049  16           14           01           049        
#>  5 16-01-003  14-01-003  16           14           01           003        
#>  6 16-01-006  14-01-006  16           14           01           006        
#>  7 16-01-037  14-01-037  16           14           01           037        
#>  8 16-01-008  14-01-008  16           14           01           008        
#>  9 16-01-004  14-01-004  16           14           01           004        
#> 10 16-01-009  14-01-009  16           14           01           009        
#> # ℹ 8,121 more rows
#> # ℹ 4 more variables: cd_INE_mun <dbl>, mun <glue>, ccaa <chr>, prov <chr>

The dataset cod_INE_mun contains the codes provided by INE of Spanish municipalities. Data was extracted from https://www.ine.es/daco/daco42/codmun/codmun20/20codmun.xlsx and it contains a tibble with 8131 rows and 10 variables:

  • id_INE_mun, id_MIR_mun: full id of municipalities (combining ccaa, province and mun’s id).
  • cod_INE_ccaa, cod_MIR_ccaa: code of regions.
  • ccaa: name of regions.
  • cod_INE_prov: code of provinces.
  • prov: name of provinces.
  • cod_INE_mun: code of municipalities.
  • cd_INE_mun: check digit (see https://www.ine.es/daco/daco42/codmun/codmun00i.htm).
  • mun: name of municipalities.

The municipality data (names and codes) were extracted from the version published by the National Statistics Institute (INE) on February 6, 2025. Over the years, various municipal mergers have taken place in Spain, which means that not all elections feature the same set of municipalities or the same identifying codes. In order to unify and standardize the results provided to users, all output tables refer to the most recent municipality recoding by the Spanish National Statistics Institute (INE). The function recod_mun() transparently returns the updated code based on that recoding, reassigning codes for merged municipalities accordingly.

Data extracted from https://www.ine.es/daco/daco42/codmun/codmun20/20codmun.xlsx.

Last update: 2025/05/25 (data updated by INE on 2020/01/01). The code used to generate the dataset can be found at recod_mun()