Summaries of the electoral and candidacies ballots data for a given aggregation level (ccaa, prov, etc)
Source:R/get_elections_data.R
summary_election_data.Rd
pending Import, preprocess and aggregate election data at the same time for a given election and aggregation level. This function also lets remove parties below a given vote share threshold.
Usage
summary_election_data(
type_elec,
year = NULL,
date = NULL,
level = "all",
by_parties = TRUE,
method = NULL,
threshold = 0.03,
short_version = TRUE,
verbose = TRUE,
filter_porc_ballots = NA,
filter_candidacies = NA,
prec_round = 3,
CERA_remove = FALSE,
col_abbrev_candidacies = "abbrev_candidacies",
col_id_elec = "id_elec",
col_id_poll_station = "id_INE_poll_station",
col_id_mun = "id_INE_mun",
cols_mun_var = c("pop_res_mun", "census_counting_mun"),
col_id_candidacies = c(id_prov = "id_candidacies", id_nat = "id_candidacies_nat")
)
Arguments
- type_elec
Type elections for which data is available. It should be one of the following values: "referendum", "congress", "senate", "local", "cabildo" (Canarian council) or "EU".
- year
A vector or single value representing the years of the elections to be considered. Please, check in
dates_elections_spain
that elections of the specified type are available for the provided year.- date
A vector or single value representing the dates of the elections to be considered. If date was provided, it should be in format %Y-%m-%d (e.g., '2000-01-01'). Defaults to
NULL
. If no date was provided,year
should be provided as numerical variable. Please, check indates_elections_spain
that elections of the specified type are available.- level
A string providing the level of aggregation at which the data is to be provided. The allowed values are the following: 'all', 'ccaa', 'prov', 'mun', 'mun_district', 'sec' or 'poll_station'. Defaults to
"all"
.- by_parties
A flag indicates whether user wants a summary by candidacies/parties or just global results at given
level
. Defaults toTRUE
.- method
A string vector providing the methods of apportionment to be used. The allowed values are the following:
"D'Hondt"
(or"Hondt"
or"hondt"
),"Hamilton"
(or"hamilton"
or"Vinton"
or"vinton"
),"Webster"
(or"webster"
or"Sainte-Lague"
or"sainte-lague"
),"Hill"
(or"hill"
or"Huntington-Hill"
or"huntington-hill"
),"Dean"
(or"dean"
) or"Adams"
(or"adams"
) or"Hagenbach-Bischoff"
(or"hagenbach"
) (or"bischoff"
) or"First Past the Post"
(or"first"
) (or"fptp"
). Defaults to"Hondt"
.- threshold
A numerical value (between 0 and 1) indicating the minimal percentage of votes needed to obtain representation for a given electoral district. Defaults to
0.03
.- short_version
Flag to indicate whether it should be returned a short version of the data (just key variables) or not. Defaults to
TRUE
.- verbose
Flag to indicate whether detailed messages should be printed during execution. Defaults to
TRUE
.- filter_porc_ballots
A numerical argument representing the vote percentage threshold (out of 100) that the user wants to use to filter the parties (as long as
by_parties = TRUE
). Defaults toNA
.- filter_candidacies
A string of characters (or vector of them) containing party abbreviations which ballots will be filtered (as long as
by_parties = TRUE
). Defaults toNA
.- prec_round
Rounding accuracy. Defaults to
prec_round = 3
.- CERA_remove
Flag to indicate whether it should be removed the ballots related to CERA constituencies. Defaults to
FALSE
.- col_abbrev_candidacies
Column name to uniquely identify the party abbreviations. Defaults to
"abbrev_candidacies"
.- col_id_elec, col_id_poll_station, col_id_mun, col_id_candidacies, cols_mun_var
(Optional) Column names for election's id, poll station's id, municipalities' id, candidacies' id and column names for the variables just available at mun level or greater.
- candidacies_data
A database containing the information of candidacies. Database should contain
col_abbrev_candidacies
andcol_id_candidacies
columns. Defaults toNULL
.
Value
A tibble with rows corresponding to the level of aggregation for each election, including the following variables:
- id_elec
election's id constructed from the election code
cod_elec
and datedate_elec
.- id_INE_xxx
id for the xxx constituency provided in
level
: id_INE_ccaa, id_INE_prov, etc. It is only provided for long version.- xxx
names for the xxx constituency provided in
level
: ccaa, prov, etc.- ballots_1, ballots_2
number of total ballots and turnout percentage in the first and second round (if applicable). It is only provided for long version.
- blank_ballots, invalid_ballots
blank and invalid ballots.
- party_ballots, valid_ballots, total_ballots
ballots to candidacies/parties, valid ballots (sum of
blank_ballots
andparty_ballots
) and total ballots (sum ofvalid_ballots
andinvalid_ballots
).- porc_candidacies_parties, porc_candidacies_valid, porc_candidacies_census
perc (%) values of
ballots
for each candidacy related toparty_ballots
,valid_ballots
andcensus_counting_xxx
, respectively.- n_poll_stations
number of polling stations. It is only provided for long version.
- pop_res_xxx
population census of residents (CER + CERA) at xxx level. It is only provided for long version.
- census_counting_xxx
population eligible to vote after claims at xxx level. It is only provided for long version.
- id_candidacies
id for candidacies: national ids when
level = "all"
and province ids otherwise.- abbrev_candidacies, name_candidacies
acronym and full name of the candidacies.
- ballots
number of ballots obtained for each candidacy at each level section.
- seats
number of seats
Details
This function chains the two lower-level helpers get_election_data()
,
which imports and cleans polling-station and candidacy ballots, and
aggregate_election_data()
, which rolls those data up to the requested
territorial level. Then, this function performs a final round of post-processing
so that the user obtains, in a single call, a tidy table with information of the
chosen date and aggregation level that is ready for analysis or visualisation.
Examples
## Correct examples
# Summary 2023 election data at prov level,
# aggregating the candidacies ballots, in a short version
summary_prov <-
summary_election_data(type_elec = "congress", year = 2023,
level = "prov")
#> Summary election data
#> [x] Checking if parameters are allowed...
#>
#> Get and join election data
#> [x] Checking if parameters are allowed...
#> [x] Importing the following poll station data ...
#> - congress elections on 2023-07-24
#>
#> [x] Importing candidacies and ballots data (at poll station level) ...
#> ... Please be patient, volume of data downloaded and internet connection may take a few seconds
#> Aggregate election data
#> [x] Checking if parameters are allowed...
#> [x] Aggregating data at prov level ...
#>
#> [x] Join information sources and last summaries ...
#> [x] Including candidacies info and summaries ...
if (FALSE) { # \dontrun{
# Summary 2023 and April 2019 election data at mun_district level,
# aggregating the candidacies ballots, in a long version
summary_mun_district <-
summary_election_data(type_elec = "congress",
year = 2023,
date = "2019-11-10",
level = "mun_district",
short_version = FALSE)
# Summary 2023 election data at prov level,
# aggregating the candidacies ballots, in a short version, and
# removing the CERA votes
summary_prov <-
summary_election_data(type_elec = "congress",
year = 2023,
level = "prov",
short_version = FALSE,
CERA_remove = TRUE)
# Summary 2023 election data at prov level, aggregating the
# candidacies ballots, in a long version, calculating the number
# of seats for each party in each province and filtering ballots
# above 45% (percentage between 0 and 100)
summary_prov <-
summary_election_data(type_elec = "congress", year = 2023,
date = "2016-06-26", level = "prov",
short_version = FALSE,
method = "d'hondt",
filter_porc_ballots = 45)
# Summary 2023 election data at mun level, aggregating the
# candidacies ballots, in a long version, and filtering ballots
# above 45% (percentage between 0 and 100) and just PP and PSOE
# parties
summary_mun <-
summary_election_data(type_elec = "congress", year = 2023,
date = "2016-06-26", level = "mun",
short_version = FALSE,
filter_candidacies = c("PSOE", "PP"),
filter_porc_ballots = 45)
# ----
# Incorrect examples
# ----
# Wrong examples
# Invalid election type: "national" is not a valid election type
summary_election_data(type_elec = "national", year = 2019)
# Invalid date format: date should be in %Y-%m-%d format
summary_election_data(type_elec = "congress", date = "26-06-2016")
# Invalid short version flag: short_version should be a
# logical variable
summary_election_data(type_elec = "congress", year = 2019,
short_version = "yes")
# Invalid aggregation level
summary_election_data("congress", 2019, level = "district")
# Invalid method
summary_election_data("congress", 2019, method = "don")
# threshold falls outside the valid range of 0 to 1
summary_election_data("congress", 2019, method = "dhondt",
threshold = 1.3)
# filter_porc_ballots outside range 0 from 100
summary_election_data("congress", 2019,
filter_porc_ballots = 150)
# filter_porc_ballots supplied while by_parties = FALSE
summary_election_data("congress", 2019,
by_parties = FALSE,
filter_porc_ballots = 5)
# filter_candidacies supplied while by_parties = FALSE
summary_election_data("congress", 2019,
by_parties = FALSE,
filter_candidacies = c("PP", "PSOE"))
} # }