Aggregate elections data at provided level (ccaa, prov, etc)
Source:R/get_elections_data.R
aggregate_election_data.Rd
Aggregate polling station election results to any chosen territorial level, providing party level ballots, total ballots, number of polling stations and contextual sums.
Usage
aggregate_election_data(
election_data,
level = "all",
by_parties = TRUE,
prec_round = 3,
verbose = TRUE,
short_version = TRUE,
col_id_elec = "id_elec",
col_id_poll_station = "id_INE_poll_station",
col_id_mun = "id_INE_mun",
cols_mun_var = c("pop_res_mun", "census_counting_mun"),
col_id_candidacies = c(id_prov = "id_candidacies", id_nat = "id_candidacies_nat")
)
Arguments
- election_data
A database containing general election data already provided (by other functions or by the user). Database should contain
col_id_elec
,col_id_poll_station
,cols_mun_var
andcol_id_candidacies
columns. Defaults toNULL
.- level
A string providing the level of aggregation at which the data is to be provided. The allowed values are the following: 'all', 'ccaa', 'prov', 'mun', 'mun_district', 'sec' or 'poll_station'. Defaults to
"all"
.- by_parties
A flag indicates whether user wants a summary by candidacies/parties or just global results at given
level
. Defaults toTRUE
.- prec_round
Rounding accuracy. Defaults to
prec_round = 3
.- verbose
Flag to indicate whether detailed messages should be printed during execution. Defaults to
TRUE
.- short_version
Flag to indicate whether it should be returned a short version of the data (just key variables) or not. Defaults to
TRUE
.- col_id_elec, col_id_poll_station, col_id_mun, col_id_candidacies, cols_mun_var
(Optional) Column names for election's id, poll station's id, municipalities' id, candidacies' id and column names for the variables just available at mun level or greater.
Value
A tibble with rows corresponding to the level of aggregation for each election, including the following variables:
- id_elec
election's id constructed from the election code
cod_elec
and datedate_elec
.- cod_elec
code representing the type of election:
"01"
(referendum),"02"
(congress),"03"
(senate),"04"
(local elections),"06"
(cabildo - Canarian council - elections),"07"
(European Parliament elections).- id_INE_xxx
id for the xxx constituency provided in
level
: id_INE_ccaa, id_INE_prov, etc.- xxx
names for the xxx constituency provided in
level
: ccaa, prov, etc.- blank_ballots, invalid_ballots
blank and invalid ballots.
- party_ballots, valid_ballots, total_ballots
ballots to candidacies/parties, valid ballots (sum of
blank_ballots
andparty_ballots
) and total ballots (sum ofvalid_ballots
andinvalid_ballots
).- n_poll_stations
number of polling stations.
- id_candidacies
id for candidacies (at province level).
- id_candidacies_nat
id for candidacies at region national level.
- ballots
number of ballots obtained for each candidacy at each level section.
Details
This function is actually a helper function that, given
an electoral data file with a specific structure, aggregates the
information to the level specified in level
. Data that is
only available at the provincial or municipal level is handled
differently when the aggregation level is below those levels
(for example, CERA data cannot be aggregated below the province,
in which case 52 special constituencies are added). This function
is not intended as a final-use tool for basic users, but rather as
an intermediate step for the summary_election_data()
function.
Examples
## Correct examples
# Election data from 2023 and 1989
election_data <-
get_election_data(type_elec = "congress", year = 2023,
date = "1989-10-29")
#> Get and join election data
#> [x] Checking if parameters are allowed...
#> [x] Importing the following poll station data ...
#> - congress elections on 2023-07-24
#> - congress elections on 1989-10-29
#>
#> [x] Importing candidacies and ballots data (at poll station level) ...
#> ... Please be patient, volume of data downloaded and internet connection may take a few seconds
#> Be careful! Some poll stations does not match individual ballots with summaries provided by MIR. The discrepancies were resolved by using votes by candidacies.
#> ! A short version was asked (if you want all variables, run with `short_version = FALSE`)
# National level results (without parties)
nat_agg <-
election_data |>
aggregate_election_data(level = "all", by_parties = FALSE)
#> Aggregate election data
#> [x] Checking if parameters are allowed...
#> [x] Aggregating data at national level ...
# Province level results (with parties)
prov_agg <-
election_data |>
aggregate_election_data(level = "prov")
#> Aggregate election data
#> [x] Checking if parameters are allowed...
#> [x] Aggregating data at prov level ...
if (FALSE) { # \dontrun{
# ----
# Incorrect examples
# ----
# Wrong examples
# Invalid 'level' argument,"district" is not allowed
aggregate_election_data(election_data, level = "district")
# Invalid 'by_parties' flag: it must be logical, not character
aggregate_election_data(election_data, level = "prov",
by_parties = "yes")
# Invalid parameters: col_id_candidacies should be matched with
# the variable names
aggregate_election_data(election_data, level = "ccaa",
col_id_candidacies = "wrong_id")
} # }