Package 'uncertainUCDP'

Title: Parametric Mixture Models for Uncertainty Estimation of Fatalities in UCDP Conflict Data
Description: Provides functions for estimating uncertainty in the number of fatalities in the Uppsala Conflict Data Program (UCDP) data. The package implements a parametric reported-value Gumbel mixture distribution that accounts for the uncertainty in the number of fatalities in the UCDP data. The model is based on information from a survey on UCDP coders and how they view the uncertainty of the number of fatalities from UCDP events. The package provides functions for making random draws of fatalities from the mixture distribution, as well as to estimate percentiles, quantiles, means, and other statistics of the distribution. Full details on the survey and estimation procedure can be found in Vesco et al (2024).
Authors: David Randahl [cre, aut]
Maintainer: David Randahl <[email protected]>
License: MIT + file LICENSE
Version: 0.5.2
Built: 2025-01-29 03:22:07 UTC
Source: https://github.com/doktorandahl/uncertainucdp

Help Index


Mean, median, and quantiles of the parametric uncertainty distributions for UCDP events

Description

Mean, median, and quantiles of the parametric uncertainty distributions for UCDP events. The parametric uncertainty distributions are based on the reported-value inflation Gumbel mixture distribution. The median and quantile functions are shortcuts for the quncertainUCDP function.

Usage

mean_uncertainUCDP(fatalities, tov = c("sb", "ns", "os", "any"))

median_uncertainUCDP(fatalities, tov = c("sb", "ns", "os", "any"))

quantiles_unceartainUCDP(probs, fatalities, tov = c("sb", "ns", "os", "any"))

Arguments

fatalities

A vector of non-negative integers representing the number of fatalities of the UCDP events. Non-integer values are allowed but should be considered experimental.

tov

A character string representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any". The options are:

* "sb" for state-based violence * "ns" for non-state violence * "os" for one-sided violence * "any" for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category.

probs

A numeric vector of probabilities with values in [0,1]. The quantiles to calculate.

Value

A numeric vector of the same length as the input vector of fatalities representing the means, medians, and quantiles of the parametric uncertainty distribution for each UCDP event.

Examples

data(ucdpged)

# Calculate the mean for an arbitrary UCDP event
mean_uncertainUCDP(fatalities = 100, tov = 'sb')

# Calculate the mean for the first event in the UCDP GED sample
mean_uncertainUCDP(ucdpged$best[1], tov = ucdpged$type_of_violence[1])

# Calculate the median for an arbitrary UCDP event
median_uncertainUCDP(fatalities = 100, tov = 'sb')

# Calculate the median for the first event in the UCDP GED sample
median_uncertainUCDP(ucdpged$best[1], tov = ucdpged$type_of_violence[1])

# Calculate the 90th percentile for an arbitrary UCDP event
quantiles_unceartainUCDP(probs = 0.9, fatalities = 100, tov = 'sb')

# Calculate the 90th percentile for the first event in the UCDP GED sample
quantiles_unceartainUCDP(ucdpged$best[1], 0.9, tov = ucdpged$type_of_violence[1])

Parametric uncertainty distributions for UCDP events

Description

Density, distribution, quantile and random number generation functions for the parametric reported-value inflated Gumbel mixture distribution for UCDP events. The functions estimate the parameters of the distribution based on the number of fatalities and the type of violence of the UCDP event.

Usage

runcertainUCDP(n, fatalities, tov = c("sb", "ns", "os", "any"))

puncertainUCDP(q, fatalities, tov = c("sb", "ns", "os", "any"))

duncertainUCDP(x, fatalities, tov = c("sb", "ns", "os", "any"))

quncertainUCDP(p, fatalities, tov = c("sb", "ns", "os", "any"))

Arguments

n

Number of observations to generate random values for

fatalities

A vector of non-negative integers representing the number of fatalities of the UCDP events. Non-integer values are allowed but should be considered experimental.

tov

A character string representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any". The options are:

* "sb" for state-based violence * "ns" for non-state violence * "os" for one-sided violence * "any" for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category.

x, q

Vector of quantiles

p

Vector of probabilities

Details

The reported-value inflated Gumbel mixture distribution is a parametric distribution for modeling the uncertainty in the number of fatalities of UCDP events. The distribution is a mixture of a Gumbel distribution and a point mass at the reported number of fatalities. The distribution is estimated based on the number of fatalities and the type of violence of the UCDP event. The distribution is estimated using a set of regression models that estimate the location, scale, and weight parameters of the distribution based on the number of fatalities and the type of violence of the UCDP event.

Value

* duncertainUCDP gives the density function * puncertainUCDP gives the distribution function * quncertainUCDP gives the quantile function * runcertainUCDP generates random values as a vector of length n

Examples

data(ucdpged)

# Generate 10 random values for an arbitrary UCDP event
runcertainUCDP(n = 10, fatalities = 100, tov = 'sb')

# Generate 10 random values for the first event in the GED sample
runcertainUCDP(n = 10, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])

# Obtaining the probability that an arbitrary UCDP event has at least 150 fatalities
puncertainUCDP(q = 150, fatalities = 100, tov = 'ns')

# Obtaining the probability that the for the first event in the GED sample has at least 5 fatalities
puncertainUCDP(q = 5, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])

# Obtaining the 90th percentile for an arbitrary UCDP event and one-sided violence
quncertainUCDP(p = 0.9, fatalities = 100, tov = 'os')

# Obtaining the 90th percentile for the first event in the GED sample
quncertainUCDP(p = 0.9, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])

# Obtaining the density for an arbitrary UCDP event and state-based violence
duncertainUCDP(x = seq(from = 0, to = 500), fatalities = 100, tov = 'sb')

# Obtaining the density for the first event in the GED sample
duncertainUCDP(x = seq(0, 50), fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])

UCDP Georeferenced Event Dataset (GED) sample

Description

A sample of the UCDP Georeferenced Event Dataset (GED) from the 2023 data release. The data contains information about the date, location, and type of conflict events. The data is a sample of the full dataset, which can be downloaded from the UCDP website <https://ucdp.uu.se/downloads/>.

Usage

ucdpged

Format

a tibble with 1000 rows and 49 columns

Source

<https://ucdp.uu.se/downloads/>


Parameter extraction for uncertainUCDP-functions

Description

Extracting parameters for the reported-value inflated Gumbel mixture distribution for UCDP events. Primarily intended for internal use by the uncertainUCDP-functions, but can be used to extract parameters for the distribution manually.

Usage

uncertainUCDP_parameters(fatalities, tov)

Arguments

fatalities

A vector of non-negative integers representing the number of fatalities of the UCDP event. Non-integer values are allowed but should be considered experimental

tov

A character string or integer value representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any" or their numeric equivalent The options are:

* "sb" or 1 for state-based violence * "ns" or 2 for non-state violence * "os" or 3 for one-sided violence * "any" or 4 for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category.

Value

A list with three elements: loc, scale, and w. loc and scale are the location and scale parameters of the Gumbel distribution, respectively. w is the weight parameter for the reported-value inflation