Title: | Utilities for Aggregating Probabilistic Forecasts |
---|---|
Description: | Provides several methods for aggregating probabilistic forecasts. You have a group of people who have made probabilistic forecasts for the same event. You want to take advantage of the "wisdom of the crowd" and combine these forecasts in some sensible way. This package provides implementations of several strategies, including geometric mean of odds, an extremized aggregate (Neyman, Roughgarden (2021) <doi:10.1145/3490486.3538243>), and "high-density trimmed mean" (Powell et al. (2022) <doi:10.1037/dec0000191>). |
Authors: | Molly Hickman [aut, cre] , Zach Jacobs [aut] |
Maintainer: | Molly Hickman <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.0.0 |
Built: | 2024-10-31 05:05:33 UTC |
Source: | https://github.com/forecastingresearch/aggutils |
Calculate the geometric mean of a vector of forecasts. We handle 0s by replacing them with the qth quantile of the non-zero forecasts.
geoMeanCalc(x, q = 0.05)
geoMeanCalc(x, q = 0.05)
x |
Vector of forecasts in 0 to 1 range |
q |
The quantile to use for replacing 0s (between 0 and 1) |
(numeric) The geometric mean of the vector
agg(a) + agg(not a) does not sum to 1 for this aggregation method.
Convert probabilities to odds, and calculate the geometric mean of the odds. We handle 0s by replacing them with the qth quantile of the non-zero forecasts, before converting.
geoMeanOfOddsCalc(x, q = 0.05, odds = FALSE)
geoMeanOfOddsCalc(x, q = 0.05, odds = FALSE)
x |
A vector of forecasts (probabilities! unless odds = TRUE) |
q |
The quantile to use for replacing 0s (between 0 and 1) |
odds |
Whether x is already in odds form (TRUE) or probabilities |
(numeric) The geometric mean of the odds
From Powell et al. (2022) doi:10.1037/dec0000191. You find the shortest interval containing (1-p) * 100% of the data and take the mean of the forecasts within that interval.
hd_trim(x, p = 0.1)
hd_trim(x, p = 0.1)
x |
Vector of forecasts in 0 to 1 range |
p |
The proportion of forecasts to trim (between 0 and 1) |
(numeric) The highest-density trimmed mean of the vector
As p gets bigger this acts like a mode in a similar way to the symmetrically-trimmed mean acting like a median.
Takes the arithmetic mean of the log odds of the forecasts, then extremizes the mean by a factor d, where d is
(n*(sqrt((3n^2) - (3n) + 1) - 2))/(n^2 - n - 1)
where n is the number of forecasts.
neymanAggCalc(x)
neymanAggCalc(x)
x |
Vector of forecasts in 0 to 1 range |
(numeric) The extremized mean of the vector
Neyman, E. and Roughgarden, T. (2021). Are you smarter than a random expert? The robust aggregation of substitutable signals: doi:10.1145/3490486.3538243. Also Jaime Sevilla's EAF post “Principled extremizing of aggregated forecasts."
This does the preprocessing steps that all the agg methods have in common.
preprocess(x, q = 0)
preprocess(x, q = 0)
x |
A vector of forecasts |
q |
The quantile to use for replacing 0s and 1s (between 0 and 1) |
A vector of forecasts with 0s are replaced by the qth quantile and 1s are replaced by the (1 - q)th quantile.
Assumes forecasts are in the range 0 to 1, inclusive.
If the mean is > .5, trim the top trim%; if < .5, the bottom trim%. Return the new mean (i.e. soften the mean).
soften_mean(x, p = 0.1)
soften_mean(x, p = 0.1)
x |
Vector of forecasts in 0 to 1 range |
p |
The proportion of forecasts to trim from each end (between 0 and 1) |
(numeric) The softened mean of the vector
This goes against usual wisdom of extremizing the mean, but performs well when the crowd has some overconfident forecasters in it.
Trim the top and bottom (p*100)% of forecasts
trim(x, p = 0.1)
trim(x, p = 0.1)
x |
Vector of forecasts in 0 to 1 range |
p |
The proportion of forecasts to trim from each end (between 0 and 1) |
(numeric) The trimmed mean of the vector