The sequential probability ratio test is a statistical test of one simple hypothesis against another. Oftentimes a parametric form is assumed for the underlying density or (discrete) probability function, and the two hypotheses are specified by two values of the parameter. The sequential probability ratio test in this case consists of taking observations sequentially and after each observation comparing the likelihood ratio to two constants that are chosen to specify type I and type II error probabilities. When the likelihood ratio crosses one of the constants the corresponding density to that constant is chosen to be the true density. For more closely mixed proposed densities the expected number of steps to decision may be large, but generally is less than a fixed n design. In this situation data compression could be a necessity in order to reduce data storage requirements. In this paper we explore the effects of binning sequential data in two cases: (1) the exact binned (histogram) of densities is known; and (2) finite sample approximations of the exact histogram densities are known. We show the effects of binning in both cases on the expected number of steps to decision, and type I and type II error. Optimal binning parameter choices for common densities as well as formulae for general densities are also given.
The Good Judgment Team led by psychologists P. Tetlock and B. Mellers of the University of Pennsylvania was the most successful of five research projects sponsored through 2015 by IARPA to develop improved group forecast aggregation algorithms. Each team had as many as ten algorithms under continuous development and evaluation over the four year project. The mean Brier score was used to rank the algorithms on approximately 283 questions concerning categorical geopolitical events each year. An algorithm would return aggregate probabilities for each question based on the probabilities provided per question by thousands of individuals, who had been recruited by the Good Judgment Team each year. This paper summarizes the theorized basis and implementation of one of the two most accurate algorithms at the conclusion of the Good Judgment Project. The algorithm incorporated a number of pre- and post-processing steps, and relied upon a minimum distance robust regression method called L2E; see Scott (2001). The algorithm was just edged out by a variation of logistic regression, which has been described elsewhere; see Mellers et al. (2014) and GJP (2015a). Work since the official conclusion of the project has led to an even smaller gap.