TY  - CONF
ID  - mk:epiDAMIK2019
T1  - Improving Outbreak Detection with Stacking of Statistical Surveillance Methods
A1  - Kulessa, Moritz
A1  - Loza Mencía, Eneldo
A1  - Fürnkranz, Johannes
TI  - Workshop Proceedings of epiDAMIK: Epidemiology meets Data Mining and Knowledge discovery (held in conjunction with ACM SIGKDD 2019)
Y1  - 2019
CY  - Anchorage, USA
N1  - also as arXiv preprint arXiv:1907.07464
UR  - http://people.cs.vt.edu/~badityap/epidamik/kdd-epidamik19-proceedings.pdf
KW  - Fusion
KW  - Outbreak Detection
KW  - Stacking
KW  - Surveillance
N2  - Epidemiologists use a variety of statistical algorithms for the early detection of outbreaks. The practical usefulness of such methods highly depends on the trade-off between the detection rate of outbreaks and the chances of raising a false alarm. Recent research has shown that the use of machine learning for the fusion of multiple statistical algorithms improves outbreak detection. Instead of relying only on the binary output (alarm or no alarm) of the statistical algorithms, we propose to make use of their p-values for training a fusion classifier. In addition, we also show that adding additional features and adapting the labeling of an epidemic period may further improve performance. For comparison and evaluation, a new measure is introduced which captures the performance of an outbreak detection method with respect to a low rate of false alarms more precisely than previous works. Our results on synthetic data show that it is challenging to improve the performance with a trainable fusion method based on machine learning. In particular, the use of a fusion classifier that is only based on binary outputs of the statistical surveillance methods can make the overall performance worse than directly using the underlying algorithms. However, the use of p-values and additional information for the learning is promising, enabling to identify more valuable patterns to detect outbreaks.
ER  -