|
CMU-CS-03-154
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-03-154
A Fast Multi-Resolution Method for Detection of
Significant Spatial Overdensities
Daniel B. Neill, Andrew W. Moore
June 2003
CMU-CS-03-154.ps
CMU-CS-03-154.pdf
Keywords: Algorithms, biosurveillance, cluster detection,
spatial scan statistic
Given an N x N grid of squares, where each square sij has a count cij and an underlying population
pij, our goal is to find the square region S with
the highest density, and to calculate the significance of
this region by Monte Carlo testing. Any density measure D,
which depends on the total count and total population of the region, can
be used. For example, if each count cij represents the
number of disease cases occurring in that square, we can use Kulldorff's
spatial scan statistic DK to find the most significant spatial
disease cluster. A naive approach to finding the region of maximum
density would be to calculate the density measure for every square region:
this requires O(RN3) calculations, where
R is the number of Monte Carlo replications, and hence is
generally computationally infeasible. We present a novel multi-resolution
algorithm which partitions the grid into overlapping regions, bounds the
maximum score of subregions contained in each region, and prunes
regions which cannot contain the maximum density region. For
sufficiently dense regions, this method finds the maximum density
region in optimal O(RN2) time, and in
practice it results in significant (10-200x) speedups as compared
to the naive approach.
17 pages
|