Distributed Nearest Neighbors Algorithm

Related Research Areas
Atmospheric Composition, Carbon Cycle & Ecosystems, Climate Variability & Change, Earth Surface & Interior, Water & Energy Cycles, Weather
Project Description
The task of outlier detection becomes more challenging if the data is located at different locations where the goal is to find the outliers in the union of all of the data without necessarily moving all points to a centralized location. In this work we develop a partly synchronized, unsupervised approach for outlier detection on distributed datasets. A distance-based outlier detection technique based on a nearest-neighbor algorithm akin to Orca was implemented in Java resulting in increase speed due to the number of processors available as well as additional improvements to the algorithm yielding efficient data transferring and evaluation of points across the distributed nodes.
Project Administrator(s):
Bryan Matthews


Bryan Matthews
Kanishka Bhaduri
Petr Votava