Research


Data faults in Sensor Networks

Overview: Various sensor network measurement studies have reported instances of transient faults in sensor readings. In this work, we seek to answer a simple question: How often are such faults observed in real deployments? To do this, we first explore and characterize four qualitatively different classes of fault detection methods. Rule-based methods leverage domain knowledge to develop heuristic rules for detecting and identifying faults. Estimation methods predict "normal" sensor behav- ior by leveraging sensor correlations, flagging anomalous sensor readings as faults. Time series analysis based methods start with an a priori model for sensor readings. A sensor measurement is compared against its predicted value computed using time series forecasting to determine if it is faulty. Learning-based methods infer a model for the "normal" sensor readings using training data, and then statistically detect and identify classes of faults.

We find that these four classes of methods sit at different points on the accuracy/robustness spectrum. Rule-based methods can be highly accurate, but their accuracy depends critically on the choice of parameters. Learning methods can be cumbersome to train, but can accurately detect and classify faults. Estimation methods are accurate, but cannot classify faults. Time series analysis based methods are more effective for detecting short duration faults than long duration ones, and incur more false positives than the other methods. We apply these techniques to four real-world sensor data sets and find that the prevalence of faults as well as their type varies with data sets. All four methods are qualitatively consistent in identifying sensor faults, lending credence to our observations. Our work is a first-step towards automated on-line fault detection and classification.

Further details are available at the CENS projects webpage.

Papers:

Talk:

Collaborators: Leana Golubchik, Ramesh Govindan

Back to the project list