Computer Science Department
School of Computer Science, Carnegie Mellon University
FastCARS: Fast, Corrrelation-Aware Sampling
Jia-Yu Pan, Srinivasan Seshan, Christos Faloutsos
Our proposed method, "FastCARS", naturally captures statistics for packets that are 1, 2 or more steps away. It has the following properties: (a) provides accurate measurements of full trace's statistics, (b) is simple and scalable for implementation, (c) captures correlations between successive packets, as well as packets that are further apart, (d) evenly separate sampling efforts over time, and (e) generalizes previously proposed sampling methods and includes them as special cases.
We also propose several new tools for network data mining and demonstrate the good quality of the information provided by FastCARS. These tools include: (a) The n-step histograms which give correlated statistics at different levels of temporal correlation, (b) the convolution test which could be used to examine the dependence level between packet arrivals. (c) the n-step packet-size/delay graph which provides accurate bandwidth estimation and load monitoring, and (d) the n-step flow graph which effectively visualizes flow patterns hidden in a trace.
The experimental results on multiple, real-world datasets (479Mb in total), show that the proposed FastCARS sampling method and these new data mining tools are effective. With these tools, we show that the independence assumption of packet arrival is not correct, and that packet trains may not be the only cause of dependence among arrivals. The provided tools may be useful in applications such as monitoring link load and traffic flows.