Computer Science Department
School of Computer Science, Carnegie Mellon University
Network Monitoring and Diagnosis
Network monitoring and diagnosis systems are used by ISPs for daily network management operations and by popular network applications like peer-to-peer systems for performance optimization. However, the high overhead of some monitoring and diagnostic techniques can limit their applicability. This is for example the case for end-to-end available bandwidth estimation: tools previously developed for available bandwidthmonitoring and diagnosis often have high overhead and are difficult to use.
This dissertation puts forth the claim that end-to-end available bandwidth and bandwidth bottlenecks can be efficiently and effectively estimated using packet-train probing techniques. By using source and sink tree structures that can capture network edge information, and with the support of a properly designed measurement infrastructure, bandwidth-related measurements can also be scalable and convenient enough to be used routinely by both ISPs and regular end users.
These claims are supported by four techniques presented in this dissertation: the IGI/PTR end-to-end available bandwidth measurement technique, the Pathneck bottleneck locating technique, the BRoute large-scale available bandwidth inference system, and the TAMI monitoring and diagnostic infrastructure. The IGI/PTR technique implements two available-bandwidth measurement algorithms, estimating background traffic load (IGI) and packet transmission rate (PTR), respectively. It demonstrates that end-to-end available bandwidth can be measured both accurately and efficiently, thus solving the path-level available-bandwidth monitoring problem. The Pathneck technique uses a carefully constructed packet train to locate bottleneck links, making it easier to diagnose available-bandwidth related problems. Pathneck only needs single-end control and is extremely light-weight. Those properties make it attractive for both regular network users and ISP network operators. The BRoute system uses a novel concept -- sourrce and sink trees --- to capture end-user routing structures and network-edge bandwidth information. Equipped with path-edge inference algorithms, BRoute can infer the available bandwidth of all N2 paths in an N-node system with only O(N) measurement overhead. That is, BRoute solves the system-level available-bandwidth monitoring problem. The TAMI measurement infrastructure introduces measurement scheduling and topology-aware capabilities to systematically support all the monitoring and diagnostic techniques that are proposed in this dissertation. TAMI not only can support network monitoring and diagnosis, it also can effectively improve the performance of network applications like peer-to-peer systems.