Laboratory for Networks and Applied Graph Theory


Research Sponsors


Basic Research

1999-2002, National Science Foundation, Algorithms and Routing Schemes for Scalable Networks, NSF Grant No. CCR-9877139

Abstract: This project conducts research on the design, implementation, and analysis of algorithms for the support of scalable and reliable communication services in distributed networks. The focus centers around important challenges that affect the national network-communication infrastructure and the needs of increasing numbers of network users and services. These challenges included mitigating the stress on network backbone caused by the diminishing locality of network traffic, easing the addition of large numbers of geographically distributed network hosts, implementing fault-tolerant and congestion-tolerant communication methods, and integrating diverse technologies such as fast-switching and mobile communication devices into the national networking infrastructure. A significant obstacle in meeting these challenges is the difficulty of routing, switching, addressing, and guaranteeing delivery of messages in such a large and growing global network. This is a program of fundamental and applied research attacking these scalability and reliability problems using techniques of network algorithms, computational combinatorics and graph theory. The analysis of the topological structure of networks and the algorithmic properties associated with network routing schemes will be the primary focus. Extensive implementations and empirical studies building on the experience using the network platform provided by the Ohio Supercomputer Center and the OCARNet dedicated ATM research network will be performed.

1998-2000, Ohio Board of Regents, Scalable Communication on Network-based Computing Systems, Research Investment Fund Grant. Collaborative research with D.K. Panda and T. Page of Ohio State University.

Project Summary: Network-Based Computing (NBC) systems are becoming increasingly popular for providing cost effective and affordable high performance computing and high throughput computing for day-to-day computational needs. Such systems generally consist of clusters of workstations connected by Local Area Networks (LANs) and Wide Area Networks (WANs). Emerging networking technologies like ATM, Myrinet, Fast Ethernet, Gigabit Ethernet, and SCRAMNet are currently being used to build such NBC systems. Hardware and software LAN/WAN technology was not initially developed for high performance/high throughput computing using workstation clusters. Having been designed primarily to optimize bandwidth, the communication overhead between workstations can be quite high. Communication on such systems is also not that reliable. Hence, messages need to be routed in an adaptive and fault-tolerant manner. High performance/high throughput execution is often dependent on frequent communication (both point-to-point and collective) and synchronization between tasks. Many research projects are currently being undertaken to support fast point-to-point communication in such systems. However, applications on these systems are not able to achieve maximum performance due to the absence of scalable collective communication support. In this research, we plan to take on new challenges in providing fast and scalable collective communication support for network-based computing systems. The research specifically aims to carry out research along the following four directions: 1) Collective communication on heterogeneous networked environment, 2) Fault-tolerant collective communication algorithms, 3) Exploiting routing adaptivity for scalable collective communication, and 4) Collective communication on NOWs with multiple networking technologies. A combination of theoretical, analytical, and experimental approach is planned in this research. The OCARNet testbed together with the local area clusters available at OSU and UC are planned to be used to validate the new research. This research promises direct applicability to the emerging network-based computing systems. It demonstrates potential for application developers to obtain multi-fold increase in performance from their network-based computing environments while maintaining portability.

Infrastructure

1997-2000, National Science Foundation Grant for Major Research Infrastructure: Acquisition of a Research Network for Distributed Computing, NSF Grant No. CISE-9871345.

OCARNet Research Network link provided by Ohio Board of Regents.