Graduate research: Working to speed up supercomputers
“Data replication means resilience against failures, but it also takes up limited disk bandwidth. The system gets congested and it processes data much slower.”
That’s how Simbarashe Dzinamarira, a third-year graduate student in computer science (CS), poses a familiar dilemma in high-performance computing. When processing even routine workloads, replication traffic contends arbitrarily for limited bandwidth with time-sensitive data flows.
“As a result, task execution times are increased and cluster-wide resource utilization becomes highly inefficient,” said Dzinamarira, a native of Harare, Zimbabwe who received a bachelor of engineering degree in CS from the Hong Kong University of Science and Technology in 2013.
Dzinamarira’s research suggests use of a customized flow-controlled file system, Pfimbi, will help clean up the digital clutter. “Pfimbi” in Shona, a Bantu language of the Shona people in Zimbabwe, means “the safe keeping of goods in a warehouse.” In a departure from the conventional design, it would decouple replication from primary data and perform job-aware flow control on file system traffic. Just as users do not backup their computers when they are busy, so Pfimbi pauses the replication process when computers in a cluster are processing data.
“Pfimbi uses established flow-control techniques such as credit-based flow control and weighted fair queuing in novel ways. This speeds up the execution of tasks,” said Dzinamarira, whose faculty adviser is T. S. Eugene Ng, associate professor of CS and of electrical and computer engineering, and director of BOLD (Big Data and Optical Lightpaths Driven Lab) at Rice.
Dzinamarira has augmented HDFS (Hadoop Distributed File System), a Java-based file system that offers scalable, reliable data storage, to create Pfimbi. “We can halve the duration of a job’s write phase, and at the same time reduce the average job’s runtime in a realistic workload by up to 20 percent using our version of HDFS,” he said.
In October, Dzinamarira’s pitch at the fourth-annual Screech Competition, sponsored by the Rice Center for Engineering Leadership, was titled “Processing Big Data in Little Time.” In November, he was one of eight students at Rice to receive a graduate fellowship from the Ken Kennedy Institute for Information Technology. In November, he defended his master’s degree in CS, and he expects to earn his Ph.D. in 2017.
“Sadly, the capacity to move around data is limited. When the system is busy, we slow down backup so the time-critical flows can move faster, and time is always critical,” Dzinamarira said.