News Brief
June 06, 2017
RICHLAND, Wash. - Analysis of big data that can reveal early signs of an Ebola outbreak or the 1st traces of a cyberattack require a different kind of processor than has been developed for large-scale scientific studies. Since the data might come from disparate sources - say, medical records and GPS locations in the case of Ebola - they are organized in such a way that conventional computer processors handle them inefficiently.
Now, the military research organization DARPA has announced a new effort to build a processor for this kind of data - and the software to run on it. A group of computer scientists at the D.O.E.'s Pacific Northwest National Lab will gain $7 million over 5 years to create a software development kit for big data analysis.
"Our software development kit will support a high-level, easy-to-use programming environment for both average and expert programmers," said computer scientist John Feo at PNNL. "We also expect it to achieve the program's goal of one thousand-fold improvement over current technology in data processing efficiency."
Conventional processors work best with structured data such as that found in science or an online store, with items arranged in tables of price, descriptions and other categories. But for applications such as cybersecurity, tracking disease outbreaks, or analyzing the power grid, data comes from a variety of sources: emails, webpages or social media apps in the case of cybersecurity or generating stations, transformers, and homes with the power grid.
This type of data - unstructured - are splayed out in nodes linked by edges, like stars in constellations. In this arrangement, the relationships among nodes - the computers in a network or power plants on the grid - are represented by the edges - the Wi-Fi links between computers or the power lines on the grid. The nodes and edges form an image called a graph, which the new hardware and software will be designed to process and analyze.
Andrew Lumsdaine and John Feo will lead a team of researchers from the Northwest Institute for Advanced Computing and PNNL's High Performance Computing group on the HAGGLE plan - Hybrid Attributed Generic Graph Library Environment. Read more about the HIVE plan - Hierarchical Identify Verify & Exploit - at DARPA.
Tags: Energy, Computational Science, National Security, Cyber Security, Software, Data Visualization
Interdisciplinary teams at Pacific Northwest National Lab address many of America's most pressing issues in energy, the environment and national security through advances in basic and applied science. Founded in 1965, PNNL employs 4,400 staff and has an yearly budget of nearly $1 billion. It is managed by Battelle for the D.O.E.'s Office of Science. As the single biggest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information on PNNL, visit the PNNL News Center, or follow PNNL on Facebook, Google+, Instagram, LinkedIn and Twitter.