From NetSysLab

Jump to: navigation, search

Pattern matching, that is, finding subgraphs that match smaller template graph(s) within the large background graph is fundamental to graph analytics and serves a rich set of applications. Today, there are applications that operate on graphs consist of tens of billions of nodes and over one trillion edges. Unfortunately, existing pattern matching solutions do not scale for modern large graph datasets and there are limitations to the intricacy of the patterns that can be used.

The goal of this research is to enable high-performance and scalable pattern matching on metadata (i.e., labeled) graphs using distributed memory systems. Our early explorations suggest that systematic graph pruning is a viable alternative for enabling scalable, distributed pattern matching on massive metadata graphs. Our work aims to achieve the following goals. First, identify key design requirements for distributed pattern matching following the graph pruning approach and evaluate the feasibility and effectiveness of this mechanism trough an initial set of experiments. Second, devise necessary design refinements, and algorithm and infrastructure development to enable support for a rich set of subgraph matching scenarios (e.g., support for patterns with not only vertex metadata but also edge metadata, temporal graphs) and demonstrate applications on real-world graphs.


Tahsin Reza
Roger Pearce (LLNL)
Geoff Sanders (LLNL)
Matei Ripeanu


[1] Towards Practical and Robust Labeled Pattern Matching in Trillion-Edge Graphs, Tahsin Reza, Christine Klymko, Geoffrey Sanders, Roger Pearce, Matei Ripeanu, IEEE Cluster, Honolulu, HI September 2017. (acceptance rate: 47/216 = 21.8% -- Best Paper Award). pdf slides