New for: D1, D2, D3, D4, D5
with many practical applications. An enormous research effort has been invested in to the task
of managing and querying graphs, yet a lot challenges are left unsolved. In this thesis, we advance
the state-of-the-art for the following query models by proposing a distributed solution to process
them in an efficient and scalable manner.
• Set Reachability: A generalization of basic notion of graph reachability, set reachability deals
with finding all reachable pairs for a given source and target sets. We present a non-iterative
distributed solution that takes only a single round of communication for any set reachability query.
• Basic Graph Patterns (BGP): BGP queries are a common mode of querying knowledge graphs,
biological datasets, etc. We present a novel distributed architecture that relies on the concepts
of asynchronous executions, join-ahead pruning, and a multi-threaded query processing framework
to process BGP queries in an efficient and scalable manner.
• Generalized Graph Patterns (GGP). These queries combine the semantics of pattern matching
(BGP) and navigational queries. We present a distributed solution with bimodal indexing layout
that individually support efficient and scalable processing of BGP queries and navigational queries.
We propose a prototype distributed engine, coined “TriAD” (Triple Asynchronous and Distributed)
that supports all the aforementioned query models. We also provide a detailed empirical evaluation
of TriAD in comparison to several state-of-the-art systems over multiple real-world and synthetic
datasets.