Download Free Query Processing In Uncertain Graphs Book in PDF and EPUB Free Download. You can read online Query Processing In Uncertain Graphs and write the review.

Large-scale, highly interconnected networks, which are often modeled as graphs, pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction models, or explicit manipulation, e.g., for privacy purposes. Therefore, uncertain, or probabilistic, graphs are increasingly used to represent noisy linked data in many emerging application scenarios, and they have recently become a hot topic in the database and data mining communities. Many classical algorithms such as reachability and shortest path queries become #P-complete and, thus, more expensive over uncertain graphs. Moreover, various complex queries and analytics are also emerging over uncertain networks, such as pattern matching, information diffusion, and influence maximization queries. In this book, we discuss the sources of uncertain graphs and their applications, uncertainty modeling, as well as the complexities and algorithmic advances on uncertain graphs processing in the context of both classical and emerging graph queries and analytics. We emphasize the current challenges and highlight some future research directions.
We study the semantics of queries over uncertain graphs, which are directed graphs in which each edge is associated with a value in [0,1] representing its cer- tainty. In this work, we consider the certainty values as probabilities and show the challenges involved in evaluating the reachability and transitive closure queries over uncertain/probabilistic graphs. As the evaluation method, we adopted graph re- duction from automata theory used for finding regular expressions for input finite state machines. However, we show that different order of eliminating nodes may yield different certainty associated with the results. We then formulate the notion of "correct" results for queries over uncertain graphs, justified based on the notion of common sub-expressions, and identify common paths and avoid their redundant multiple contributions during the reduction. We identify a set of possible patterns to facilitate the reduction process. We have implemented the proposed ideas for answering reachability and transitive closure queries. We evaluated the effectiveness of the proposed solutions using a library of many uncertain graphs with different sizes and structures. We believe the proposed ideas and solution techniques can yield query processing tools for uncertain data management systems.
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values. Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
In the world of data management, one of the most formidable challenges faced by academic scholars is the effective handling of spatiotemporal data within the semantic web. As our world continues to change dynamically with time, nearly every aspect of our lives, from environmental monitoring to urban planning and beyond, is intrinsically linked to time and space. This synergy has given rise to an avalanche of spatiotemporal data, and the pressing question is how to manage, model, and query this voluminous information effectively. The existing approaches often fall short in addressing the intricacies and uncertainties that come with spatiotemporal data, leaving scholars struggling to unlock its full potential. Uncertain Spatiotemporal Data Management for the Semantic Web is the definitive solution to the challenges faced by academic scholars in the realm of spatiotemporal data. This book offers a visionary approach to an all-encompassing guide in modeling and querying spatiotemporal data using innovative technologies like XML and RDF. Through a meticulously crafted set of chapters, this book sheds light on the nuances of spatiotemporal data and also provides practical solutions that empower scholars to navigate the complexities of this domain effectively.
The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments. This, the 18th issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains extended and revised versions of seven papers presented at the 24th International Conference on Database and Expert Systems Applications, DEXA 2013, held in Prague, in the Czech Republic, in August 2013. Following the conference, and two further rounds of reviewing and selection, five extended papers and two invited keynote papers were chosen for inclusion in this special issue. The subject areas covered include argumentation, e-government, business processes, predictive traffic estimation, semantic model integration, top-k query processing, uncertainty handling, graph comparison, community detection, genetic programming, and web services.
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values. Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challenges in query semantics and processing, making conventional methods inapplicable. Furthermore, the interplay between ranking and uncertainty models introduces new dimensions for ordering query results that do not exist in the traditional settings. This lecture describes new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on discussing the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries. Under the tuple-level uncertainty model, we describe new processing techniques leveraging the capabilities of relational database systems to recognize and handle data uncertainty in score-based ranking. Under the attribute-level uncertainty model, we describe new probabilistic ranking models and a set of query evaluation algorithms, including sampling-based techniques. We also discuss supporting rank join queries on uncertain data, and we show how to extend current rank join methods to handle uncertainty in scoring attributes. Table of Contents: Introduction / Uncertainty Models / Query Semantics / Methodologies / Uncertain Rank Join / Conclusion