Knowledge and Information Systems
An International Journal
ISSN: 0219-1377 (printed version)
ISSN: 0219-3116 (electronic version)
by Springer

Volume 1, Number 4, November 1999

Regular Papers

Exploration of Ordinal Data Using Association Rules

Oliver Buchter and Rudiger Wirth

Abstract. The discovery of association rules is a very efficient data mining technique that is especially suitable for large amounts of categorical data. This paper shows how the discovery of association rules can be of benefit for numeric data as well. Based on a review of previous approaches we introduce Q2, a faster algorithm for the discovery of multi-dimensional association rules over ordinal data. We experimentally compare the new algorithm with the previous approach, obtaining performance improvements of more than an order of magnitude on supermarket data. In addition, a new absolute measure for the interestingness of quantitative association rules is introduced. It is based on the view that quantitative association rules have to be interpreted with respect to their Boolean generalizations. This measure has two major benefits compared to the previously used relative interestingness measure; first, it speeds up rule extraction and evaluation and second, it is easier to interpret for a user. Finally we introduce a rule browser which supports the exploration of ordinal data with quantitative association rules.

An Axiom Foundation for Uncertain Reasonings in Rule-Based Expert Systems: NT-Algebra

Xudong Luo and Chengqi Zhang

Abstract. This paper identifies an axiom foundation for uncertain reasonings in rule-based expert systems: a near topological algebra (NT-algebra for short), which holds some basic notions hidden behind the uncertain reasoning models in rule-based expert systems. According to the basic means of topological connection in an inference network, an NT-algebraic structure has five basic operators, i.e. AND, OR, NOT, Sequential combination and Parallel combination, which obey some axioms. An NT-algebraic structure is defined on a near-degree space introduced by the authors, which is a special topological space. The continuities of real functions, of fuzzy functions and functions in other senses can be uniformly considered in the framework of a near-degree space. This paper also proves that EMYCIN's and PROSPECTOR's uncertain reasoning models correspond to good NT-algebras. Moreover, the existence of any finite NT-algebraic structure is constructively proved. Compared to other related research efforts, the NT-algebra as an axiom foundation has the following characteristics: (1) various cases of assessments for uncertainties of evidence and rules are put into a unified algebraic structure; and (2) major emphasis has been placed on the basic laws of the propagation for them in an inference network, especially the continuity of propagation operations and the relationships between propagation operations.

Run Placement Policies for Concurrent Mergesorts Using Parallel Prefetching

Kun-Lung Wu, Philip S. Yu and James Z. Teng

Abstract. We study the performance of various run placement policies on disks for the merge phase of concurrent mergesorts using parallel prefetching. The initial sorted runs (input) of a merge and its final sorted run (output) are stored on multiple disks but each run resides only on a single disk. In this paper, we examine through detailed simulations three different run placement policies and the impact of buffer thrashing. The results show that, with buffer thrashing avoidance, the best performance can be achieved by a run placement policy that uses a proper subset of the disks dedicated for writing the output runs while the rest of the disks are used for prefetching the input runs in parallel. However, the proper number of write disks is workload dependent, and if not carefully chosen, it can adversely affect the system performance. In practice, a reasonably good performance can be achieved by a run placement policy that does not place the output run of a merge on any of the disks that store its own input runs but allows the output run to share the same disk with some of the input runs of other merges.

Imprecise Reliability of General Structures

Lev V. Utkin and Sergey V. Gurov

Abstract. This paper discusses the important aspects of the reliability of systems with an imprecise general model of the structure function. It is assumed that the information about reliability behavior of components is restricted by the mean levels of component performance. In this case the classical reliability theory cannot provide a way for analyzing the reliability of systems. The theory of imprecise probabilities may be a basis in developing a general reliability theory which allows us to solve such problems. The basic tool for computing new reliability measures is the natural extension which can be regarded as a linear optimization problem. However, the linear programming computations will become impracticable when the number of components in the system is large. Therefore, the main aim of this paper is to obtain explicit expressions for computing the system reliability measures. We analyze the reliability of general structures and typical systems. The numerical examples illustrate the usefulness of the presented approach to reliability analyzing.

Efficient Join Processing Using Partial Precomputation

Kian-Lee Tan, Cheng Hian Goh, Mong Li Lee, and Beng Chin Ooi

Abstract. In this paper, we generalize conventional join indexes to a cluster-based join index, in which objects are grouped into clusters based on proximity. Each record of our join index represents a pair of clusters in which the join condition is satisfied by some members of the cluster. This strategy is especially useful for spatial and high-dimensional databases because of their typically large data volume and complex operations. Our approach leverages on the structure of R-trees by exploiting the internal nodes of an R-tree in effectively determining the precomputed clusters which can be used in our join index. By varying the size of the cluster, we are able to fine-tune the join index to achieve a balance between update cost and retrieval cost to suit individual applications. Different implementations of the join index are examined to determine how the join index can be efficiently maintained. To this end, we also conduct a number of experiments on intersection join and window queries, and the results confirm that semi-precomputation of join results is a robust and cost effective approach to join processing.


This page has been accessed times since October 9, 1999.