[1]
G. Coghill, A. Srinivasan, and R.D. King. Qualitative System Identification from Imperfect Data. Journal of Artificial Intelligence Research, 2008. (To appear).

[2]
S. Joshi, G. Ramakrishnan, and A. Srinivasan. Feature construction using theory-guided sampling and randomised search. In Proceedings of the Eighteenth International Conference on Inductive Logic Programming (ILP 2008), LNAI, Berlin, 2008. Springer. (To appear).

[3]
N. Fonseca, A. Srinivasan, F. Silva, and R. Camacho. Parallel ILP for Distributed-Memory Architectures. Machine Learning Journal, 2008. (Conditionally accepted).

[4]
G. Ramakrishnan, S. Joshi, S. Balakrishnan, and A. Srinivasan. Using ILP to Assist Information Extraction from Semi-Structure Text. In Proceedings of the Seventeenth International Conference on Inductive Logic Programming, LNAI 4894, pages 211-224, Berlin, 2008. Springer.

[5]
A. Srinivasan and R.D. King. Incremental Identification of Qualitative Models of Biological Systems. Journal of Machine Learning Research, 2008. (To appear).

[6]
S. Tyagi, C. Vaz, V. Gupta, R. Bhatia, S. Maheshwari, A. Srinivasan, and A. Bhattacharya. CID-miRNA: A web server for prediction of novel miRNA precursors in the human genome. Journal of Biochemical and Biophysical Research Communications, 2008. (To appear).

[7]
A. Paes, F. Zelezny, G. Gaverucha, C.D. Page, and A. Srinivasan. ILP through Propositionalization and Stochastic k-Term DNF Learning. In Proceedings of the Sixteenth International Conference on Inductive Logic Programming, LNAI 4455, pages 379-393, Berlin, 2007. Springer.

[8]
L. Specia, A. Srinivasan, G. Ramakrishnan, and M. Nunes. Word Sense Disambiguation with ILP. In Proceedings of the Sixteenth International Conference on Inductive Logic Programming, LNAI 4455, pages 409-423, Berlin, 2007. Springer. Winner of the Best Application Prize.
The identification of the correct sense of a word is necessary for most tasks in automatic natural language processing like machine translation, information retrieval, speech and text processing. Automatic Word Sense Disambiguation (WSD) is difficult and accuracies with state-of-the art methods are substantially lower than in other areas of text understanding like part-of-speech tagging. One shortcoming of these methods is that they do not utilise substantial sources of background knowledge like parsers, semantic taxonomies, dictionaries and so on that are now available in an electronic form (the methods largely use shallow syntactic features). Empirical results from the use of Inductive Logic Programming (ILP) have repeatedly shown the ability of ILP systems to use diverse sources of background knowledge. In this paper, we investigate the use of ILP for WSD in two different ways: (a) as a stand-alone constructor of models for WSD; and (b) to build interesting features, which can then used by standard model-builder such as a SVM. Our investigation is in the form of experiments that examine a monolingual WSD task using the 32 English verbs contained in the SENSEVAL-3 benchmark data; and a bilingual WSD task using 7 highly ambiguous verbs that arise in machine translation from English to Portuguese. Background knowledge available is from eight sources that provide a wide range of syntactic and semantic information that may be relevant. For both WSD tasks, experimental results show that ILP-constructed models and models constructed using ILP-generated features have higher average (mean and median) accuracies than that obtained using a state-of-the art feature-based technique equipped with shallow syntactic features. This suggests that the use of ILP with diverse sources of background knowledge can provide one way for making substantial progress in the field of automatic WSD.

[9]
A. Srinivasan, C.D. Page, R. Camacho, and R.D. King. Quantitative Pharmacophore Models with Inductive Logic Programming. Machine Learning Journal, 64(1,2,3):65-90, 2006.
Three-dimensional models, or pharmacophores, describing Euclidean constraints on the location on small molecules of functional groups (like hydrophobic groups, hydrogen acceptors and donors, etc.), are often used in drug design to describe the medicinal activity of potential drugs (or `ligands'). This medicinal activity is produced by interaction of the functional groups on the ligand with a binding site on a target protein. In identifying structure-activity relations of this kind there are three principal issues: (1) It is often difficult to ``align'' the ligands in order to identify common structural properties that may be responsible for activity; (2) Ligands in solution can adopt different shapes (or `conformations') arising from torsional rotations about bonds. The 3-D molecular substructure is typically sought on one or more low-energy conformers; and (3) Pharmacophore models must, ideally, predict medicinal activity on some quantitative scale. It has been shown that the logical representation adopted by Inductive Logic Programming (ILP) naturally resolves many of the difficulties associated with the alignment and multi-conformation issues. However, the predictions of models constructed by ILP have hitherto only been nominal, predicting medicinal activity to be presentor absent. In this paper, we investigate the construction of two kinds of quantitative pharmacophoric models with ILP: (a) Models that predict the probability that a ligand is ``active''; and (b) Models that predict the actual medicinal activity of a ligand. Quantitative predictions are obtained by the utilising the following statistical procedures as background knowledge: logistic regression and naive Bayes, for probability prediction; linear and kernel regression, for activity prediction. The multi-conformation issue and, more generally, the relational representation used by ILP results in some special difficulties in the use of any statistical procedure. We present the principal issues and some solutions. Specifically, using data on the inhibition of the protease Thermolysin, we demonstrate that it is possible for an ILP program to construct good quantitative structure-activity models. We also comment on the relationship of this work to other recent developments in statistical relational learning.

[10]
F. Zelezny, A. Srinivasan, and C.D. Page. Randomised Restarted Search in ILP. Machine Learning Journal, 64(1,2,3):183-208, 2006.
Recent statistical performance surveys of search algorithms in difficult combinatorial problems have demonstrated the benefits of randomising and restarting the search procedure. Specifically, it has been found that if the search cost distribution of the non-restarted randomised search exhibits a slower-than-exponential decay (that is, a ``heavy tail''), restarts can reduce the search cost expectation. We report on an empirical study of randomised restarted search in ILP. Our experiments conducted on a high-performance distributed computing platform provide an extensive statistical performance sample of five search algorithms operating on two principally different classes of ILP problems, one represented by an artificially generated graph problem and the other by three traditional classification benchmarks (mutagenicity, carcinogenicity, finite element mesh design). The sample allows us to (1) estimate the conditional expected value of the search cost (measured by the total number of clauses explored) given the minimum clause score required and a ``cutoff'' value (the number of clauses examined before the search is restarted), (2) estimate the conditional expected clause score given the cutoff value and the invested search cost, and (3) compare the performance of randomised restarted search strategies to a deterministic non-restarted search. Our findings indicate striking similarities across the five search algorithms and the three domains, in terms of the basic trends of both the statistics (1) and (2). Also, we observe that the cutoff value is critical for the performance of the search algorithm, and using its optimal value in a randomised restarted search may yield a mean search cost lower by several orders of magnitude lower than that obtained with a deterministic non-restarted search.

[11]
H. Blockeel, C.D. Page, and A. Srinivasan. Multi-Instance Tree Learning. In Proceedings of the Twenty Second International Conference on Machine Learning (ICML 2005), 2005.
We introduce a novel algorithm for decision tree learning in the multi-instance setting adn originally defined by Dietterich et al. It differs from existing multi-instance tree learners in a few crucial, well-motivated details. Experiments on synthetic and real-life datasets confirm the beneficial effect of these differences and show that the resulting system outperforms the existing multi-instance decision tree learners.

[12]
G.M. Coghill, S.M. Garrett, A. Srinivasan, and R.D. King. Qualitative System Identification from Imperfect Data. Technical Report TR0501, Dept. of Computer Science, University of Aberdeen, Aberdeen, 2005.
Experience in the physical sciences suggests that the only realistic means of understanding complex systems is through the use of mathematical models. Typically, this has come to mean the identification of quantitative models expressed as differential equations. Quantitative modelling works best when the structure of the model i.e., the form of the equations is known (theoretically) and the primary concern is one of estimating the values of the constant parameters in the model. For complex biological systems, the model-structure is rarely known and the modeler has to deal with both model-identification and parameter-estimation. In this paper, we are concerned with providing automated assistance to the first of these problems. Specifically, we examine the identification by machine, of the structural relationship between experimentally observed variables. The relationship will be expressed in the form qualitative abstractions of a quantitative model. Such qualitative models can not only provide clues to the precise quantitative model, but also assist in understanding the essence of that model. Our position in this paper is that background knowledge incorporating system modelling principles can be used to constrain effectively the set of good qualitative models. Utilising the model-identification framework provided by Inductive Logic Programming (ILP) we present empirical support for this position using a series of increasingly complex artificial datasets. The results are obtained with qualitative and quantitative data subject to varying amounts of noise and different degrees of sparsity. The results also point to the presence of a set of qualitative states, which we term kernel subsets, that may be necessary for a qualitative model-learner to learn correct models. We also demonstrate scalability to biological system modelling by identification of the core of the glycolysis metabolic pathway from data.

[13]
A. Srinivasan and R. Kothari. A study of applying dimensionality reduction to restrict the size of a hypothesis space. In Proceedings of the Fifteenth International Conference on Inductive Logic Programming (ILP2005), LNAI 3625, pages 348-365, Berlin, 2005. Springer.
Given sample data and background knowledge encoded in the form of logic programs, a predictive Inductive Logic Programming (ILP) system attempts to find a set of rules (or clauses) for predicting classification labels in the data. Most present-day systems for this purpose rely on some variant of a generate-and-test procedure that repeatedly examines a set of potential candidates (termed here as the ``hypothesis space''). On each iteration a search procedure is employed to find the best clause. The worst-case time-complexity of such systems depends critically on: (1) the size of the hypothesis spaces examined; and (2) the cost of estimating the goodness of a clause. To date, attempts to improve the efficiency of such ILP systems have concentrated either on examining fewer clauses within a given hypothesis space, or on efficient means of estimating the goodness of clauses. The principal means of restricting the size of the hypothesis space itself has been through the use of language and search constraints. Given such constraints, this paper is concerned with investigating the use of a dimensionality reduction method to reduce further the size of the hypothesis space. Specifically, for a particular kind of ILP system, clauses in the search space are represented as points in a high-dimension space. Using a sample of points from this geometric space, feature selection is used to discard dimensions of little or no (statistical) relevance. The resulting lower dimensional space translates directly, in the worst-case, to a smaller hypothesis space. We evaluate this approach on one controlled domain (graphs) and two real-life datasets concerning problems from biochemistry (mutagenesis and carcinogenesis). In each case, we obtain unbiased estimates of the size of the hypothesis space before and after feature selection; and compare the the results obtained using a complete search of the two spaces.

[14]
R. Camacho, R.D. King, and A. Srinivasan, editors. Inductive Logic Programming. LNAI 3194. Springer, Berlin, 2004.

[15]
F. Zelezny, A. Srinivasan, and C.D. Page. A Monte-Carlo Study of Randomised Restarted Search in ILP. In Proceedings of the Fourteenth International Conference on Inductive Logic Programming (ILP2004), LNAI 3194, pages 341-358, Berlin, 2004. Springer.
Recent statistical performance surveys of search algorithms in difficult combinatorial problems have demonstrated the benefits of randomising and restarting the search procedure. Specifically, it has been found that if the search cost distribution (SCD) of the non-restarted randomised search exhibits a slower-than-exponential decay (that is, a ``heavy tail''), restarts can reduce the search cost expectation. Recently, this heavy tail phenomenon was observed in the SCD's of benchmark ILP problems. % [?]. Following on this work, we report on an empirical study of randomised restarted search in ILP. Our experiments, conducted over a cluster of a few hundred computers, provide an extensive statistical performance sample of five search algorithms operating on two principally different ILP problems (artificially generated graph data and the well-known ``mutagenesis'' problem). The sample allows us to (1) estimate the conditional expected value of the search cost (measured by the total number of clauses explored) given the minimum clause score required and a ``cutoff'' value (the number of clauses examined before the search is restarted); and (2) compare the performance of randomised restarted search strategies to a deterministic non-restarted search. Our findings indicate that the cutoff value is significantly more important than the choice of (a) the specific refinement strategy; (b) the starting element of the search; and (c) the specific data domain. We find that the optimal value for the cutoff parameter remains roughly stable across variations of these three factors and that the mean search cost using this value in a randomised restarted search is up to three orders of magnitude (i.e. 1000 times) lower than that obtained with a deterministic non-restarted search.

[16]
A. Srinivasan, R.D. King, and M.E. Bain. An empirical study of the use of relevance information in Inductive Logic Programming. Machine Learning Research, 4(Jul):369-383, 2003.
Inductive Logic Programming (ILP) systems construct models for data using domain-specific background information. When using these systems, it is typically assumed that sufficient human expertise is at hand to rule out irrelevant background information. Such irrelevant information can, and typically does, hinder an ILP system's search for good models. Here, we provide evidence that if additional expertise is available that can provide a partial-ordering on sets of background predicates in terms of relevance to the analysis task, then this can be used to good effect by an ILP system. In particular, using data from biochemical domains, we investigate an incremental strategy of including sets of predicates in decreasing order of relevance. Results obtained suggest that: (a) the incremental approach identifies, in less time, a model that is comparable in predictive accuracy to that obtained with all background information in place; and (b) the incremental approach using the relevance ordering performs better than one that does not (that is, one that adds sets of predicates randomly). For a practitioner concerned with use of ILP, the implication of these findings are two-fold: (1) when not all background information can be used at once (either due to limitations of the ILP system, or the nature of the domain) expert assessment of the relevance of background predicates can assist substantially in the construction of good models; and (2) good ``first-cut'' results can be obtained quickly by a simple exclusion of information known to be less relevant.

[17]
H. Toivonen, A. Srinivasan, R.D. King, S. Kramer, and C. Helma. Statistical Evaluation of The Predictive Toxicology Challenge 2000-2001. Bioinformatics, 19:1183-1193, 2003.
The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology Challenge (PTC) competition was organized to test the state-of-the-art in applying machine learning to form such predictive models. Fourteen machine learning groups generated 111 models. The use of Receiver Operating Characteristic (ROC) space allowed the models to be uniformly compared regardless of the error cost function. We developed a statistical method to test if a model performs significantly better than random in ROC space. Using this test as criteria five models performed better than random guessing at a signi ficance level p of 0.05. Statistically the best predictor was the Viniti model for female mice, with p value below 0.002. The toxicologically most interesting models were Leuven2 for male mice, and Kwansei for female rats. These models performed well in the statistical analysis and they are in the middle of ROC space, i.e., distant from extreme cost assumptions. These predictive models were also independently judged by domain experts to be among the three most interesting, and are believed to include a small but significant amount of empirically learned toxicological knowledge.

[18]
V. S. Costa, A. Srinivasan, R. C. Camacho, H. Blockeel, B.Demoen, G. Janssens, J. Struyf, H. Vandecasteele, and W. Van Laer. Query Transformations for Improving the Efficiency of ILP Systems. Journal of Machine Learning Research, 4(Aug):465-491, 2003.
Relatively simple transformations can speed up the execution of queries for data mining considerably. While some ILP systems use such transformations, relatively little is known about them or how they relate to each other. This paper describes a number of such transformations. Not all of them are novel, but there have been no studies comparing their efficacy. The main contributions of the paper are: (a) it clarifies the relationship between the transformations; (b) it contains an empirical study of what can be gained by applying the transformations; and (c) it provides some guidance on the kinds of problems that are likely to benefit from the transformations.

[19]
C.D. Page and A. Srinivasan. ILP: A Short Look Back and a Longer Look Forward. Journal of Machine Learning Research, 4(Aug):415-430, 2003.
Inductive logic programming (ILP) is built on a foundation laid by research in machine learning and computational logic. Armed with this strong foundation, ILP has been applied to important and interesting problems in the life sciences, engineering and the arts. In turn, the applications have brought into focus the need for more research into specific topics. We enumerate five of these: (1) novel search methods; (2) incorporation of explicit probabilities; (3) incorporation of special-purpose reasoners; (4) parallel execution using commodity components; and (5) enhanced human interaction. It is our hypothesis that progress in each of these areas can greatly improve the contributions that can be made with ILP; and that, with assistance from research workers in other areas, significant progress in each of these areas is possible.

[20]
A. Srinivasan. The Applicability to ILP of Results Concerning the Ordering of Binomial Populations. In Proceedings of the Twelfth International Conference on Inductive Logic Programming (ILP2002), LNAI, Berlin, 2002. Springer.
Results obtained in the 1950's on the ordering of k binomial populations are directly applicable to computing the size of data sample required to identify the best amongst k theories in an ILP search. The basic problem considered is that of selecting the best one of k (k geq 2) binomial populations on the basis of the observed frequency of `success' on n observations. In the statistical formulation of the problem the experimenter specifies that he or she would like to guarantee that the probability of correct selection is at least P* (0 leq P* < 1) whenever the true difference between the best and second best population is at least d* (0 < d* leq 1). The principal result of interest is that for large values of k, the value of n required to meet any fixed specification (d*,P*) is approximately equal to some constant multiple of ln k. Depending on the k, d* and P* specified, this value of n can be substantially lower than that obtained within the corresponding PAC formulation. Applied to the search conducted by an ILP algorithm, the result guarantees with confidence level P*, that the true accuracy of the theory selected will be no more than d* below that of the best theory. In this paper, we review this basic result and test its predictions empirically. We also comment on the relevance to ILP of more recent work in the field.

[21]
F. Zelezny, A. Srinivasan, and C.D. Page. Lattice-Search Runtime Distributions May Be Heavy-Tailed. In Proceedings of the Twelfth International Conference on Inductive Logic Programming (ILP2002), LNAI, Berlin, 2002. Springer.
Recent empirical studies show that runtime distributions of backtrack procedures for solving hard combinatorial problems often have intruiging properties. Unlike standard distributions (such as the normal), such distributions decay slower than exponentially and have ``heavy tails''. Procedures characterised by heavy-tailed runtime distributions exhibit large variability in efficiency, but a very straightforward method called rapid randomized restarts has been designed to essentially improve their average performance. We show on two experimental domains that heavy-tailed phenomena can be observed in ILP, namely in the search for a clause in the subsumption lattice. We also reformulate the technique of randomized rapid restarts to make it applicable to ILP and show that it can reduce the average search-time.

[22]
C. Helma, R.D. King, S. Kramer, and A. Srinivasan. The Predictive Toxicology Challenge 2000-2001. Bioinformatics, 17:107-108, 2001. (Application Note).
We initiated the Predictive Toxicology Challenge (PTC) to stimulate the development of advanced SAR techniques for predictive toxicology models. The goal of this challenge is to predict the rodent carcinogenicity of new compounds based on the experimental results of the US National Toxicology Program (NTP). Submissions will be evaluated on quantitative and qualitative scales to select the most predictive models and those with the highest toxicological relevance.

[23]
R. D. King, A. Srinivasan, and L. DeHaspe. WARMR: A Data Mining Tool for Chemical Data. Computer Aided Molecular Design, 15:173-181, 2001.
Data mining techniques are becoming increasingly important in chemistry as databases become too large to examine manually. Data mining methods from the field of Inductive Logic Pr ogramming (ILP) have potential advantages for structural chemical data. In this paper we present Warmr, the first ILP data mining algorithm to be applied to chemoinformatic data. We illustrate the value of Warmr by applying it to a well studied database of chemical compounds tested for carcinogenicity in rodents. Data mining was used to find all frequent substructures in the database , and knowledge of these frequent substructures is shown to add value to the database. One use of the frequent substructures was to convert them into probabilistic prediction rules relating compound description to carcinogenesis. These rules were found to be accurate on test data and to give some insight into the relationship between structure and activity in carcinogenesis. In addition, the substructures were used to prove that there existed no accurate probabilistic rule, based purely on atom-bond substructure with less than seven conditions, that could predict carcinogenicity. This result put a lower bound on the complexity of the relationship between chemical structure and carcinogenicity. Only by using a data mining algorithm, and by doing a complete search, is it possible to prove such a result. The frequent substructures were also found to add value by increasing the accuracy of statistical and machine learning programs that were trained to predict chemical carcinogenicity. We conclude that Warmr, and ILP data mining methods generally, are an important new tool for analysing chemical databases.

[24]
S.H. Muggleton, C.H. Bryant, A. Srinivasan, A. Whittaker, S. Topp, and C. Rawlings. Are grammatical representations useful for learning from biological sequence data?. Computational Biology, 8(5):493-522, 2001.
This paper investigates whether Chomsky-like grammar representations are useful for learning cost-effective, comprehensible predictors or members of biological sequence families. The Inductive Logic Programming (ILP) Bayesian approach to learning from positive examples is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Collectively, five of the co-authors of this paper have extensive expertise on NPPs and general bioinformatics methods. Their motivation in generating a NPP grammar was that non of the existing bioinformatic methods could provide sufficient cost-savings during the search for new NPPs. Prior to this project experienced specialists at SmithKline Beechan had tried for many months to hand-code such a grammar without success. Our best predictor makes the search for novel NPPs more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. As far as these authors are aware, this is both the first biological grammar learnt using ILP and the first real-world scientific application of the ILP Bayesian approach to learning from positive examples. A group of features is derived from this grammar. Other groups of features of NPPs are derived using other learning strategies. Amalgams of these groups are formed. A recognition model is generated for each amalgam using C4.5 and C4.5rules and its performance measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The highest RA was achieved by a model which includes grammar-derived features. This RA is significantly higher than the best RA achieved without the use of the grammar-derived features. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.

[25]
A. Srinivasan. Extracting context-sensitive models in Inductive Logic Programming. Machine Learning, 44:301-324, 2001.
Given domain-specific background knowledge and data in the form of examples, an Inductive Logic Programming (ILP) system extracts models in the data-analytic sense. We view the model-selection step facing an ILP system as a decision problem, the solution of which requires knowledge of the context in which the model is to be deployed. In this paper, ``context'' will be defined by the current specification of the prior class distribution and the client's preferences concerning errors of classification. Within this restricted setting, we consider the use of an ILP system in situations where: (a) contexts can change regularly. This can arise for example, from changes to class distributions or misclassification costs; and (b) the data are from observational studies. That is, they may not have been collected with any particular context in mind. Some repercussions of these are: (a) any one model may not be the optimal choice for all contexts; and (b) not all the background information provided may be relevant for all contexts. Using results from the analysis of Receiver Operating Characteristic curves, we investigate a technique that can equip an ILP system to reject those models that cannot possibly be optimal in any context. We present empirical results from using the technique to analyse two datasets concerned with the toxicity of chemicals (in particular, their mutagenic and carcinogenic properties). Clients can and typically do, approach such datasets with quite different requirements. For example, a synthetic chemist would require models with a low rate of commission errors which could be used to direct efficiently the synthesis of new compounds. A toxicologist on the other hand, would prefer models with a low rate of omission errors. This would enable a more complete identification of toxic chemicals at a calculated cost of misidentification of non-toxic cases as toxic. The approach adopted here attempts to obtain a solution that contains models that are optimal for each such user according to the cost function that he or she wishes to apply. In doing so, it also provides one solution to the problem of how the relevance of background predicates is to be assessed in ILP.

[26]
A. Srinivasan. Four Suggestions and a Rule Concerning the Application of ILP. In Nada Lavrac and Saso Dzeroski, editors, Relational Data Mining, pages 365-374. Springer-Verlag, Berlin, 2001.
Since the late 1980s there has been a sustained research effort directed at investigating the application of Inductive Logic Programming (ILP) to problems in biology and chemistry. This essay is a personal view of some interesting issues that have arisen during my involvement in this enterprise. Many of the concerns of the broader field of Knowledge Discovery from Databases manifest themselves during the application of ILP to analyse bio-chemical data. Addressing them in this microcosm has given me some directions on the wider application of ILP, and I present these here in the form of four suggestions and one rule. Readers are invited to consider them in the context of a hypothetical Recommended Codes and Practices for the application of ILP.

[27]
V. S. Costa, A. Srinivasan, and R. C. Camacho. A note on two simple transformations for improving the efficiency of an ILP system. In J. Cussens and A. Frisch, editors, Proceedings of the Tenth International Conference on Inductive Logic Programming (ILP2000), LNAI, pages 225-242, Berlin, 2000. Springer.
Inductive Logic Programming (ILP) systems have had noteworthy successes in extracting comprehensible and accurate models for data drawn from a number of scientific and engineering domains. These results suggest that ILP methods could enhance the model-construction capabilities of software tools being developed for the emerging discipline of ``knowledge discovery from databases.'' One significant concern in the use of ILP for this purpose is that of efficiency. The performance of modern ILP systems is principally affected by two issues: (1) they often have to search through very large numbers of possible rules (usually in the form of definite clauses); (2) they have to score each rule on the data (usually in the form of ground facts) to estimate ``goodness''. Stochastic and greedy approaches have been proposed to alleviate the complexity arising from each of these issues. While these techniques can result in order-of-magnitude improvements in the worst-case search complexity of an ILP system, they do so at the expense of exactness. As this may be unacceptable in some situations, we examine two methods that result in admissible transformations of clauses examined in a search. While the methods do not alter the size of the search space (that is, the number of clauses examined), they can alleviate the theorem-proving effort required to estimate goodness. The first transformation simply involves eliminating literals using a weak test for redundancy. The second involves partitioning the set of literals within a clause into groups that can be executed independently of each other. The efficacy of these transformations are evaluated empirically on a number of well-known ILP datasets. The results suggest that for problems that require the use of highly non-determinate predicates, the transformations can provide significant gains as the complexity of clauses sought increases.

[28]
S. H. Muggleton, C. H. Bryant, and A. Srinivasan. Learning chomsky-like grammars for biological sequence families. In Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, 2000. Morgan Kaufmann.
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). As far as these authors are aware, this is both the first biological grammar learnt using ILP and the first real-world scientific application of the positive-only learning framework of CProgol. Performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. The highest RA was achieved by a model which includes grammar-derived features. This RA is significantly higher than the best RA achieved without the use of the grammar-derived features.

[29]
S. H. Muggleton, C. H. Bryant, and A. Srinivasan. Measuring performance when positives are rare: Relative Advantage versus predictive accuracy - a biological case-study. In Proceedings of the Eleventh European Conference on Machine Learning, LNCS, Berlin, 2000. Springer.
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.

[30]
L. Todorovski, S. Dzeroski, A. Srinivasan, J. Whiteley, and D. Gavaghan. Discovering the Structure of Partial Differential Equations from Example Behavior. In Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, 2000. Morgan Kaufmann.
One of the most powerful and widely accepted analytical formalisms for modeling biological and physical systems is that of the partial differential equation (PDE). Establishing an acceptable PDE model for a dynamic system occupies a major portion of the work of the mathematical modeler. There are two main aspects to this activity. First, an appropriate structure has to be determined for the equations involved (the model identification problem). Second, acceptably accurate values for parameters are to be determined (the parameter estimation problem). Of these, the first is more challenging, and is the focus of this paper. We propose a method for discovering the structure of PDE models from example behavior. For simple PDE models, we illustrate that a straight-forward adaptation of existing equation discovery methods is sufficient. However, complex PDE models require a more sophisticated approach: a two-stage method is described in the paper. The efficacy of the approach is demonstrated initially by rediscovering the PDE models for several artificial problems. We also use it to obtain the structure of the classic FitzHugh-Nagumo model. This represents a very wide class of biological systems, making the model discovery method of interest to scientists concerned with the enterprise of obtaining a mathematical understanding of dynamic processes occurring in the life sciences.

[31]
A. Srinivasan. A study of two probabilistic methods for searching large spaces with ILP. Technical Report PRG-TR-16-00, Oxford University Computing Laboratory, Oxford, 2000.
Given sample data and background knowledge encoded in the form of logic programs, a predictive Inductive Logic Programming (ILP) system attempts to find a set of rules (or clauses) for predicting classification labels in the data. Most present-day systems for this purpose rely on some variant of a generate-and-test procedure that repeatedly examines a set of potential candidates (termed here as the ``search space'') and selects one or more clauses according to some criterion of ``goodness''. The worst-case time-complexity of such systems depends critically on: (1) the size of the search space; and (2) the cost of estimating the goodness of a clause. This paper is concerned with addressing the first issue and is motivated by two principal factors. First, the representation adopted by an ILP system often engenders a search space whose size dominates complexity calculations. Straightforward arguments show that examining fewer clauses should lead to faster execution times. Second, evidence from feature-based methods suggests that increased search also increases the probability of ``flukes'' -- rules whose goodness values are high simply due to chance effects. We examine two simple probabilistic schemes that restrict the search space for a predictive ILP system by sacrificing optimality. The first, ``stochastic clause selection'' is derived from a technique termed ordinal optimisation and randomly selects a fixed-size sample of clauses from the search space which, with high probability, contains a good clause. This directly addresses the concern of searching intractably large spaces. The second, ``iterative beam search'' is a direct adaptation of a technique developed for feature-based methods to counter the increased chance of flukes. An ILP system equipped with each of these methods is used to construct hypotheses for three real-life datasets concerning problems from biochemistry (mutagenesis and carcinogenesis) and engineering (finite-element mesh design). In each case, we obtain estimates of the predictive accuracies of the hypotheses and the time for their construction. These are then compared against estimates obtained with two extreme search options, namely greedy search and restricted branch-and-bound. For the problems considered, the results here show that there is no significant gain in accuracy to be accrued with admissible techniques like branch-and-bound, and that time savings comparable to that from greedy search are possible with the use of a random search scheme like stochastic clause selection.

[32]
A. Srinivasan. The Aleph Manual. Available at http://www.comlab.ox.ac.uk/oucl/ research/areas/machlearn/Aleph/, 1999.

[33]
A. Srinivasan, R.D. King, and S. Muggleton. The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program. Technical Report PRG-TR-08-99, Oxford University Computing Laboratory, Oxford, 1999.
Inductive Logic Programming (ILP) systems construct explanations for data in terms of domain-specific background information. How does the quality of this information affect the performance of an ILP system? Results from experiments concerned with learning simple programs for list processing suggest that performance is sensitive to the type and amount of background knowledge provided. In particular, background knowledge that contains large amounts of information that is known to be irrelevant to the problem being considered can, and typically does, prevent an ILP system in its search for an correct explanation. In this paper, we examine a more realistic scenario where the relevance of background predicates is graded from being more to less relevant to the problem considered, and no ``correct explanation'' is known. Theoretical or empirical answers to the performance of an ILP system under such conditions address more directly the concerns of a practitioner. Using a problem from chemistry, this paper reports on an empirical study of the change in the performance of an ILP program as increasingly generic chemical definitions are added to background knowledge. The problem examined is the task of obtaining rules describing the mutagenic activity of small molecules. The experiments describe the use of the ILP program Progol, and the effect of adding atom and bond structure, bulk molecular properties, 2 and 3-dimensional descriptions of the molecules as background knowledge. The results show that adding highly relevant background predicates causes significant improvements in predictive accuracy, simplicity and time for theory construction. These effects are less pronounced, although still noticeable, along the scales of accuracy and simplicity as the background knowledge includes increasingly more generic -- and hence not as directly relevant -- chemical knowledge. For a practitioner concerned with use of ILP, the results suggest that good ``first-cut'' results can be obtained quickly by a simple exclusion of information known to be less relevant.

[34]
A. Srinivasan. A study of two sampling methods for analysing large datasets with ILP. Data Mining and Knowledge Discovery, 3(1):95-123, 1999.
This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to construct theories by sampling from this large pool of data. The first, ``subsampling'', is a single-sample design in which the utility of a potential rule is evaluated on a randomly selected sub-sample of the data. The second, ``logical windowing'', is multiple-sample design that tests and sequentially includes errors made by a partially correct theory. Both schemes are derived from techniques developed to enable propositional learning methods (like decision trees) to cope with large datasets. The ILP system CProgol, equipped with each of these methods, is used to construct theories for two datasets -- one artificial (a chess endgame) and the other naturally occurring (a language tagging problem). In each case, we ask the following questions of CProgol equipped with sampling: (1) Is its theory comparable in predictive accuracy to that obtained if all the data were used (that is, no sampling was employed)?; and (2) Is its theory constructed in less time than the one obtained with all the data? For the problems considered, the answers to these questions is ``yes''. This suggests that an ILP program equipped with an appropriate sampling method could begin to address problems satisfactorily that have hitherto been inaccessible simply due to data extent.

[35]
A. Srinivasan. Note on the location of optimal classifiers in n-dimensional ROC space. Technical Report PRG-TR-2-99, Oxford University Computing Laboratory, Oxford, 1999.
The comparative study of classifier performance is a worthwhile concern in Machine Learning. Empirical comparisons typically examine unbiased estimates of predictive accuracy of different algorithms -- the assumption being that the classifier with the highest accuracy would be the ``optimal'' choice of classifier for the problem. The qualification on optimality is needed here, as choice is restricted to the classifiers being compared, and the estimates are typically subject to sampling errors. Comparisons based on predictive accuracy overlook two important practical concerns, namely (a) class distributions cannot be specified precisely. Distribution of classes in the training set are thus rarely matched exactly on new data; and (b) that the costs of different types of errors may be unequal. Using techniques developed in signal detection, Provost and Fawcett describe an elegant method for the comparative assessment of binary classifiers that takes these considerations into account. Their principal result is that optimal classifiers lie on the edge of the convex hull of points representing the classifier performance in a Lorentz diagram. The authors leave open the question of whether this result extends to classifiers that discriminate between more than two classes. Here we show that the result that optimal classfiers are located on the edge of the convex hull does extend to arbitrary number of classes. This follows directly from results about convex sets that form the basis of linear programming algorithms.

[36]
A. Srinivasan and R.C. Camacho. Numerical reasoning with an ILP program capable of lazy evaluation and customised search. Journal of Logic Programming, 40(2,3):185-214, 1999.
Using problem-specific background knowledge, computer programs developed within the framework of Inductive Logic Programming (ILP) have been used to construct restricted first-order logic solutions to scientific problems. However, their approach to the analysis of data with substantial numerical content has been largely limited to constructing clauses that: (a) provide qualitative descriptions (``high'', ``low'' etc.) of the values of response variables; and (b) contain simple inequalities restricting the ranges of predictor variables. This has precluded the application of such techniques to scientific and engineering problems requiring a more sophisticated approach. A number of specialised methods have been suggested to remedy this. In contrast, we have chosen to take advantage of the fact that the existing theoretical framework for ILP places very few restrictions of the nature of the background knowledge. We describe two issues of implementation that make it possible to use background predicates that implement well-established statistical and numerical analysis procedures. Any improvements in analytical sophistication that result are evaluated empirically using artificial and real-life data. Experiments utilising artificial data are concerned with extracting constraints for response variables in the text-book problem of balancing a pole on a cart. They illustrate the use of clausal definitions of arithmetic and trigonometric functions, inequalities, multiple linear regression, and numerical derivatives. A non-trivial problem concerning the prediction of mutagenic activity of nitroaromatic molecules is also examined. In this case, expert chemists have been unable to devise a model for explaining the data. The result demonstrates the combined use by an ILP program of logical and numerical capabilities to achieve an analysis that includes linear modelling, clustering and classification. In all experiments, the predictions obtained compare favourably against benchmarks set by more traditional methods of quantitative methods, namely, regression and neural-network.

[37]
A. Srinivasan and R.D. King. Feature construction with Inductive Logic Programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(1):37-57, 1999.
Recently, computer programs developed within the field of Inductive Logic Programming (ILP) have received some attention for their ability to construct restricted first-order logic solutions using problem-specific background knowledge. Prominent applications of such programs have been concerned with determining ``structure-activity'' relationships in the areas of molecular biology and chemistry. Typically the task here is to predict the ``activity'' of a compound (for example, toxicity), from its chemical structure. A summary of the research in the area is: (a) ILP programs have largely been restricted to qualitative predictions of activity (``high'', ``low'' etc.); (b) When appropriate attributes are available, ILP programs have equivalent predictivity to standard quantitative analysis techniques like linear regression. However ILP programs usually perform better when such attributes are unavailable; and (c) By using structural information as background knowledge, an ILP program can provide comprehensible explanations for biological activity. This paper examines the use of ILP programs as a method of ``discovering'' new attributes. These attributes could then be used by methods like linear regression, thus allowing for quantitative predictions while retaining the ability to use structural information as background knowledge. Using structure-activity tasks as a test-bed, the utility of ILP programs in constructing new features was evaluated by examining the prediction of biological activity using linear regression, with and without the aid of ILP learnt logical attributes. In three out of the five data sets examined the addition of ILP attributes produced statistically better results. In addition six important structural features that have escaped the attention of the expert chemists were discovered. The method used here to construct new attributes is not specific to the problem of predicting biological activity, and the results obtained suggest a wider role for ILP programs in aiding the process of scientific discovery.

[38]
A. Srinivasan and R.D. King. Using Inductive Logic Programming to construct Structure-Activity Relationships. In G.C. Gini and A.R. Katrizsky, editors, Predictive toxicology of chemicals: experiences and impact of AI tools (Papers from the 1999 AAAI Spring Symposium), pages 64-73. AAAI Press, Menlo Park, CA, 1999.
The existence and rapid growth of chemical databases have brought into focus the utility of methods that can assist the discovery of predictive patterns in data, and communicating them in a manner designed to provoke insight. This has turned attention to machine learning techniques capable of extracting ``symbolic'' descriptions from data. At the cutting-edge of such techniques is Inductive Logic Programming (ILP). Given a set of observations and background knowledge encoded as a set of logical descriptions, an ILP system attempts to construct explanations for the observations. The explanations are in the same language as the observations and background knowledge -- usually a subset of first-order logic. The use of first-order logic contrasts with algorithms like decision-trees, and neural networks which employ simple propositional logic representations. The increased representation power along with the flexibility to include background knowledge -- which can even include propositional algorithms -- allow a form of data analysis and decision-support that is, in principle, unmatched by first-generation methods. Biochemical applications of ILP have largely been concerned with determining ``structure-activity'' relationships (SARs). The task here is to obtain rules that predict the activity of a compound, like toxicity, from its chemical structure. The representation language adopted by ILP systems allows the development of compact, chemist-friendly ``theories'', and ILP systems have progressively been shown to be capable of handling 1, 2, and 3-dimensional descriptions of chemical structure. Empirical results in predicting mutagenicity and carcinogenicity suggest that structure-activity relations found by ILP systems achieve at least the same predictive power of traditional SAR techniques, with fewer limitations (like the need for alignment, pre-determination of structural features etc.) In some cases, they have found novel structural features that significantly improve the predictive capabilities of traditional 1 and 2-dimensional SAR methods. Here we summarise the progress achieved so far in the use of ILP in these areas, including ideas emerging from a recent toxicology prediction challenge which suggest that a combination of ILP and established prediction methods can provide a powerful form of relating chemical activity to structure.

[39]
A. Srinivasan, R.D. King, and D.W. Bristol. An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. In Proceedings of the Sixteenth International Conference on Artificial Intelligence (IJCAI-99). Morgan Kaufmann, Los Angeles, CA, 1999.
Constructing ``good'' models for chemical carcinogenesis was identified in IJCAI-97 as providing a substantial challenge to ``knowledge discovery'' programs. Attention was drawn to a comparative exercise which called for predictions on the outcome of 30 rodent carcinogenicity bioassays. This -- the Predictive Toxicology Evaluation (or PTE) Challenge -- was seen to provide AI programs with an opportunity to participate in an enterprise of scientific merit, and a yardstick for comparison against strong competition. Here we provide an assessment of the machine learning (ML) submissions made. Models submitted are assessed on: (1) their accuracy, in comparison to models developed with expert collaboration; and (2) their explanatory value for toxicology. The principal findings were: (a) using structural information available from a standard modelling package, layman-devised features, and outcomes of established biological tests, results from ML-derived models were at least as good as those with expert-derived techniques. This was surprising; (b) the combined use of structural and biological features by ML-derived models was unusual, and suggested new avenues for toxicology modelling. This was also unexpected; and (c) significant effort was required to interpret the output of even the most ``symbolic'' of ML-derived models. Much of this could have been alleviated with measures for converting the results into a more ``toxicology-friendly'' form. As it stands, their absence is sufficient to prevent a whole-hearted acceptance of these promising methods by toxicologists. This suggests that ML techniques have been able to respond -- not fully, but nevertheless substantially -- to the PTE Challenge.

[40]
A. Srinivasan, R.D. King, and D.W. Bristol. An assessment of ILP-assisted models for toxicity and the PTE-3 experiment. In S. Dzeroski and P.A. Flach, editors, Proceedings of the Ninth International Workshop on Inductive Logic Programming (ILP99), volume 1634 of LNAI, pages 291-302, Berlin, 1999. Springer.
The Predictive Toxicology Evaluation (or PTE) Challenge provided Machine Learning techniques with the opportunity to compete against specialised techniques for toxicology prediction. Toxicity models that used findings from ILP programs have performed creditably in the PTE-2 experiment proposed under this challenge. We report here on an assessment of such models along scales of: (1) quantitative performance, in comparison to models developed with expert collaboration; and (2) potential explanatory value for toxicology. Results appear to suggest the following: (a) across of range of class distributions and error costs, some explicit models constructed with ILP-assistance appear closer to optimal than most expert-assisted ones. Given the paucity of test-data, this is to be interpreted cautiously; (b) a combined use of propositional and ILP techniques appears to yield models that contain unusual combinations of structural and biological features; and (c) significant effort was required to interpret the output, strongly indicating the need to invest greater effort in transforming the output into a ``toxicologist-friendly'' form. Based on the lessons learnt from these results, we propose a new predictive toxicology evaluation experiment -- PTE-3 -- which will address some important shortcomings of the previous study.

[41]
P. Finn, S. Muggleton, D. Page, and A. Srinivasan. Pharmacophore Discovery using the Inductive Logic Programming system Progol. Machine Learning, 30:241-270, 1998.
This paper is a case study of a machine aided knowledge discovery process within the general area of drug design. More specifically, the paper described a sequence of experiments in which an Inductive Logic Programming (ILP) system is used for pharmacophore discovery. Within drug design, a pharmacophore is a description of the substructure of a ligand (a small molecule) which is responsible for medicinal activity. This medicinal activity is produced by interaction between the ligand and the binding site for the target protein. ILP was chosen by the domain expert (first author) at Pfizer since active molecules are most naturally described, in relational terms, as requiring a substructure with various 3-D relations which hold among the atoms involved. The results described in this paper build on previous investigations into prediction of mutagenecity using ILP with a 2-D (bond connectivity only) representation of molecules. The case study supports general lessons for knowledge discovery, as well as more specific lessons for pharmacophore discovery, the use of ILP for 3-D problems, and for the particular medicinal activity of ACE inhibition, a treatment for hypertension.

[42]
S.H. Muggleton, A. Srinivasan, R.D. King, and M.J.E. Sternberg. Biochemical knowledge discovery using Inductive Logic Programming. In H. Motoda, editor, First Conference on Discovery Science. Springer-Verlag, 1998.
Machine Learning algorithms are being increasingly used for knowledge discovery tasks. Approaches can be broadly divided by distinguishing discovery of procedural from that of declarative knowledge. Client requirements determine which of these is appropriate. This paper discusses an experimental application of machine learning in an area related to drug design. The bottleneck here is in finding appropriate constraints to reduce the large number of candidate molecules to be synthesised and tested. Such constraints can be viewed as declarative specifications of the structural elements necessary for high medicinal activity and low toxicity. The first-order representation used within Inductive Logic Programming (ILP) provides an appropriate description language for such constraints. Within this application area knowledge accreditation requires not only a demonstration of predictive accuracy but also, and crucially, a certification of novel insight into the structural chemistry. This paper describes an experiment in which the ILP system Progol was used to obtain structural constraints associated with mutagenicity of molecules. In doing so Progol found a new indicator of mutagenicity within a subset of previously published data. This subset was already known not to be amenable to statistical regression, though its complement was adequately explained by a linear model. According to the combined accuracy/explanation criterion provided in this paper, on both subsets comparative trials show that Progol's structurally-oriented hypotheses are preferable to those of other machine learning algorithms.

[43]
R.R. Saith, A. Srinivasan, D. Michie, and I.L. Sargent. The relationship between embryo, oocyte and follicular features and the developmental potential of human ivf embryos. Human Reproduction Update, 4(2):121-134, 1998.

[44]
R. King and A. Srinivasan. The discovery of indicator variables for QSAR using inductive logic programming. Journal of Computer-Aided Molecular Design, 11:571-580, 1997.

[45]
A. Srinivasan, R.D. King, S.H. Muggleton, and M.J.E. Sternberg. The Predictive Toxicology Evaluation Challenge. In Proceedings of the Fifteenth International Conference on Artificial Intelligence (IJCAI-97). Morgan Kaufmann, Los Angeles, CA, 1997.
Can an AI program contribute to scientific discovery? An area where this gauntlet has been thrown is that of understanding the mechanisms of chemical carcinogenesis. One approach is to obtain Structure-Activity Relationships (SARs) relating molecular structure to cancerous activity. Vital to this are the rodent carcinogenicity tests conducted within the US National Toxicology Program by the National Institute of Environmental Health Sciences (NIEHS). This has resulted in a large database of compounds classified as carcinogens or otherwise. The Predictive-Toxicology Evaluation project of the NIEHS provides the opportunity to compare carcinogenicity predictions on previously untested chemicals. This presents a formidable challenge for programs concerned with knowledge discovery. Desirable features of this problem are: (1) involvement in genuine scientific discovery; (2) availability of a large database with expert-certified classifications; (3) strong competition from methods used by chemists; and (4) participation in true blind trials, with results available by next IJCAI. We describe the materials and methods constituting this challenge, and provide some initial benchmarks. These show the Inductive Logic Programming tool Progol to be competitive with current state-of-the-art. The challenge described here is aimed at encouraging AI programs to avail themselves the opportunity of contributing to an enterprise with immediate scientific value.

[46]
A. Srinivasan and R.D. King. Carcinogenesis predictions using ILP. In N. Lavrac and S. Dzeroski, editors, Proceedings of the Seventh International Workshop on Inductive Logic Programming (ILP97), volume 1297 of LNAI, pages 273-287, Berlin, 1997. Springer. A version also in Intelligent Data Analysis in Medicine, Kluwer.
Obtaining accurate structural alerts for the causes of chemical cancers is a problem of great scientific and humanitarian value. This paper follows up on earlier research that demonstrated the use of Inductive Logic Programming (ILP) for predictions for the related problem of mutagenic activity amongst nitroaromatic molecules. Here we are concerned with predicting carcinogenic activity in rodent bioassays using data from the U.S. National Toxicology Program conducted by the National Institute of Environmental Health Sciences. The 330 chemicals used here are significantly more diverse than the previous study, and form the basis for obtaining Structure-Activity Relationships (SARs) relating molecular structure to cancerous activity in rodents. We describe the use of the ILP system Progol to obtain SARs from this data. The rules obtained from Progol are comparable in accuracy to those from expert chemists, and more accurate than most state-of-the-art toxicity prediction methods. The rules can also be interpreted to give clues about the biological and chemical mechanisms of carcinogenesis, and make use of those learnt by Progol for mutagenesis. Finally, we present details of, and predictions for, an ongoing international blind trial aimed specifically at comparing prediction methods. This trial provides ILP algorithms an opportunity to participate at the leading-edge of scientific discovery.

[47]
R.D. King, S.H. Muggleton, A. Srinivasan, and M.J.E. Sternberg. Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. of the National Academy of Sciences, 93:438-442, 1996.

[48]
R.D. King and A. Srinivasan. Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming. Environmental Health Perspectives, 104(5):1031-1040, 1996.

[49]
A. Srinivasan, S.H. Muggleton, R.D. King, and M.J.E. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85:277-299, 1996.
A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler feature-based methods. One area that has long been associated with graph-based or structural representation and reasoning is organic chemistry. In this field, we consider the problem of predicting the mutagenic activity of small molecules: a property that is related to carcinogenicity, and an important consideration in developing less hazardous drugs. By providing an ILP system with progressively more structural information concerning the molecules, we compare the predictive power of the logical theories constructed against benchmarks set by regression, neural, and tree-based methods.

[50]
S. Muggleton, C.D. Page, and A. Srinivasan. An initial experiment into stereochemistry-based drug design using ILP. In S.H. Muggleton, editor, Proceedings of the Sixth Inductive Logic Programming Workshop, volume 1314 of LNAI, pages 25-40, Berlin, 1996. Springer-Verlag. Extended version appears in ML Journal, 1998.
Previous applications of Inductive Logic Programming to drug design have not addressed stereochemistry, or the three-dimensional aspects of molecules. While some success is possible without consideration of stereochemistry, researchers within the pharmaceutical industry consider stereochemistry to be central to most drug design problems. This paper reports on an experimental application of the ILP system P-Progol to stereochemistry-based drug design. The experiment tests whether P-Progol can identify the structure responsible for the activity of ACE (angiotensin-converting enzyme) inhibitors from 28 positive examples, that is, from 28 molecules that display the activity of ACE inhibition. ACE inhibitors are a widely-used form of medication for the treatment of hypertension. It should be stressed that this structure was already known prior to the experiment and therefore is not a new discovery; the experiment was proposed by a researcher within the pharmaceutical industry to test the applicability of ILP to stereochemistry-based drug design. While the result of the experiment is quite positive, one challenge remains before ILP can be applied to a multitude of drug design problems.

[51]
A. Srinivasan and R.D. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. In S.H. Muggleton, editor, Proceedings of the Sixth Inductive Logic Programming Workshop, volume 1314 of LNAI, pages 89-104, Berlin, 1996. Springer-Verlag. Extended version appears in Data Mining and Knowledge Discovery, 1999.
Recently, computer programs developed within the field of Inductive Logic Programming (ILP) have received some attention for their ability to construct restricted first-order logic solutions using problem-specific background knowledge. Prominent applications of such programs have been concerned with determining ``structure-activity'' relationships in the areas of molecular biology and chemistry. Typically the task here is to predict the ``activity'' of a compound, like toxicity, from its chemical structure. Research in the area shows that: (a) ILP programs have been restricted to qualitative predictions of activity (``high'', ``low'' etc.); (b) When appropriate attributes are available, ILP programs have not been able to better the performance of standard quantitative analysis techniques like linear regression. However ILP programs perform creditably when such attributes are unavailable; and (c) When both are applicable, ILP programs are usually slower than their propositional counterparts. This paper examines the use of ILP programs, not for obtaining theories complete for the sample, but as a method of ``discovering'' new attributes. These could th en be used by methods like linear regression, thus allowing for quantitative predictions and the ability to use structural information as background knowledge. Using structure-activity tasks as a test-bed the utility of ILP programs in constructing new features was evaluated by examining the prediction of chemical activity using linear regression, with and without the aid of ILP learnt logical attributes. In three out of the five datasets examined the addition of ILP attributes produced statistically better results (P < 0.01). In addition six important structural features that have escaped the attention of the expert chemists were discovered. The method used here to construct new attributes is not specific to the problem of predicting biological activity, and the results obtained suggest a useful role for ILP programs in aiding the process of scientific discovery.

[52]
M.J.E. Sternberg, R.D. King, A. Srinivasan, and S.H. Muggleton. Drug Design by Machine Learning. In D. Michie S. Muggleton and K. Furukawa, editors, Machine Intelligence 15. Oxford University Press, Oxford, 1996.

[53]
M. Bain and A. Srinivasan. Inductive logic programming with large-scale unstructured data. In D. Michie S. Muggleton and K. Furukawa, editors, Machine Intelligence 14, pages 233-267. Oxford University Press, Oxford, 1995.
We report some recent developments from an ongoing project in which a chess endgame domain is providing benchmark experimental tests for the study of concept learning. The King and Rook against King (KRK) endgame is simple enough in chess terms but provides concept learning tasks which can be demanding, as evidenced in previous studies by a number of authors. For learning systems these tasks have highlighted problems of representation, such as the ability to express the structural relationships to be found in learning examples, and other issues like correctness, compression and comprehensibility. Our current focus is on Inductive Logic Programming methods which are based on previously developed systems for the generalisation and specialisation of normal logic programs. In the current work we are principally concerned with improving these methods to be able to handle more examples during learning. The main contribution of this work consists of a new method of incremental learning, together with new results on a set of concept learning problems taken from the KRK endgame domain.

[54]
R.D. King, A.Srinivasan, and M.J.E. Sternberg. Relating chemical activity to structure: an examination of ILP successes. New Gen. Comput., 13(3,4):411-433, 1995.
An important test-bed for Inductive Logic Programming (ILP) systems has been the task of relating the activity of chemical compounds to their structure. In this paper we examine the structure-activity problems that have been addressed by ILP, and evaluate empirically the extent to which a first-order representation was required. This is done by comparing ILP theories against those constructed by standard linear regression and a decision-tree learner. When propositional encodings are feasible for the feature-based algorithms, we present evidence that they are capable of matching the predictive accuracies of an ILP theory. However, as the diversity of compounds considered increases, propositional encodings become intractable. In such cases, our results provide support for the claim that ILP programs will continue to construct accurate, understandable theories. Based on this evidence, we propose future work to realise fully the potential of ILP in structure-activity problems.

[55]
A. Srinivasan, S.H. Muggleton, and R.D. King. Comparing the use of background knowledge by inductive logic programming systems. In L. De Raedt, editor, Proceedings of the Fifth International Inductive Logic Programming Workshop. Katholieke Universteit, Leuven, 1995. Withdrawn from publication. New version submitted to IEEE Trans on Knowledge and Data Engineering, 1999.

[56]
R.R. Saith, A. Srinivasan, I.L. Sargent, and D.H. Barlow. Embryo selection: can Artificial Intelligence help? In Tenth Annual Meeting of the European Society of Human Reproduction and Embryology (ESHRE), 1994. Extended version appears in Human Reproduction Update, 1998.

[57]
A. Srinivasan, S.H. Muggleton, and M.E. Bain. The justification of logical theories based on data compression. In D. Michie S. Muggleton and K. Furukawa, editors, Machine Intelligence 13, pages 87-121. Oxford University Press, Oxford, 1994.
Non-demonstrative or inductive reasoning is a crucial component in the skills of a learner. A leading candidate for this form of reasoning involves the automatic formation of hypotheses. Initial successes in the construction of propositional theories have now been followed by algorithms that attempt to generalise sentences in the predicate calculus. An important defect in these new-generation systems is the lack of a clear model for theory justification. In this paper we describe a method of evaluating the significance of a hypothesis based on the degree to which it allows compression of the observed data with respect to prior knowledge. This can be measured by comparing the lengths of the input and output tapes of a reference Turing machine which will generate the examples from the hypothesis and a set of derivational proofs. The model extends an earlier approach of Muggleton by allowing for noise. The truth values of noisy instances are switched by making use of correction codes. The utility of compression as a significance measure is evaluated empirically in three independent domains. In particular, the results show that the existence of compression distinguishes a larger number of significant clauses than other significance tests. The method also appears to distinguish noise as incompressible data.

[58]
A. Srinivasan, S.H. Muggleton, R.D. King, and M.J.E. Sternberg. Mutagenesis: ILP experiments in a non-determinate biological domain. In S. Wrobel, editor, Proceedings of the Fourth International Inductive Logic Programming Workshop, pages 217-232. Gesellschaft fur Mathematik und Datenverarbeitung MBH, 1994. GMD-Studien Nr 237.
This paper describes the use of Inductive Logic Programming as a scientific assistant. In particular, it details the application of the ILP system Progol to discovering structural features that can result in mutagenicity in small molecules. To discover these concepts, Progol only had access to the atomic and bond structure of the molecules. With such a primitive description and no further assistance from chemists, Progol corroborated some existing knowledge and proposed a new structural alert for mutagenicity in compounds. In the process, the experiments act as a case study in which, even with extremely limited background knowledge, an Inductive Logic Programming tool firstly, complements a complex statistical model developed by skilled chemists, and secondly, continues to provide understandable theories when the statistical model fails. The experiments also constitute the first demonstrations of a prototype of the Progol system. Progol allows the construction of hypotheses with bounded non-determinacy by performing a best-first search within the subsumption lattice. The results here provide evidence that such searches are both viable and desirable.

[59]
M. Sternberg, J. Hirst, R. Lewis, R. King, A. Srinivasan, and S. Muggleton. Application of machine learning to protein structure prediction and drug design. In S. Schulze-Kremer, editor, Advances in Molecular Bioinformatics, pages 1-8. IOS Press, 1994.

[60]
James Cussens, Anthony Hunter, and Ashwin Srinivasan. Generating explicit orderings for non-monotonic logics. In Proc. of the Eleventh National Conference on Artificial Intelligence (AAAI-93), pages 420-425. MIT Press, 1993.
For non-monotonic reasoning, explicit orderings over formulae offer an important solution to problems such as `multiple extensions'. However, a criticism of such a solution is that it is not clear, in general, from where such orderings should be obtained. Here we show how orderings can be derived from statistical information about the domain which the formulae cover. For this we provide an overview of prioritized logics -- a general class of logics that incorporate explicit orderings over formulae. This class of logics has been shown elsewhere to capture a wide variety of proof-theoretic approaches to non-monotonic reasoning, and in particular, to highlight the roles of preferences -- both explicit and implicit -- in such proof theory. We take one particular prioritised logic, called SF logic, and describe an experimental approach for comparing this logic with an important example of a logic that does not use explicit orderings of preference -- namely Horn clause logic with negation-as-failure. Finally, we present the results of this comparison, showing how SF logic is more sceptical and more accurate than negation-as-failure.

[61]
G. Edwards, P. Compton, R. Malor, A. Srinivasan, and L. Lazarus. PEIRS: a pathologist maintained expert system for the interpretation of chemical pathology reports. Pathology, 25:27-34, 1993.

[62]
A. Srinivasan and J.A. Richards. Analysis of gis spatial data using knowledge based methods. International Journal of Remote Sensing, 7(6):479-500, 1993.
A symbolic approach to analysis of mixed data types found in the spatial database of a Geographic Information System is developed. Unlike statistical techniques, that are essentially based on the manipulation of numeric quantities, the method here attempts to emulate photo-interpretation by manipulating symbolic representations of photo-interpretive knowledge. This representation consists of a mixture of normal and `defeasible' rules. The latter enables analysis to proceed in the face of uncertainty by making (and possibly revising) assumptions about the content of the scene being analysed. The analysis procedure is decomposed into three stages. The first deals with the analysis of data from groups of sensors (sources) providing the same type of information, the second combines the inferences of these individual analyses, and the third imposes spatial consistency constraints. This decomposition implies a structuring of the knowledge, an issue that appears to be essential for a knowledge-based analysis to be efficient. Examples are presented which demonstrate the system's ability to handle mixed multi-spectral and radar imagery, and images at different spatial resolutions.

[63]
P. Compton, G. Edwards, B. Kang, L. Lazarus, R. Malor, P. Preston, and A. Srinivasan. Ripple down rules: turning knowledge acquisition into knowledge maintenance. Artificial Intelligence in Medicine, 4:463-475, 1992.

[64]
R.D. King, S. Muggleton, A.Srinivasan, C. Feng, R.A. Lewis, and M.J.E. Sternberg. Drug design using inductive logic programming. In Proceedings of the 26th Hawaii International Conference on System Sciences, pages 646-655, Los Alamitos, 1993. IEEE Computer Society Press.

[65]
S.H. Muggleton, A. Srinivasan, and M.E. Bain. Compression, significance and accuracy. In D. Sleeman and P. Edwards, editors, Ninth International Machine Learning Conference, pages 338-347, San Mateo, CA, 1992. Morgan Kaufmann.
Inductive Logic Programming (ILP) involves learning relational concepts from examples and background knowledge. To date all ILP learning systems make use of tests inherited from propositional and decision tree learning for evaluating the significance of hypotheses. None of these significance tests take account of the relevance or utility of the background knowledge. In this paper we describe a method, called HP-compression, of evaluating the significance of a hypothesis based on the degree to which it allows compression of the observed data with respect to the background knowledge. This can be measured by comparing the lengths of the input and output tapes of a reference Turing machine which will generate the examples from the hypothesis and a set of derivational proofs. The model extends an earlier approach of Muggleton by allowing for noise. The truth values of noisy instances are switched by making use of correction codes. The utility of compression as a significance measure is evaluated empirically in three independent domains. In particular, the results show that the existence of positive compression distinguishes a larger number of significant clauses than other significance tests The method is also shown to reliably distinguish artificially introduced noise as incompressible data.

[66]
A. Srinivasan, S.H. Muggleton, and M.E. Bain. Distinguishing noise from exceptions in non-monotonic learning. In S. Muggleton and K. Furukawa, editors, Second International Inductive Logic Programming Workshop (ILP92). Institute for New Generation Computer Technology, 1992.
It is important for a learning program to have a reliable method of deciding whether to treat errors as noise or to include them as exceptions within a growing first-order theory. We explore the use of an information-theoretic measure to decide this problem within the non-monotonic learning framework defined by Closed-World-Specialisation. The approach adopted uses a model that consists of a reference Turing machine which accepts an encoding of a theory and proofs on its input tape and generates the observed data on the output tape. Within this model, the theory is said to ``compress'' data if the length of the input tape is shorter than that of the output tape. Data found to be incompressible are deemed to be ``noise''. We use this feature to implement a compression-guided specialisation procedure that searches for the best-fitting theory for the data (that is, the one with the shortest input tape length). The approach is empirically evaluated on the standard Inductive Logic Programming problem of learning classification rules for the KRK chess end-game.

[67]
A. Srinivasan, S.H. Muggleton, and M.E. Bain. Compression-guided correction of first-order theories. In ECAI92 Workshop on Logical Approaches to Learning. 1992. Shorter version of ILP92 paper.

[68]
P. Compton, A.Srinivasan, G. Edwards, R. Malor, and L. Lazarus. Knowledge base maintenance without a knowledge engineer. In J. Liebowitz, editor, First World Congress on Expert Systems, pages 668-675. Pergamon, London, 1991.

[69]
P. Compton, A.Srinivasan, G.Edwards, R.Malor, and L.Lazarus. Ripple down rules. In IJCAI Workshop on Representing Knowledge in Medical Decision Support Systems, pages 7.1-7.8, 1991.

[70]
A. Srinivasan and J.A. Richards. Rule-based techniques for multi-source classification. International Journal of Remote Sensing, 11:502-525, 1990.

[71]
A. Srinivasan and J.A. Richards. Knowledge based multi-source analysis. In Fifth Australasian Remote Sensing Conference, 1990.

[72]
A. Srinivasan, C.A. Sammut, and J.A. Richards. A knowledge-based system for multi-source classification in remote sensing. In Fifth Australian Conference on the Application of Expert Systems, pages 102-119, 1989.