Category Archives: Uncategorized

Change in posh syntax

The syntax for posh has changed in the latest release on the “posh” branch.

Previously “q”, “l” and other shortcut commands were unary predicates. This caused a few problems when combining with other programs.

Now there is a single generic infix predicate “\” for all shortcut commands. The first argument in the command.




init/1 remains the same:

init pellet.

Direct rdf mapping in Thea

The current implementation of Thea makes use of the SWI-Prolog semweb package as means of parsing OWL encoded as RDF/XML, according to the Mapping to RDF Graphs specification. After the triples have been translated into owl2_model facts such as EquivalentClasses/1, they are discarded. This is somewhat analagous to how the java OWLAPI views RDF – merely as an exchange format.

This means it is possible to use Thea to convert an OWL ontology encoded in RDF to native prolog facts such as:


(real URIs have been replaced by names in the above)

And then use another prolog system that lacks the rich RDF libraries of SWI to process the ontology.

This also has lots of advantages when working in a purely OWL world, but has some disadvantages when working with mixed RDF and OWL views. From an engineering perspective it would be nice to be able to take more advantage of the useful features of the SWI semweb library (in particular, namespace support). Ideally the programmer could choose whether the OWL predicates were served from an RDF store or from native prolog facts.

This is now possible, to a certain extent, using the newrdf branch in github. Note that the posh and pkb branches frequently merge in from this branch.

This branch includes a module owl2_rdf that serves owl2_model predicates directly from semweb/ in SWI. Check the comments in the code for how DCGs are used for a very compact declarative coding of the mapping.

The idea is to allow seamless switching between backing stores. To use the direct RDF store, specify “rdf_direct” as the format. In prolog:


On the command line:

thea --format rdf_direct foo.owl --query select A where "axiom(A)"

For most purposes, there should be no noticeable differemce. However, if you now wish to mix and match RDF, SPARQL and OWL in prolog the picture is much better. E.g. try the following:

thea-poshj --format rdf_direct ceph.owl
?- use_module(library(semweb/rdf_db)).
?- rdf_has(X, rdfs:subClassOf, Y).
?- subClassOf(X,Y).

The direct rdf query returns the same axioms (with the difference that the rdf query Y may bind to bNodes, whereas with subClassOf/2, Y binds to prolog terms corresponding to class expressions.

In fact, from within Posh, you can now type the command “clio.” to launch a ClioPatria semantic web server complete with custom OWL views.

There are still a few quirks that need ironed out before newrdf is merged into master

  • Efficiency is a challenge. When mapping a query such as subClassOf/2 to rdf_has/3, additional calls have to be made to map bNodes to prolog terms. This has to be done in the correct order for efficiency. This has now been done for a few predicates, but others are noticeably slow on ontologies such as snomed
  • This is particularly challenging for OWL axioms that takes sets as arguments (e.g. equivalentClasses/1) and have to be mapped to pairwise RDF calls
  • assert_axiom/2 needs to be mapped

However, for many purposes the current behavior should be fine.

Generating a variant ontology using POPL

Ontologies such as the FMA represent reference anatomical entities. Many actual existing anatomical entities would not be classified in reference anatomical ontologies, due to widespread variation found in nature. This applies across multiple scales and modalities: genes and proteins are typically represented using some reference structure, pathways are abstractions that conveniently ignore all the messy crosstalk and stochastic events ubiquitous in cells.

From a practical point of view it makes sense to ignore the majority of this variation and represent some possibly hypothetical reference model. This is what most bio-ontologies do. Sometimes it can be useful to generate an ontology of variants, together with abstractions over the union of the variant and the reference. I call this here a Reference-Variant-Abstraction triad model, with a nod to SNOMED-CT SEP triples.

Generation of a skeleton variation ontology can be automated using the following POPL script:

:- [idfixer].

% ========================================
% ========================================
% we add both variant classes and abstract classes, in a R-V-A triad
(   class(CV),
    CV == variant and variantOf some C,

    CA == CV or C,
where (
       atom_concat('variant ',CN,CVN),
       atom_concat('abstract ',CN,CAN)

% ========================================
% ========================================
% we make a property chain for each object property,
% traversing the variantOf property (it is assumed the ontology
% already has this). We also add the reflexive form
(   objectProperty(PV),
    variantOf*P @< PV,
    variantOf*P @< PVR,
where (
       atom_concat('variantOf ',PN,PVN),
       atom_concat('reflexive variantOf ',PN,PVRN)       

This relies on additional program called

extend_iri(Iri,Suffix,New) :-
        % HASH-style URIs
extend_iri(Iri,Suffix,New) :-
        % OBO-style URIs

If we save the popl file as rvs.popl, we can execute it like this:

thea --popl-file rvs.popl myont.owl --to owl

The resulting ontology will have 3x the number of classes. Use a reasoner to classify this.

If the original ontology contained “tooth”, “mouth” and “tooth SubClassOf partOf some mouth”, then the new ontology would include:

Class: 'variant tooth'
EquivalentTo: variantOf some tooth

Class: 'abstract tooth'
EquivalentTo: 'variant tooth' or tooth

(annotations omitted)

You can try DL queries within Protege

A query such as:

partOf some mouth

Will return the reference class for “tooth”, but not the abstract tooth or a variant tooth. This is because “variant tooth” encompasses ectopic teeth (e.g. a tooth may be part of a teratoma in the lung). At a stretch, we would also include shark dermal denticles as variants of human teeth. This is all well and good, but we might want to query for what is “typical”. In this case we can ask:

'reflexive variantOf partOf' some mouth

The property name is not very intuitive, but what we mean here is “any variant of a tooth that is part of a mouth, or part of a mouth”. The following should be equivalent:

('variantOf partOf' some mouth) or partOf some mouth

As should this:

(variantOf some (partOf some mouth)) or partOf some mouth

The named property chain just makes querying easier.

In this case we get “abstract tooth” and “variant tooth” in the descendants. If we manually classify teratoma teeth or dermal denticles here, we will get these too.


graphviz and blip ontol

blip includes a generic grammar/writer for the graphviz language ‘dot’.

dot is actually quite powerful, and allows for specification of boxes inside boxes. For example, the following blip command line call:

blip -r fma ontol-subset -n Heart -cr subclass -to display

will generate and display a png such as this:

The ontol/conf directory specifies a number of configulation modules for the ontol library. These can be specified with the “-u” option on the command line. These allows things such as color-coding by ontology. For example “ontol_config_uberon” allows generation of diagrams such as:

Robot Scientist

This story received a bit of coverage last month: [BBC]

There are more details in the Science publication, “the automation of science”:

The system generates scientific hypotheses about gene functions, then devises and carries out experiments to test these hypotheses. The system appears to be written in Prolog, with the hypothesis generation using Inductive/Abductive Logic Programming, in particular the ProGol system.

More details can be found on the group’s website.

SWI-Prolog + GMP rocks

SWI-Prolog provides unbounded integer and rational number arithmetic based on GMP library. Support comes out the box — no need to explicitly use a library in your code.

Even if you aren’t working with number ranges near the limits of most programming languages, unbounded integer support turns out to be fantastically useful for blazingly fast set operations by representing the sets as bit vectors. Given a mapping ix between set elements S and integers [0..n], we can assign any set an integer in the range [0..2**n], with each bit indicating whether an element is in the set, i.e  v(S) =  ∑ eS 2**ix(e). We can then use the bitwise arithmetic operators /\ and \/ to perform set-iintersection and set-union. The SWI-Prolog arithmetic function popcount/1 returns the cardinality of the set (corresponding the number of bits set).

This pattern is used in the blip simmatrix module. This allows for the computation of similarity scores between two features based on the attributes they share. E.g given a database with the predicate likes/2, e.g. between individuals and bands, we can determine the similarity of two individuals based on metrics like cosine similarity and Jacard coefficient.

Benchmarks to follow, but even on a 32 bit machine it appears possible to perform super-fast comparisons for databases of millions of facts utilizing 10k+ attributes (and thus integers in the range [0..2**10000]).

A prolog library for OWL2 and SWRL

Ontologies are vital for the life sciences. The Web Ontology Language (OWL) offers decidability of reasoning, and now with OWL2 and SWRL reasonably high levels of expressivity.

Vangelis Vassilidis and I are writing Thea2, based on his original Thea library. The redesign introduces prolog predicates for every OWL2 axiom, and prolog terms for owl class and property expressions. We use the SWI-Prolog semweb library for reading/writing to RDF. There is also an (optional) JPL bridge wrapping the Manchester OWLAPI.

There are a number of different reasoning strategies, including:

  • simple but limited backward chaining reasoning
  • using Grosof’s translation to DLP in conjunction with systems such as Yap, XSB or DLV
  • using standard OWL reasoners via JPL (DIG interface from Thea1 still needs ported)

Source: github
Documentation: pldoc

Documentation server up

Blipkit pldoc server now up and running