Generating a variant ontology using POPL

Ontologies such as the FMA represent reference anatomical entities. Many actual existing anatomical entities would not be classified in reference anatomical ontologies, due to widespread variation found in nature. This applies across multiple scales and modalities: genes and proteins are typically represented using some reference structure, pathways are abstractions that conveniently ignore all the messy crosstalk and stochastic events ubiquitous in cells.

From a practical point of view it makes sense to ignore the majority of this variation and represent some possibly hypothetical reference model. This is what most bio-ontologies do. Sometimes it can be useful to generate an ontology of variants, together with abstractions over the union of the variant and the reference. I call this here a Reference-Variant-Abstraction triad model, with a nod to SNOMED-CT SEP triples.

Generation of a skeleton variation ontology can be automated using the following POPL script:


:- [idfixer].

% ========================================
% GENERATION OF VARIANT CLASSES
% ========================================
% we add both variant classes and abstract classes, in a R-V-A triad
add
(   class(CV),
    CV == variant and variantOf some C,
    label(CV,CVN),

    class(CA),
    CA == CV or C,
    label(CA,CAN)
)
where (
       class(C),
       labelAnnotation_value(C,CN),
       C\=variant,
       extend_iri(C,'variant_',CV),
       atom_concat('variant ',CN,CVN),
       extend_iri(C,'abstract_',CA),
       atom_concat('abstract ',CN,CAN)
      ).

% ========================================
% PROPERTY CHAINS FOR VARIANTS
% ========================================
% we make a property chain for each object property,
% traversing the variantOf property (it is assumed the ontology
% already has this). We also add the reflexive form
add
(   objectProperty(PV),
    variantOf*P @< PV,
    label(PV,PVN),
    variantOf*P @< PVR,
    subPropertyOf(P,PVR),
    label(PVR,PVRN)
)
where (
       objectProperty(P),
       P\=variantOf,
       labelAnnotation_value(P,PN),
       extend_iri(P,'variantOf_',PV),
       atom_concat('variantOf ',PN,PVN),
       extend_iri(P,'reflexive_variantOf_',PVR),
       atom_concat('reflexive variantOf ',PN,PVRN)       
       ).

This relies on additional program called idfixer.pl:


extend_iri(Iri,Suffix,New) :-
        % HASH-style URIs
        atomic_list_concat([A,B],'#',Iri),
        atomic_list_concat([A,'#',Suffix,B],New).
extend_iri(Iri,Suffix,New) :-
        % OBO-style URIs
        atomic_list_concat(Parts,'_',Iri),
        reverse(Parts,[A,B|Rest]),
        atom_concat(B,Suffix,B2),
        reverse(Parts2,[A,B2|Rest]),
        atomic_list_concat(Parts2,'_',New).

If we save the popl file as rvs.popl, we can execute it like this:


thea --popl-file rvs.popl myont.owl --to owl

The resulting ontology will have 3x the number of classes. Use a reasoner to classify this.

If the original ontology contained “tooth”, “mouth” and “tooth SubClassOf partOf some mouth”, then the new ontology would include:


Class: 'variant tooth'
EquivalentTo: variantOf some tooth

Class: 'abstract tooth'
EquivalentTo: 'variant tooth' or tooth

(annotations omitted)

You can try DL queries within Protege

A query such as:

partOf some mouth

Will return the reference class for “tooth”, but not the abstract tooth or a variant tooth. This is because “variant tooth” encompasses ectopic teeth (e.g. a tooth may be part of a teratoma in the lung). At a stretch, we would also include shark dermal denticles as variants of human teeth. This is all well and good, but we might want to query for what is “typical”. In this case we can ask:


'reflexive variantOf partOf' some mouth

The property name is not very intuitive, but what we mean here is “any variant of a tooth that is part of a mouth, or part of a mouth”. The following should be equivalent:


('variantOf partOf' some mouth) or partOf some mouth

As should this:


(variantOf some (partOf some mouth)) or partOf some mouth

The named property chain just makes querying easier.

In this case we get “abstract tooth” and “variant tooth” in the descendants. If we manually classify teratoma teeth or dermal denticles here, we will get these too.

References:

Advertisements
Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: