Category Archives: Gene Ontology

Unfolding T-boxes in GO using POPL

“Unfolding a T-box” may sound like some quaint tea ceremony ritual, but in fact in the context of description logics it refers to the iterative replacement of classes by equivalent anonymous class expressions.

Many reasoners take advantage of T-box unfolding behind the scenes. But there may be reasons to unfold your T-box in a more public fashion.

For example, there are frequently criticisms of highly specific wordy GO terms such as:
GO:2001043 positive regulation of cytokinetic cell separation involved in cell cycle cytokinesis

Whilst not as unwieldy as some of the infamous ICD9 examples, some feel that this is taking pre-composition too far. In fact these detailed pre-composed terms are very useful for systems that aren’t capable of consuming anonymous class expressions. However, for some purposes it may be useful to replace this with a nested class expression.

This can be done in 3 lines with a POPL script:


class(X) ===> null where X==_.
annotationAssertion(_,X,_) ===> null where X==_.

X ===>Y where X==Y.

The first two lines remove class declarations and annotation assertions (e.g. label assignments) for any defined classes. The final line does all the work: it replaces every occurrence of a defined class with the equivalent class expression.

This means if we have the following gene association (e.g. from a GAF file):


Class: :Gene1234
Types: 'positive regulation of cytokinetic cell separation involved in cell cycle cytokinesis'

it will be translated to:

Class: :Gene1234
Types:
  capable_of some 
    ('biological regulation'
     and (positively_regulates some 
        ('cytokinetic cell separation'
         and (part_of some 
            (cytokinesis
             and (part_of some 'cell cycle'))))))

We can also choose to selectively unfold – e.g. unfold all regulation terms:


X ===>Y where (X==Y, Y='biological regulation and _).

Resulting in:

Class: :Gene1234
Types:
  capable_of some
    ('biological regulation'
      and (positively_regulates some 'cytokinetic cell separation involved in cell cycle cytokinesis'))

The same thing could be done in java, but would require significantly more code and messing around with visitor classes. The results would be less declarative, and harder to customize.

What if we want to perform the reverse operation? This is similar to finding the most specific subsuming class, which is a standard reasoner operation. However, in this case we want to find a more specific class expression, which is slightly more difficult. This might be the topic of a future post.