improving on SWI indexing on large databases of facts

SWI, like most prologs provides fast first-argument indexing. Accessing a large database of facts via other arguments can be very slow, as a sequential scan is used. SWI provides index/1, but it doesn’t appear to be very effective.

The index_util module provides faster indexing by rewriting fact clauses to provide multiple entry points.

From the documentation:

This is designed to be a swap-in replacement for index/1. Indexing a
fact with M arguments on N of those arguments will generate N sets of
facts with arguments reordered to take advantage of first-argument
indexing. The original fact will be rewritten.

For example, calling:


materialize_index(my_fact(1,0,1)).

will retract all my_fact/3 facts and generate the following clauses in its place:


my_fact(A,B,C) :-
nonvar(A),
!,
my_fact__ix_1(A,B,C).
my_fact(A,B,C) :-
nonvar(C),
!,
my_fact__ix_3(C,A,B).
my_fact(A,B,C) :-
my_fact__ix_1(A,N,C).

here my_fact__ix_1 and my_fact__ix_3 contain the same data as the original my_fact/3 clause. In the second case, the arguments have been reordered

Speed Improvements

Some users have reported perfomance gains of 1000x. For example, this post

Limitations

  • Single key indexing only. Could be extended for multikeys.
  • Reindexing is not a good idea. It could be smarter about this.
  • Should not be used on dynamic databases.

Does not have to be used with fact (unit clauses) – but the clauses should enumerable

Advertisements
Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: