Various notes and links on Lucene

Joins in Lucene

joins using BlockJoinQuery

joins using the JoinUtil

is slower than BlockJoinQuery but seems more flexible and does not require special indexing


General rules

  • I you are using multiple filters, place these filters inside of a Boolean Filter. The Boolean Filter internally benefits from the BitSets produced by Filters.

  • If you need to use And/Or/Not Filters, your “heaviest” filter should always be placed last – typically Geo filters since they can perform some heavy computations to determine distance.

  • And/Or/Not are more performant when you are working with filters that do not return a bitset. These are operations that must iterate over every document anyway. For example, a custom script is not BitSet-able because it performs computation on every document. In these cases, And/Or/Not is a better choice than the Bool. The list of Non-Bitset filters is very small, so it is easy to remember.( That’s it! Every other filter should be placed inside of a Bool):

    • Geo* filters
    • Scripts
    • Numeric_range

Links to docs

Updated by Katja Luther over 1 year ago · 13 revisions