본문 바로가기

IT-Consultant

Inverted Index Strategies

batch-based: use file-sorting algorithms (textbook)
+ fastest to build
+ fastest to search
- slow to update
b-tree based: update in place (http://www.lucene.com/papers/sigir90.ps)
+ fast to search
- update/build does not scale
- complex implementation
segment based: lots of small indexes (Verity)
+ fast to build
+ fast to update
- slower to search
hash-file based (Ultraseek ISTK?)
+ fast to build/update/search
- unsorted dictionary
- no suffix matching
- slower index merging, more seeks