batch-based: use file-sorting algorithms (textbook)
+ fastest to build
+ fastest to search
- slow to update
b-tree based: update in place (http://www.lucene.com/papers/sigir90.ps)
+ fast to search
- update/build does not scale
- complex implementation
segment based: lots of small indexes (Verity)
+ fast to build
+ fast to update
- slower to search
hash-file based (Ultraseek ISTK?)
+ fast to build/update/search
- unsorted dictionary
- no suffix matching
- slower index merging, more seeks
'IT-Consultant' 카테고리의 다른 글
최종적으로 만들어진 Posting List를 어떻게 파일에 쓸까? (0) | 2008.10.29 |
---|---|
최종적으로 만들어진 Posting List를 어떻게 파일에 쓸까? (0) | 2008.10.29 |
Inverted Index Strategies (0) | 2008.10.29 |
TF, IDF 구현 (0) | 2008.10.29 |
TF, IDF 구현 (0) | 2008.10.29 |