55 lines
1.5 KiB
Text
55 lines
1.5 KiB
Text
- Project name
|
|
|
|
marisa-trie
|
|
http://code.google.com/p/marisa-trie/
|
|
|
|
- Project summary
|
|
|
|
Matching Algorithms with Recursively Implemented StorAge
|
|
|
|
- Version
|
|
|
|
0.1.4
|
|
|
|
- Description
|
|
|
|
This project *marisa-trie* provides a C++ library *libmarisa* and command line tools *`marisa-*`* for building and operating nesting patricia tries. The brand-new dictionary structure is designed to be static and space efficient. Also, *marisa-trie* enables not only simple lookups but also prefix searches and predictive searches.
|
|
|
|
* Prefix searches are to find keys from prefixes of a given query.
|
|
* Predictive searches are to find keys starting with a given query.
|
|
|
|
The biggest advantage of *marisa-trie* is that it can build a considerably compact dictionary. See below for the size of dictionaries built with various trie implementations.
|
|
|
|
* Input
|
|
* Source: enwiki-20110115-all-titles-in-ns0.gz
|
|
* Contents: all page titles of English Wikipedia (Jan. 2011)
|
|
* Number of keys: 8,279,325
|
|
* Total size: 167,223,035 bytes (plain) / 46,344,646 bytes (gzipped)
|
|
|
|
|| *Implementation* || *Size (bytes)* || *Remarks* ||
|
|
|| darts-clone || 316,065,792 || Compacted double-array trie ||
|
|
|| tx-trie || 107,119,864 || LOUDS-based trie ||
|
|
|| *marisa-trie* || *42,688,271* || Nesting patricia trie ||
|
|
|
|
* Documentation (in Japanese)
|
|
* HowTo
|
|
* ListOfTools
|
|
* LibraryInterface
|
|
* BenchmarkResults
|
|
|
|
- Version control system
|
|
|
|
Subversion
|
|
|
|
- Source code license
|
|
|
|
New BSD License
|
|
|
|
- Project labels
|
|
|
|
Nesting
|
|
Patricia
|
|
Trie
|
|
Static
|
|
Dictionary
|
|
CPlusPlus
|