« Previous 1 2
Three full-text desktop search engines
Needle in a Haystack
Conclusions
An alternative full-text search tool on the Linux desktop quickly returns dividends if you work frequently with very large directories and a large number of files. The capabilities of the search engines presented here (Table 2) are usually fit for the purpose but not always easily accessible or user friendly. This is especially true for the otherwise powerful Gnome standard tool, Tracker, whose feature set is not fully utilized by any of the associated desktop applications.
Table 2
Search Engine Features
Feature | Tracker | DocFetcher | Recoll |
---|---|---|---|
Character encoding detection | + | – | – |
Search term highlighting | + | + | + |
Boolean join operators | + | + | + |
Proximity operators | – | + | + |
Multilingual word stemming | – | – | + |
Faceting search results | – | – | + |
Indexing multimedia files | + | – | + |
Bindings/open APIs | + | – | + |
Weighting of search terms | – | + | + |
Phrase search | – | + | + |
Synonym searching | – | – | + |
Mobile version | – | + | – |
Indexing SQL databases | – | – | + |
Desktop integration | + | – | + |
Support for multiple operating systems | – | + | + |
Wildcard (placeholder) searching | – | + | + |
Autocompleting search queries | – | – | + |
Time (min) to index 554 PDFs (13GB) | 7:05 | 14:40 | 2:30 |
No. of indexing processes | 2 | 1 | 5 |
The test shows that both DocFetcher and Recoll cut a fine figure on the Gnome desktop and meet upscale requirements. Both also have unique selling points that could tempt some users. In the case of DocFetcher, this might be the mobile version, and in the case of Recoll, the complex indexing and faceting options.
Although search engines in the past often came with complex search masks and languages, today they are usually content to show a simple input line and a very low-key query language. This trend is also propagating onto the desktop. Earlier approaches are now more likely to give way to a sensible sorting of (large) results sets, as well as the possibility of subsequently breaking these sets down into increasingly smaller sets by applying smart faceting choices, to ultimately generate useful match results without wasting time.
Infos
- Tracker: https://gnome.pages.gitlab.gnome.org/tracker/
- DocFetcher: https://sourceforge.net/projects/docfetcher/
- Recoll: https://www.lesbonscomptes.com/recoll/
- Solr: https://solr.apache.org
- Regain: http://regain.sourceforge.net
- "Tracker 3.0: Where do we go from here?" by Sam Thursfield: https://samthursfield.wordpress.com/2020/11/05/tracker-3-0-where-do-we-go-from-here/
- SPARQL 1.1 Overview: https://www.w3.org/TR/sparql11-overview/
- Tracker CLI documentation: https://gnome.pages.gitlab.gnome.org/tracker/docs/commandline/
- Nepomuk: https://nepomuk.semanticdesktop.org
- Tracker Ontology Reference Manual: https://developer-old.gnome.org/ontology/stable/
- DocFetcher Pro: https://docfetcherpro.com/features/
- Xapian: https://xapian.org
« Previous 1 2
Buy this article as PDF
(incl. VAT)