mu-query.cc:
- make_related_enquire: don't include first query in qvec, we already have all
thread IDs we need to query in thread_ds.
- run_related: always sort first query by date, explained by the comment.
- run_related: include qflags (in particular ascending vs descending) in
leader_qflags.
- run_theaded: don't limit results to maxnum, that results in threads
potentially being cut off.
mu-server.cc:
- output_sexp: don't limit results to maxnum so as to match the behaviour of
mu find (and avoid cuttong off threads).
Fixes#1924 and #1911.
For threading, we still get the _full_ set of messages (since the mset is
limited, but not the enquire); so no need to warn about docids we
haven't seen before.
Also, ensure the unwanted docids are sorted after the wanted ones.
Fixes: #1926.
Rewrite the query machinery in c++:
- use an MSet decorator instead of the mu-msg-iter stuff
- use mu-query-decider to mark duplicates/unreadable/related messages
- use mu-query-threader to replace the older container/thread code
Algorithm did not substantially change, but the implementation details
did.
- Move the lib/query/ stuff up a level into lib/
- Associate directly with the Query object
- Rework the Query object to be C++ rather than mixed with C
- Update all dependencies, tests
reimplement the old mu-log.[ch] into mu-logging.{cc,hh}
If available (and using an appropriately equipped glib), log to the
systemd journal
Only g_criticals have stderr output, all the other g_* go to the log
file / journal.
* mu-store.h, mu-store-read.cc, mu-store-write.cc, mu-store-priv.hh have been reworked
in mu-store.{cc,hh}, it the mix of c/c++ improved
* update all the dependent modules
* make it easier to upgrade an database in place (without user intervention)
* remove the xbatch-size option
The current threading algorithm is applied to the entire result of a query, even
if maxnum is specified, and then the result of the threading algorithm is
truncated to maxnum. The improves threading results by returning the entire
thread even when only a single message makes it into the top maxnum results.
This commit applies the threading algorithm to the related message set of the
maxnum-truncated query result instead of to the entire query result. For a given
set of messages, the set of messages which will share threads with any of the
original messages is exactly the related message sets. Put another way, either
any messages returned by the original query but removed by the maxnum truncation
will also be returned by the related message query, or they would not have been
needed anyway because they would not be members of any visible thread.
To maintain backward compatibility and allow threading to be used without
including related messages, the related message set is found for the threading
calculation, but any messages which would not have matched the original query
are then pruned, resulting in a superset of the truncated query, but a subset of
the untruncated query.
This does not improve (or degrade) the run time of a threading calculation when
maxnum is not set, but significant improves it when maxnum is set by making it
scale (roughly) linearly in terms of maxnum. On a maildir with ~200k messages
and maxnum set to 500 (the default), the run time of a threading calculation is
lowered from ~1m to ~0.1s.
Perform threading calculation on related set instead of entire result.
The current threading algorithm is applied to the entire result of a query, even
if maxnum is specified, and then the result of the threading algorithm is
truncated to maxnum. The improves threading results by returning the entire
thread even when only a single message makes it into the top maxnum results.
This commit applies the threading algorithm to the related message set of the
maxnum-truncated query result instead of to the entire query result. For a given
set of messages, the set of messages which will share threads with any of the
original messages is exactly the related message sets. Put another way, either
any messages returned by the original query but removed by the maxnum truncation
will also be returned by the related message query, or they would not have been
needed anyway because they would not be members of any visible thread.
To maintain backward compatibility and allow threading to be used without
including related messages, the related message set is found for the threading
calculation, but any messages which would not have matched the original query
are then pruned, resulting in a superset of the truncated query, but a subset of
the untruncated query.
This does not improve (or degrade) the run time of a threading calculation when
maxnum is not set, but significant improves it when maxnum is set by making it
scale (roughly) linearly in terms of maxnum. On a maildir with ~200k messages
and maxnum set to 500 (the default), the run time of a threading calculation is
lowered from ~1m to ~0.1s.