1
0
mirror of https://github.com/djcb/mu.git synced 2024-07-06 09:01:17 +02:00
Commit Graph

5 Commits

Author SHA1 Message Date
djcb
f794cea6e7 parser: small regex optimization 2017-11-04 14:32:41 +02:00
djcb
3cd150f289 parser: handle implicit 'and not' 2017-11-04 12:59:48 +02:00
djcb
65863e46cd parser: fix and-not precedence
For now, don't treat "and not" specially; this gets us back into a
somewhat working state. At some point, we probably _do_ want to
special-case and_not though (since Xapian supports it).
2017-10-31 07:18:14 +02:00
djcb
6ce7c89488 phrases: only allow for index fields 2017-10-27 18:42:58 +03:00
djcb
b75f9f508b lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.

Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).

Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.

The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.

From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.

From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
      subject:/h.ll?o/
  will find subjects with hallo, hello, halo,  philosophy, ...

  As you can imagine, this can be a _heavy_ operation on the database,
  and might take quite a bit longer than a normal query; but it can be
  quite useful.
2017-10-24 22:55:35 +03:00