Commit Graph

17 Commits

Author SHA1 Message Date
Dirk-Jan C. Binnema 264bb092f0 support xapian ngrams
Xapian supports an "ngrams" option to help with languages/scripts
without explicit wordbreaks, such as Chinese / Japanese / Korean.

Add some plumbing for supporting this in mu as well. Experimental for
now.
2023-09-09 17:26:20 +03:00
Dirk-Jan C. Binnema 1018f0f0a1 mu-document: Make sexp() lazy (optimization)
This makes queries where we don't need the sexp much faster; e.g.

before:
   mu find "a" --include-related  47,51s user 2,68s system 99% cpu 50,651 total
after:
  mu find "a" --include-related  7,12s user 1,97s system 87% cpu 10,363 total
2023-08-04 00:09:02 +03:00
Dirk-Jan C. Binnema b795242d5a message: use html-to-text scraper for html parts
We were dumping the HTML-parts as-is in the Xapian indexer; however,
it's better to remove the html decoration first, and just pass the text.

We use the new built-in html->text scraper for that.
2023-07-25 21:26:36 +03:00
Dirk-Jan C. Binnema 31f0c40893 migrate to fmt-based logging in some more places
and improve logging.
2023-07-08 10:30:36 +03:00
Dirk-Jan C. Binnema 4920b56671 update to use fmt-based apis
Not complete, but a first big stab converting users of Mu::Error and
various g_warning & friends, format to the new libfmt-based APIs.
2023-07-05 23:10:13 +03:00
Dirk-Jan C. Binnema abfa6f277c mu: index html text as if it were plain text
This is a bit of hack to include html text in results.

Of course, html text is not really plain text, so this is a bit of a
hack until we introduce some html parsing step.
2023-01-31 23:41:57 +02:00
Dirk-Jan C. Binnema 58176f8438 message: updates for new sexp
Update for API changes.
2022-11-07 18:38:03 +02:00
Dirk-Jan C. Binnema 317fe53ff7 tests: update test helpers and users
Move test-mu-common to mu-test-utils. Use mu_test_init as a wrapper for
g_test_init. Update users.
2022-08-11 22:55:10 +03:00
Dirk-Jan C. Binnema ca8836b631 document: cosmetic 2022-06-29 22:20:34 +03:00
Dirk-Jan C. Binnema df80935c2e document: index some sub-parts as well
1. Also add 'normal' terms for some indexable fields
2. Add terms for e-mail address components

And add some tests.

This helps for some corner-case queries (see tests).

Fixes #2278
Fixes #2281
2022-06-29 08:00:43 +03:00
Dirk-Jan C. Binnema 8c3d1ae90a message: cosmetics 2022-05-06 22:17:53 +03:00
Dirk-Jan C. Binnema 85fed37870 message/document: update sexp on the fly
Keep the sexp for the document up to date during scan / change, instead of
having a separate step.
2022-05-05 01:40:17 +03:00
Dirk-Jan C. Binnema a4f39819ee message/document: allow updating flags
Some flags (such as 'personal') can only be set just before storing; so allow
for update the flags.
2022-05-05 01:38:25 +03:00
Dirk-Jan C. Binnema 263e122a13 contacts: expose contact type
Instead of the Field::Id, keep a specific Contact::Type so we can distinguish
Sender, ReplyTo as well.

Update dependents.

Some cleanup.
2022-05-05 01:38:25 +03:00
Dirk-Jan C. Binnema 9a8741f0dd message:document/fields: update and tie down
Update many of the field flags; remove obsolete ones.

Ensure they are handled correctly in mu-document
2022-04-30 10:40:45 +03:00
Dirk-Jan C. Binnema 37988b5a26 message: update implementation
Add more of the Message class (and various helpers), which are to replace all
the `mu-msg-*` code.

Add more tests.
2022-03-26 17:19:10 +02:00
Dirk-Jan C. Binnema 4c4fb1759f message: move to lib/message, update naming
Basically, move/rename lib/mu-message* to lib/mu-*.

Add the beginnings of a Message class.
2022-03-26 17:19:10 +02:00