lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
/*
|
2022-02-22 21:58:31 +01:00
|
|
|
** Copyright (C) 2017-2022 Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
**
|
|
|
|
** This library is free software; you can redistribute it and/or
|
|
|
|
** modify it under the terms of the GNU Lesser General Public License
|
|
|
|
** as published by the Free Software Foundation; either version 2.1
|
|
|
|
** of the License, or (at your option) any later version.
|
|
|
|
**
|
|
|
|
** This library is distributed in the hope that it will be useful,
|
|
|
|
** but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
** Lesser General Public License for more details.
|
|
|
|
**
|
|
|
|
** You should have received a copy of the GNU Lesser General Public
|
|
|
|
** License along with this library; if not, write to the Free
|
|
|
|
** Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
|
|
|
|
** 02110-1301, USA.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <vector>
|
|
|
|
#include <glib.h>
|
|
|
|
|
|
|
|
#include <iostream>
|
|
|
|
#include <sstream>
|
2019-12-16 21:41:17 +01:00
|
|
|
#include <functional>
|
2022-04-28 21:49:45 +02:00
|
|
|
#include <array>
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
|
2019-12-16 21:41:17 +01:00
|
|
|
#include "mu-utils.hh"
|
2022-06-08 23:38:17 +02:00
|
|
|
#include "mu-error.hh"
|
2019-12-16 21:41:17 +01:00
|
|
|
|
|
|
|
using namespace Mu;
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
|
2022-06-08 23:38:17 +02:00
|
|
|
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
struct Case {
|
2021-10-20 11:18:15 +02:00
|
|
|
const std::string expr;
|
|
|
|
bool is_first{};
|
|
|
|
const std::string expected;
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
};
|
2021-10-20 11:18:15 +02:00
|
|
|
using CaseVec = std::vector<Case>;
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
using ProcFunc = std::function<std::string(std::string, bool)>;
|
|
|
|
|
|
|
|
static void
|
|
|
|
test_cases(const CaseVec& cases, ProcFunc proc)
|
|
|
|
{
|
2021-10-20 11:18:15 +02:00
|
|
|
for (const auto& casus : cases) {
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
const auto res = proc(casus.expr, casus.is_first);
|
|
|
|
if (g_test_verbose()) {
|
|
|
|
std::cout << "\n";
|
|
|
|
std::cout << casus.expr << ' ' << casus.is_first << std::endl;
|
2017-10-28 13:12:50 +02:00
|
|
|
std::cout << "exp: '" << casus.expected << "'" << std::endl;
|
|
|
|
std::cout << "got: '" << res << "'" << std::endl;
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
}
|
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
g_assert_true(casus.expected == res);
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_date_basic()
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
{
|
2022-05-17 23:52:28 +02:00
|
|
|
const auto hki = "Europe/Helsinki";
|
|
|
|
|
2022-05-18 00:08:40 +02:00
|
|
|
// ensure we have the needed TZ or skip the test.
|
|
|
|
if (!timezone_available(hki)) {
|
|
|
|
g_test_skip("timezone Europe/Helsinki not available");
|
|
|
|
return;
|
2022-05-17 23:52:28 +02:00
|
|
|
}
|
|
|
|
|
2022-05-18 00:08:40 +02:00
|
|
|
g_setenv("TZ", hki, TRUE);
|
2022-06-02 20:02:11 +02:00
|
|
|
constexpr std::array<std::tuple<const char*, bool/*is_first*/, int64_t>, 13> cases = {{
|
2022-04-28 21:49:45 +02:00
|
|
|
{"2015-09-18T09:10:23", true, 1442556623},
|
|
|
|
{"1972-12-14T09:10:23", true, 93165023},
|
|
|
|
{"1854-11-18T17:10:23", true, 0},
|
2017-10-24 21:57:57 +02:00
|
|
|
|
2022-04-28 21:49:45 +02:00
|
|
|
{"2000-02-31T09:10:23", true, 951861599},
|
|
|
|
{"2000-02-29T23:59:59", true, 951861599},
|
2018-02-17 16:44:21 +01:00
|
|
|
|
2022-06-02 20:02:11 +02:00
|
|
|
{"20220602", true, 1654117200},
|
|
|
|
{"20220605", false, 1654462799},
|
|
|
|
|
|
|
|
{"202206", true, 1654030800},
|
|
|
|
{"202206", false, 1656622799},
|
|
|
|
|
2022-04-28 21:49:45 +02:00
|
|
|
{"2016", true, 1451599200},
|
|
|
|
{"2016", false, 1483221599},
|
2017-10-24 21:57:57 +02:00
|
|
|
|
2022-04-28 21:49:45 +02:00
|
|
|
// {"fnorb", true, -1},
|
|
|
|
// {"fnorb", false, -1},
|
|
|
|
{"", false, G_MAXINT64},
|
|
|
|
{"", true, 0}
|
|
|
|
}};
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
|
2022-04-28 21:49:45 +02:00
|
|
|
for (auto& test: cases) {
|
|
|
|
if (g_test_verbose())
|
|
|
|
g_debug("checking %s", std::get<0>(test));
|
|
|
|
g_assert_cmpuint(parse_date_time(std::get<0>(test),
|
|
|
|
std::get<1>(test)).value_or(-1),==,
|
|
|
|
std::get<2>(test));
|
|
|
|
}
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
}
|
|
|
|
|
2017-10-24 21:57:57 +02:00
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_date_ymwdhMs(void)
|
2017-10-24 21:57:57 +02:00
|
|
|
{
|
2022-06-02 20:02:11 +02:00
|
|
|
struct testcase {
|
|
|
|
std::string expr;
|
|
|
|
int64_t diff;
|
|
|
|
int tolerance;
|
|
|
|
};
|
|
|
|
|
|
|
|
std::array<testcase, 7> cases = {{
|
|
|
|
{"7s", 7, 1},
|
|
|
|
{"3M", 3 * 60, 1},
|
|
|
|
{"3h", 3 * 60 * 60, 1},
|
|
|
|
{"21d", 21 * 24 * 60 * 60, 3600 + 1},
|
|
|
|
{"2w", 2 * 7 * 24 * 60 * 60, 3600 + 1},
|
|
|
|
{"2y", 2 * 365 * 24 * 60 * 60, 24 * 3600 + 1},
|
|
|
|
{"3m", 3 * 30 * 24 * 60 * 60, 3 * 24 * 3600 + 1}
|
|
|
|
}};
|
|
|
|
|
|
|
|
for (auto&& tcase: cases) {
|
|
|
|
const auto date = parse_date_time(tcase.expr, true);
|
|
|
|
g_assert_true(date);
|
|
|
|
const auto diff = ::time({}) - *date;
|
2017-10-24 21:57:57 +02:00
|
|
|
if (g_test_verbose())
|
2022-06-02 20:02:11 +02:00
|
|
|
std::cerr << tcase.expr << ' ' << diff << ' ' << tcase.diff << '\n';
|
2017-10-24 21:57:57 +02:00
|
|
|
|
2022-06-02 20:02:11 +02:00
|
|
|
g_assert_true(tcase.diff - diff <= tcase.tolerance);
|
2017-10-24 21:57:57 +02:00
|
|
|
}
|
|
|
|
|
2022-06-02 20:02:11 +02:00
|
|
|
// note: perhaps it'd be nice if we'd detect this error;
|
|
|
|
// currently we're being rather tolerant
|
|
|
|
// g_assert_false(!!parse_date_time("25q", false));
|
2017-10-24 21:57:57 +02:00
|
|
|
}
|
|
|
|
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
static void
|
2022-04-28 21:49:45 +02:00
|
|
|
test_parse_size()
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
{
|
2022-06-02 20:02:11 +02:00
|
|
|
constexpr std::array<std::tuple<const char*, bool, int64_t>, 6> cases = {{
|
2022-04-28 21:49:45 +02:00
|
|
|
{ "456", false, 456 },
|
|
|
|
{ "", false, G_MAXINT64 },
|
|
|
|
{ "", true, 0 },
|
|
|
|
{ "2K", false, 2048 },
|
2022-06-02 20:02:11 +02:00
|
|
|
{ "2M", true, 2097152 },
|
|
|
|
{ "5G", true, 5368709120 }
|
2022-04-28 21:49:45 +02:00
|
|
|
}};
|
|
|
|
for(auto&& test: cases) {
|
|
|
|
g_assert_cmpint(parse_size(std::get<0>(test), std::get<1>(test))
|
|
|
|
.value_or(-1), ==, std::get<2>(test));
|
|
|
|
}
|
2022-06-02 20:02:11 +02:00
|
|
|
|
|
|
|
g_assert_false(!!parse_size("-1", true));
|
|
|
|
g_assert_false(!!parse_size("scoobydoobydoo", false));
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
}
|
|
|
|
|
2017-10-28 13:12:50 +02:00
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_flatten()
|
2017-10-28 13:12:50 +02:00
|
|
|
{
|
|
|
|
CaseVec cases = {
|
2021-10-20 11:18:15 +02:00
|
|
|
{"Менделе́ев", true, "менделеев"},
|
|
|
|
{"", false, ""},
|
|
|
|
{"Ångström", true, "angstrom"},
|
2017-10-28 13:12:50 +02:00
|
|
|
};
|
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
test_cases(cases, [](auto s, auto f) { return utf8_flatten(s); });
|
2017-10-28 13:12:50 +02:00
|
|
|
}
|
|
|
|
|
2021-03-16 15:51:01 +01:00
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_remove_ctrl()
|
2021-03-16 15:51:01 +01:00
|
|
|
{
|
|
|
|
CaseVec cases = {
|
2021-10-20 11:18:15 +02:00
|
|
|
{"Foo\n\nbar", true, "Foo bar"},
|
|
|
|
{"", false, ""},
|
|
|
|
{" ", false, " "},
|
|
|
|
{"Hello World ", false, "Hello World "},
|
|
|
|
{"Ångström", false, "Ångström"},
|
2021-03-16 15:51:01 +01:00
|
|
|
};
|
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
test_cases(cases, [](auto s, auto f) { return remove_ctrl(s); });
|
2021-03-16 15:51:01 +01:00
|
|
|
}
|
|
|
|
|
2017-10-28 13:12:50 +02:00
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_clean()
|
2017-10-28 13:12:50 +02:00
|
|
|
{
|
|
|
|
CaseVec cases = {
|
2021-10-20 11:18:15 +02:00
|
|
|
{"\t a\t\nb ", true, "a b"},
|
|
|
|
{"", false, ""},
|
|
|
|
{"Ångström", true, "Ångström"},
|
2017-10-28 13:12:50 +02:00
|
|
|
};
|
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
test_cases(cases, [](auto s, auto f) { return utf8_clean(s); });
|
2017-10-28 13:12:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2021-10-20 11:18:15 +02:00
|
|
|
test_format()
|
2017-10-28 13:12:50 +02:00
|
|
|
{
|
2021-10-20 11:18:15 +02:00
|
|
|
g_assert_true(format("hello %s", "world") == "hello world");
|
|
|
|
g_assert_true(format("hello %s, %u", "world", 123) == "hello world, 123");
|
2017-10-28 13:12:50 +02:00
|
|
|
}
|
|
|
|
|
2022-02-22 21:58:31 +01:00
|
|
|
static void
|
|
|
|
test_split()
|
|
|
|
{
|
|
|
|
using svec = std::vector<std::string>;
|
|
|
|
auto assert_equal_svec=[](const svec& sv1, const svec& sv2) {
|
|
|
|
g_assert_cmpuint(sv1.size(),==,sv2.size());
|
|
|
|
for (auto i = 0U; i != sv1.size(); ++i)
|
|
|
|
g_assert_cmpstr(sv1[i].c_str(),==,sv2[i].c_str());
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
assert_equal_svec(split("axbxc", "x"), {"a", "b", "c"});
|
|
|
|
assert_equal_svec(split("axbxcx", "x"), {"a", "b", "c", ""});
|
|
|
|
assert_equal_svec(split("", "boo"), {});
|
|
|
|
assert_equal_svec(split("ayybyyc", "yy"), {"a", "b", "c"});
|
|
|
|
assert_equal_svec(split("abc", ""), {"a", "b", "c"});
|
|
|
|
assert_equal_svec(split("", "boo"), {});
|
2022-03-19 09:58:13 +01:00
|
|
|
|
|
|
|
|
|
|
|
assert_equal_svec(split("axbxc", 'x'), {"a", "b", "c"});
|
|
|
|
assert_equal_svec(split("axbxcx", 'x'), {"a", "b", "c", ""});
|
|
|
|
assert_equal_svec(split("", "boo"), {});
|
2022-02-22 21:58:31 +01:00
|
|
|
}
|
|
|
|
|
2022-03-19 09:58:13 +01:00
|
|
|
static void
|
|
|
|
test_join()
|
|
|
|
{
|
|
|
|
assert_equal(join({"a", "b", "c"}, "x"), "axbxc");
|
|
|
|
assert_equal(join({"a", "b", "c"}, ""), "abc");
|
|
|
|
assert_equal(join({},"foo"), "");
|
|
|
|
assert_equal(join({"d", "e", "f"}, "foo"), "dfooefoof");
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2020-01-05 00:15:07 +01:00
|
|
|
enum struct Bits { None = 0, Bit1 = 1 << 0, Bit2 = 1 << 1 };
|
|
|
|
MU_ENABLE_BITOPS(Bits);
|
|
|
|
|
|
|
|
static void
|
|
|
|
test_define_bitmap()
|
|
|
|
{
|
2021-10-20 11:18:15 +02:00
|
|
|
g_assert_cmpuint((guint)Bits::None, ==, (guint)0);
|
|
|
|
g_assert_cmpuint((guint)Bits::Bit1, ==, (guint)1);
|
|
|
|
g_assert_cmpuint((guint)Bits::Bit2, ==, (guint)2);
|
2020-01-05 00:15:07 +01:00
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
g_assert_cmpuint((guint)(Bits::Bit1 | Bits::Bit2), ==, (guint)3);
|
|
|
|
g_assert_cmpuint((guint)(Bits::Bit1 & Bits::Bit2), ==, (guint)0);
|
2020-01-05 00:15:07 +01:00
|
|
|
|
2021-10-20 11:18:15 +02:00
|
|
|
g_assert_cmpuint((guint)(Bits::Bit1 & (~Bits::Bit2)), ==, (guint)1);
|
2020-01-05 00:15:07 +01:00
|
|
|
|
2021-03-16 16:07:39 +01:00
|
|
|
{
|
|
|
|
Bits b{Bits::Bit1};
|
2021-10-20 11:18:15 +02:00
|
|
|
b |= Bits::Bit2;
|
|
|
|
g_assert_cmpuint((guint)b, ==, (guint)3);
|
2021-03-16 16:07:39 +01:00
|
|
|
}
|
2020-01-05 00:15:07 +01:00
|
|
|
|
2021-03-16 16:07:39 +01:00
|
|
|
{
|
|
|
|
Bits b{Bits::Bit1};
|
2021-10-20 11:18:15 +02:00
|
|
|
b &= Bits::Bit1;
|
|
|
|
g_assert_cmpuint((guint)b, ==, (guint)1);
|
2021-03-16 16:07:39 +01:00
|
|
|
}
|
2020-01-05 00:15:07 +01:00
|
|
|
}
|
|
|
|
|
2022-04-28 21:49:45 +02:00
|
|
|
static void
|
|
|
|
test_to_from_lexnum()
|
|
|
|
{
|
|
|
|
assert_equal(to_lexnum(0), "g0");
|
|
|
|
assert_equal(to_lexnum(100), "h64");
|
|
|
|
assert_equal(to_lexnum(12345), "j3039");
|
|
|
|
|
|
|
|
g_assert_cmpuint(from_lexnum(to_lexnum(0)), ==, 0);
|
|
|
|
g_assert_cmpuint(from_lexnum(to_lexnum(7777)), ==, 7777);
|
|
|
|
g_assert_cmpuint(from_lexnum(to_lexnum(9876543)), ==, 9876543);
|
|
|
|
}
|
|
|
|
|
2022-06-03 21:01:57 +02:00
|
|
|
static void
|
|
|
|
test_locale_workaround()
|
|
|
|
{
|
|
|
|
g_assert_true(locale_workaround());
|
|
|
|
|
|
|
|
g_setenv("LC_ALL", "BOO", 1);
|
|
|
|
|
|
|
|
g_assert_true(locale_workaround());
|
|
|
|
}
|
|
|
|
|
2022-06-08 23:38:17 +02:00
|
|
|
static void
|
|
|
|
test_error()
|
|
|
|
{
|
|
|
|
GError *err;
|
|
|
|
err = g_error_new(MU_ERROR_DOMAIN, 77, "Hello, %s", "world");
|
|
|
|
Error ex{Error::Code::Crypto, &err, "boo"};
|
|
|
|
g_assert_cmpstr(ex.what(), ==, "boo: Hello, world");
|
|
|
|
|
|
|
|
ex.fill_g_error(&err);
|
|
|
|
g_assert_cmpuint(err->code, ==, static_cast<unsigned>(Error::Code::Crypto));
|
|
|
|
g_clear_error(&err);
|
|
|
|
}
|
|
|
|
|
2022-02-22 21:58:31 +01:00
|
|
|
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
int
|
2021-10-20 11:18:15 +02:00
|
|
|
main(int argc, char* argv[])
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
{
|
2022-05-17 23:52:28 +02:00
|
|
|
g_test_init(&argc, &argv, nullptr);
|
2021-10-20 11:18:15 +02:00
|
|
|
|
|
|
|
g_test_add_func("/utils/date-basic", test_date_basic);
|
|
|
|
g_test_add_func("/utils/date-ymwdhMs", test_date_ymwdhMs);
|
2022-04-28 21:49:45 +02:00
|
|
|
g_test_add_func("/utils/parse-size", test_parse_size);
|
2021-10-20 11:18:15 +02:00
|
|
|
g_test_add_func("/utils/flatten", test_flatten);
|
|
|
|
g_test_add_func("/utils/remove-ctrl", test_remove_ctrl);
|
|
|
|
g_test_add_func("/utils/clean", test_clean);
|
|
|
|
g_test_add_func("/utils/format", test_format);
|
2022-02-22 21:58:31 +01:00
|
|
|
g_test_add_func("/utils/split", test_split);
|
2022-03-19 09:58:13 +01:00
|
|
|
g_test_add_func("/utils/join", test_join);
|
2021-10-20 11:18:15 +02:00
|
|
|
g_test_add_func("/utils/define-bitmap", test_define_bitmap);
|
2022-04-28 21:49:45 +02:00
|
|
|
g_test_add_func("/utils/to-from-lexnum", test_to_from_lexnum);
|
2022-06-03 21:01:57 +02:00
|
|
|
g_test_add_func("/utils/locale-workaround", test_locale_workaround);
|
2022-06-08 23:38:17 +02:00
|
|
|
g_test_add_func("/utils/error", test_error);
|
2021-10-20 11:18:15 +02:00
|
|
|
|
|
|
|
return g_test_run();
|
lib: implement new query parser
mu's query parser is the piece of software that turns your queries
into something the Xapian database can understand. So, if you query
"maildir:/inbox and subject:bla" this must be translated into a
Xapian::Query object which will retrieve the sought after messages.
Since mu's beginning, almost a decade ago, this parser was based on
Xapian's default Xapian::QueryParser. It works okay, but wasn't really
designed for the mu use-case, and had a bit of trouble with anything
that's not A..Z (think: spaces, special characters, unicode etc.).
Over the years, mu added quite a bit of pre-processing trickery to
deal with that. Still, there were corner cases and bugs that were
practically unfixable.
The solution to all of this is to have a custom query processor that
replaces Xapian's, and write it from the ground up to deal with the
special characters etc. I wrote one, as part of my "future, post-1.0
mu" reseach project, and I have now backported it to the mu 0.9.19.
From a technical perspective, this is a major cleanup, and allows us
to get rid of much of the fragile preprocessing both for indexing and
querying. From and end-user perspective this (hopefully) means that
many of the little parsing issues are gone, and it opens the way for
some new features.
From an end-user perspective:
- better support for special characters.
- regexp search! yes, you can now search for regular expressions, e.g.
subject:/h.ll?o/
will find subjects with hallo, hello, halo, philosophy, ...
As you can imagine, this can be a _heavy_ operation on the database,
and might take quite a bit longer than a normal query; but it can be
quite useful.
2017-10-24 21:55:35 +02:00
|
|
|
}
|