Add ADRs

Add some (back-dated) architecture decision records [1] to document some of the more significant historical design choices. [1]: https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
2022-11-26 13:12:51 +01:00 · 2022-11-26 13:12:51 +01:00 · 48f4d34baf
parent 83ab69aedd
commit 48f4d34baf
5 changed files with 101 additions and 0 deletions
--- a/.adr-dir
+++ b/.adr-dir
@ -0,0 +1 @@
+adr
--- a/adr/0001-require-valid-utf-8.md
+++ b/adr/0001-require-valid-utf-8.md
@ -0,0 +1,28 @@
+# Require valid UTF-8
+
+ADR #: 1 \
+Date: 2020-11-28 \
+Author: [Nick Groenen](https://github.com/zoni/)
+
+## Context
+
+Rust's native [String] types are UTF-8–encoded (an [OsString] can hold arbitrary byte sequences), but filesystem paths (represented by the [Path] and [PathBuf]) structs) may consist of arbitrary encodings/byte sequences.
+Similarly, note content that we read from files could be encoded in any arbitrary encoding; it may not consist of valid UTF-8.
+
+In many cases we will need to look up strings found within notes against a list of paths (for example to find the path in the vault when encountering a `[[WikiLinkedNote]]`).
+
+We must decide whether to treat everything as valid UTF-8, or to treat it as arbitrary bytes, as we cannot mix these two together.
+
+## Decision
+
+Treating everything as arbitrary byte slices is technically the more correct thing to do, but it would complicate the internal design and is more difficult to get right.
+We can then no longer trivially perform certain operations like upper/lowercasing, splitting/appending, etc. as doing so might lead to mixed encoding schemes.
+
+To simplify the code and eliminate many sources of edge-cases introduced by possible mixed encoding schemes, we will shift the responsibility to end-users to ensure all input to obsidian-export is valid UTF-8.
+
+Where applicable, we will use lossy conversion functions such as `to_string_lossy()` and `from_utf8_lossy()` to simplify code by not having to handle the error-case of attempting to convert bytes that are not valid UTF-8.
+
+[String]: https://doc.rust-lang.org/std/string/struct.String.html
+[OsString]: https://doc.rust-lang.org/std/ffi/struct.OsString.html
+[Path]: https://doc.rust-lang.org/std/path/struct.Path.html
+[PathBuf]: https://doc.rust-lang.org/std/path/struct.PathBuf.html
--- a/adr/0002-percent-encode-questionmark-in-filenames.md
+++ b/adr/0002-percent-encode-questionmark-in-filenames.md
@ -0,0 +1,15 @@
+# Percent-encode `?` in filenames
+
+ADR #: 2 \
+Date: 2021-02-16 \
+Author: [Nick Groenen](https://github.com/zoni/)
+
+## Context
+
+A recent Obsidian update expanded the list of allowed characters in filenames, which now includes `?` as well.
+Most static site generators break when they encounter a bare `?` in markdown links, so this should be percent-encoded to ensure we export valid links.
+
+## Decision
+
+We'll add `?` to the hardcoded list of characters to escape (`const PERCENTENCODE_CHARS`).
+Making this list configurable is desirable, but this is left for a future improvement given other priorities.
--- a/adr/0003-extensibility-through-postprocessors.md
+++ b/adr/0003-extensibility-through-postprocessors.md
@ -0,0 +1,39 @@
+# Extensibility through postprocessors
+
+ADR #: 3 \
+Date: 2021-02-20 \
+Author: [Nick Groenen](https://github.com/zoni/)
+
+## Context
+
+It's desirable for end-users to have some control over the logic that is used to export notes and the transformation of their content from Obsidian-flavored markdown to regular markdown.
+
+One use-case would be to tailor the output for consumption by a specific static site generator, for example [Hugo].
+This requires emitting specific frontmatter elements and converting certain syntax elements to Hugo [shortcodes].
+
+However, to ease maintenance the core of the library would ideally remain as narrowly scoped and limited as possible.
+Ideally, all of such customization would be expressed through some kind of hook, callback or plugin mechanism that keeps it entirely out of the core of the obsidian-export library modules.
+
+## Decision
+
+We introduce the concept of _postprocessors_, which are (user-supplied) Rust functions that are called for every exported note right after it's been parsed, but before it is written out to the filesystem.
+
+Postprocessors may be chained (they'll be called in order, with the output of the first being the input to the second, etc) and will have access to and be able to modify:
+
+1. The stream of markdown events which makes up the note
+2. The note context, containing information such as the filename, path, frontmatter, etc.
+
+In addition, the return value of a postprocessor will be used to affect how the note is treated further, to prevent later postprocessors from running (`PostprocessorResult::StopHere`) or cause a note to be skipped entirely (`PostprocessorResult::StopAndSkipNote`) and omitted from the export.
+
+In code, the function signature for a postprocessor looks like:
+
+```rust
+pub type Postprocessor = dyn Fn(Context, MarkdownEvents) -> (Context, MarkdownEvents, PostprocessorResult) + Send + Sync;
+```
+
+The `Exporter` will receive a new method `add_postprocessor()` to allow users to register their desired postprocessors.
+
+Initially, we'll introduce support for this without anything else, but if any sufficiently generic usecases can be identified, we may add certain postprocessors to obsidian-export directly for users to opt-in to via CLI args.
+
+[Hugo]: https://gohugo.io/
+[shortcodes]: https://gohugo.io/content-management/shortcodes/
--- a/adr/templates/template.md
+++ b/adr/templates/template.md
@ -0,0 +1,18 @@
+# TITLE
+
+ADR #: NUMBER \
+Date: DATE \
+Author: [Nick Groenen](https://github.com/zoni/)
+
+## Context
+
+## Decision
+
+
+
+<!-- Optional sections for further information. Delete when unused -->
+
+## Further reading
+
+## References
+