doc: Improve check documentation

Make it clearer what the difference is between regular check and check with --read-data.
This commit is contained in:
rawtaz 2019-12-10 21:15:59 +01:00 committed by GitHub
parent 6c700f95b5
commit fccb579471
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 55 additions and 25 deletions

View File

@ -82,54 +82,85 @@ Furthermore you can group the output by the same filters (host, paths, tags):
1 snapshots
Checking a repo's integrity and consistency
===========================================
Checking integrity and consistency
==================================
Imagine your repository is saved on a server that has a faulty hard
drive, or even worse, attackers get privileged access and modify your
backup with the intention to make you restore malicious data:
drive, or even worse, attackers get privileged access and modify the
files in your repository with the intention to make you restore
malicious data:
.. code-block:: console
$ echo "boom" >> backup/index/d795ffa99a8ab8f8e42cec1f814df4e48b8f49129360fb57613df93739faee97
$ echo "boom" > /srv/restic-repo/index/de30f3231ca2e6a59af4aa84216dfe2ef7339c549dc11b09b84000997b139628
In order to detect these things, it is a good idea to regularly use the
``check`` command to test whether everything is alright, your precious
backup data is consistent and the integrity is unharmed:
Trying to restore a snapshot which has been modified as shown above
will yield an error:
.. code-block:: console
$ restic -r /srv/restic-repo --no-cache restore c23e491f --target /tmp/restore-work
...
Fatal: unable to load index de30f323: load <index/de30f3231c>: invalid data returned
In order to detect these things before they become a problem, it's a
good idea to regularly use the ``check`` command to test whether your
repository is healthy and consistent, and that your precious backup
data is unharmed. There are two types of checks that can be performed:
- Structural consistency and integrity, e.g. snapshots, trees and pack files (default)
- Integrity of the actual data that you backed up (enabled with flags, see below)
To verify the structure of the repository, issue the ``check`` command.
If the repository is damaged like in the example above, ``check`` will
detect this and yield the same error as when you tried to restore:
.. code-block:: console
$ restic -r /srv/restic-repo check
Load indexes
ciphertext verification failed
...
load indexes
error: error loading index de30f323: load <index/de30f3231c>: invalid data returned
Fatal: LoadIndex returned errors
Trying to restore a snapshot which has been modified as shown above will
yield the same error:
If the repository structure is intact, restic will show that no errors were found:
.. code-block:: console
$ restic -r /srv/restic-repo restore 79766175 --target /tmp/restore-work
Load indexes
ciphertext verification failed
$ restic -r /src/restic-repo check
...
load indexes
check all packs
check snapshots, trees and blobs
no errors were found
By default, ``check`` command does not check that repository data files
are unmodified. Use ``--read-data`` parameter to check all repository
data files:
By default, the ``check`` command does not verify that the actual data files
on disk in the repository are unmodified, because doing so requires reading
a copy of every data file in the repository. To tell restic to also verify the
integrity of the data files in the repository, use the ``--read-data`` flag:
.. code-block:: console
$ restic -r /srv/restic-repo check --read-data
...
load indexes
check all packs
check snapshots, trees and blobs
read all data
[0:00] 100.00% 3 / 3 items
duration: 0:00
no errors were found
Use the ``--read-data-subset=n/t`` parameter to check only a subset of the
repository data files at a time. The parameter takes two values, ``n`` and
``t``. When the check command runs, all data files in the repository are
logically divided in ``t`` (roughly equal) groups, and only files that
belong to the group number ``n`` are checked. For example, the following
commands check all repository data files over 5 separate invocations:
.. note:: Since ``--read-data`` has to download all data files in the
repository, beware that it might incur higher bandwidth costs than usual
and also that it takes more time than the default ``check``.
Alternatively, use the ``--read-data-subset=n/t`` parameter to check only a
subset of the repository data files at a time. The parameter takes two values,
``n`` and ``t``. When the check command runs, all data files in the repository
are logically divided in ``t`` (roughly equal) groups, and only files that
belong to group number ``n`` are checked. For example, the following commands
check all repository data files over 5 separate invocations:
.. code-block:: console
@ -138,4 +169,3 @@ commands check all repository data files over 5 separate invocations:
$ restic -r /srv/restic-repo check --read-data-subset=3/5
$ restic -r /srv/restic-repo check --read-data-subset=4/5
$ restic -r /srv/restic-repo check --read-data-subset=5/5