Commit Graph

134 Commits

Author SHA1 Message Date
Michael Eischer 6a6d313c9a prune: reduce priority of repacking small packs 2022-08-05 23:47:12 +02:00
Kyle Brennan 0269381b8d prune: add repack-small parameter 2022-08-05 23:47:12 +02:00
Michael Eischer e85a21eda2 prune: move code 2022-07-30 17:37:07 +02:00
Michael Eischer d0590b7841 prune: Add internal integrity check
After repacking every blob that should be kept must have been repacked.
We have seen a few cases in which a single blob went missing, which
could have been caused by a bitflip somewhere. This sanity check might
help catch some of these cases.
2022-07-30 17:37:07 +02:00
Michael Eischer 5cbde03eae prune: split into smaller functions 2022-07-30 17:37:07 +02:00
Alexander Weiss 7643237da5 prune: separate collecting/printing/pruning 2022-07-30 17:37:07 +02:00
Michael Eischer 715d457aad prune: code cleanups 2022-07-17 11:41:56 +02:00
Michael Eischer 9be1bd2acc prune: handle very high duplication of some blobs
Suggested-By: Alexander Weiss <alex@weissfam.de>
2022-07-17 11:39:56 +02:00
Alexander Weiss 7478cbf70e prune: Enhance treatment of duplicates 2022-07-17 00:22:23 +02:00
Michael Eischer ec4dfa3c66 Wording: replace further repo occurrences with repository 2022-07-12 20:48:01 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Alexander Neumann 74f7fe2b98
Merge pull request #3767 from MichaelEischer/fix-prune-empty-snapshot
prune: Fix crash on snapshot loading error
2022-05-29 16:52:45 +02:00
Michael Eischer c8e1ac4049 prune: Don't print stack trace if snapshot can't be loaded 2022-05-23 22:38:45 +02:00
Michael Eischer 173695104c prune: Fix crash on empty snapshot 2022-05-23 22:32:59 +02:00
Michael Eischer 381bd94c6c prune: Add option to repack uncompressed data 2022-05-09 22:31:30 +02:00
Michael Eischer 5406743102 prune: Automatically repack uncompressed trees for repo v2
Tree packs are cached locally at clients and thus benefit a lot from
being compressed. Ensure this be having prune always repack pack files
containing uncompressed trees.
2022-05-09 22:31:30 +02:00
Michael Eischer dbbeac7174 prune: Add unsafe option to recover from no free space
The new option allows prune to operate with nearly no scratch space by only removing
no longer necessary pack files and first deleting the index before
rebuilding it. By first deleting the index it becomes safe to just
delete no longer necessary pack files. However, as a downside there's
now the risk that the repository becomes inaccessible if prune fails.

To recover from that problem a user might have to manually delete the
repository index and then run (a full) `rebuild-index` again.
2022-04-30 19:21:07 +02:00
Michael Eischer f5609d1d3c prune: Fail early if too few backend connections 2022-04-23 11:32:52 +02:00
Michael Eischer 3d29083e60 copy/find/ls/recover/stats: Memorize snapshot listing before index
These commands filter the snapshots according to some criteria which
essentially requires loading the index before filtering the snapshots.
Thus create a copy of the snapshots list beforehand and use it later on.
2022-04-09 12:26:30 +02:00
Michael Eischer 2ec0f3303a backup/diff/dump/restore/stats: List snapshots before index
During a backup the index is written before the corresponding snapshots.
To ensure that a concurrent/later restic run can read a snapshot's data,
restic thus must first load the snapshots and only afterwards the index.
Otherwise it is not possible to ensure that the loaded index is recent
enough to cover all of the snapshot's data.
2022-04-09 12:24:09 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
rawtaz abfbacf3d3
Merge pull request #3591 from MichaelEischer/prune-fix-max-repack
prune: Handle --max-repack-size=0 as expected
2022-01-13 03:53:20 +01:00
mattxtaz 6ff32ee4d3 Add missing colon in prune stats output and change padding to 14 chars to align the fields 2022-01-06 21:15:15 +00:00
Michael Eischer 0cfdb82ea4 prune: Handle --max-repack-size=0 as expected
Previously the flag was ignored and `--max-repack-size=1` had to be
used.
2021-12-27 15:48:56 +01:00
greatroar 7f0aa49f45 cmd/restic: Streamline progress printing
* PrintProgress no longer does unnecessary Sprintf calls, and performs
  fewer allocations in general
* newProgressMax's callback checks whether the terminal supports
  line updates once instead of once per call
* the callback looks up the terminal width once per call instead of
  twice (on Windows)
* the status shortening now uses the Unicode-aware version from
  internal/ui/termstatus (future-proofing)
2021-09-03 11:48:22 +02:00
Michael Eischer aad8864835 prune: Add missing newlines in error descriptions 2021-06-20 14:25:40 +02:00
Alexander Neumann 027a51529d prune: Improve error message for missing files
This commit changes the error message so that a list of file names is
printed. Before, just the raw map was printed, which is not a great user
interface.
2021-01-31 11:31:27 +01:00
Alexander Weiss e08e65dc30 prune: Simplify logic selecting packs to repack 2021-01-29 22:27:22 +01:00
Alexander Weiss daeb4cdf8f prune: Fix statistics for --repack-cacheable-only 2021-01-29 22:27:22 +01:00
Alexander Neumann a16ce65295
Merge pull request #3244 from MichaelEischer/better-damage-reports
Print more details about possible repository damages
2021-01-29 11:45:45 +01:00
Michael Eischer 58b5679f14 prune: reword missing blobs error
The previous wording could be understood such that the prune run did
damage the repository.
2021-01-28 21:48:24 +01:00
Michael Eischer 7b8886c052 prune: report missing but unneeded pack files
This indicates a damaged repository so add some output to help with
debugging.
2021-01-28 21:46:01 +01:00
MichaelEischer 82140967d3
Merge pull request #3228 from aawsome/prune-all-trees
prune: Remove all unused trees
2021-01-28 21:04:27 +01:00
MichaelEischer 35033d9b79
Merge pull request #3177 from MichaelEischer/fix-2759
prune: don't print stacktrace on console
2021-01-28 20:28:45 +01:00
Michael Eischer eda8c67616 restic: let FindUsedBlobs handle multiple snapshots at once 2021-01-28 11:08:43 +01:00
Alexander Weiss d7dc19a496 prune: Always repack packs containing tree blobs 2021-01-15 16:42:04 +01:00
Alexander Weiss f0113139ea prune: Correct error message 2020-12-29 20:20:05 +01:00
Alexander Weiss f6df94a50e prune: Add self-healing
Allow prune to heal situations where blobs in the index are missing or
the corresponding packfiles are damaged if those blobs are not needed.
2020-12-29 20:20:05 +01:00
Alexander Weiss 68b74e359e Count packs directly in RebuildIndexFiles 2020-12-22 23:01:58 +01:00
Michael Eischer cfea79d0c5 prune: don't print stacktrace on console 2020-12-19 14:28:48 +01:00
Alexander Weiss e329623771 Remove LoadAllSnapshots 2020-12-06 05:22:27 +01:00
Alexander Weiss 92bd448691 Make BlobHandle substruct of Blob 2020-11-22 20:41:10 +01:00
Alexander Weiss 67c938f232 Use PackSize in prune 2020-11-21 22:13:54 +01:00
Alexander Weiss cb5ec7ea6b Fix statistics in prune for duplicates
Note that this fix only solves the statistics problem, if
all duplicates are marked for repacking.

If not all duplicates are marked for repacking, we lack the
information which

The situation that not all duplicates are marked for repacking can occur
when using the `max-repack-size` option
2020-11-18 22:30:22 +01:00
Alexander Weiss 1ec628ddf5 Add extraObsolete to MasterIndex.Save 2020-11-15 07:05:09 +01:00
greatroar 21b787a4d1 Stop Counters where they're constructed and started 2020-11-09 13:03:31 +01:00
greatroar ddca699cd2 Replace restic.Progress with new progress.Counter
This fixes two race conditions while cleaning up the code.
2020-11-09 12:12:35 +01:00
Alexander Neumann a8d21b5dcf prune: Warn if no cache is present 2020-11-08 20:25:35 +01:00
Alexander Neumann 47277c4b4c Add comments, clarify computation 2020-11-06 20:23:30 +01:00
Alexander Weiss fd33030556 Use in-memory index to rebuild index in prune 2020-11-06 20:23:30 +01:00
Alexander Neumann 1ca60bccfb Refactor condition for MaxRepackBytes
Don't depend on the string (opts.MaxRepackSize) for the condition,
instead check if there's a (positive) limit configured.
2020-11-03 16:42:21 +01:00
Alexander Neumann 7f86eb4ec0 Move helper function 2020-11-03 16:42:21 +01:00
Alexander Neumann c1a3de4a6e Refactor max-unused calculation, add `unlimited` option
Add a callback to the PruneOptions struct which calculates the number of
bytes allowed to be unused after prune is done. This way, the logic is
closer to the option parsing code.

Also, add an explicit option `unlimited` for the use case when storage
does not matter but bandwidth and time do. Internally, this sets the
maximum number of unused bytes to MaxUint64.

Rework the documentation slightly so that no more "packs" are
mentioned and it talks about "files" instead.

Make it clear in the documentation that the percentage given to
`--max-unused` is relative to the whole repository size after pruning is
done. If specified, it must be below 100%, otherwise the repository
would contain 100% of unused data, which is pointless.

I had a hard time coming up with the correct formula to calculate the
maximum number of unused bytes based on the number of used bytes. For a
fraction `p` (0 ≤ p < 1), a repo with `u` bytes used, and the number of
unused bytes `x` the following holds:

      x ≤ p * (u+x)
    ⇔ x ≤ p*u + p*x
    ⇔ x - p*x ≤ p*u
    ⇔ x * (1-p) ≤ p*u
    ⇔ x ≤ p/(1-p) * u
2020-11-03 16:42:21 +01:00
Alexander Neumann f8c4dd7b1a Split packe rewrite logic into two case branches
The comma is too sublte, let's split this into two separate branches.
2020-11-03 16:42:21 +01:00
Alexander Neumann a5b80452fe Add comment that usedBlobs is modified 2020-11-03 16:42:21 +01:00
Alexander Neumann aff1e220f5 Split struct members, add comments 2020-11-03 16:42:21 +01:00
Alexander Neumann 095155d9ce Remove RepackSmall 2020-11-03 16:42:21 +01:00
Alexander Weiss b2f5381737 Make realistic forget --prune --dryrun 2020-11-03 16:42:21 +01:00
Alexander Weiss 7f9a0a5907 Reimplementation of prune 2020-11-03 16:42:21 +01:00
Michael Eischer 0c9efa9c2a Pass context to lockRepo 2020-10-09 22:39:06 +02:00
Michael Eischer 17995dec7a Unify progress bar of check and prune commands 2020-08-25 22:47:38 +02:00
Michael Eischer 08d24ff99e prune: Include ID of all missing blobs in error message 2020-08-16 11:36:14 +02:00
Michael Eischer 3ba19869be prune: Abort if any used blobs are missing
The previous check only approximately verified whether all required
blobs were found. However, after forgetting a few snapshots the
repository contains lots of unused blobs whose number can be sufficient
to make up for missing packs.

When coupled with a malfunctioning backend that temporarily returns broken
data this could cause restic to regard the corresponding packs as
invalid and thereby delete data that's still in use. This change lets
restic play it safe and refuse to delete anything if data is missing.
2020-08-16 11:34:01 +02:00
aawsome 0fed6a8dfc
Use "pack file" instead of "data file" (#2885)
- changed variable names, especially changed DataFile into PackFile
- changed in some comments
- always use "pack file" in docu
2020-08-16 11:16:38 +02:00
Michael Eischer 05116e4787 prune: Cleanup progress bar handling while repacking 2020-08-03 19:32:46 +02:00
Michael Eischer 04f79b9642 prune: Stop progress bar after searching used blobs 2020-08-03 19:31:49 +02:00
MichaelEischer 66d089e239
Merge pull request #2841 from aawsome/optimize-getUsedBlobs
Extract get used blobs in prune as separate function
2020-08-01 22:40:33 +02:00
Alexander Weiss 9762bec091 Use optimized getUsedBlobs in prune 2020-08-01 21:07:31 +02:00
Alexander Weiss 2ee654763b Delete files in parallel 2020-08-01 20:39:24 +02:00
Michael Eischer 184103647a FindUsedBlobs: merge seen into blobs BlobSet
The seen BlobSet always contained a subset of the entries in blobs.
Thus use blobs instead and avoid the memory overhead of the second set.

Suggested-by: Alexander Weiss <alex@weissfam.de>
2020-08-01 12:29:16 +02:00
Alexander Weiss 70347e95d5 disable index uploads for prune command
+ modifications of changelog
2020-06-12 09:24:38 +02:00
Erik Rigtorp 94f4f13388 Add documentation on exit status codes to man pages
This is step one to start defining useful exit codes for all the commands.
2020-02-12 23:09:26 +01:00
Alexander Neumann 1ab5703404 prune: Fix calculation for removed bytes 2018-08-14 22:06:05 +02:00
Alexander Neumann 663c57ab4d debug: Remove manual Str() call Log() 2018-01-25 20:49:41 +01:00
Alexander Neumann b0c6e53241 Fix calls to repo/backend.List() everywhere 2018-01-21 21:15:09 +01:00
Alexander Neumann 56fccecd06 prune: Repack mixed pack files 2017-09-24 21:54:53 +02:00
Stefan Völkel 7f927d4774 fix duplicate bytes in prune output 2017-09-22 10:07:24 +02:00
Alexander Neumann d7e644272f prune: Add plausibility check 2017-09-19 10:50:07 +02:00
Shayne Holmes 9eb39cef05 Capitalize short help commands
Unify existing Cobra help command, and git-help's style.
2017-09-11 09:32:44 -07:00
Alexander Neumann 6bc43a4198 manpage: Remove auto gen tag from man page 2017-08-06 21:31:01 +02:00
Alexander Neumann 23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00
Alexander Neumann 6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann 83d1a46526 Moves files 2017-07-23 14:19:13 +02:00