My understanding is that the only reliable way of long-term digital archival storage is to refresh the media you are storing things on every few years, copying the previous archives to the fresh storage.
Since storage constantly gets cheaper, 100GB first stored in 2001 can be stored on updated media for a fraction of that original cost in 2024.
Related, CD-Rs. When I left my submarine in 2013, they (by which I mean the entire Virginia class) were still using them to store archived logs, despite my explanation that they’d be lucky to get a decade out of them. The first chosen storage location was literally the hottest part of the engine room, right in between the main engines. Easily 120+ F at all times. After protest, we moved ours to a somewhat cooler location. Still hot, and still with atmospheric oil and other fun chemicals floating around.
I look forward to the first time logs from a few decades ago are required, and the media is absolutely dead.
EDIT: they weren’t even Azo dye, they were phthalocyanine. A decade was probably generous.
1. Incomplete copies with missing dependencies.
2. Old software and their file formats with a poor virtualization story.
3. Poor cataloging.
4. Obsolete physical interfaces, file systems, etc.
5. Long-term cold storage on media neither proven nor marketed for the task.
Managing archives is just a cost center until it isn't, and it's hard to predict what will have value. The worst part of this is that TFA discusses mostly music industry materials. Outside parties and the public would have a huge interest in preserving all this, but of course it's impossible. All private, proprietary, copyrighted, and likely doomed to be lost one way or another.
LTO-1 started in 2000 and the current LTO-9 spec is from 2021. But it only has backwards compatibility for 1 to 2 generations. You can't read an LTO-6 tape in an LTO-9 drive.
> Sticky-shed syndrome is a condition created by the deterioration of the binders in a magnetic tape, which hold the ferric oxide magnetizable coating to its plastic carrier, or which hold the thinner back-coating on the outside of the tape.[1] This deterioration renders the tape unusable.
Stiction Reversal Treatment for Magnetic Tape Media
> Stiction can, in many cases, be reversed to a sufficient degree, allowing data to be recovered from previously unreadable tapes. This stiction reversal method involves heating tapes over a period of 24 or more hours at specific temperatures (depending on the brand of tape involved). This process hardens the binder and will provide a window of opportunity during which data recovery can be performed. The process is by no means a permanent cure nor is it effective on all brands of tape. Certain brands of tape (eg. Memorex Green- see picture below) respond very well to this treatment. Others such as Mira 1000 appear to be largely unaffected by it.
Data migration and periodic verification is the answer but it requires more money to hire people to actually do it.
I've got files from 1992 but I didn't just leave them on a 3.5" floppy disk. They have migrated from floppy disk -> hard drive -> PD phase change optical disk -> CD-R -> DVD-R -> back to hard drive
I verify all checksums twice a year and have 2 independent backups.
Policy at $Job - all important data is backed up to a rotation of high-quality hard drives. Which are stored off-site, powered down. Every N weeks, each one of them is powered up (in an off-line system) and checked - both with the SMART long test, and `zfs scan` (which verifies ZFS's additional anti-bit-rot checksums for the data).
Yes, it's a bit of a PITA. OTOH, modern HD's are huge, so a relative few are needed. And we've lost 0 bits of our off-site data in our >25 years of using that system.
simonw ·126 days ago
Since storage constantly gets cheaper, 100GB first stored in 2001 can be stored on updated media for a fraction of that original cost in 2024.
Show replies
sgarland ·126 days ago
I look forward to the first time logs from a few decades ago are required, and the media is absolutely dead.
EDIT: they weren’t even Azo dye, they were phthalocyanine. A decade was probably generous.
Show replies
Clamchop ·126 days ago
1. Incomplete copies with missing dependencies. 2. Old software and their file formats with a poor virtualization story. 3. Poor cataloging. 4. Obsolete physical interfaces, file systems, etc. 5. Long-term cold storage on media neither proven nor marketed for the task.
Managing archives is just a cost center until it isn't, and it's hard to predict what will have value. The worst part of this is that TFA discusses mostly music industry materials. Outside parties and the public would have a huge interest in preserving all this, but of course it's impossible. All private, proprietary, copyrighted, and likely doomed to be lost one way or another.
Oh well.
Show replies
lizknope ·126 days ago
https://en.wikipedia.org/wiki/Linear_Tape-Open#Generations
LTO-1 started in 2000 and the current LTO-9 spec is from 2021. But it only has backwards compatibility for 1 to 2 generations. You can't read an LTO-6 tape in an LTO-9 drive.
https://en.wikipedia.org/wiki/Sticky-shed_syndrome
> Sticky-shed syndrome is a condition created by the deterioration of the binders in a magnetic tape, which hold the ferric oxide magnetizable coating to its plastic carrier, or which hold the thinner back-coating on the outside of the tape.[1] This deterioration renders the tape unusable.
Stiction Reversal Treatment for Magnetic Tape Media
https://katalystdm.com/digital-transformation/tape-transcrip...
> Stiction can, in many cases, be reversed to a sufficient degree, allowing data to be recovered from previously unreadable tapes. This stiction reversal method involves heating tapes over a period of 24 or more hours at specific temperatures (depending on the brand of tape involved). This process hardens the binder and will provide a window of opportunity during which data recovery can be performed. The process is by no means a permanent cure nor is it effective on all brands of tape. Certain brands of tape (eg. Memorex Green- see picture below) respond very well to this treatment. Others such as Mira 1000 appear to be largely unaffected by it.
Data migration and periodic verification is the answer but it requires more money to hire people to actually do it.
I've got files from 1992 but I didn't just leave them on a 3.5" floppy disk. They have migrated from floppy disk -> hard drive -> PD phase change optical disk -> CD-R -> DVD-R -> back to hard drive
I verify all checksums twice a year and have 2 independent backups.
Show replies
bell-cot ·126 days ago
Yes, it's a bit of a PITA. OTOH, modern HD's are huge, so a relative few are needed. And we've lost 0 bits of our off-site data in our >25 years of using that system.