Vint Cerf is sometimes called one of the "fathers of the Internet" for his role in developing the TCP/IP protocol suite that's in use on every Internet-connected device. So when he warns of a forgotten century of data, it's worth paying attention to. What's more, we've seen some of these dangers already, as Mac users — dangers we must stay vigilant for.
During a recent speech recounted at The Guardian, Cerf warned that bit rot will lead to a forgotten generation, or perhaps a forgotten century of data.
The bit rot that Cerf describes is what happens to data as it ages; software capable of reading it changes over time. Some software is discontinued. And we've seen it happen on the Mac, time and time again. Think Microsoft Word files are eternal? Talk to people trying to extract useful information from old FrameMaker templates or Aldus Freehand files.
These are more than just data files. They're the sum total of our creative output and our analytical output, our ability to interpret the world around us. Within data there is meaning and there is structure.
It is important to preserve what we do and how we do it for the posterity of future generations, the same way we learn about the creation of the cotton gin, harnessing of electricity, development of transcontinental railroads.
But this sort of preservation affects us on deeply personal levels, too.
A few weeks ago a customer came into the computer store I work at on the weekends. He was using a pre-Intel Macintosh; a PowerPC-based iMac that sported a copy of ClarisWorks, an integrated productivity software suite that Apple developed.
The Mac still worked, and that database still contained files he needed: A mailing list of donors to a non-profit charity he dedicated his time. If he bought a new Mac, how would he access the files? ClarisWorks doesn't run on Intel Macs anymore, because Apple itself stopped supporting the technology which allowed it to work when OS X Lion came out in 2011.
The solution, it turns out, is using a free software program called LibreOffice, an office productivity suite that actually supports ClarisWorks file imports.
This might make for an interesting Mac how to at some point, but the point is this: There was a way to transform the data and make it useable again, but it was several steps removed from the owner of the data simply being able to double-click on the file and expect something to happen. If we hadn't been able to intervene, what would have happened to his fund-raising efforts? Would they have changed? Would he have recreated the database from scratch?
That's one very real danger associated with the type of bit rot that Vint Cerf is talking about. Cerf, who's now a vice president at Google, advocates the development of what he calls "digital vellum," to help preserve the way old software and hardware works for the benefit of future generations.
Moving from PowerPC to Intel processors on the Macintosh forced Apple's hand here: Eventually the technology to smoothe out that transition, a translation technology called Rosetta, was deprecated in OS X. When that happened, people who needed access to apps that ran on PowerPC, and their datasets, were left behind.
Forced upgrades sometimes leave data behind
My ClarisWorks example is one case of Apple not doing the best job it can for customers. Certainly ClarisWorks is the most narrow edge case these days, but it's a real problem faced by a loyal Apple user.
Compare that to a more recent example, however: Apple discontinued iPhoto for iOS with the release of iOS 8. What's more, though, it prevented iOS 8 users from opening iPhoto at all, even if they owned it from a previous release.
Essentially, iOS 8 killed iPhoto and forced a messy transition to Photos. Text and layouts from certain projects you created with iPhoto were lost in the transition without any easy way to convert that data into another useable format.
Apple has shown that it's willing to impose short term discomfort to its users in order to employ a long-term strategy of iterative software and hardware improvement. That improves things for the widest possible audience, but it can have a profoundly disruptive effect on those of us who are not ready for the transition.
Apple must always, always be mindful of those people caught in the crossfire. Progress and profit should never get in the way of doing the right thing for the greater good.
Good post. To be honest, bit rot is something I never considered and it's kind of scary when you think about it. What do you think may be the ultimate solution to it?
Mostly an Apple platform problem. You don't see this kind of widespread abandonment on other platforms (I use them all, not being biased here). One reason Windows is so criticized is because it has limitations and struggles related to maintaining backward compatibility with many formats. Apple has always disregarded backward compatibility arrogantly and it is a much easier road for a hardware/software company to go when you don't have to worry about it!
Backwards compatibility that Vint Cerf is talking about has little to do with the OS, and everything to do with the applications. Peter's example of ClarisWorks is an Apple (software) on Apple (OS) on Apple (hardware) scenario, so a foul at any level can kill it. Microsoft has been pretty good about supporting backwards compatible file formats or import/upgrade options for applications like Access, Excel, MSSQL, Outlook, etc., but they are less forgiving in the software development areas like Visual Basic, .Net, and the like. MS forces developers forward just as hard as Apple does.
As someone working at Microsoft, your assessment is slightly off, if only recently.
We've started letting older tech go because of the myriad complications that have arisen from trying to support everything since the beginning of tech. The only way to be truly nimble in this industry is to let go of dying tech. Twenty-six people worldwide may be inconvenienced, but it's a better product for everyone.
Agreed. I recently came across some old 3.5 floppy disk from my college days that had papers I had written in Word Perfect, Works, and what I remember to be the first version of Office. I have an external 3.5 drive that still works and guess what, the disks still works but Office 2002 on an older Windows PC can't even open any of the documents. Granted, these are not hugely sentimental but the fact that this data is lost is there.
I have been a Windows expert and professional for 30 years, and involved in the Macintosh since the Mac 512. (No, not the 256...). I can assure you, bit rot is something that affects all computing, and even more so on Windows because of the proliferation of formats and software. I think what W3C has done with XML standards has made a great contribution towards interoperability. Adobe introduced Portable Document Format, but that is not a usable format for editing, only for publishing. What we need is standards-based document formats. In my current profession, the Engineering world has suffered from this for decades, but attempts to develop intermediate, "standard" formats like STEP have pretty much failed. Why? I believe because of several factors. One, the STEP format is somehow much more limiting than the native CAD / CAM / CAE formats. Another, the software does provide export and import from these formats, but they do not natively work in these formats. Further, they have not foregone their own proprietary format for the standardized one. I think, for something like a text document (word processor, slide, publication, etc.) this is really a much easier thing to accomplish - but how do you take established powerhouses like MS Office, Apple's office suite, LibreOffice and Open Office, and get them to natively support a standard they don't control? It seems something the software developers have to come onto themselves.
Wow, that example from the guy with the old Mac with ClarisWork is actually a real one that well, a lot of people don't think about it to much, because, now days, a lot of software makers continue to suppor their products if the product is still around (like Microsoft Office to say something), and the people actually don't believe their computer files are going to be obsolete one day (I still have a couple of .doc files from almost 10 years ago).
Nice to think about it.
But I don't believe Apple is helping this mess, I believe, how you just say in the article, are needed changes, sometimes bad changes in short term, but finally, changes we need to addapt for, is that, or use something else. Also, no one can make everybody happy.
Hindering, for sure, but Apple is not the only guilty party. Apple is slightly more guilty than MS, in that Apple is always more ready to kill its own backwards compatibility, but in the long run historians and archivists deal with, all proprietary formats are risky, be they Word, Keynote, or Photoshop. Standards alone are not enough, unfortunately, as we found in 2007 when MS rammed through OOXML as a "standard" despite the lack of other implementations or even a fully defined spec. Gruber had it right when he created Markdown - use a simple, openly specified format* if you care about it being preserved and readable more than a few years. * yeah, yeah, his specification has some holes that he is not willing to fill or even implement properly in his own parser; it is still a great idea with multiple solid implementations Sent from the iMore App
Open formats are the answer. I have found floppy discs when cleaning up my office that had data from the early 1990s on them that I was able to get at because it was saved in an open format (ascii text). Old examples for anyone that is not familiar:
Ascii txt - Words, punctuation, and maybe a carriage return/line feed if you are feeling frisky.
CSV - comma separated variable (or values depending on your religion) is the way to keep the raw data from a spreadsheet or database. It is actually just ascii text with a structured text format.
PDF - can be images or docs, and Adobe has not really owned this for 20 years, so it's safe
JPEG - image Good list, but has some really obscure ones:
If companies used, and users insisted on, open standards, open formats and open source tools, this wouldn't be a problem. But greed always wins the day the way the system is setup.
The ClarisWorks example makes me wonder who's really at fault though. I've worked in all-Mac environments for many years and had to help people out with ClarisWorks documents innumerable times. Now ClarisWorks was completely discontinued at one point and yet Apple still supported it's files on the Macs for years after that. Then they stopped even doing that. Then because of public outcry over that decision, a special version of the program was created for no other reason than to allow the files to still be read on Macs. It literally had no new features at all other than it enabled people to work with their old files. Then they eventually discontinued that. I mean people who use ClarisWorks had literally YEARS of warnings from Apple about the program eventually going away. They had YEARS of special offers and free workshops in order to enable them to move to the new products. Then they got special treatment when they whined about it anyway. At the time that ClarisWorks was "king" also, a little thing called "Graphic Converter" was also on pretty much every Mac, and could translate Claris drawing files (the hardest ones to read nowadays), into ANY other format, for free, and instantly. Now it's years later and STILL there is some dude with ClarisWorks files he can't convert? He had his (multiple, good) chances IMO. I have no sympathy for seniors who still use ClarisWorks. These are users that would have the same problem regardless, because they don't listen, they don't update their software, and they basically don't really understand computers. It's nothing to do with Apple IMO (or Microsoft for that matter), it's a problem to do with these type of people who will literally always have a problem no matter what.
The problem with that line of reasoning is that you don't know you want something until you want it - or, if you are a historian, until you discover it. For example, the British have put together http://www.operationwardiary.org/ A collection of diaries from the front lines of World War 1's Western Front. They capture the horror, sacrifice, and beauty the common soldier endured, but this sort of historical treasure is only possible because they were in a format (paper) historians could understand decades later when they were recovered. The soldiers of the early 21st century may put down those experiences in Word, or Pages. 50 years hence, it is highly unlikely MS will be supporting Word 2011, or Apple iWork 09. That is what the article means by a lost century - not that you won't be able to open your documents, but that history may not be able to get a complete picture of our time, because so much of it is locked in formats that may not last. Sent from the iMore App
You've nailed it pretty much. Most of the times it's the way people handle their data and keep their stuff up to date. A bigger problem than file formats is – at least in my own experience – the availability of the pure physical access to old(er) storage media – e.g. ZIP or JAZZ cartridges by Iomega, floppy drives, interfaces to connect old(er) hard drive technology … things like that.
Use industry standard formats and you won't have an issue. Archive files as PDFs and use MS standard doc and xls file formats. These formats will probably outlive us all.
Would that be doc or docx, xls or xlsx or xlsm? : / )
'Bit rot' is certainly the wrong term here, as it means something different altogether (e.g. data corruption in non-ECC RAM or on storage media)... and that would indeed be a topic Apple should address at some point, as HFS+ is quite a bit behind in dealing with this issue, compared to e.g. ZFS or ReFS. As Gazoobee pointed out correctly, ClarisWorks users had something like a decade to migrate their data to other applications, or at least to established archival or interchange formats. A former colleague of mine is one of the world's most wanted experts for the design of runway lighting systems. He still visits several airport conferences every year, still lugging around his Harvard Graphics presentations from the 1980s, and yelling at every technician who can't make them work... really no support for this nonsense here. And no, it is not better in Windows-land, unless you have been using the surviving products from the beginning. Try opening some SGML files from 1990s WordPerfect in any modern software, good luck... and SGML, other than MS Word or Excel, IS a standard. Try most Lotus products, QuattroPro, dozens of lost and forgotten database formats that existed in the Windows hemisphere before MS and their illegal doings killed them one after the other. This 'problem' is everywhere, but sound advice on proper archiving is around for almost three decades, but people's unlimited feeling of entitlement seems to prevail. Did the death of iPhoto kill any photos? Are the photos edited and saved in iPhoto still in industry standard image formats that can be opened in hundreds of applications? Great. No problem. What is the next world-ending problem? Xcode not supporting Algol, Fortran and Cobol?
See above -- its history that needs the open format, not Joe User going to see their own files.
I had a similar experience some years ago. My father was a pianist who played in dance bands in the 1940s. Some of his rehearsals were recorded on the now defunct format of wire spools. Wire recorders predate tape recorders and used a spool of hair thin steel wire moving through a magnetic head. I had the wire spool but no working playback device. It took me months of searching the Internet to find an individual who had the equipment and could transcribe the recordings onto an audio CD. I think the issue is not so much the file format but the physical format. The original article about this talked about what archeologists and historians would do a thousand years from now. Writings on stone tablets have preserved data for thousands of years. What happens to hard drives, USB sticks, DVDs and the like after a thousand years for example. Will the data continue to be transcribed to newer technology as the years pass? What if an archeologist digs up a SSD 500 years in the future? How will he/she read the data contained on it no matter the file format?
I think Apple is actually taking the p!ss in this area. I went to open a document in Pages the other day and it won't open as it's too old for the latest version. That is a joke. It can't be that hard to include backwards compatibility for their own invented format. Pure laziness. Sent from the iMore App
Almost every day you can read the laments of users who lose their photographs, their contacts, their calendars, their music to a hardware or software failure and do not have a backup. We still have the letters Civil War soldiers sent home to friends and family 150 years ago. Will we sill have the love letters sent via email? Think about that.
The problem with the "you should keep supporting every format you've ever created until the end of time" argument, which is what this is saying, is that over time, that cost keeps going up, and will always go up. There's a lot of valuable data in hypercard stacks that no one's thought about in years. Good luck in reading it. But who pays for it? Someone with decades old data is going to foot the bill? No. No one is going to be able to afford that, and the implication that Apple, Microsoft or anyone should just eat that ever-escalating cost is ridiculous. And that's assuming the company is in business. If a company tanks and no one buys, or cares about those old assets, how is that dealt with? Should there be some kind of law requiring the maintenance of decades-old software just in case someone else needs it? One of the things that anyone in the IT field has to deal with is storage updates. When you update your RAIDs, you have to deal with migrating old data to the new storage. When you upgrade your tape drives or backup software, you have to deal with this migration. Well, the same thing with old data. If you have old data in older formats, this is something that you, as the owner of that data have to consider. Is that data valuable to you, enough to convert it to something that is supported. Open Formats are not a panacea either, as the SGML example shows. SGML is an open format, but if you've no software that can read it correctly, then exactly what good is it? PDF is an open format, but that only benefits you as long as someone with the skill cares to maintain PDF reader applications. If you don't have that skill, how much are you going to pay someone to do that? Assuming someone will spend a non-trivial amount of time on doing that is a fool's errand. "Apple should..." or "Microsoft should..." sounds great, but it's not *Apple's* data, it is not *Microsoft's* data. It is your data or my data and at some point, *we* have to accept some responsibility for it.
Great that you address the problem of digital preservation. There is actually a community about this problem. There is a yearly international conference (iPRES) about this topic and there are several research projects about it. I work for the German National Library and cultural heritage institutions all over the world are in big trouble to maintain access to old digital items. Emulation and file format migration are the usual strategies to deal with it. But companies like Apple are not very helpful in preserving access to old digital objects. There is no emulation for iOS apps. As for now content in iOS apps is not collected and preserved, not in the Library of Congress, not in the British Library, because there is no technical solution to archive these apps and maintain access.
It's not just the software, I noticed that my airport express and airport don't work with the latest OS X I'm lucky that I have my pre Intel Power PC to control my home wifi. The things that worries me is Apple keeps changing their OS every year and as a result making perfectly good equipment obsolete. I now need a new airport and new express if my old PowerPC dies. Sent from the iMore App
Thank you for signing up to iMore. You will receive a verification email shortly.
There was a problem. Please refresh the page and try again.