February 10, 2004
More on Microsoft Metadata
Back on January 6th, I reported the release of Microsoft's "Remove Hidden Data add-in for Office 2003 and Office XP".
With Microsoft's track record, I was somewhat skeptical that such a free utility would live up to its hype. With that in mind, I cautioned:
"I mentioned the readme file so that savvy users could compare its functionality to other metadata removers on the market. Although it's free, I strongly suggest that you make sure this tool removes everything you need it to remove. If it doesn't, then I recommend obtaining a program that will do the necessary job rather than rely upon this free utility. Otherwise, it could create a false sense of security, which when relied upon can cause many of the same problems as not using a metadata remover at all. Still, if you do not currently have a metadata remover and use the Office XP or Office 2003 suites, then using this add-in is probably better than the alternative."
Microsoft recently posted "Known issues with the Remove Hidden Data add-in for Office 2003 and Office XP". Also, Microsoft's Knowledge Base Article 834427 provides more information on the types of data this add-in can remove.
Therefore, it's up to each person to decide whether or not this tool properly suits their needs, and how it stacks up against leading programs such as Payne Consulting Group's Metadata Assistant for Word, Excel and PowerPoint. If the Microsoft tool removes what you need it to remove, then it may be worth using. The problem is that many people are just not tech savvy enough to know how to determine this -- thus my caution about false reliance on a metadata remover. My best advice is that whenever you can achieve it, as a general rule, Word document files do not contain revision and other metadata after conversion to HTML and PDF files. If you must share or send MS Office files, then make sure it is properly cleansed before sending. As part of one's due diligence in this regard, I believe a bit of in-house testing is required. If you don't know how to do this, then I heartily recommend engaging someone who does, such as Donna Payne.
As a good example of why we need to understand and care about metadata is this intriguing article by Preston Gralla. Mr. Gralla, a noted technology author, outlines how savvy privacy experts were able to debunk a supposedly valid high-level U.K. intelligence dossier about Iraq to be little more than a "cut-and-paste job" from three publicly available articles, one of which had been written by a postgraduate student in the U.S. I've also read similar approaches being used on college research papers and even attorneys' briefs to see who really wrote them and how much editing time was involved (cut-and-pastes take much less time than actual drafting) compared against the time billed.