Records Matter, Declaration Doesn’t – Revisited

This is a response to comments that Jürg Meier made recently on something I posted a while ago. Jürg is a very smart and personable guy, whom I had the pleasure of meeting in person at ARMA Switzerland’s inaugural event on November 29, 2011. I urge you to check him out on twitter and here, where he works.

I was going to simply reply to Jürg’s comments on my blog, but I figured that the points he brought up are pretty substantial and would be of interest to a broader audience. I asked Jürg if I could paste his comments into a post and respond to them. You’re reading this so either: a) Jürg agreed; or 2) I’m in deep doo-doo.

From the original post: “Users know what business process they’re involved in. …”

JM: Chris, not sure here. What about knowledge workers, who “often advance the overall understanding of that subject through focused analysis, design and/or development” (Wikipedia). Are they in a business process? Perhaps, but more often than not in a very large one, like a product development, an IT or marketing project. These people send email, word and powerpoint docs back and forward, take notes. Notes? Karl Alexander Mueller, Nobel price winner in physics 1987, discovered a material for high-temperature supraconductors. He took the decisive note a few years earlier at a congress – on a single page of his pocket notepad.

Moreq2010 comes up with a similar example. In the introduction chapter, they show a hand written shopping list and the resulting cash register receipt. They consider only the latter as a record.

I would say that regardless of what the duration or intended outcome of a process is, it’s still a business process with measurable business objectives. Projects and cases (as in case management) cross multiple business processes and can be of several years duration. Product development takes ages, is complex, involves large numbers of people and huge volumes of content. However, it can still be tied to business processes and the participants (usually) know what they’re doing. In that type of scenario I think I would recommend using a case aggregation for the users to plunk their content into, and apply appropriate retention to the aggregate.

I would assume that Müller knew what he was working towards when he wrote the decisive note in his pocket notepad (how different is the pocket notepad from a tablet these days?). If my assumption is correct then it stands to reason that the note is part of the research documentation, which must be filed and retained. The big question in my mind is related to ownership of the intellectual property; does it belong to Müller, to IBM, or to both?

Another question that I have concerns an outcome that is unintended, but beneficial nonetheless. I’m sure we all remember what Sildenafil was originally intended for, and what its current use is. What, if anything, are the impacts on categorization and retention? Research & knowledge based processes are really tricky to deal with, but I think the key is that you can apply (business) rules & automation to the mundane aspects, use aggregations to capture the content, and let the participants do what they are engaged to do. I would certainly rather have medical / pharma researchers figuring out cures than worrying about where to file something.

The Moreq2010 shopping list / receipt example is analogous to an order / invoice example. Each document provides a part of the complete picture, and therefore is required. I also think that particular example is nonsensical unless for some official reason (e.g.: personal taxes) you need to hold on to the receipt. Frankly, I need to keep the list to prove to my wife that I didn’t bugger something up when she “let” me go shopping for her.

JM: In my experience, it is really a question of who will consume the information. There are the usual suspects:
– business
– legal
– long-term (historical) archive

As you pointed out during your speech at the Swiss ARMA Chapter inaugural meeting, different people have different views on the same information. So, it would be compelling to classify information multiple times by different consumers… and I’m inclined to say: as late as possible. Only if we know about the purpose of the classification, we can do it right. E.g. for legal, they actually only know what they are looking for upon a litigation. By then though, they know very well what they need.

But what’s wrong with classifying as soon as possible, and adding additional classifications as they are identified, if that’s the case. The classification with the longest retention drives how long any content needs to be kept. This only works when classification and retention/disposition are segregated. In litigation situations simply applying a hold / freeze will do the job. There’s no reason to apply additional classification to the content because you create a legal case file aggregation and dump the content into it.

Content that has archival value, but no risk is easy – just keep it. I mean, I know we’ll want to keep all of my blog posts for the next 300 years or so. J It’s tougher when content that has archival value has some potential risk associated to it (privacy issues, legal exposure). I think at that point it’s really a judgement call. Frankly, I’m in favour of preserving because I’d like to think that sometime in the future there are going to be people that are interested in what we’ve been thinking and doing, and that the information they want is available. I’m also hoping that we’re not so stupid that we evaluate everything in terms of whether or not we’re going to get sued.

JM: However, the case of “late classification” does not answer one key question: for how long should we retain? The only reliable basis here is law and the retention schedule. And for that, by nature, we must classify upfront. That isn’t too difficult for “real business processes” (e.g. selling a ticket), but becomes tricky with output from knowledge workers. Here, to some extent, we need their support. Classifying draft/final is a good start, formally assigning it to a project would be very helpful, as well as identifying ownership and the document type.

For the most part I agree with this paragraph. For the knowledge workers, especially those that are involved in a lot of trial and error, I think we can come up with some reasonable classifications and retentions for them to use. Imagine how different things would be if the people that were working on Sildenafil tossed everything away once they realized they weren’t going to achieve what they set out to do.

3 Comments on “Records Matter, Declaration Doesn’t – Revisited

  1. Chris,

    I’d like to come back to your essential statement from your original blog entry ( that states “we can argue that all business content qualifies as a record by virtue of it being business content”.

    Thinking about it, it would be of course the simplest approach to declare anything as a record straightforwardly. You are right, if something has been created outside of a business activity, what qualifies the existence of that information asset on a business system? Taking that thought to the extreme leads us to the backup-as-archive perspective. And we know, that’s something we strongly discourage customers to do, for many good reasons. One of it being the sheer volumes of data and the huge problems to restore/retrieve records.

    Again in your original post, you’re coming up with the following table, on which you don’t expatiate further (at least in that post):

    From a retention management perspective there are three types of records:
    1. Permanent / Archival;
    2. Long term temporary (=> 1yr < forever);
    3. Transitory (< 1yr).

    So assuming that "everything is a record", my take on this is that you would retain any information asset for up to one year, then eventually decide (classify) if it is worthwhile to be moved to the long term temporary category (2). Well, that would make sense such in situations like Müller's note on oxyds that lead after several years to a Nobel prize win, but had equally the chance to end up in the paper tray, as a mislead research assumption, for instance.

    Cheryl McKinnon comes up in one of her blogs ( with a nice story about a 19th century adventurer, and his records. One of her key statements: "Time is what can transform the ephemeral, transitory content in our systems of engagement to something of value in a system of record."

    It looks compelling to classify info assets in retention buckets upfront, and keep the huge amount of transient assets (coming out of systems of engagement) for some time (e.g. < 1yr). Eventually, they grow to "real records" (above category 2) and even make it to historic archives.

    However, complexities to do so are many and huge. Repeatedly, I heard from firms specializing in search and classification that e.g. filtering important/relevant emails from Exchange servers they reach a maximum of 70 % of accuracy — IF private Emails had already been sorted out, and after training of the system.

    Thus, trying to sort the sheep from the goats might be harder than expected. Algorithms are not (yet) reliable enough, and for humans it is simply too much volume.
    So from that perspecitve, keeping everything forever might be really an option — as you state, in the Web world we are expecting anyway that any link we have in our bookmarks will still work even after decades.


