Email management and archiving

Users facing email records.
The “email managing and archiving“ working group at CR2PA is questioning vendors, 2010.

What is this?

The issue of e-mails is a complex one given its private protection constraints and the impact on a daily basis for countless users who have long grown accustomed to their messaging systems!

Messaging systems have been optimised for individual usage. Organisations must comply with these standards. How can we ensure that relevant e-mail becomes a shared record and a genuine corporate asset?

In 2009 the group working on managing and archiving e-mails at the CR2PA published a white paper: Users facing email records. The 2009-2010 work plan was intended to share this statement of user requirements with a set of software vendors in order to compare it against their messaging and archiving offers.

Offer summary

There are many offers in records management and archiving, including for e-mails.

The French e-mail archiving market has yet to mature in 2010 with few references for French organisations.

Offers combine current solutions from vendor-type origins (messaging, ECM, search engines, archiving, storage infrastructure, etc.).

Vendors therefore have varied strategies:

  • transversal archiving framework or specialist silos (Symantec)
  • global coverage of the document life cycle (IBM, EMC, Open Text) or specialisation in recordkeeping including paper (Klee) and structured documents (RSD)
  • intra-company system or outsourcing (Google)
  • autonomous archiving module for e-mails (Autonomy, Microsoft) or connected to an ERMS.

Offers address several end purposes at the same time:

  •       individual preservation
  • archiving driven by eDiscovery (defensive)
  • IS optimisation of archiving (technique)
  • records management approach, but a distinction must be made between those capable of simple / repetitive or complex preservation rules ! Vendors are increasingly capable of handling such rules. All that remains is for records managers to establish them!

In sum, daily user assistance (sent or received e-mails) is what has still to be improved:

  • identification  of the e-mail as record or as working information
  • choice of retention and preservation rules depending on the content
  • classification by context and content.

Semantic analysis is foreseeable with interesting technological possibilities. At the same time, functions relating to e-mail templates are insufficient for records management and archiving!

Our statement of the requirements

An e-mail, as almost like any information, may provide evidence of business transactions and be required by:

  • the organisation in relation to another organisation, a private individual or an employee
  • a unit in the organisation in relation to another
  • an employee in relation to his or her unit or organisation.

Functions expected by users

 1.    Assistance in sorting / identifying / capturing

Expectation: create using e-mail templates, create in a system to inherit the context (send from a business system, drag and drop in a collaborative “business” file), semantic content analysis (help for identification of record value – taxonomy of sensitive terms?) / indexing

2.    Secure e-mail storage

Expectation: server storage duration limited by time period (not by quota) for a complete status “as sent or received” 

3.    Continuous mailbox purging

4.    Streamlining with document and records management systems

The right technical approach should therefore:

  • connect individual management of e-mails with the organisation’s information and records management processes
  • preserve the fluidity of the messaging system user’s “gesture”:
    – help with the decision : record or non record ?
    – automated addition of descriptive information concerning context / classification / techniques.

What CR2PA discovered among software vendors

1. Identification and Capture

We found : task handled automatically by the system or manually by the user:

  • automation is based on rules and above all on the “technical” characteristics of the e-mail (date, recipients, size, etc.), not or only slightly on the content (Google, Symantec)
  • manually, the user adds “tags” to qualify or moves / files in a directory that indicates the content (Open Text, EMC).

At this stage, all software vendors (except Microsoft) carry out a technical redundancy check in attachments and implement shortcuts in the user mailboxes involved.

What is missing : content / context analysis for assistance in identification.

Vendors want to involve the user as little as possible or even not at all, which means not using the content – a decisive factor for retention rules!

What does emerge, however, is the categorisation of contents (Autonomy, RSD, etc.) implementing the semantic analysis.

Original ideas:

  • complete archiving / messaging integration (Microsoft, IBM),
  • “scan” PCs to recover current files (pst, nsf): Open Text (pst), etc.
  • periodic transfer (Klee) as a continuation of the “gesture” for paper archiving.

2. Retention and preservation

We found: this task is completely handled (an essentially technical approach):

  • storage optimisation by the archiving solution (Symantec)
  • integrity of the archive object
  • continuity of the e-mail format: the native format is kept (condition for an integral aspect restoration).

Original ideas: to ensure long-term readability of e-mails in Notes format, retain an EML version (RFC2822) in parallel (RSD, Klee) or a universal viewer (Autonomy).

3. Restoration

We found the following depending on the various usages:

  • restitution of e-mails to the user (RSD)
  • centralised search using a search engine or by browsing (Open Text, Microsoft, IBM, EMC).

This requires collective access management rules to be tailored over time (and is depending on reorganisations).

Original ideas: reconstitute exchanges around an e-mail (Autonomy).

4. Destruction

Destruction (except in case of hold) depends on the end purpose and offers are tailored to each:

  • individual archiving: the user deletes his own files, but e-mails are not necessarily destroyed
  • systematic archiving using technical criteria (Google): retention period is set for all e-mails (approx. five years)
  • archiving depending on the content: destruction is performed according to records management requirements.

What is missing: detailed management of destructions using records management requirements, which may be complex (combinations).