Archiving emails: how and why?

Uit Tracks
Naar navigatie springen Naar zoeken springen
Dit is een vertaalde versie van de pagina E-mails archiveren: hoe en waarom?. De vertaling is voor 89% voltooid.
Verouderde vertalingen worden zo weergegeven.

Emails often contain important or useful information. To keep this information accessible, it's essential to archive and store important emails with the rest of your digital documents.
In this article, you’ll learn:

  • Why is it important to archive your emails?
  • How do you prepare your mailbox for archiving?
  • How do you archive your emails and what should you pay attention to?

Digitally archiving emails is a challenge for many organisations. Mailboxes are bursting with incoming and outgoing messages, and increasingly becoming a repository for information and knowledge, partly because they can include all sorts of attached files. Emails are born-digital documents, and this digital feature is an essential characteristic that needs to be stored.[1]

When storing emails, retaining the link between attachments and the email itself poses an extra challenge.


Why?

There’s little point in keeping the vast majority of emails in our inboxes for the long term. Their value is often very temporary and trivial. But it’s also not uncommon for important agreements to be made via email, or for conversations to occur that are relevant in the long term and for the entire organisation. This could include informal agreements – for example about copyright and usage rights – that aren’t formalised in a contract, or instructions from an artist on how an artwork should be constructed or displayed. In these cases, it’s important that the information doesn’t get locked away in an employee’s personal email account. Confidentiality of correspondence applies here, which means you’re not allowed to simply access a (former) colleague’s mailbox and – as an employer – you also can't store former employees’ mailboxes in their entirety. It is therefore crucial that important emails end up in a logical place in the folder structure, and that your organisation makes clear agreements about this.

If you want to temporarily keep a personal mailbox to ensure continuity after an employee leaves, it’s advisable to make agreements about this before the employment starts. This can be done in an appendix to the employment contract, but it's essential to also ensure your agreements align with Data Protection Authority legislation and guidelines. [2][3] Key points for such an appendix include:

  • employees must ensure that they place important content in the folder structure before leaving, reducing the need for the employer to access the mailbox;
  • a clear timeline indicating when this should happen;
  • employees no longer have access to their personal mailbox after leaving;
  • former employees’ mailboxes are deleted one month after their departure;
  • describe the exceptional cases in which the employer can still access the mailbox, and state that this can be up to a maximum of one month after leaving.

For shared inboxes or for personal backup purposes, you might consider archiving entire mailboxes. You can find detailed instructions on how best to do this below.

Email clients, such as Microsoft Outlook or Apple Mail, use their own closed file format. This can mean that not all metadata is stored and emails become unreadable if the email client is no longer available. People are also increasingly using webmail services such as Gmail or Outlook (formerly Hotmail), so if these services stop functioning or start to require (high) costs, you risk losing all your emails.

How?

Organise your emails

You can organise your emails by structuring your mailbox[4] according to your organisation's folder structure. (See Draw up an organisational plan/folder structure). If you also use your personal email address, you can create separate folders for your personal emails and emails that you've sent and received to perform work or other tasks for your organisation. This makes it easier to find emails again, and to save emails and attachments outside your email client in the right folders in your folder structure at a later date.

Clean up your mailbox

Your email archive is more accessible if you regularly clean up your mailbox. After all, email correspondence always contains a lot of dead weight – such as spam, adverts and news articles – which doesn't need to be kept. So cleaning up your mailbox makes it easier to find emails and ensures you only keep emails that are worth archiving. You can agree a selection list in your organisation to decide which emails need to be stored and which don't.[5] Emails that are sent or received as part of your organisation's work or other activities are kept; purely informative emails that don't have any direct link with your organisation can be deleted.

Use a suitable protocol

Emails are retrieved from the mail server for archiving, and you should use the IMAPS protocol for this. IMAPS is a form of IMAP, a standard protocol[6] for retrieving emails from the mail server and sending them to an email application over an encrypted (and therefore secure) connection. It saves emails according to the structure you use to organise your emails in your mailbox, and doesn't delete them when they are retrieved from the server.

Store all the essential email properties

Authenticity and integrity are central concepts for archiving. Authenticity assures you that an archive item is what it claims to be, and integrity ensures that the content of an archive item is complete and true. The following elements need to be saved in order to preserve these two properties for emails:

  • Origin context: this is all the details that are displayed for an archive item in relation to the archive creator's activities. It clarifies the subject or matter that the email relates to, its origin, and the mutual relationship between related emails, attachments and archive documents.
  • Structure displays the relationships between the different components of an email (Header, body[7] and attachments) and between related emails (e.g. when replying to or forwarding an email).
  • Content consists of the email subject, the text that is sent, and the attachments.
  • Appearance: layout is not an essential feature of emails; after all, it depends on the email client and the device on which you open the email. If an email has artistic value or the layout clarifies the email message structure or content, it can be important to also save this aspect, however.

An email consists of a header, body and attachment. The relationship between these components is an essential feature that needs to be retained.

You can store these essential features by choosing a file format that saves emails in accordance with the Internet Message Format (IMF). IMF is a standard format for transporting emails, which makes it possible for your email application to read any email that someone sends to you via their email client or webmail service, even if you don't use the same application.

You should therefore never print out emails because they contain hidden metadata that you don't see when you open them in your email application. This metadata contains information about the elements of the document that you want to save, which you lose if you print it out. You can also preserve the origin context of emails by filing them in a well-organised folder structure (see 1. Organise your emails).

This image shows an email's hidden header metadata.

Separate emails and attachments

When attachments are not sent in a permanent file format, there's a risk of obsolescence. And even emails that are written using HTML sometimes include images that are stored on an external web server because otherwise the message becomes too big, meaning you can lose these images if you don't store them separately. You should therefore save images and attachments separately from the email in a suitable archiving format, but make sure that the relationship between the emails, images and attachments remains clear by using the same filename for the different components.

Choose a suitable file format

A suitable file format is a format that is standardised, with an open file specification, and which can be read by different applications, so you're not reliant on a particular software supplier. It is crucial that the file format stores all the essential email properties.

EML and MBOX are the de facto standards for storing emails.[8] These file formats save emails with attachments in accordance with the Internet Message Format (IMF) and can be opened by most email applications and word processing programs. EML saves files separately (one email is one EML file), whereas MBOX can save an entire email archive (one email archive is one MBOX file). MBOX saves an email archive in the email client according to the structure in which the emails are organised. Because MBOX stores an entire email archive in one file, it is difficult to store attachments and emails separately and still maintain the connection between attachments and emails by using the filename, so try to use EML to store important emails in a logical place in your folder structure. . MBOX can however be a useful format when you want to bulk export an email archive, for example when an employee leaves the organisation and wants to transfer all their emails.

Save emails permanently

Email clients and mail servers are not designed to store emails permanently, so emails that are worth archiving always need to be saved outside the email application. Some email clients and webmail applications provide an archiving function, but this is not permanent storage. Email clients use proprietary and compressed formats that can result in a loss of metadata and information. It is not clear how emails are archived with webmail, and you remain reliant on companies such as Google and Microsoft to manage your files.

With webmail applications, it’s not always possible to easily export emails. If you can’t export individual emails as EML or an entire mailbox as MBOX from your webmail application, it’s best to use an email client like Mozilla Thunderbird. This is a free and open-source email application that allows you to export to the EML and MBOX formats. Email clients have the advantage of allowing you to build a folder structure in the mailbox.[9]

Apart from that, the general rules for long-term email storage apply. Always make sure that you use good back-up procedures and that you store different back-ups of your files in different (geographical) locations. Use checksums to safeguard the integrity of your files and check the files periodically, and keep a close eye on developments in file formats. This is particularly important for attachments because of the wide variety of file formats available. (See Storing your digital archive).

Get started with some tools


Authors: Nastasia Vanderperren (meemoo) with help from Rony Vissers (meemoo), Bart Magnus (meemoo) and Pieter De Praetere

  1. See also: Richtlijn 1 van Edavid (in Dutch).
  2. https://legalnews.be/arbeid-sociale-zekerheid/de-mailbox-van-een-ex-werknemer-of-ex-medewerker-wat-zijn-de-dos-en-donts-reyns-advocaten/ (in Dutch)
  3. https://www.vanroey.be/en/what-to-do-with-a-personal-mailbox-of-an-employee-leaving-the-company/
  4. mailbox refers to both the inbox and the outbox.
  5. A selection list is a document that determines which documents to keep or delete. You can find an example of a selection list for emails here (in Dutch)
  6. Protocols are rules that computers need to follow to communicate with each other. POP and IMAP are protocols for retrieving emails from the server. Internet Message Access Protocol, usually abbreviated to IMAP, makes it possible to synchronise emails so that you can look up your emails on all your devices – smartphone, tablet, laptop, computer, etc. POP, short for Post Office Protocol, removes the emails from the mail server when you retrieve them. IMAPS is a form of IMAP that encrypts the traffic, which makes it more secure.
  7. The body is the text field in an email, the section in which the sender writes their message.
  8. EML and MBOX are standardised and widely supported, but not open.
  9. See also: https://kadoc.kuleuven.be/2_erfgoed/22_uwerfgoed/handleiding_erfgoed_bewaren_beheren/digitaalorde#email (in Dutch)