High-quality text and image digitisation
Digitisation is the conversion of analogue media, such as paper or photographs, into a digital form. But how do you ensure high-quality digitisation of your archive and collections?
There are four issues, which we explain in this article, that determine digitisation quality:
- the location where digitised items are stored;
- the information (metadata) saved for the file;
- the visual quality of the capture;
- the quality of the file format used to save the image information.
Good to know before reading
You can read more about digitising text and image content – such as photos, posters and drawings – in this section. If you have other media, such as audio tapes or film reels, then go to the Digitising audio and video recordings section.
For your born-digital archive content (i.e. files created on a computer), please see the Digital storage section.
You can also outsource your digitisation assignments to a professional company, which often have more in-house expertise to produce higher quality. But their expertise and quality should still be assessed on the basis of the aforementioned issues. You can find more information in Outsourcing a digitisation assignment.
The storage location
Imagine you spend months digitising your entire photographic collection, saving all the photos on your computer, and then your computer is stolen or you spill coffee on it! Or what if all your images somehow disappear following a system update?
All sorts of things can go wrong with your digitised archive, so you need to make sure your digital files are stored properly. At the very least, you need a good back-up strategy. Read more about this in the How do you make a back-up? section.

The description
A digital reproduction loses lots of value if you don’t know what the original is, or who made the copy and when. You should therefore make sure you keep full records of what is digitised and where the original can be found.
It’s best to register or describe the collection before starting the digitisation process. Another option is to do this systematically during the digitisation process itself, but you need to make sure you work out exactly how you want to do this in advance. You can create the description in a spreadsheet such as Excel or in a database, for example. It’s best not to use Word or other unstructured text formats.
Ideally there will already be a access pass or inventory for the collection, which you can use as the basis for registering your digitisation work. And if there isn’t, but you want to start digitising already, you should always make a note of what each file is, and where the analogue source can be found in your collection, so you always know where the original is.
A digitisation spreadsheet
A spreadsheet can serve as an overview of your digitisation work and record all the links between originals and their digital reproductions. Keep a record of at least the following details:
| Column | Column contents |
|---|---|
| Unique number | A unique number ensures clear identification of the original and its reproduction. It is very important to include this number as part of the reproduction’s filename. It can also be added to the original (e.g. in pencil), and is often a combination of your inventory number with a serial number (e.g. for photo albums) |
| Type of document | If your collection contains different types of content, you can indicate this here, e.g.: "photo", "text document", "poster"... |
| Brief description | A brief description of the original content, e.g.: "Photo taken during a study trip to Prague", "Poster of a show at the Beursschouwburg"... |
| Place code | If the inventory number alone doesn’t provide sufficient information about where the original is located, you can find it in this field, e.g. the number of the box where the original is kept. |
These columns are the minimum requirements and are sufficient to get started with the high-quality digitisation process. You can of course add columns of your own choice depending on the content or your requirements. The most typical columns are:
- start and end dates;
- the projects that the photo relates to (e.g. exhibitions for art houses, productions by performing arts organisations);
- the people in the image.
In general, the following rule applies: the simpler the registration, the smoother the digitisation process itself will be. Bear in mind that you can also add content descriptions after digitisation, based on the reproductions.
Think carefully about adding extra descriptive metadata in your spreadsheet or your inventory and placement list.
The filename and folder structure for your reproductions
As well as a description in a spreadsheet or database, it’s also important to consider what filenames to use for your digital reproductions. As mentioned previously, there should always be a link between your filenames and the unique numbers in your spreadsheet. It preferably starts with the unique number (possibly preceded by a unique code to refer to your organisation). You can then add further text/info after an underscore if desired.
See the Naming files and folders section for this.
When you’re digitising documents that require multiple reproductions (such as photo albums, books or magazines), you need to pay careful attention to the filenames. Make sure they display the correct page order.
The codes "-r" (recto) and "-v" (verso) are often used in filenames if you need to digitise both the front and rear of an original.
Magazines are even more complicated. They have annuals, editions and sometimes also supplements. You will therefore need to consider how to save this info logically in your filenames or folder structure. You can’t order and fully rely on your spreadsheet to recreate the magazine structure.
It’s not difficult, but make sure the files are ordered and organised in a clear and logical way.
The visual quality of the recording
The visual quality of a recording starts with the recording equipment. The better your photographic device or camera, the better your images. But the better your equipment, the more knowledge you need as a user to be able to configure it correctly.
The sharpness that your scanner or photographic device achieves, and how precisely it reproduces colours, depends on how well it is calibrated (see below). Your reproduction environment has a strong influence too, especially if you’re taking photographs. You need to be able control the brightness at all times. And many recordings will also need post-processing, such as straightening and cropping.
You can find a good basic guide for configuring equipment on the FARO website (in Dutch).

General rules
- Scan the entire document, leaving a margin of approximately 0.5 cm around the edge to prove that the full document has been digitised. You can always remove the margin again later, e.g. for publication.
- The image must have a resolution of minimum 300 ppi at full size (ppi stands for ‘pixels per inch’). This means that 300 pixels are recorded for every inch of your document. The more pixels there are, the sharper the image is and the more you can zoom in without losing quality. If you’re digitising documents that you know it must be possible to magnify (e.g. passport photos and slides), the standard value of 300 ppi is not sufficient. If you want to magnify the document 2x as standard, choose 600 ppi. And use 1200 ppi to zoom in 4x, and so on.
- If you’re scanning or taking photographs in colour, choose a bit depth of 24 bit. This is the number of bits (zeros and ones) used to register the colour per pixel. The greater the bit depth, the greater the range of colours that can be saved.
- If you’re scanning or taking photographs in grey tones, choose a bit depth of 8 or 16 bit.
- Make sure the colours are captured and saved using a sufficiently rich colour profile. An RGB colours profile is usual for digitisation projects. The world of heritage mostly opts for the colour profiles ECI RGB v2 or Adobe RGB. Another common colour profile is sRGB. But do not use this for your archive or master files (see below) as the range of colours that sRGB can save is not rich enough.
Calibrate the recording equipment
You’re already on the right track if you follow the general rules above, but they’re not enough on their own. In order to create high-quality reproductions, you need to properly calibrate your recording equipment and screen. Environmental factors such as lighting also need to be optimal.
Unless you’re outsourcing to a professional, you need to be prepared to spend time delving into the subject matter to work it all out. (See the Outsourcing a digitisation assignment section for this.) Go through the user guides and do some experiments or seek advice and training if you want to do it yourself. Make sure you keep the aforementioned general rules in mind.
Calibrating recording equipment and achieving the required standards for high-quality digitisation is quite a technical affair. You can of course create reproductions of a decent standard (without attaching great importance to exact colour reproduction) if you don’t have time to delve into all this in more detail. A digital recording is better than no recording at all. But try to stick to the general rules.
Equipment
What kind of equipment should I buy? A scanner or a photographic device? If you buy good-quality equipment, you can meet the standards required for a high-quality scan in both scenarios in principle.
A scanner is often simpler for beginners to use, but a good photographic device usually offers more possibilities for taking good pictures because you can configure more parameters. Bear in mind that this is a steep learning curve and you need a good environment where you can control the light. Photos taken without good knowledge of photography or in poorly lit conditions result in worse quality images than those produced by scanners.
If you buy a scanner, make sure the software at least allows you to configure the resolution, bit depth and colour profile, and that the scanner can produce uncompressed TIFF files (see below).
Tip: read user reviews for the device, and seek advice from sellers or TRACKS partners.
Software
Goede beeldbewerkingssoftware om bestanden te bewerken en in het juiste bestandsformaat op te slaan (zie verder) is een aanrader. Zeer bekend en heel geschikt is Adobe Photoshop. In combinatie met Lightroom, een tool waarmee je Photoshopacties over meerdere afbeeldingen tegelijkertijd kan uitvoeren, kan je veel bereiken. Een andere software die vaak wordt gebruikt door professionals, is Capture One.
Er zijn ook gratis alternatieven. Voor Photoshop is dat Gimp. Voorbeelden van gratis software om beelden in batch te bewerken (zonder dat we deze in het bijzonder willen aanraden), zijn onder andere XnView en Faststone Image Viewer.
De kwaliteit van het bestandsformaat
Welk bestandsformaat kies je: JPEG, TIFF of PNG? Het antwoord is dat je je bestand in meer dan één kopie opslaat. Creëer minstens een archiveringsbestand en een raadplegingsbestand. Indien je dat wenst kan je ook nog het moederbestand opslaan.
Het archiveringsbestand
Het archiveringsbestand is de kopie waarin je al je informatie zo hoogwaardig mogelijk opslaat, zonder risico op informatieverlies. De archiefkopie dient als je backup waar je altijd weer naar terug kan grijpen wanneer je de hoogste kwaliteit nodig hebt.
Kies als archiefkopie voor ongecomprimeerde TIFF van het type Uncompressed Baseline TIFF v 6.0. Dit bestandsformaat neemt meer opslagruimte in dan de andere, maar het is wel het formaat dat wereldwijd gebruikt wordt voor de opslag van hoogwaardige beelddata. Zorg ervoor dat je géén compressie kiest in je archiefbestanden. Compressie wordt vaak gerealiseerd door bepaalde informatie weg te knippen die niet meteen zichtbaar is voor het oog, maar die wel zichtbaar wordt wanneer je het bestand gaat bewerken in Photoshop (bv. bewerking voor boekpublicatie).
Zorg ervoor dat de kleuren in het archiveringsbestand gecodeerd worden in de kleurruimte ECI RGB v2 of Adobe RGB. Je kan dit instellen met Photoshop.
Controleer de TIFF. Niet iedere TIFF is een goedgemaakte TIFF. De TIFF wordt gemaakt door de software van je scanner en die software is mensenwerk. Er kan dus iets fout lopen. Lees hier hoe je kan controleren dat je TIFF in elkaar steekt zoals het hoort.
Het raadplegingsbestand
De niet-gecomprimeerde TIFF is meestal te zwaar voor dagelijks gebruik en publicatie op het web. Hiervoor gebruik je een kopie in JPEG. Deze kopie noemen we het raadplegingsbestand. Zo’n bestand maak je het makkelijkst aan met software als Adobe Lightroom of alternatieven, waarbij je TIFFs in batch kan converteren naar JPEG.
Het moederbestand
Als je dat wenst kan je ook nog een onderscheid maken tussen moederbestand (of master) en archiefbestand. Beiden zijn TIFF en van hoge kwaliteit, maar je moederbestand bevat de info onbewerkt, zoals ze uit de scanner of het fototoestel komt. Je archiefbestand is dan een bewerkt beeld, mooi rechtgezet, bijgesneden tot op de rand enz.
Wanneer je een moederbestand, archiefbestand en raadpleegbestand bewaart, ben je zeker dat je de opname voor alle mogelijke functies kan gebruiken. Het betekent wel dat je tweemaal een zware tiff moet opslaan.
Meer lezen
Auteursvermeldingen
Dit artikel is oorspronkelijk gebaseerd op een tekst van Wim Lowet (Vlaams Architectuurinstituut), in samenwerking met Nastasia Vanderperren en Bart Magnus (meemoo).