Validating TIFF files with DPF Manager

From Tracks
Jump to navigation Jump to search
This page is a translated version of the page Valideren van TIFF-bestanden met DPF-Manager and the translation is 100% complete.
Other languages:
English • ‎Nederlands

Validating digital files allows you to check whether they meet their file format’s quality standards. For TIFF files, you can use a DPF Manager to do this.
In this article, you’ll learn:

  • Why should you validate digital files and when should you do this?
  • Why should you validate TIFF files?
  • How do you install and use DPF Manager?
  • How can you correct embedded metadata in TIFF files?

The process for validating file formats verifies whether a digital file's contents and structure satisfy the requirements set for that file format's specification.

DPF Manager is a particularly user-friendly open source tool for checking TIFF files, with a simple interface to show whether your TIFF file satisfies the right TIFF specification. And if your file does not satisfy these requirements, the tool also explains why not.

Why validate?

It's very important to validate file formats for their long-term preservation. One major stumbling block when developing a digital preservation strategy is that we often don't have a clear overview of the file formats in our digital archive, even though this is important for regularly checking whether they can still be opened with available software. After all, it's possible that this software might cease to exist in the future. File identification and validation can help you to detect in good time whether a format is going to become obsolete, so you can take action by converting any relevant and affected files into a different format.

It's also important to check that files delivered in an outsourced digitisation assignment satisfy the set quality requirements.

When to validate?

Quality requirements are set in advance of a digitisation project, for example with regard to which file format to use. Guidelines in the High-quality text and image content digitisation article recommend using Uncompressed baseline TIFF v6.0. When the digitisation is complete, you should therefore check that the TIFF files received satisfy this specification. Even if errors are discovered in the file validation process, the digitisation company can still convert the files into the correct format.

So you're not just checking that files with a .tif extension are actually TIFF files, but also that they satisfy the set requirements in the Uncompressed baseline TIFF v6.0 specification. The file's structure is analysed to check for any errors when the file was created, which could result in not all software being able to read it.

DPF Manager for TIFF file validation

There is a DPF Manager tutorial on YouTube.

Install DPF Manager

Download and install DPF Manager. It is available for Windows, macOS and Linux. Attention! MacOS users need to take an additional step before they can open the installation file. Please read the instructions located in the folder with the installation file for guidance.

Select files to validate

Open the DPF Manager program on your computer.

DPFManager 2 rv.jpg

Drag the folder containing the TIFF files that you want to validate to the 'Files/Folders' window.

DPFManager 3 rv.jpg

...or click 'Select' to choose the folder containing the TIFF files for validation.

DPFManager 5en6 rv.jpg

Select the 'Default' option, and click the 'Full check' button.

DPFManager 7 rv.jpg

The 'Tasks' window opens below, where you can follow the progress of the validation process. The validation is finished when the bar is fully green. Close the window by clicking on 'Tasks' at the bottom left.

DPFManager 8en9 rv.jpg

Analyse the results

When the validation is complete, you can view the report with validation results by clicking on 'Reports' in the menu bar at the top.

DPFManager 13 rv.jpg

You will then see a general overview showing:

  • when the validation was performed;
  • how many TIFF files were validated;
  • which folder was validated;
  • how many errors were detected;
  • how many warnings there are;
  • how many TIFF files passed the validation;
  • the score.

DPFManager 14 rv.jpg

Click the folder symbol to go straight to the reports. You can check the results by clicking on the line.

DPFManager 15en16 rv.jpg

You will then see an overview of the results per file. This shows a summary of the general report for the entire folder at the top, followed by summaries of the reports for the individual TIFF files. The overview shows you, for each TIFF file:

  • a colour code indicating whether the validation was successful;
  • which files have been validated;
  • how many errors were detected;
  • how many warnings there are.

DPFManager 18 rv.jpg

Click on the HTML symbol to see a brief visual summary of the validation results for the entire folder.

All reports, both for the entire folder and for individual TIFF files, are available in four file formats: HTML, PDF, XML and JSON. Simply click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol. For the validation report for an individual TIFF file, click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol next to that file.

DPFManager 20 rv.jpg

The HTML validation report for the entire folder

DPFManager 19 5 assemblage sm.jpg

Click HERE to download a PDF of an example validation report for a folder of TIFF files without any errors.

The HTML validation report for an individual file

DPFManager 21c rv assemblage.jpg

Click HERE to download a PDF of an example validation report for an individual TIFF file without any errors.

Example error messages

Not all file validations result in a report without any error messages. You will find a number of example error messages, with solutions for correcting them, below.

Example 1: use of special characters

DPFManager copyright overzicht.jpg

The validation report indicates that the TIFF file does not comply with baseline TIFF v6.0 specifications. The error message is 'Only 7-bits ASCII-codes are accepted'. Hover your cursor over the error message to see more details.

DPFManager copright.jpg

DPFManager copyright toelichting.jpg

ASCII is a code for displaying letters, numbers and punctuation marks on a computer screen. It consists of 128 characters in total, and you can find an overview on Wikipedia. The error message indicates a problem with the embedded metadata from 'tag 33432 Copyright'. You can find the details for this tag higher up in the report, in the list of IFD tags: '© Rony Vissers'. The copyright symbol in not 7-bits ASCII-code, and that's the reason for the error message.

Fortunately, it's easy to rectify. If you open the file with image editing software (e.g. Adobe Photoshop or GIMP) and view the embedded metadata, you can simply change '© Rony Vissers' to 'copyright: Rony Vissers'. You can access the embedded metadata in Adobe Photoshop by clicking on 'File info' in the 'File' menu. In GIMP, access the embedded metadata by clicking on 'Metadata' in the 'Image' menu, and then 'Edit Metadata'. Don't forget to save the updated TIFF file once you have modified it. See also the embedded metadata article for information about modifying embedded metadata.

DPFManager correctie 2.jpg

When you check the updated TIFF file with DPF Manager, you will see that the previously reported error has disappeared and the file is now valid.

DPFManager correctie.jpg

If the TIFF files are the result of a digitisation project carried out by a specialist digitisation company, ask them to fix the errors rather than doing it yourself.

Example 2: use of compression

Even though the TIFF file format is mainly known as a file format without compression, it does offer this possibility. Compression is not recommended for digitisation, however. DPF Manager can detect TIFF files that have been compressed.

Here is a validation report from the same image: saved without compression on the left, and saved with JPEG compression on the right. The TIFF file with JPEG compression gives an error message.

Validatierapport vergelijking compressie rv2.jpg

The only way to fix this error is to perform the capture or scan again and save it as Baseline TIFF v6.0 without compression. If the RAW file used to create the TIFF file is available, you can use that to create a Baseline TIFF v6.0 without compression.