Metadata in uploaded files


#1

Image file metadata

The forum default for strip image metadata is now set to false. This configuration prevents image files from losing their open license notices when uploaded to the site.

This topic therefore explains how to manually strip and add metadata using the exiftool utility. Other software can also do this, but exiftool supports a wide array of file formats and metadata fields.

In passing, the metadata in PDF files and other non-image files is not modified on upload.

Stripping metadata

The following commands will strip all metadata from a PNG file and a JPG file:

$ exiftool -all= foo.png
$ exiftool -all= bar.jpg

In addition, exiftool can process some or all of a subdirectory, for instance all PNG files in the foo subdirectory:

$ exiftool -all= -ext png foo/

Note also the options -recurse, -overwrite_original, and -preserve.

Adding metadata

The tags used in these examples are my suggestions. Feel free to omit some tags and add others. Note too the tags -rights and -CopyrightNotice. There should be no problem using multiple tags with the same information. But some may be shorthand for others and the processing may be order dependent. Unicode characters can be used. Creative Commons offers the following recommendations concerning XMP metadata, although their comments relate mostly to PDF files.

The following code will add metadata to a PNG file. The given tags are registered under the PNG specification and do not derive from informal conventions, such as those used by ImageMagick.

target="foo.png"
lastmod=$(stat --format="%y" "$target")
exiftool \
    -Author="Erika Mustermann <erika@isp.eu>" \
    -Description="Image description goes here" \
    -Copyright="This work is licensed under the Creative Commons Attribution 4.0 International License" \
    -CreationTime="$lastmod" \
    "$target"

Put the above in a text file called foo.cmd and run it under bash:

$ bash foo.cmd

Similarly for a JPG file:

target="bar.jpg"
exiftool \
    -Artist="Erika Mustermann <erika@isp.eu>" \
    -ImageDescription="Image description goes here" \
    -Copyright="This work is licensed under the Creative Commons Attribution 4.0 International License" \
    -XMP-cc:License="http://creativecommons.org/licenses/by/4.0/" \
    "$target"

And again run it from the command line:

$ bash bar.cmd

I am not aware of SPDX open license descriptors being used in image file metadata and doubt if there is a corresponding tag (although there should be).