Open source software for the keen photographer: file formats

by Ramón Casero Cañas on 17 January 2007 , last updated

Archived This page has been archived. Its content will not be updated. Further details of our archive policy.

Introduction

This article presents the general idea of workflow in digital photography, and then focuses on how to deal with different file formats of interest (JPEG, TIFF and raw) using free and open source software. While this subject can be approached from many different starting points, the examples that follow are based on the use of the Ubuntu GNU/Linux distribution, using the GNOME desktop environment. Of course many of the programs discussed are also available for other operating systems such as Windows and OSX.

In my experience, many people assume that The GNU Image Manipulation Program (GIMP), is all a photographer needs if he or she wants to use open source software. Other photographers may assume the photos will look worse or that “my camera would not work in Linux”, but both assumptions are misplaced.

My goal in this article is to help computer proficient people understand the needs of a photographer, and to help photographers - from casual snappers to keen amateurs - discover that they can work with free and open source software as well as with proprietary software.

Digital workflow

Digital workflowSee I. Farrell, Digital workflow, Photography Monthly, Aug 2006, Essex, UK, pp. 70-75, and R. Sheppard, Digital darkroom workflow, Outdoor Photographer, Nov 2006, Los Angeles, USA, pp. 54–55. is everything you do after you take a photo with your camera and before it is ready to be printed or published.

Some examples of specific actions that a photographer may want to perform as part of his or her workflow are:

  • save photos from camera to computer
  • remove red eye from a wedding photo
  • stitch together several photos from the Lake District to make a panorama
  • resize and upload photos to a blog
  • tag the people in each photo so that they can be found more easily
  • shift the white balance and boost the saturation to give that extra oomph to a photo

Fig. 1 shows a typical digital workflow. While image editing is part of the workflow, it can be readily seen that the needs of a photographer cover a wider scope.

I will focus solely on the first two steps of the workflow in this article, i.e. moving files from the camera to the computer and opening them (yellow ellipsoidal area in Fig. 1). While the topic sounds straightforward enough, it raises interesting issues with file formats, in particular with raw files.

Digital photos from camera to computer: general issues

Digital cameras store photos as digital files on memory cards in the camera. Most digital cameras can only use one type of memory card, which tend to be one of the following types: Memory Stick™ (MS), CompactFlash® (CF), Secure Digital™ (SD or SDHC), or SmartMedia™. (SM or xD)

The first step in our digital workflow then is to move the photo files from the memory card somewhere else, typically a computer, so that they can be displayed, edited, etc. This in turn depends on three factors:

  • the operating system of the computer receiving the photos
  • the software application doing the data transfer
  • the hardware connection between the memory card and the computer, typically a dedicated cable for the camera or a USB card reader

Most digital camera makers supply proprietary software applications on a CD along with the camera you purchase. Once installed, this application facilitates automatic detection of the camera when it is connected to the computer, transfer of files from the camera to the computer, conversion from raw to JPEG file formats, setting of parameters such as white balance or exposure, printing and more. Not surprisingly perhaps, many photographers assume that this software is necessary in order to connect their camera to their computer or to perform these functions. In fact this is rarely the case.

Fig. 2 shows a screenshot of the Canon ZoomBrowser EX software. This software may be useful, but it is not necessary in order to communicate with your Canon digital camera.

As manufacturer programs are seldom available for GNU/Linux, photographers may get the impression that “cameras don’t work in linux”. But actually, just transferring files depends on the operating system rather than on a particular application. According to H. Figuière, Digital Camera Support for UNIX, Linux and BSD, apart from some very cheap ones, most cameras are supported in Linux. The reason for this lies in standards.

First, most cameras use standard connections:

  • USB (Universal Serial Bus)
  • IEEE 1394 (a.k.a. FireWire™ or i.Link®) The former is a royalty-free industry standard, the latter an industry standard subject to a royalty. Both, however, are well supported by Linux See the Linux USB project and the IEEE 1394 for Linux FAQ sheet..

Second, data transfer is performed with two standard communication protocols, that also make the type of memory card transparent to the computer:

  • Mass Storage: this is the simplest configuration; the computer sees the camera/memory card as an external hard drive
  • PTP (Picture Transfer Protocol, international standard ISO 15740): in this configuration the computer can control the camera (e.g. fire the shutter or change the settings), as well as access the memory card It is therefore possible to transfer photo files from the camera/memory card of a digital camera, in most cases, without additional software.

Although archiving and image editing can be performed offline with other programs, some photographers prefer to use a single application that detects the camera, allows basic image editing, file format conversion, archiving, etc., just as the Canon software does. For them, these features are provided by the libgphoto2 library which is available for Linux and MacOS, but not for Windows. Even cameras that implement non-standard protocols are often supported by libgphoto2. The gphoto website has a comprehensive list of digital cameras supported by libgphoto2.

Of course, most photographers are probably more interested in the user-friendly applications that they can use to get the job done, rather than the technical details that have to do with the operating systems or libgphoto2. However, before turning to software applications, it is worth taking a brief look at the question of hardware connections.

There are two devices that allow access to the memory card from the computer:

  • dedicated cable
  • card reader

Dedicated cable

Here the memory card stays inside the camera, and the camera is plugged into the computer with a dedicated cable (USB or FireWire). Different cameras need different cables, because although the end that goes into the computer is standard, the camera end changes from model to model.

For example, a Nikon Coolpix 5600 requires a USB Type A Male to USB Mini 8 Pin Male cable (Ref. UC-E6), but a Nikon D70s requires a USB Type A Male to USB Mini Pin Male (Ref. UC-E4 part 25262). The price of a new cable typically ranges from £8 to £15.

This type of connection can be used to operate the camera from the computer using PTP, as explained above, but using a cable has some drawbacks too. Principally, file transfer is very slow and it drains the camera’s battery power quickly. You may also find the plastic or rubber lids for the camera connector port to be inconvenient and easy to break, which could leave your camera with an open connector.

Card reader

A card reader is a simple device that accepts memory cards and is plugged into the computer via USB. They range in price between £5 and £10 and fit any memory card of a certain type (some are multicard).

The file transfer is much faster, no camera battery power is consumed, and they are sturdy and easy to handle. On the downside, the camera can not be operated from the computer.

File formats for digital photography

Digital photos can be stored as JPEG, TIFF or raw files. Each format has different characteristics, and is produced at a different stage of the image formation flowchart as seen in Fig. 3.

Briefly, the characteristics of each format are as followsFor further information on each of these file formats, see JPEG, TIFF, and Raw.:

JPEG
JPEG is the most popular format for photographic images in the Internet. JPEG is a lossy compression format, i.e. some information in the image is lost when the file is created. The final image quality is lower, but on the other hand the file size is also smaller than TIFF and Raw and thus they are written to the camera’s internal storage more quickly, allowing pictures to be taken at a faster rate. For these reasons, for most people JPEG represents a good trade-off. The loss in quality will depend on the level of compression. Whether this quality loss is important depends on what one wants to do with the photo, e.g. how big it will be printed.
TIFF
TIFF allows storing images in a lossless format (it can also work as a container for other formats like JPEG, and raw is based on TIFF too). Files are bigger, but no information is lost, and it is the preferred format by digital scanners.
Raw
Raw is not an image format as such, but the way to refer to unprocessed data as produced by the sensor of a camera or scanner after the ISO (“film sensitivity”) is set. Some raw schemes use lossless compression to reduce the file size without losing any information. A good analogy for raw files is to think of them as “digital negatives”.

A comparison of photos taken with a Nikon Coolpix 5400, a camera with 5 levels of image quality, from “Basic” to “Raw”, demonstrate the differences between file formats, as seen in Table 1.

Comparison of file size for different settings of image quality.

Image Quality | File type | File size | Number of shots in a 512M memory card

Basic | JPEG | 527K | 780
Normal | JPEG | 1.2M | 400 Fine | JPEG | 2.8M | 202 Hi | TIFF | 15M | 33 Raw | NEF | 7.9M | 62

Showing the difference in image quality is difficult as that can usually only be noticed when zooming in to the image or printing to large format. Below, in Table 2, is a 115x120 px detail from a set of photos at the above quality levels, resized by 400% to 460x480 px with no interpolation, and saved as JPEG with 85% compression. (The detail obtained from the raw image looks softer compared to JPEG and TIFF because it has not been sharpened).

The “Basic” setting displays noticeable JPEG artifacts, that are also visible in the “Normal” setting, but not in the “Fine”, “Hi” or “Raw” settings. However, the TIFF file is actually almost twice as large as the raw one. As all the information in TIFF is also in raw (but not necessarily the other way around), at least in this camera it does not make much sense to use the “Hi” setting.

Comparison of image quality at different file sizes.

Basic image quality | Normal image quality Fine image quality | High image quality | Raw image quality

JPEG has been an ISO and ITU-T standardSee ISO/IEC 10918-1:1994 and ITU-T Recommendation T.81. since 1994. The most important contribution to the JPEG standard has been attributed to the open source software implementation of the Independent JPEG Group (IJG). JPEG is therefore widely supported and offers no problems for open source software.

TIFF uses tags to declare and describe the content of the fileFor further information on tags see the Tags for TIFF and Related Specifications article from the Library of Congress.. Because the format allows for non-standard private tags, this has led to companies developing proprietary formats based on TIFF. This in turn means that there is no software application (open source or otherwise) that is or will be able to read all types of TIFF files. Fig. 9 illustrates this problem with a TIFF file generated by the Nikon 5400. The image is correctly displayed, except for the top left corner, that contains a thumbnail.

Lack of compatibility between TIFF files is not a critical issue for digital photography. As JPEG files are much smaller, they tend to be used in most cases. And if maximum quality is required, then raw is arguably better. However, raw suffers from the same problem as TIFF, impaired compatibility due to proprietary formats.

Raw format in more detail

As indicated above, raw is not a standard format, but a generic way to refer to the way unprocessed data is stored by the camera, and this is why raw files are often considered digital negatives. That is, for raw data to be displayed on screen, processing (e.g. setting the exposure and white balance) is necessary, as opposed to JPEG or TIFF data, that only require decoding (e.g. uncompressing).

There are some advantages to working with raw files:

  • For any image, maximum quality can only be obtained from the raw file because any further processing, as can be seen in the chart in Fig. 3, reduces the amount of information. For example, as JPEG has only 8 bit per channel for colours, banding becomes apparent with heavy editing.
  • Raw data also admits maximum flexibility and customization. For instance, it is possible to compensate for exposition to get the most of the dynamic range, or set the level of sharpening according to the final size of the print.
  • JPEG and TIFF files produced by the camera are constrained by what the limited onboard firmware can achieve. Offline processing of raw files takes advantage of ever improving computers and algorithms. This combination of maximum quality and flexibility makes raw very attractive for some photographers. The 2006 RAW survey found that 77% of respondents shot in raw mode all or most of the time. This figure could have a bias if respondents to the survey were mainly photographers interested in raw in the first place, but current photography magazines consistently show interest in raw formats too.

So is it possible to work in raw with open source software? In a nutshell: Yes, in most cases, and results might even be better than with proprietary software. But there are limitations and legal issues. These are collectively known as the raw problem.

Open source software solutions

As discussed above, it is possible to attach the memory card to a card reader, or plug the camera into the computer with a dedicated cable. The hotplug service will detect the connection and mount the memory card as a new directory in the file system, e.g. /media/usbdisk-1/. The digital photo files are accessible by a number of open source software tools that can then be used to handle the photo files.

Shell or command line

For many open source software users, the shell remains the favoured tool to interact with the computer. My experience is that proprietary software users usually hate it with a passion, but for me, the simplest and fastest way to copy files from the memory card to the computer is e.g.

$ cp /media/usbdisk-1/DCIM/100NIKON/* ~/photos/

Nautilus file manager

After the memory card is mounted, GNOME will open a Nautilus file manager window showing the contents of the new directory, as in Fig. 10. Files can then be copied by drag-and-drop or Copy & Paste in the usual way

To see thumbnails of raw files, install the package gnome-raw-thumbnailer

$ sudo apt-get install gnome-raw-thumbnailer

and in Nautilus go to Edit -> Preferences -> Preview, and in “Other Previewable files” change “Only for files smaller than” to 10MB or an appropriate value for your typical raw file size.

dcraw

dcraw (by D. Coffin) is a stand-alone command line program to read raw files and save them as PPM or TIFF. PPM is a very basic format for graphic files. It is not used by cameras, but merely as a standard format that can be opened by any program. dcraw has become a standard for open source developers. Its code is the base for other tools also reviewed in this section, e.g. F-Spot, Gimp’s plugins rawphoto and UFRaw.

Install package dcraw

$ sudo apt-get install dcraw

The program can then be run from the shell. For example,

$ dcraw -a photo.nef

produces the file photo.ppm with automatic colour balance.

Let’s look at an example. The JPEG files in Fig. 11-12 were created from the raw file produced by a Canon 400d, a fairly recent camera. The first one was generated using Canon ZoomBrowser EX, while the second one was generated using dcraw with default camera settings. The results look quite similar; in my opinion, Canon has better blacks density, while dcraw has better whites and white balance, but it would be very easy to adjust any of the results to match the other.

Comparison of raw file processing.

Fig. 11. Using Canon software with default settings (Photo: Courtesy of S. Yeates).

Fig. 12. Using dcraw with default camera settings (Photo: Courtesy of S. Yeates).

To convert the file to TIFF instead of PPM do

$ dcraw -a -T photo1.nef

To display the metadata in the file

$ dcraw -i -v photo1.nef Filename: photo1.nef Timestamp: Thu Nov 16 23:13:17 2006 Camera: NIKON E5400 ISO speed: 0 Shutter: 1/6.8 sec Aperture: f/4.6 Focal Length: 24 mm Secondary pixels: no Embedded ICC profile: no Decodable with dcraw: yes Thumb size: 1600 x 1200 Full size: 2608 x 1950 Image size: 2608 x 1950 Output size: 2608 x 1950 Raw colors: 3 Filter pattern: BGGRBGGRBGGRBGGR Daylight multipliers: 2.079447 0.934016 1.229720 Camera multipliers: 1.242188 1.000000 2.417969 0.000000

Further options of dcraw are:

  • Extract the camera-generated thumbnail
  • Use different interpolation methods for speed/quality trade off
  • Apply noise reduction while preserving edges.
  • Change gamma values
  • Set black point and highlights
  • Change colour balance
  • Use ICC profiles

An important limitation of dcraw is that it cannot decode all the metadata in the raw file. D. Coffin explained this in an interview: Yes, the metadata is much more complicated [to reverse engineer than the actual sensor data]. That’s why dcraw reads only metadata necessary to decode the image, and ignores the rest.

rawphoto

rawphoto is the Gimp plugin form of dcraw. The graphic interface is very simple to use. When you try to open a raw file in Gimp, a menu pops up with options to change some default parameters, as in Fig. 13. There are no white balance or curve adjustment options in this plugin.

UFRaw

UFRaw can be used both as a stand-alone application or a Gimp plugin, with the same graphic interface. The graphic interface pops up when a raw file is opened as well, but it offers more options than rawphoto. It is possible for example to adjust white balance, tone curves or exposure.

To process a set of files with user interaction:

$ ufraw *.nef *.cr2

This will open each raw file one by one, and show a graphical interface to allow previewing and tweaking of parameters. Files can be saved as PPM, TIFF or JPEG (preserving metadata).

F-Spot

F-Spot is in fact a personal photo management application. It features file import, image editing, organizing, and more. It is possible to import photos directly from the camera (F-Spot uses the libgphoto2 library) or from any directory in the system, including memory cards in a card reader, external hard drives, a network filesystem, etc. Photos can be saved to directory ~/Photos/ or kept in the original directory. In the former case, a directory tree by year, month and day is created, e.g.

$ tree Photos/ Photos/ -- 2006 – 11 – 16 -- DSCN4837.nef – 29 – DSCN4827.jpg – DSCN4828.jpg – DSCN4829.jpg – DSCN4830.jpg – DSCN4831.jpg – DSCN4832.jpg – DSCN4833.jpg – DSCN4834.jpg – DSCN4835.jpg `– DSCN4836.jpg

Once imported, photos can be managed from the F-Spot main window. Files can be tagged, copied, moved, edited (all versions of an edited photo are saved), removed from the catalogue, deleted or searched by tag or date range.

F-Spot also allows the export of photos to an online sharing service like Flickr, Picasa Web or SmugMug, websites powered by photo album organizers like Gallery and O.r.i.g.i.n.a.l., to a CD or a directory.

Batch processing of raw files

Batch processing here means a series of transformations that can be applied repeatedly to a set of images without interaction between the computer and the photographer. For raw files, this usually refers to converting a set of raw files to JPEG using fixed settings. As this is a topic of particular interest to photographers, I have created a new section for it, even though the programs to be used are the same as the ones presented above.

dcraw

dcraw is intrinsically a batch program, as most command line programs in Linux. For example, doing

$ dcraw -a photo1.nef photo2.cr2

converts 2 raw files from different manufacturers (photo1.nef, photo2.cr2) to PPM. Doing

$ dcraw -a *

converts all the files in the current directory to PPM.

UFRaw

UFRaw provides the program ufraw-batch (in Ubuntu both are bundled in the same package) for batch processing of raw files from the shell. The parameters for the image conversion can be set in the shell as usual

$ ufraw-batch –exposure=3.0 *.nef

But it is also possible to use the graphic interface of ufraw to set those parameters, and save them in a so-called ID file. For example, Fig. 17 shows the interface after opening photo1.nef, changing the exposure level to 3.0 and selecting the option to save an ID file.

This generates a text file called photo1.ufraw that contains the conversion parameters:

1.245920 1.000000 1.126845 3.243823 0.300000 4

It is then possible to process a batch of files using these parameters from the shell

$ ufraw-batch –conf=photo1.ufraw *.nef ufraw-batch: loaded photo1.nef ufraw-batch: saved photo1.jpg ufraw-batch: loaded photo2.nef ufraw-batch: saved photo2.jpg

Conclusions

This article has presented the idea of workflow in digital photography and then focused on technical and practical aspects of dealing with JPEG, TIFF, and raw files using open source software.

In practice, it is feasible and simple to download photo files from the vast majority of cameras onto the computer, either using a dedicated cable or a card reader for the memory card. Raw files are well supported; this support is even integrated into the file manager of the GNOME desktop. There are also programs that can convert raw files to JPEG or TIFF, either in interactive or batch mode.

Further reading

Links

Related information from OSS Watch:

Acknowledgements

Fig. 1 includes the icons:

  • scanner by Machovka from the Open Clip Art Library
  • Wilber by Tuomas Kuosmanen
  • the rest by Steven Garrity, Lapo Calamandrei, Ryan Collier, Rodney Dawes, Andreas Nilsson, Tuomas Kuosmanen, Garrett LeSage and Jakub Steiner from the Human and Tango Icon Themes for Gnome (Creative Commons Legal Code Attribution-ShareAlike 2.5)