Graphics with scientific data become clearer when the colours are chosen carefully. It is convenient to have good default schemes ready for each type of data, with colours that are:
My default colour scheme for qualitative data is the bright scheme in Fig. 1. Colour coordinates (R,G,B) are given in the RGB colour system (red R, green G and blue B), decimal at the top and hexadecimal below. An alternative when fewer colours are enough is the high-contrast scheme in Fig. 2, which also works when converted to greyscale. A second alternative is the vibrant scheme in Fig. 3, designed for data visualization framework TensorBoard. A third alternative is the muted scheme in Fig. 4, which has more colours, but lacks a clear red or medium blue. A fourth alternative is the medium-contrast scheme in Fig. 5 with three colour pairs that can work in greyscale, but not as well as the high-contrast scheme.
The bright, high-contrast, vibrant, muted and medium-contrast schemes work well for plot lines and map regions, but the colours are too strong to use for backgrounds to mark (black) text, typically in a table. For that purpose, the pale scheme is designed (Fig. 6, top). The colours are inherently not very distinct from each other, but they are clear in a white area. The dark scheme (Fig. 6, bottom) is meant for text itself on a white background, for example to mark a large block of text. The idea is to use one dark colour for support, not all combined and not for just one word.
There are situations where a scheme is needed between the bright and pale schemes, for example (Fig. 10) for backgrounds in a table where more colours are needed than available in the pale scheme and where the coloured areas are small. For this purpose, the light scheme of Fig. 7 is designed.
Colour names have been added to the scheme definitions as mnemonics for the maker of a figure, not necessarily for use in text: a reader should not have to guess what olive looks like. Colours are identified uniquely by their names within the collective of the bright, pale, dark and light schemes, whereas the high-contrast, vibrant, muted and medium-contrast schemes reuse some names for different colours.
The colours within a qualitative scheme are given in order of changing hue (or luminance in the case of the high- and medium-contrast schemes), but the colours can be picked at random. Often, a data type suggests an appropriate choice or similar data types can be grouped by giving them similar colours. If the colours have to be picked in a fixed sequence, a good order for each scheme is as follows.
Examples of the use of the qualitative schemes are given in Figs. 8 and 9 for lines of the Tokyo metro and in Fig. 10 for cell backgrounds and text blocks. The application to maps is shown in Fig. 11. It is a stylized variation of the diagnostic map used by the ColorBrewer website. In one area, all colours are shown in a random pattern. In other areas, basically one colour is shown, but with one small area of each other colour included. This indicates how well the colours within a scheme are identifiable when all of them are used.
The design of the qualitative schemes involved four types of calculations:
All colours on this site are defined in sRGB colour space, the default used by most software and displays. Printers work in a different colour space that also varies from model to model. When they conform to international standard ISO 12647-2 and the exact printing conditions are not known beforehand, it is recommended to assume the CMYK colour space provided by colour profile ISO Coated v2 300 %. All scheme colours are taken from the overlap between this and the sRGB colour spaces. Individual printers may deviate, probably not so much that colours become unrecognizable, but enough to push some colours closer together. However, it is not possible to take individual printers into account.
The Netherlands Standardization Institute NEN has issued a code of practice which includes a recommended scheme with eight colours, three greys and white. The colours are bright, but differences between them in colour-blind vision are often much smaller than the smallest difference in the bright, vibrant or muted schemes, two colours are not print-friendly and they cannot be quoted without infringing copyright.
Diverging schemes are for ordered data between two extremes where the midpoint is important. Such schemes could be constructed simply by scaling the colour coordinates linearly, e.g. from blue to white to red. However, by including subtle hue changes, the colours are more distinct and the schemes more attractive. Figures 12, 13 and 14 show the sunset, BuRd and PRGn schemes, which are tweaked versions of schemes on the ColorBrewer website. The darkest shades of the original versions have been removed, because they are too dark and similar to be used in practice. The circled colour is meant for bad data, without drawing attention away from good data with a large deviation from zero. The sunset scheme was designed for situations where bad data have to be shown white. The three schemes look similar in colour-blind vision, so if more than one is used, do not reverse the direction in one of them. If more colours than shown are needed from a given scheme, use a continuous version of the scheme instead of the discrete colours, by linearly interpolating the colour coordinates. If fewer colours are needed, pick colours at equidistant points in the continuous version. Examples of the use of the diverging schemes for maps are given in Figs. 15 and 16.
Sequential schemes are for ordered data from low to high. The YlOrBr scheme given in Fig. 17 is a tweaked version of the ColorBrewer YlOrBr scheme. The most distinct grey is also given, useful for data gaps; it is not meant for extreme values. If more colours than shown are needed from this scheme, use a continuous version of the scheme instead of the discrete colours, by linearly interpolating the colour coordinates. If fewer colours are needed, pick colours at equidistant points in the continuous version. An alternative continuous scale is provided by the iridescent scheme, which is the linear interpolation of the colours specified in Fig. 18. The luminance varies linearly, so this scheme also works well for people with monochrome vision and in a monochrome printout.
There are many warnings that ordered data should not be shown with a rainbow scheme. The arguments are:
The discrete rainbow colour scheme is inspired by the temperature map of the weather forecast in newspaper de Volkskrant: unconnected curves in CIELAB colour space for purples, blues, greens and oranges, each sampled three times but the last one twice extra for yellow and red, in total 14 colours. The curves were straightened, shifted and sampled equidistantly to make the colours more distinct, reasonably colour-blind safe and print-friendly. Later, the lines were resampled with smaller distances and the scheme was extended towards white and black, to get 23 colours. Figure 21 shows how the two sets can be combined to make a scheme with any number of colours up to 23.
People usually find out at an early age whether they are colour-blind. However, there are subtle variants of colour-vision deficiency. The two main types are:
To simulate green-blindness, all RGB colours in an image are converted to R′G′B′ colours with
R ′ = (4211 + 0.677 G 2.2 + 0.2802 R 2.2)1/2.2,
G ′ = (4211 + 0.677 G 2.2 + 0.2802 R 2.2)1/2.2,
B ′ = (4211 + 0.95724 B 2.2 + 0.02138 G 2.2 − 0.02138 R 2.2)1/2.2,
with parameters R, G and B in the range 0–255 and the output values rounded. To simulate red-blindness, colours are shifted as follows:
R ′ = (782.7 + 0.8806 G 2.2 + 0.1115 R 2.2)1/2.2,
G ′ = (782.7 + 0.8806 G 2.2 + 0.1115 R 2.2)1/2.2,
B ′ = (782.7 + 0.992052 B 2.2 − 0.003974 G 2.2 + 0.003974 R 2.2)1/2.2.
These conversions should be applied in sRGB colour space, i.e. they work on a standard video display, but not necessarily on paper. The conversion can be performed with the free software suite ImageMagick. The following two commands make green-blind and red-blind versions of original image original.png, respectively:
|scheme||normal vision||green-blind vision||red-blind vision|
According to the Web Content Accessibility Guidelines, a contrast ratio between colours of at least 3 is recommended by ISO-9241-3 for standard text and vision, but the Guidelines define a stronger criterion of at least 4.5 to make the colours useful for people with moderately low vision. This includes people with monochrome vision, who only see brightness variations. The criterion actually only applies to body text, but "charts, graphs, diagrams, and other non-text-based information [...] should also have good contrast to ensure that more users can access the information."
The criterion cannot be met using more than one print-friendly websmart colour plus white and black. The only blue shades are 4477BB and 5577AA, almost the same as the blue from the bright scheme. A websmart shade of grey is not available: only 757575 meets the criterion. The largest minimum contrast ratio in a set of two print-friendly websmart colours plus white and black is 2.8 and in a set with three such colours 2.1. One example with three colours is the high-contrast scheme. However, with some precautions it can still be applied to lines and symbols: use the colours in the order
The largest minimum contrast ratio with six colours is 1.5. The medium-contrast scheme uses print-friendly websmart colours not darker than the blue in the high-contrast scheme (from light to dark):
All other schemes fail the contrast-ratio criterion completely, as they contain too many colours and were designed for standard, red-blind and green-blind vision, relying not only on brightness differences, but also on hue differences. If one of the other qualitative schemes is used, the best subsets for greyscale conversion are (from light to dark):
The YlOrBr and iridescent sequential schemes work well (Fig. 26). The latter was designed for this purpose, with a linearly varying luminance. Python's default sequential scheme viridis has a similar property, but it is not print-friendly and seems to have fewer discernible colours. The rainbow schemes do not work. By definition, all diverging schemes do not work either after greyscale conversion.
Some data sets need a very specific colour scheme. An example is the global land cover classification, as generated by the University of Maryland Department of Geography from AVHRR data acquired between 1981 and 1994, available at a resolution of 1 km. There is a recommended colour scheme, but the colours are not distinct, some not even in normal vision. Figure 27 gives a more subtle and logical scheme where all colours are distinct in all visions. Figure 28 shows the world with a reduced resolution of 20 km using this scheme. Figure 29 shows only North America at a resolution of 5 km, using almost all classes.
The following figures show the true physical figure of the Earth using two colour schemes: the smooth sunset scheme defined here and a traditional rainbow scheme as defined by many programs (e.g. IDL). Most rainbow schemes contain bands of almost constant hue with sharp transitions between them, which are perceived as jumps in the data. In this example large areas are green without features, while for example the yellow line in northern Australia implies a sudden change that does not exist in reality. The oceanic trench at top right can be seen over the whole length with the sunset scheme, while it becomes difficult to see near Japan with the traditional scheme. This shows that it is more important that a scheme is smooth than that it contains many colours. In addition, colour-blind people have difficulty distinguishing some colours of the rainbow. With the traditional scheme, the yellow spot in the middle of the Sahara is not visible in red-blind vision, making the green areas effectively even larger than they are in normal vision. Last but not least, the sunset scheme emphasizes more clearly the average (light colours) and the extremes (dark colours of contrasting hues).
In this context the meaning of the figures is not really important, but if you're interested: they show the distance between the WGS84 ellipsoid and the geoid calculated with the EGM96 gravity model.
Currently I produce maps in two steps. First, I perform the analysis in whatever program and export the data as a particular type of ASCII table. Then, I apply a colour look-up table to the data and export the result as a PNG file. This way, I don't have to redo the analysis if I want a different colour scale, e.g. to emphasize a different value range. The second step is performed on the command line with the command "convert" of ImageMagick, a free, cross-platform and open-source program suite for image manipulation.
The data are exported to a plain portable graymap (PGM) file, which is an ASCII file starting with "P2 w h 65535 ", where w is the width and h the height of the map, followed by a list of integers in the range 0 to (in my case) 65535 with a space or newline between the values. Because the width and height are given, the spaces and newlines can be put wherever you like. The "P2" identifies the file as a PGM file. I scale the data so all values are in the range 0–65532 (but you don't have to fill this whole range). The other possible values are reserved: 65533 for bad data, 65534 for text and lines and 65535 for no data (e.g. outside the projection of the world).
The colour look-up table (clut) is exported to a plain portable pixmap (PPM) file, which is an ASCII file starting with "P3 1 n 255 ", where n is the number of colours, followed by a list of RGB coordinates of the colours as integers in the range 0–255 with a space or newline between the values. I don't use perfectly white, because that may become transparent later on.
Now the image can be produced. The data is stored in input.pgm, the clut in clut.ppm and the image should be output.png. If none of the 3 reserved values are used, then the command is:
An example: given a map of the albedo at 750 nm including the legend (input.pgm) and a list of colours (clut.ppm), the output of the last two commands above (with "-level 0,9999" and without "-interpolate integer") is this image.
The SRON style for presentations has a dark grey background. I have produced a design template for PowerPoint 2003 or earlier and a theme for PowerPoint 2007 or later with a palette including the regular colours white (for titles), light yellow (for normal text) and orange (for stressed text), but also three extra colours that are suitable for use with beamers: the light blue of the footer and shades of green and red that work well.
The current official style of SRON reports, technotes, etc. can also be achieved in LaTeX and pdfLaTeX. Two files are needed:this. Under Debian, you may need to execute first (once) the command "sudo apt-get install texlive-fonts-extra".
I am an Instrument Scientist with a PhD in atomic physics, working on the TROPOMI and SPEXone projects in the Earth programme of SRON. I have normal colour vision, but many colleagues have not. My email address: firstname.lastname@example.org.