◌ On rainbows
25 December 2011
(Halfway through writing this I double-checked that the techniques it shows were actually new – and, trying different search terms, found that Jim Bumgardner, a Processing expert, had already explained them perfectly well. With his encouragement I’m publishing this anyway.)
Here are two techniques for picking useful colors for data visualization: a rainbow function and a generator of distinct colors.
A rainbow function
Let’s say we want to turn a one-dimensional value into a color: something like rainbow(n mod 1) → 8 bits × RGB. This is useful to show inherently circular information, for example time of day, longitude, or angle, and can be abused for many other purposes. (Keep in mind, of course, that about 1 in 25 people is colorblind.) The usual solution is to use n as the hue of an HSV color with S and V both at the maximum. In python 3, that might look like this, cribbing from Wikipedia:
def HSV_vivid(h): h %= 1 h *= 6 c = 1 x = 1 - abs((h % 2) - 1) rgb = () if h < 1: rgb = (c, x, 0) elif h < 2: rgb = (x, c, 0) elif h < 3: rgb = (0, c, x) elif h < 4: rgb = (0, x, c) elif h < 5: rgb = (x, 0, c) else: rgb = (c, 0, x) return (int(chan*255) for chan in rgb)
Which we can visualize thus:
The white line in the graph is the sum of R, G, and B, at the same scale. Assuming that you perceive each channel as equally and linearly bright,✼ you’ll see different points on this color wheel as having brightness varying by a factor of 2 between (a) the troughs at the three angles where one channel is at max and two are off and (b) the peaks at the angles bisecting those. This is why most digital color wheels have spokes.
✼ Which you don’t. For one thing, greens and yellows are easier to distinguish than blues. On a typical monitor, the point at 60° is often hard to distinguish against white, while the deepest blue at 240° seems almost as dark as charcoal gray. So we can’t use HSV like this when we want the colored elements to have anything close to equal visual weight against any solid background.
To make something better, notice that the HSV function is based on a single sharp, mesa-shaped curve. It’s repeated, evenly offset and wrapped around the color wheel (in other words, modulo’d by one rotation), for each channel. We could use a replacement for the mesa curve such that evenly offset-and-wrapped copies sum to a constant. When I wondered aloud about this, Sam immediately pointed to sin2. The code looks like this:
def sinebow(h): h += 1/2 h *= -1 r = sin(pi * h) g = sin(pi * (h + 1/3)) b = sin(pi * (h + 2/3)) return (int(255*chan**2) for chan in (r, g, b))
The first two lines are just setup to start the 0..1 cycle on red and run in ROYGBIV order clockwise, and π appears because the sine function works on radians. (If you’re into micro-optimization, you can see ways to refactor for constant folding and FMA here, but let’s keep it readable.) Here’s classic HSV again, plus a sin2 version:
Depending on your eyes and monitor, the difference could be large or small, but it should be clear that the sine-based one is smoother. Its colors are relatively muted, but only relatively, and they look consistent. Compare 0° with 180° in each wheel, for example. To me, the HSV’s looks like rich, candy apple red v. pale, robin’s egg blue, while the sinebow’s is a much more balanced contrast of medium red v. medium turquoise. We lose the sharp, daffodil yellow at 60° and the deep, inky blue at 240° – but that’s exactly what we set out to do: they may be fine colors, but they aren’t team players. The sinebow gives us a set of colors that really do seem to vary mainly in hue and relatively little in saturation or value.
It’s still not perfect, and can’t be without fairly sophisticated knowledge of human vision and the user’s monitor. And, of course, a lot of uses of rainbow color scales are completely inappropriate in the first place. But I think that using the sinebow in place of standard HSV will rarely make things worse.
Incidentally, the sinebow’s code in its most natural form has no branches. This makes it suitable for GPGPU, while a textbook implementation of the classic HSV function has to run each of the 6 cases serially.
Suppose we want a sequence of m distinct colors, for example to distinguish the layers of a map, the handles of people in a chat room, or objects in a game. To give them relatively similar perceptual weight, we can take them from the sinebow’s color wheel. We start at 0 and move 1/m of the way around to generate the next color. This is simple, fast, and effective. But there are two things we might want that it can’t give us: (1) even where m is large, each color in the sequence should be easily distinguished from its dozen or so sequence neighbors, and (2) we should have a way to deal with situations where we don’t know m when we set up the generator, for example in a chat room that an arbitrary number of people can join. (In fact, if you know the number of colors you want beforehand, you should probably be hand-picking them anyway.)
Here’s a scarecrow: use random numbers. This actually works fairly well in practice, and it’s certainly simple. But it’s irritating to know that nothing but the odds are protecting us from, like, five successive indistinguishable colors.
We could use some kind of interleaving scheme involving bit-shifting, angle bisection, or prime numbers, but if you pursue this for even a few minutes (which I did), you’ll find yourself dealing with an awful lot of computational complexity or saved state just to pick colors. It would be nice to have an algorithm that’s firmly O(1) in time and space.
We could get fancier with randomness and place a point uniformly somewhere in the half of the wheel across from the last point. This is better, but it doesn’t protect us from runs of alternate identical colors. And if we go further on the path of narrowing the window(s) in which we place points randomly, we’re only working toward placing them deterministically, which we can’t do without repetition.
Wait, no. What we can’t do without repetition is use a rational ratio as an angle. But there is certain ratio✼ which, when used as a stride around a circle, minimizes total point nearness as the point count increases. It’s the same constant that many plants approximate for the analogous problem of phyllotaxis, or placing leaves around a stalk so they don’t block lower leaves’ sun and dew: φ, the golden ratio. As an angle, it’s about 137.507°. There are a lot of boring, pareidolic claims about φ having special esthetic properties and such, but all you need to believe here is that it’s the most irrational number. Therefore a sequence of colors that are φ apart is as far from repeating as possible in the long run.
(✼ You might know the punk/funk band A Certain Ratio from various songs like Life’s a Scream and Anthem. Their name is taken from Brian Eno’s The True Wheel, possibly the least accessible track of Taking Tiger Mountain (By Strategy), an album named after «智取威虎山», one of the eight model plays approved for performance during the Cultural Revolution. Here’s the most famous film version. In America, the best known of the eight is probably the ballet The Red Detachment of Women, because it was performed for Nixon on his famous visit. This is fictionalized in Nixon in China: in a scene where a woman is whipped, Pat Nixon (represented as a kindly fool), apparently forgetting that it’s fiction, objects and steps onstage. Jiāng Qīng (a sort of needy monster) is provoked by this and sings the famously disturbing aria I Am the Wife of Mao Tse-tung, using the actors as props to praise Máo and prefigure her betrayal, in the Gang of Four, of Zhōu Ēnlái in the second half of the Cultural Revolution – in some versions coldly, in others shrilly – and it all gets very hammy and subtle. For me, it’s the creepiest part of a creepy opera, and it carries a lot of the larger interpretation. In any case, I think Eno claimed the lyrics of The True Wheel came to him in a dream or something, but out and about on the internet one certainly sees people assuming the phrase
a certain ratio has numerological significance.)
Here’s an animation that helped me visualize this:
You can write the code as a very small class or generator. All you have to do is keep adding φ ≈ 1.61803 revolutions (≈ 2.39996 radians ≈ 137.50776°) to your last number. Or, of course:
def nthcolor(n): phi = (1+5**0.5)/2 return sinebow(n * phi)
Let’s see what this looks like compared to some other strides: