Apparently, adjacent notes in a piano (including white or black) are always separated by a semitone. Why the distinction, then? Why not just have scales with 12 notes? (apparently there’s a musical scale called Swara that does just that)
I’ve asked several musician friends, but they lack the math skills to give me a valid answer. “Notes are like that because they are like that.”
I need some mathematician with musical knowledge (or a musician with mathematical knowledge) to help me out with this.
Mathematically, is there any difference between white and black notes, or do we make the distinction for historical reasons only?
The first thing you have to understand is that notes are not uniquely defined. Everything depends on what tuning you use. I’ll assume we’re talking about equal temperament here. In equal temperament, a half-step is the same as a frequency ratio of 12√2; that way, twelve half-steps makes up an octave. Why twelve?
At the end of the day, what we want out of our musical frequencies are nice ratios of small integers. For example, a perfect fifth is supposed to correspond to a frequency ratio of 3:2, or 1.5:1, but in equal temperament it doesn’t; instead, it corresponds to a ratio of 2712:1≈1.498:1. As you can see, this is not a fifth; however, it is quite close.
Similarly, a perfect fourth is supposed to correspond to a frequency ratio of 4:3, or 1.333…:1, but in equal temperament it corresponds to a ratio of 2512:1≈1.335:1. Again, this is not a perfect fourth, but is quite close.
And so on. What’s going on here is a massively convenient mathematical coincidence: several of the powers of 12√2 happen to be good approximations to ratios of small integers, and there are enough of these to play Western music.
Here’s how this coincidence works. You get the white keys from C using (part of) the circle of fifths. Start with C and go up a fifth to get G, then D, then A, then E, then B. Then go down a fifth to get F. These are the “neighbors” of C in the circle of fifths. You get the black keys from here using the rest of the circle of fifths. After you’ve gone up a “perfect” perfect fifth twelve times, you get a frequency ratio of 312:212≈129.7:1. This happens to be rather close to 27:1, or seven octaves! And if we replace 3:2 by 2712:1, then we get exactly seven octaves. In other words, the reason you can afford to identify these intervals is because 312 happens to be rather close to 219. Said another way,
happens to be a good rational approximation, and this is the main basis of equal temperament. (The other main coincidence here is that log254≈412; this is what allows us to squeeze major thirds into equal temperament as well.)
It is a fundamental fact of mathematics that log23 is irrational, so it is impossible for any kind of equal temperament to have “perfect” perfect fifths regardless of how many notes you use. However, you can write down good rational approximations by looking at the continued fraction of log23 and writing down convergents, and these will correspond to equal-tempered scales with more notes.
Of course, you can use other types of temperament, such as well temperament; if you stick to 12 notes (which not everybody does!), you will be forced to make some intervals sound better and some intervals sound worse. In particular, if you don’t use equal temperament then different keys sound different. This is a major reason many Western composers composed in different keys; during their time, this actually made a difference. As a result when you’re playing certain sufficiently old pieces you aren’t actually playing them as they were intended to be heard – you’re using the wrong tuning.
Edit: I suppose it is also good to say something about why we care about frequency ratios which are ratios of small integers. This has to do with the physics of sound, and I’m not particularly knowledgeable here, but this is my understanding of the situation.
You probably know that sound is a wave. More precisely, sound is a longitudinal wave carried by air molecules. You might think that there is a simple equation for the sound created by a single note, perhaps sin2πft if the corresponding tone has frequency f. Actually this only occurs for tones which are produced electronically; any tone you produce in nature carries with it overtones and has a Fourier series
where the coefficients an,bn determine the timbre of the sound; this is why different instruments sound different even when they play the same notes, and has to do with the physics of vibration, which I don’t understand too well. So any tone which you hear at frequency f almost certainly also has components at frequency 2f,3f,4f,....
If you play two notes of frequencies f,f′ together, then the resulting sound corresponds to what you get when you add their Fourier series. Now it’s not hard to see that if ff′ is a ratio of small integers, then many (but not all) of the overtones will match in frequency with each other; the result sounds a more complex note with certain overtones. Otherwise, you get dissonance as you hear both types of overtones simultaneously and their frequencies will be similar, but not similar enough.
Edit: You should probably check out David Benson’s “Music: A Mathematical Offering”, the book Rahul Narain recommended in the comments for the full story. There was a lot I didn’t know, and I’m only in the introduction!