Why God plays dice: A pedagogical and accurate explanation of the second law

The second law of thermodynamics has taken a supreme place in the hall of fame of universal laws, owing to its time-asymmetric character and its countless applications in engineering, biology, chemistry, physics and astronomy.[that is: everywhere] Yet, it also holds a supreme place as one of the most mystic laws that “nobody really understands”. Luckily, exactly the opposite is the case. In this post I will give you an accurate explanation of the second law that even a child (12+) can understand.

Required background: familiarity with playing dice.

Warning: This post is longer than usual, but it’s worth the effort. 😉


We start with the fact that the Universe is made of elementary particles that have certain properties such as a mass, an electric charge, a position, etc. Those particles (and their interactions) are crucial to understand the second law, but it is actually not crucial that these particles are truly elementary (such as electrons or quarks). Any sort of particle (atoms, molecules or even billiard balls) works for our explanation below. If you like, this is already an important hint at why the second law is so universal.

Indeed, we will use an abstract model and represent each particle simply by a 6-faced die. Each face represents a different state of our particle, for instance, the position of an oxygen molecule in the air surrounding you. Surely, just 6 faces are not enough to represent all states of our particle in an accurate way, but for our explanation it is unimportant whether the die has 6 faces or 6,000,000,000 faces or even more (another hint at the universality of the second law). So for simplicity, we will be happy with 6 faces only, and—believe it or not—this description is surprisingly close to how physicists model a particle mathematically.

Before we get into the details, let me mention for the advanced reader that there is a subtle assumption we are using here, namely that all particles (or all dice) are the same. But the advanced reader can easily check on their own that the explanation remains accurate for multiple sorts of particles (represented, for instance, by dice of different sizes). Moreover, even though I will give a quantum mechanically accurate treatment of the problem at the end, I need to skip one detail related to the indistinguishability of elementary particles. This would require some greater care and advanced mathematics (physics students are not exposed to this before year 3). But again: even this subtlety will not change the basic argumentation below. So after all these preliminary remarks, let’s go…

Warm-up: some simple statistics about Rolling Dice

Take as many dice as fit into your hand and roll them once. Will the dice show any regularity in their behavior? At first sight, one might not think so since each die behaves completely random, showing each of its faces with the same probability 1/6 (we assume fair dice here, which—as we will see below—is even justified in our Universe). But think twice!

Even a child knows that certain regularities appear if one rolls many dice (or a single die many times). For instance, many games (such as Monopoly) reward the player for rolling doubles with two dice (such as ⚀⚀, ⚁⚁, ⚂⚂, etc.) because it is an unlikely event: only 6 of the 36 possible outcomes for rolling two dice give doubles. Thus, the probability for doubles is 6/36 = 1/6. If you are a lucky and optimistic player, you might still bet on rolling doubles, after all 1/6 is not a terribly small number. But would you also bet on rolling triples (which have probability 1/62 = 1/36) or quadruples (with probability 1/63 ≈ 0.005)? Or a decuple (involving 10 dice)? We conclude: highly regular sequences are very uncommon—one typically rolls a sequence where each face appears roughly equally many times. Hence, we get a fantastic play on words: the “irregularity” of sequences can be seen as a “regularity” of rolling dice!

A related example concerns the expected face value of a die. Formally, the expectation value is defined by adding up each possible face value multiplied by its respective probability. Since the probability is the same for each face value, we get (1+2+3+4+5+6)×(1/6) = 3.5. But this number does not quite get the gist of what we “expect to see”. Not only is it impossible to roll a 3.5, but in a single roll the values 3 or 4 (which are close to 3.5) are as likely as the values 1 or 6 (which are far from 3.5).

What we really mean when talking about expecting something around 3.5 is again related to rolling many dice. Suppose you roll 10 of them. Saying that the expectation value is 3,5 means that most of the 610 = 60,466,176 many possible sequences1 of faces give a total value close to 3.5×10 = 35, where “close to 35” means, say, a value between 30 and 40. This fact is actually related to the above mentioned regularity of irregular sequences, but the best way to convince yourself of it is to do the counting. Below is a table showing for N dice how many sequences have a total face value ∑ within some boundary (to make ∑ comparable for different N, we divide it by N and consider ∑/N).

N1 ≤ ∑/N < 22 ≤ ∑/N < 33 ≤ ∑/N ≤ 44 < ∑/N ≤ 55 < ∑/N ≤ 6
111211
2371673
310461044610
43527567627535
1089,5189,373,67241,539,7969,373,67289,518
Table 1: Number of sequences that have a given value ∑/N (within some boundary) for different N.

By carefully inspecting the table you should observe: for increasing N most sequences start to concentrate around a value ∑/N ≈ 3.5 (that is, the expectation value!) whereas only a (relatively) small amount of sequences is far from the expectation value. To make this “relative” character apparent, it is useful to divide each entry in the table above by the total number 6N of sequences. The result is (up to some rounding error):

N1 ≤ ∑/N < 22 ≤ ∑/N < 33 ≤ ∑/N ≤ 44 < ∑/N ≤ 55 < ∑/N ≤ 6
116.7%16.7%33.3%16.7%16.7%
28.3%19.4%44.4%19.4%8.3%
34.6%21.3%48.1%21.3%4.6%
42.7%21.2%52.2%21.2%2.7%
100.15%15.5%68.7%15.5%0.15%
Table 2: Relative number of sequences that have a given value ∑/N (within some boundary) for different N.

The very attentive reader might mourn that the definition of the boundaries above is somewhat subjective. For instance, we defined the center region by 3 ≤ ∑/N ≤ 4; that is, we included the boundary values 3 and 4 (please recall the difference between the “(strictly) less” sign < and the “less or equal” sign ≤). The table below shows what happens if we exclude the boundary term in the center region (but include it in the adjacent regions, of course).

N1 ≤ ∑/N ≤ 22 < ∑/N ≤ 33 < ∑/N < 44 ≤ ∑/N < 55 ≤ ∑/N ≤ 6
133.3%16.7%0.0%16.7%33.3%
216.7%25%16.7%25%16.7%
39.3%28.2%25.0%28.2%9.3%
45.4%28.2%32.9%28.2%5.4%
100.29%20.2%59.0%20.2%0.29%
Table 3: The same as Table 1 but for different boundaries.

The result is: whereas there are very strong deviations for N = 1, already for N = 10 the results look similar. Increasing N further, say to N = 100, would make them almost indistinguishable for all practical purposes. Indeed, this is guaranteed by the central limit theorem, but we will not go into further details here. Moreover, it also does not matter that we divided the interval [1,6] into five partitions. We could choose a much finer partition, but if N becomes large we will always see the same effect emerging.2 Therefore, we summarize:

Summary: An individual sequence obtained from rolling N dice gives for large N very likely a value ∑/N close to the expectation value of 3.5. That is, “it behaves as expected.” This is so because the vast majority of sequences looks irregular, whereas regular sequences deviating strongly from ∑/N ≈ 3.5 are the rare exception (e.g., a sequence with N ⚀). This phenomenon is very robust, for instance, it is insensitive to the precise definition of the boundaries as long as N is large enough.3 This is the law of large numbers.

Introducing Entropy: the queen of science

It might not come as a big surprise to the reader that the number of sequences that look regular or irregular is an important quantity, not only for rolling dice but in a broad area of sciences (essentially everywhere where some sort of statistical reasoning is needed). For instance, in Table 1 we saw that for 10 dice there are 41,539,796 many sequences whose total face value (divided by 10) lies in the interval [3,4], whereas only 179,036 sequences fall into the interval [1,2) or (5,6].

Whenever a quantity is important in science, one introduces a special symbol for it, and for historical reason we label the number of sequences having some property by the letter W. So, in the last example we have W = 41,539,796 for the property: “10 dice show a total face value (divided by 10) within the interval [3,4].” Moreover, since specifying the property we are interested in can take quite a few words, we introduce another abbreviation and label a property by the letter P. Thus, in short, W(P) denotes the number of sequences with property P.

Now, more than 150 years ago the physicist Ludwig Boltzmann defined a very important quantity that he called entropy[also for historical reasons] and defined as:

Entropy S = log(W)

Here, “log” denotes the logarithm. If you do not know this function, it doesn’t matter too much because the arguments below remain true even if we forget about the logarithm in all expressions.4

Why is this entropy quantity important? Well, because the second law of thermodynamics says:

Second law of thermodynamics: No process in the Universe can decrease the entropy of the Universe.

Note that I will not explain you here what the second law implies and why it is important (this is covered at many other places, and I critically discussed this with Matteo Polettini recently). Instead, I assume the reader to know at least some of the important implications of the second law. In this post we are concerned with explaining the second law, i.e., why is it true that entropy increases?

An important hint at the answer is directly contained in the definition of entropy itself. Properties P1 that are implied by a large number of sequences W(P1) have a higher entropy than properties P2 that are implied by a smaller number of sequences W(P2). That is: if W(P1) > W(P2), then S(P1) > S(P2). Hence,

More common sequences have higher entropy.

At this point, it is probably advisable to establish a connection between the abstract definition of entropy above with some “real world” example. So let us consider the perhaps most common example: temperature.

The temperature of a body is related to the average kinetic energy of its constituent particles, e.g., the temperature of the air surrounding you is felt by you due to the collision of many air molecules with your skin. The important point is here: temperature is related to an average perception where many atoms or molecules participate—a single air molecule hitting you will not generate any temperature perception. You can easily test this with a candle at home:[Only do this with health insurance!] move your finger quickly throw the flame and the flame will not feel hot, but leave your finger in the flame for a while and you will feel it’s hot.

All what we say in this post is concerned with such average properties, whose effect is generated by many different particles, because this is what thermodynamics, entropy and the second law is about. The second law tells you how heat (or energy) flows between bodies of different temperature, but it does not tell you the state of every single molecule in the body. Returning to the example of rolling dice: entropy is a powerful concept if you are interested in simple quantities such as the total face value of all dice, but it does not tell you much about the individual sequence that you rolled. In particular, notice that, if you use the faces of the dice to represent velocities of particles, then the expectation value of the dice corresponds to the average kinetic energy of the particles, i.e., it becomes a measure of temperature. Our example of rolling dice is closer to reality than you might have thought initially…

Adding Dynamics

It is good to pause for a moment and to recap. Whenever there are many particles (may they be represented by dice or atoms), and whenever we ask questions about the average behavior of many of them, then we can group all sequences according to a property P (e.g., the total face value of all dice). We then find that some properties P are implied by a much larger number W of sequences than other properties, and this implies that the entropy S of this property is also higher. For rolling dice this “dominant” property was the one where the total face value is close to the expectation value.5

However, all this reasoning was very “static”, based solely on counting sequences according to some criterion or property. There was no dynamics. Once the dice were rolled, they do not change their state anymore; and perhaps by mere chance we roll a very unusual sequence of dice with low entropy. But the second law of thermodynamics is about processes or changes. It does not say that the entropy now is maximal, it says that the entropy tomorrow is greater than today which is greater than yesterday. Or more symbolically: Sfuture > Spast.

In reality there is lots of dynamics and movement even if we humans can not see it: blood flows in our veins, electric charges flow along our nervous system, tiny molecules are flying through the air, and atoms shake in a crystal. Permanently, all the time! So, if we want to model the blood, electric charges, molecules or atoms in an abstract way using dice, we need to introduce some dynamical rule for them, some law of evolution, some way they can change their faces. Here are three possible examples how two dice could change their faces during one time step starting at t = 0 and ending at t = 1:

Obviously, which changes are allowed and how fast they happen depends on many difficult physical details. But surprisingly, the second law does not depend on these details. All what matters is that many ways of change are allowed, thereby connecting many different face values. Observe, for instance, that the above examples allow to turn the pair ⚀⚅ into ⚃⚁ by linking (A) → (B) → (C) at consecutive times.

Does the above dynamics cause an increase in entropy? Clearly, with only three examples this is hard to tell, but the point I want to make is that it is possible to postulate some dynamics that violates the second law!

Example: Suppose the dynamical rules are ⚀→⚀, ⚁→⚀, ⚂→⚀, ⚃→⚀, ⚄→⚀ and ⚅→⚀: whatever the current face value is, it always gets changed to ⚀ (and then stays there forever). This would turn every sequence into the highly regular and uncommon sequence ⚀⚀⚀⚀…⚀. The average face value of this sequence is 1, i.e., it is far from the expectation value 3.5, and it has very low entropy. Thus, this is an example where the dynamics would always decrease entropy, in violation of the second law.

But this is not what we observe. The second law is empirically very well tested. So, what forces are at work preventing such “unusual” dynamics?

Time reversal symmetry

Whether it is classical mechanics, electrodynamics, general relativity or quantum mechanics: all microscopic physical theories that we have discovered so far share one fundamental property known as time reversal symmetry.

Roughly speaking, time reversal symmetry guarantees that if some particles can change from one state to another, then it is also possible that the reverse change takes place, and even more: this change takes place at the same speed (physicists also call this “microreversibility”). Which direction of change is realized depends only on the initial condition.

Example: If you throw a ball to me, you know it is possible that I can throw the ball back to you. In which direction the ball flies only depends on who of us has the ball in its hands initially. Moreover, you know that when we both throw the ball at the same speed, the ball has the same time of flight, whether it flies from me to you or from you to me. Thus, there is a complete symmetry for both directions as far as the dynamics is concerned: which direction we observe then depends only on the initial state of the ball.

Another slightly more complicated example: You know that an apple can fall from the tree down to the ground. Now, the apple usually stays down at the ground and does not jump back up. But in principle, you know that you can throw the apple back from the ground up to the tree. Moreover, it will reach exactly the position from which it has fallen if you throw it with the same speed the apple had when it hit the ground. Again, there is a symmetry in the dynamics. That the apple does not bounce back from the ground to the tree is a consequence of the second law: if the apple hits the ground, its energy gets absorbed in the grass, sand and dust of the ground (physicists say it gets “dissipated”), but it is possible in principle that the grass, sand and dust give back their energy to the apple and let it jump up to the tree again. It is the goal of this post that you understand at the end why this possibility is very small in our world.

So, let us return to our abstract dice example. If we want to reflect physical reality, we have to build time reversal symmetry into our description. Luckily, this is very simple. In case of our three examples above, we simply need to allow for the reverse direction to also take place. Symbolically, instead of drawing a one sided arrow →, we simply draw a two-sided arrow ⇄. That’s it!

If you want, time reversal symmetry is a democracy principle built into the laws of the Universe: all particle states have the same rights, no state is preferred over the others. At this point it is worth to recall the statement above about fair dice: all what we said was only true because all dice were “equal”. It is time reversal symmetry that justifies this assumption on a microscopic level.

Time reversal symmetry should also make clear why our “counterexample” violating the second law (given at the end of the previous section) is not legitimate. The laws of the Universe forbid that all dice turn into ⚀ and stay there forever, the inverse transition must be equally possible.

The increase of entropy

We have now all ingredients at hand to explain the second law of thermodynamics for our abstract example, where dice are used to model microscopic particles. Consider the following facts that we have found:

  • Each single dice sequence is equally possible.
  • Yet, as soon as we consider some average property of many dice, there are “dominant” properties: a certain “expected” property is implied by the vast majority of sequences, whereas the more one deviates from this expectation, the fewer sequences with this property exist (Tables 1-3).
  • Dynamically, the faces of the dice permanently change (for instance, due to interactions between two dice, see the last two pictures).
  • Time reversal symmetry guarantees that the direction of change is unbiased: no face value is preferred.

So, what happens if we start from some particular sequence of dice? Suppose this sequence is a rare regular sequence (for instance ⚀⚁⚀⚁⚀⚁ … ⚀⚁). The average face value 1.5 of this sequence deviates much from the expected value 3.5, which means its entropy is low. Now, the dynamics start to change the values of the faces and even though this does not happen randomly, it happens in a way that all faces ⚀, ⚁, ⚂, ⚃, ⚄ and ⚅ are treated equally. Since our starting sequence has many ⚀ and ⚁, it is unlikely that the dynamics turn ⚀ into ⚁ or ⚁ into ⚀ only. Instead, it is much more likely that face values ⚂, ⚃, ⚄ or ⚅ are created. In fact, we found that the irregular sequences with roughly an equal number of ⚀, ⚁, ⚂, ⚃, ⚄ and ⚅ are the most common sequences by far. Hence, it is much more likely that a regular sequence gets turned into an irregular sequence, that is: its entropy increases.

Finally, consider what happens if we start from an irregular sequence whose average face value is very close to the expected value 3.5. In this case, we start from a high entropy state, so couldn’t it happen that the dynamics generates a low entropy state over time (such as ⚀⚁⚀⚁⚀⚁ … ⚀⚁) because, after all, all sequences are treated democratically? Yes, this can happen, but it is very unlikely! Why? Because the number of irregular sequences is enormous and unimaginably larger than the number of regular sequences. Go back to Tables 1 & 2 and consider the numbers for N = 10. Then, recall that even a cubic centimeter of air contains approximately 1023 = 100,000,000,000,000,000,000,000 many molecules. By interpolating from the table the behavior from N = 1 to N = 10 particles, imagine the numbers it generates in any realistic situation with 1023 particles!!! The chance that the Universe turns an irregular sequence into a regular sequence is virtually zero: the time it would take to do this is literally beyond our imagination.

If this explanation went to fast for you, go back and consider each step again. It is not easy to connect all the steps the first time, but the important point that I want to emphasize is: all you need to explain the second law is a basic understanding for the behavior of rolling many dice. There is no complicated algebra, analysis or geometry involved.

Below, I will consider two more important details that make our exposition much closer to physical reality, but they do not change the essential point of our argument. Feel free to skip them upon a first reading and just continue with the Summary at the end.

Conservation laws

I’ve said time reversal symmetry guarantees democracy among all possible particle states, but there is a caveat. Some constraints in the world put a barrier to absolute democracy and these constraints are known as conservation laws.

Indeed, we have already tacitly included one conservation law in our treatment. Namely, we have assumed that the number of dice is constant. That is, if you role N dice and you count n1 many ⚀, n2 many ⚁, n3 many ⚂, n4 many ⚃ and n5 many ⚄, then you know that there must be N-n1-n2-n3-n4-n5 many ⚅. Translated into a physical context, this is closely related to the conservation of particles (such as electrons), which can neither be destroyed nor created.

Another important conservation law in physics is conservation of energy. We can build this into our abstract model, for instance, by assuming that the face value of the dice corresponds to the energy of the particle (in some meaningful units). Conservation of energy then implies that the total face value of all dice must remain constant. This has consequences for the dynamics we considered above:

Indeed, you can easily compute that the total face value is conserved in examples (A) and (C), but not in (B). Hence, (B) would violate conservation of energy in this model. Does this change the conclusions we have reached about the second law? No, it doesn’t, but it requires some further thinking!

Let us denote the total face value of all dice by E (instead of ∑ as done in the tables above), where we choose E for energy. If E is constant, the argument above no longer applies directly. For constant E, the number of states W(E) compatible with E can not change, and consequently also the entropy S(E) remains constant. However, there is still a very interesting second law we can construct in this case that follows the same argumentation as above.

To this end, imagine we divide the N dice into two sets A and B containing NA and NB many dice such that NA + NB = N.6 Physically, we can think of two different bodies A and B that contain NA and NB many atoms, respectively, and that are put into contact to exchange energy. Indeed, while the total energy E = EA + EB is constant, the energies EA and EB of each body can change.

Now, we follow the same procedure as above. Let W(EA,EB) be the number of sequences of all dice that add up to energies EA and EB in A and B, respectively, and let the total entropy be S(EA,EB) = log[W(EA,EB)]. Next, observe that we can write EB = E – EA because the total energy E is conserved. Thus, we can actually write the entropy as S(EA,E-EA).

Let us compare this with our previous situation. Previously, we did not assume E to be constant and the total entropy was S(E) (note that previously we talked about a property P, so now we focus on the property of energy, P = E). We then said that, if we start with a state with low S(E), there is a natural tendency for S(E) to increase because the majority of sequences has large S(E) and time-reversal symmetry guarantees democracy among all states.

Now, we find again a very similar situation: the entropy S(EA,E-EA) depends on the variable property EA and it is larger for values of EA that are implied by a larger number of sequences (by definition). Furthermore, even though time reversal symmetry does not guarantee democracy among all states, it is nevertheless present and guarantees democracy among all states that are compatible with the total energy E. Thus, if we start from a small value of S(EA,E-EA), EA is more likely to evolve in a way such that S(EA,E-EA) increases. The second law continues to hold and the explanation is still the same, we only need to look more carefully.

Indeed, if we look even more carefully, we can derive a well known result. If we label the number of sequences adding up to energy EA by WA(EA) and similarly for WB(EB), the total number of sequences compatible with energies EA and EB equals their product: W(EA,EB) = WA(EA)×WB(EB). At this point, we need one property of the logarithm, which appears in the definition of entropy: namely, the logarithm of a product of numbers is equal to the sum of the logarithms of each number. We thus obtain the following expression for the total entropy:

S(EA,E-EA) = SA(EA) + SB(E-EA).

Next, we consider the case where the total entropy has reached its maximum and we recall some slightly advanced mathematics.[sorry for that!] Namely, since the maximum is an extreme value, the derivative of the above expression with respect to EA must vanish. In equations: dS(EA,E-EA)/dEA = 0. But using that the total entropy S is the sum of two entropies SA and SB, the condition of maximum entropy implies S’A(EA) = S’B(E-EA), where we used a prime ‘ to indicate a derivative with respect to the argument, i.e., f'(x) = df(x)/dx. Finally, we use one more fact whose justification would require some detour, but physicists formally define temperature T as the derivative of entropy S with respect to energy E. So, in general T = S'(E), and for the particular case above we find the neat and well known result

TA = TB.

This calls for a little summary. We started from two interacting bodies A and B, whose total energy was conserved, but whose individual energies could change due to the interaction. Applying the second law, and assuming the bodies had enough time to exchange energy such that the total entropy became maximal, we found that the state of maximal entropy is characterized by equal temperatures of the two bodies: TA = TB. Physicists therefore call the state of maximum entropy also thermal equilibrium. Thus, the second law predicts that two bodies exchanging energy will reach the same temperature, in unison with what everyone knows intuitively.

Quantum nuances

This post wouldn’t be complete, if I wouldn’t give you a quantum mechanically accurate treatment of the problem. In fact, this problem is somewhat ideal to introduce two basic (and perhaps the most important) properties of quantum mechanics.

First of all, you might not be completely happy with the way I introduced dynamics. For instance, consider example (A). I told you that the pair ⚀⚅ at time t = 0 could flip over to the pair ⚁⚄ at time t = 1. “Fine,” you might say, “but what is the state of the dice, for instance, at t = 0.5? Is the state somehow one half of ⚀⚅ plus one half of ⚁⚄? Clearly, this doesn’t make sense: the face of a die is either ⚀ or ⚁, but it can not be something in between.”

Well, what can I say… The face can be ⚀/2 + ⚁/2! Physicists call such a state a
“superposition”: the simultaneous existence of two different values ⚀ and ⚁ in the same system. This is a very important consequence of quantum mechanics. Because quantum particles tend to have discrete states (such as the faces of a die), but time itself passes continuously, it must be possible to continuously transform a face value ⚀ into a face value ⚁. This profound insight leads to the requirement that states like α⚀ + β⚁, where α and β are some numbers, must exist. You might find this weird, but the earlier you start accepting it the better…

The second quantum phenomenon I should mention here is the complementarity of different properties. For instance, initially I wrote (very carelessly) that particles “have certain properties such as a mass, an electric charge, a position, etc.” But from Heisenberg’s uncertainty relation you probably know that some properties can not be precisely defined simultaneously such as the position and velocity of a particle.

We can take care of this complementarity in our toy model by equipping the dice with further properties besides their face value. For instance, in addition to their face value you can imagine that the dice can take on different colors, for instance, blue, red and yellow. Of course, as we said above for the face value, it must be also possible that the dice take on colors “in between”. For instance, you can think of red/2 + yellow/2 as orange.

Now, all this would sound normal and fine (after all, also classical dice can have different colors and face values), but the bizarre quantum world demands the following. Whenever the die has a definite color (blue, red or yellow), then its face value must be in one of these strange superpositions such as ⚀/2 + ⚁/2. Vice versa, if the die has a definite face value (⚀, ⚁, ⚂, ⚃, ⚄ or ⚅), then its color must be “mixed” (e.g., orange, purple or green) and it can not be “pure” blue, red or yellow. This is Bohr’s complementarity: you might be able to define one property well, but the other property becomes blurry and undefined (further philosophical discussion of this can be found in this post).

But does any of these bizarre quantum features jeopardize our explanation of the second law? No, they don’t! The reason is that, as long as we stick to one property (may it be face value or color), the above argumentation is just fine. We can define some average property of many dice, count the number of sequences that imply this property, and define entropy as (the logarithm of) the number of these sequences. This “counting of possible sequences” argument is independent of the question whether the face value of a die is ⚀ or ⚁ or ⚀/2 + ⚁/2. This only influences the dynamics (and makes it actually smoother).

Also the fact that each individual die has complementary properties does not need to bother us much because the second law is about average properties of many particles. This then becomes a question of scale (“one versus many”). Recall: rolling a single die has a lot of uncertainty (it makes little sense talking about “expecting to roll a 3.5”), but many dice will almost certainly have an average face value of 3.5. Similarly, one “quantum die” has no definite face value and color, but many “quantum dice” can tend to a definite face value and color.

Summary

The second law of thermodynamics is one of the most charismatic laws of the Universe, and it is also one of the simplest! All you need to explain it is some basic statistical understanding (how do many dice behave?) plus the ability to count.

Personally, I find it hilarious and sad at the same time (and certainly remarkable) how much mystification surrounds the second law even in scientific discourses until today—as epitomized by these famous statements of famous physicists (such as Arnold Sommerfeld and John von Neumann) that “no one really understands thermodynamics or entropy”. But as Lebowitz emphasizes: “there is really no excuse for this [misbehavior].”

Honestly: entropy is the simplest concept we have in physics! If you doubt this, please give me a similarly simple explanation (e.g., based only on counting) of other concepts in physics such as energy, fields, particles, forces, space, time, etc. At least I have no clue what they really are… because they just… are.

So here is a summary of the ingredients you need to make sense of entropy and the second law:

  • Many particles.
  • Focus on some simple average property (technically, physicists call this a “coarse-graining”).
  • Dynamics obeying time reversal symmetry.7
  • Start from a low entropy state (if you want entropy to increase).

It is particularly curious to observe that time reversal symmetry is necessary to explain the time asymmetric second law. Conventionally, the second law is said to be so surprising (so “mysterious”) because microscopic physics has time reversal symmetry, whereas the second law dictates an arrow of time.[Shame on me! Also I fell into this trap.] But exactly the opposite is the case! Entropy is such a universal concept because time reversal symmetry treats all different sorts of particles and interactions democratically. This is actually something I’ve learned for myself while writing this post, so to never forget it, I summarize it once more:

Time reversal symmetry (plus three other conditions) causes the second law.

Where to go from here?

So, if everything is so crystal clear about entropy and the second law, why is there still research on it? It is only because people are confused and reinvent the wheel over and over again? Partially yes, but there are some major questions we have not addressed in this post. Those that I am aware of are:

  • For the examples considered here there was one dominant property, i.e., one property that had the vast majority of sequences behind it. But sometimes there are two or more properties that are backed up by a similar number of sequences, or sometimes quantum or classical uncertainties between different properties become relevant. In this case, one needs to generalize Boltzmann’s entropy concept, and one ends up with an entropy that researchers call coarse-grained or observational entropy. Look it up, it became very popular today (there is even an observational entropy appreciation club now).
  • If you look at the four assumptions required for the explanation of the second law, you find that they are actually quite different in character. The third (time reversal symmetry) is just a fact of the Universe. But the first two are epistemic and related to what kind of systems we look at and what kind of questions we ask. Whereas in daily life we are always concerned with asking simple average questions about many-particle systems, in modern labs this is not necessarily so, and this is a major driving force behind the emergent field of quantum thermodynamics.
  • Then, there is the real elephant in the room: the fourth assumption (“start from a low entropy state”). Why this assumption is justified is the truly puzzling question in our world—with possible answers ranging from “it has to be like that” to “this will involve new physics”.
  • But wait, before you attempt at answering the previous question, you should question the premise whether we can even apply entropy and thermodynamics to the entire Universe. Indeed, the Universe is believed to be unbounded, but theoretical physicists could so far only make sense of entropy and thermodynamics for bounded systems.
  • And even for bounded systems there seem to be some puzzling and worrying questions left, in particular for long range systems such as gravitating systems. Until today, researchers couldn’t really make sense of entropy for these systems, despite the success (and all the fuss) about black hole entropy.

Well then, good luck with solving these problems. 🙂


  1. If you have troubles understanding how I get these numbers, consider the following. For one die there are 6 possible sequences; namely, ⚀, ⚁, ⚂, ⚃, ⚄ and ⚅. If you have two dice, then for each of the 6 sequences of the first die the second die can show 6 different faces, so there are 6×6 = 36 = 62 many possible sequences (which you can easily confirm by counting). Next, consider 3 dice. The first two dice can show 36 different sequences (as we just found out) to which the third die adds 6 further options for each sequence, so the total number of sequences is 36×6 = 63. And so it continues: N dice can show 6N different sequences, which becomes a giant (“exponentially large”) number for large N. ↩︎
  2. Readers who doubt this point are invited to do their own calculations for a much simpler example: a die with two faces only or, put differently, a coin tossing experiment. ↩︎
  3. It also works for many other quantities and not just the expectation value, but for our purposes it is sufficient to focus on the expectation value only. ↩︎
  4. [For the experts.] The logarithm has two nice properties. First, it turns large numbers into small numbers—for instance, log(41,539,796) ≈ 17.54—and in physical applications W will typically be extremely large (beyond imagination, really!). Second, the logarithm of a product of two numbers is equal to the sum of the logarithms of each number. In equations: log(W1×W2) = log(W1) + log(W2). We will use this property only once in this post, but it is a truly important property for many more applications. ↩︎
  5. Indeed, we can even turn things around and define the “expected value” by the property P that is implied by the vast majority of sequences. ↩︎
  6. If you like to push the analogy with rolling dice further, you can imagine to divide the table in front of you into two halves labeled A and B. You then roll dice and look which of them end up in A or B. This defines NA or NB. ↩︎
  7. For completeness I should mention that one also must demand that the number of conservation laws is small compared to the number of particles (can you see why?). But this is satisfied in nature and shall not bother us here. ↩︎

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *