This post explains quantum mechanics (QM) without any advanced math. Unlike most introductions, I will focus on the interpretation of QM: what the objects in the theory mean and how they fit into a broader philosophy of doing physics. Specifically, I explain why the Von Neumann-Wigner interpretation, a variant of the standard Copenhagen interpretation, is the correct one. I also explain why a popular alternative to Copenhagen, the many-worlds interpretation, is incorrect.
The footnotes will contain details for more advanced readers. Also, see here for a shorter and more math-heavy version of this post.
What is science?
Let’s start with what we know. As Descartes said, “I think, therefore I am.” We know that subjective experience exists. In philosophy, subjective experiences are called qualia (singular quale). One purpose of science (including physics) is to predict what qualia we will experience, based on our past experiences. This is simply because qualia are, by definition, all that we can experience, so any attempt to verify a scientific theory necessarily involves qualia as inputs and outputs.
This focus on subjective experience may sound fuzzy and unrigorous, especially for those used to classical physics. However, it is actually a very conservative viewpoint. Some may say that the goal of science is instead to understand the objective world around us. That may be the case, but at a minimum, a theory must also be able to make predictions about our experiences. More on this as we go along.
The wavefunction and many-worlds
In this section, I will explain the basic ideas of QM, in the language of the many-worlds interpretation (MWI). MWI provides a convenient way to visualize QM as the continual splitting of a system’s state into many branches, or “worlds”. I will then show that MWI alone cannot be used to make predictions, for both practical and mathematical reasons. However, we can fix it by adding the concept of wavefunction collapse. This produces the Copenhagen interpretation.
Quantum mechanics describes the universe using a mathematical object called a wavefunction, with the symbol . In the quantum world, a system can be in a combination of classical states instead of being in one state at a time. For example, a particle can be in two places at once. This is called a superposition.
Fig. 1 shows an example. The particle starts at position A, then over time, it evolves into an equal superposition of position A and B. (The boxes show instants in time.) At this time, if the experimenter measures the position of the particle, they will obtain either A or B with 50% probability1. This is indicated by the “probability amplitude” on top of each box. In QM, probabilities are given by the square of this amplitude. This is called Born’s rule. At any time, the squared amplitudes of all the branches must sum to . We get the number on each box as follows. When a box branches into multiple scenarios, we first multiply its amplitude with the number on each outgoing arrow (). Then, for each new scenario, we sum over all the incoming arrows. For example, on the top box in the superposition, comes from on the initial box times from the one incoming arrow.
The numbers on the arrows depend on the particular interactions between the particle and its environment. We will not be concerned with those here.
Of course, the experimenter is also composed of many particles, so should also be included as part of the wavefunction. This is shown in Fig. 2. When the experimenter measures the position, her brain’s particles record a state corresponding to seeing it at either A or B. We say that the experimenter’s state has become entangled with that of the particle.
This shows how physics fundamentally works. To make predictions about qualia, a physical theory associates certain mathematical objects, or states, with qualia such as “seeing the particle in position A”. Given an initial state, classical physics predicts a certain future state, which is confirmed or denied by perceiving its associated qualia. In contrast, QM only predicts probabilities of obtaining future states. One way to confirm QM is then to do many identical experiments and then see if the results converge to the right probabilities2.
Does this mean that we must know the entire state of our brain in order to make or verify any predictions? Of course not. In practice, we rely on our eyes, ears, and other measuring devices to sense the world. This is because external inputs to these devices can reliably induce certain states in our brain. For example, light with a wavelength of 700nm that goes into our eyes can reliably induce the sensation of “seeing red”. More on this when we discuss measuring devices and decoherence later.
A prediction rule
Is the wavefunction all you need? No. As the experimenter, simply knowing the wavefunction at a given time does not allow you to make predictions, for the very obvious reason that you don’t know which branch you are on. At the least, you must also keep track of your current branch. For example, if you observe the particle at A, you know you are on the top branch of Fig. 2. Then, for future predictions, you must only use the arrows coming out of that state. Since the total probability must still equal one, you must then divide the probability (squared amplitude) on each future box by the current one on your box.
This is shown in Fig. 3 for multiple splittings. (Here, instead of drawing pictures in the boxes, I use letters A, B, etc. to show general states.) Let’s say you observe that you are in state B. Then in the future, you have a chance of being in state D and a chance of being in state E. This comes from and . Even though the wavefunction contains states F and G at the same time as D and E, there is no probability of reaching those states because there are no arrows coming from B.
This seems like a workable rule for making predictions: whenever you make a measurement, select your branch of the wavefunction and “follow the arrows” from there to predict future measurement results. Note that this rule does not discard the other branches entirely. All branches are still “there” at least mathematically, although most are unreachable in practice.
The wavefunction in this picture is globally shared among all observers. However, each person might perceive themselves to be in a different branch, depending on their random measurement results. This is shown in Fig. 4. Experimenters E1 and E2 measure the particle in turn. E1 may get A, so she selects the top branch. At the end of this branch, she perceives that both agree on position A. However, E2 may get B, so she selects the bottom branch, and perceives that both agree on position B. The key point is that in the end, each observer perceives an agreement on the position, so the measurement results are consistent from their own perspective.
This example is similar to a famous thought experiment called Wigner’s friend. Wigner’s friend has historically been very confusing (as you can see from the Wiki article), so let me elaborate. Clearly, E1’s perceptions only depend on the particles in her own brain, not those in E2’s. When I say that she “perceives an agreement”, I mean that she treats E2 as a physical system and interacts with it, by asking her/it about the particle’s position, perhaps. That system then responds, by saying “A” or “B”, for example. This information gets received and stored in her brain in some form. From E1’s perspective, everything is a physical system, including other humans, animals, her own brain, etc. Only a subset of this system (her brain) corresponds to her perceptions3. Again, this is a very conservative viewpoint, since it does not assume other parts of the system correspond to some other entity’s perceptions. In other words, we do not assume other humans/animals/rocks/etc are “conscious”4.
So far so good, right? Unfortunately, this prediction rule does not quite work. Mathematically, you must completely discard the other branches every time you make an observation, and only keep the branch you are on. In other words, there can be no globally shared wavefunction. This is because probability amplitudes, unlike probabilities, can be negative. Quantum interference can cause the amplitude of a given scenario to be zero in a global wavefunction, even when that scenario is reachable in practice. If that branch is selected, it gives for any future probabilities, which is undefined.
As usual, Fig. 5 shows an example. Assume you measure B. By the rule, you predict a 50% probability of either D or E (). See Fig. 5(a). Note that we only consider arrows coming from B in this prediction. Then assume D is measured. We now try to apply the rule starting from D. See Fig. 5(b). However, the amplitude of D is zero! This comes from adding the two incoming arrows. We have from B, and from C, adding up to zero.
The solution is to discard all other branches upon each measurement, and set the amplitude of the measured branch equal to 1. This is called wavefunction collapse. It is shown in Fig. 6. When B is measured, we remove C and give B amplitude 1. Then when D is measured, we remove E and give D amplitude 1. This guarantees that probabilities are always well-defined.
Wavefunction collapse is the most controversial aspect of QM. However, from the discussion above, we see that it is basically just a mathematical formality, since the prediction rule is unchanged except in special cases. Remember, we are only concerned with making predictions, not “modeling the world”. This avoids meaningless philosophical issues about whether the wavefunction or its collapse is “real”. The reason many are uncomfortable with collapse is because it is different from classical physics, in the following ways:
- Different observers use different wavefunctions. In MWI, although observers may find themselves in different branches, there is only one wavefunction. Similarly, the classical universe is in a single big classical state. However, by discarding the other branches, different observers use entirely different mathematical objects (wavefunctions) to describe the universe. Of course, the physics stays the same, since as just mentioned, the prediction rule is almost the same.
- Wavefunction collapse happens instantaneously. In classical physics, the state evolves continuously in time under Newton’s laws. In quantum physics, apart from wavefunction collapse, the wavefunction also evolves continuously in time under an equation called Schrödinger’s equation5. (We have summarized this continuous evolution using the arrows with numbers on them.) Wavefunction collapse instantly discards the other branches and assigns a new amplitude to the observed branch. How is such a discontinuous process allowed? Because any predictions must specify a time when the measurement yields a definite result. This is when collapse occurs6. More on this later.
The Copenhagen interpretation
This theory of wavefunction evolution plus collapse is loosely called the Copenhagen interpretation. Actually, there is no widely-agreed-upon definition of the Copenhagen interpretation, but one hallmark is the separation of the world into classical and quantum systems. QM was originally developed to describe small objects such as single particles using a wavefunction. In contrast, large objects such as photon detectors or human beings were treated as classical systems that cause wavefunction collapse. For example, a particle detector appears to “collapse” the wavefunction of a superposition state like Fig. 1 into a state with definite position, either A or B. In this picture, the particle detector is not part of the wavefunction.
Of course, this led to much confusion about where exactly to draw the line between classical and quantum. How large does a system have to be in order to become classical? As we have argued above, there is no inherent difference between objects such as particles and humans; they are all quantum systems and all part of the wavefunction. In other words, we draw the line at the observer’s “consciousness”. The act of observation causes collapse. This variant of Copenhagen is sometimes called the Von Neumann-Wigner interpretation, or “consciousness causes collapse”.
Consciousness is a dirty word among serious physicists, almost always for good reason. However, we simply use it to mean the ability to have subjective experiences, which was our very first assumption.
Measuring devices and decoherence
This begs the question of why large systems like particle detectors tend to “look” classical. In fact, this was not fully understood until the theory of decoherence emerged in the 1950s-1970s, decades after QM was developed. The basic idea is quite simple. Take a small system in one of a few states , , , etc. When it interacts with an environmental system , this environment turns into a corresponding state , , , etc. For a large environment, these environmental states tend to become well-separated very quickly. This is because there are many more microscopic states that the large environment can take.
For example, Fig. 7 shows a single particle bouncing around in a box. This is a small environmental system. If another particle is placed at position A (top left), eventually they will hit each other, affecting the path of the first particle in some way. If instead the second particle is placed at position B (bottom left), it will affect the first particle in a different way. However, there is a good chance that at some future time, the first particle will happen to be at (nearly) the same location for both scenarios, as seen in Fig. 7.
Now consider a huge number of particles bouncing around in the box. This is a large environmental system. If a new particle is introduced at position A, it will rapidly scramble the paths of all the other particles as they interact with it and with each other. If instead the new particle is introduced at position B, it will scramble the paths in a very different way. At any future time, there is very little chance that all the original particles will be at all the same locations in the two scenarios. The environmental states and are well-separated.
Fig. 8 shows a more accurate version of the measurement in Fig. 2, incorporating decoherence. The wavefunction initially splits into an equal superposition of position states A and B of the particle. At this time, the experimenter is in the same initial state for both branches. The experimenter then measures the particle by interacting with it. For example, there may be some light illuminating the particle, which goes into the experimenter’s eyes, which sends an electrical signal to the brain, etc. After a short amount of time, the experimenter’s brain is in very different states for the two scenarios and . This is seen by the nearly zero amplitude of the “observed B” state when the particle is at (top-most branch), and the nearly zero amplitude of the “observed A” state when the particle is at (bottom-most branch).
To summarize: a measuring device looks classical if it causes decoherence. Therefore, you might think that decoherence can be used to define measurement, so that we do not need wavefunction collapse. This is not the case, for a couple of reasons. First, decoherence is never complete. In most decoherence models, the amplitude of the “wrong” branch approaches zero exponentially with time, but never reaches it. Therefore, we cannot define a time when the measurement is complete. Second, decoherence is only an emergent property of large systems. Why should conscious observers be limited to these systems? Indeed, how do we set a lower limit on the size or amount of decoherence anyway? Clearly, we cannot. The theory must still apply to general quantum systems as observers.
For example, consider an observer system that fluctuates rapidly in time, as in Fig. 9. The theory must still be able to associate states of this system with the observer’s perceptions. Since the branches do not remain separated over time, we cannot rely on decoherence. We also cannot say a state must be stable for a minimum amount of time in order to be measured. The observation, and thus collapse, must happen instantaneously.
The Copenhagen interpretation has always been the standard one taught in textbooks. In the last few decades, many other interpretations have sprung up. I myself believed in MWI until I started thinking deeply about QM a few years ago. In my opinion, these other interpretations all stem from misunderstanding either the Copenhagen interpretation or the purpose of a physical theory. I will list some of them and their flaws here without further detail.
- MWI is incomplete, as argued above.
- Bohmian mechanics and consistent histories are ugly and overly complicated.
- Quantum Bayesianism and relational quantum mechanics just dress up Copenhagen with some fancy words.
- The minimum requirement for a scientific theory is that it makes predictions about an observer’s qualia. It does not have to predict the qualia of other entities, since they are not observable.
- A theory does this by associating mathematical objects, or states, to certain qualia.
- Classical physics predicts one future state, while quantum physics only predicts probabilities of each future state. This is done using a wavefunction that splits into multiple scenarios.
- The wavefunction collapses upon an observation to the observed branch. Thus, different observers use different objects (wavefunctions) to describe the universe. Collapse is required mathematically for the theory to work.
- Collapse must be instantaneous for the theory to apply to all possible observers.
- Decoherence explains why certain objects look like classical measuring devices. However, it is only an approximation and does not replace the need for collapse.
1 Why can’t we observe the particle in two places at once? There are two ways to interpret this question in QM. 1) Why do we prefer the position basis instead of another basis? This is known as the preferred-basis problem. The short answer is that the preferred basis must be empirically determined, just as the perception of the color “red” must be correlated with certain wavelengths of light. More in the advanced version of this post. 2) Why can’t we perceive that we are in a superposition, in general? Because then we could prepare an identical state, violating the no-cloning theorem. More on this here, or see Nielsen & Chuang’s textbook.
2 To be pedantic, no experiments can be truly identical, because 1) the initial states cannot be exactly the same, and 2) the state of your brain has to include the memory of previous experiments. Of course, we really mean that for a series of experiments where we control all the relevant inputs, the results stored in your brain will converge to the predicted probabilities. Also, it goes without saying that many states are associated with the same quale: shifting the position of one molecule in your brain by a tiny amount has no observable effect.
3 This begs the question: how do we know what subset we can observe? As usual, we must determine this empirically!
4 Yes, this is basically solipsism. Unfortunately, that is where the logic of QM leads us. Don’t take it so seriously as to affect your personal moral code or anything.
5 Or more generally, the operator generated by the Hamiltonian.
6 Another common belief is that collapse is incompatible with relativity. This is false. Of course, we do not have a complete theory of quantum gravity, but for QFT in curved space, we can choose the collapse to occur on any spacelike hypersurface. This is because spacelike-separated operators commute, so can be simultaneously measured.