Quantum mechanics explained

After being a strong believer in the many-worlds interpretation of quantum mechanics for years, I have now completely changed my mind. Many-worlds is seriously flawed, and the good old Copenhagen interpretation is not so bad.

Specifically, the correct interpretation of quantum mechanics is the Von Neumann-Wigner interpretation, a flavor of Copenhagen that puts the Heisenberg cut at the observer’s consciousness. The orthodox Copenhagen interpretation, which allows placing the cut at a physical measuring device, is a useful approximation due to decoherence.

What is physics?

Understanding quantum mechanics requires thinking carefully about what physics is and is not. The point of a physical theory is to make predictions about sensory experience. It is only about modeling the world if this helps to make predictions. Thus, the observer’s consciousness1 is just as fundamental as the mathematical objects of the theory. In classical physics, this is obscured because the mathematical objects of the theory are shared among all observers, rendering the observer apparently redundant. Quantum mechanics relaxes this assumption and allows different observers to use different mathematical objects (wavefunctions).

Quantum and classical compared

Let me elaborate on classical and quantum physics.

Classical mechanics describes a system of particles with positions and momenta that evolve in time under Newton’s law. Quantum mechanics is quite similar: it describes a system of particles with a field called the wavefunction that evolves in time under Schrödinger’s equation2. If that were the whole story, quantum mechanics would be pretty much the same as classical mechanics.

However, these are just mathematical constructs so far. How do we actually verify classical mechanics? We can only sense the set of particles corresponding to our body/brain, so we must find a way to cause the system of interest to interact with these particles. In other words, we must split the universe into system and observer3. Then we must assign different states of our state space to different perceptions corresponding to the results of a measurement.

This is exactly what happens in quantum mechanics as well. The difference is that quantum mechanics contains superposition states, while observers can only distinguish between orthogonal states. Thus, there must be a rule to say which orthogonal state in a superposition the observer actually perceives: Born’s rule4.

Why many-worlds fails

Many-worlds seems like a simple and attractive idea that accomplishes the goal: it tells you what an observer perceives using only unitary evolution of a global wavefunction, similar to classical physics. However, it is seriously flawed. Many-worlds models a measurement as follows:

\displaystyle \left(\sum_i c_i | s_i \rangle \right) \otimes |O_0\rangle \rightarrow \sum_i c_i |s_i'\rangle \otimes |O_i\rangle

where |s_i\rangle are the system basis states, |s_i'\rangle are the new system states for each |s_i\rangle, |O_0\rangle is the initial observer state and |O_i\rangle are the final observer states. The |s_i'\rangle are left arbitrary to include both destructive and non-destructive measurements. Measurement is complete upon decoherence, when \langle O_i|O_j\rangle \approx \delta_{ij}. Then the states |O_i\rangle are interpreted as the different perceptions of the observer.

This has several problems. In order of least to most serious:

1. Decoherence is never complete.

What happens in this case? Observers can only distinguish between orthogonal states. An idea is to rewrite the final wavefunction as a sum of direct products in some orthonormal observed basis |O_i''\rangle:

\sum_i c_i'' |s_i''\rangle \otimes |O_i''\rangle

Then the observed system states c_i'' |s_i''\rangle would simply be slightly different than the original ones c_i |s_i'\rangle, corresponding to a small error in the measurement.

2. It assumes the observer is not entangled with the system before measurement.

This is obviously false most of the time! Everything is usually entangled with everything else. To generalize the above, what we actually want is some rule for “hopping” between perceived states of the observer, given an arbitrary entangled state \psi(t). I invite you to come up with such a hopping rule. Seriously, try it.

For example, consider this plausible attempt at a hopping rule. The probability of hopping from state i at time t, to state j at time t+\Delta t, is:

p_{i\rightarrow j} = \displaystyle \frac{\text{tr}\left( P_j e^{-iH\Delta t} P_i \rho(t) P_i e^{iH\Delta t}\right)}{\text{tr}\left( P_i \rho(t)\right)}

where P_i is a projection operator corresponding to state i and \rho(t) is the density matrix5. This has the required property that \sum_j p_{i\rightarrow j} = 1, since \sum_i P_i = 1. This gives the same probabilities that would be observed if the state had collapsed to i at time t, but without actually collapsing the state. The problem is that the denominator can be zero, since there is a nonzero probability that the previous hop landed in the state i even if \text{tr}\left(P_i \rho(t)\right) = 0. The state actually has to collapse to ensure this doesn’t happen.

3. It assumes the many worlds never re-merge or overlap.

Consider the observer’s density matrix \rho_O(t)=tr_S(\rho(t)). The diagonal elements in the observed basis \rho_{Oii} = \langle O_i | \rho_O(t) | O_i\rangle are constantly evolving into each other, with \sum_i \rho_{Oii} = 1. A hopping rule is impossible because you cannot tell which previous state a certain \rho_{Oii} “came from” in the past, unless you assume each state comes from just one past state. This is clearly not true in general.

Many-worlds proponents sometimes argue that macroscopic systems in different states are unlikely to revisit the same state. However, then one must pick a certain size (dimensionality) above which re-merging becomes “acceptably” unlikely. There is clearly no fixed size. For an exact theory of physics, one cannot ignore edge cases like this just because they are rare. Ironically, while many-worlds proponents like to point to the seemingly arbitrary nature of wavefunction collapse, it is many-worlds that places arbitrary restrictions on what systems can be considered observers.

Why Copenhagen is fine

The key insight of the Copenhagen interpretation (i.e. quantum mechanics itself) is that a global (objective) reality is not required to make predictions.

One way to understand this is with the Wigner’s friend thought experiment, which I have slightly extended below.

Wigner prepares his friend and a two-state system in a superposition state

(a|\uparrow\rangle + b|\downarrow\rangle)\otimes |\psi_{friend}\rangle

When his friend measures the system, he may obtain the state |\uparrow\rangle. He then tells Wigner his result, so that in his view, Wigner knows that |\uparrow\rangle was measured. However, Wigner models this measurement as the total state

a |\uparrow\rangle\otimes |\uparrow_{observed}\rangle + b |\downarrow\rangle\otimes |\downarrow_{observed}\rangle

When Wigner measures his friend (by asking him about it, perhaps), he may see a different state |\downarrow\rangle\otimes |\downarrow_{observed}\rangle, so he believes that |\downarrow\rangle was measured. Thus, they may both experience totally different things. But each observer sees an internally consistent story, so the theory is consistent. That’s it.

Measuring devices

This subjective view of physics implies that measurements are made on the observer’s Hilbert space, not on external measuring devices. Then why can some objects be considered classical measuring devices in practice? The answer comes down to decoherence. I will explain this in a somewhat roundabout way that highlights the behavior of real measuring devices.

Recall the textbook measurement postulate: a measurement collapses the system to an eigenstate of the measured Hermitian operator, with probability given by Born’s rule. This is often false in practice! For example, in quantum optics, photodetectors may measure position of a photon, but collapse the system to the state of “no photon”.

Real-world measurements are described by so-called general measurements6. These are defined by a set of operators M_i corresponding to the results of the measurement. The probability for result i is:

p_i = \langle \psi | M_i^\dagger M_i | \psi\rangle

upon which the wavefunction collapses to

\displaystyle |\psi\rangle \rightarrow \frac{M_i |\psi\rangle}{\sqrt{\langle \psi | M_i^\dagger M_i | \psi\rangle}}

The measurement operators satisfy the completeness relation

\sum_i M_i^\dagger M_i = 1

M_i do not have to be Hermitian. For a photodetector, they would be something like M_\textbf{n}=|0\rangle\langle \textbf{n}|, where \textbf{n} are some properties of the photon, like position and polarization. General measurements reduce to conventional (projective) measurements when the M_i are Hermitian and orthogonal projectors: M_i M_j = \delta_{ij} M_i.

General measurements are equivalent to unitary interaction of a system with an ideal environment, followed by a projective measurement on the environment. Specifically, consider coupling the system to an environmental Hilbert space: \mathcal{H} = \mathcal{H}_s \otimes \mathcal{H}_e. The environment is initially in the state |0\rangle. Introduce the operator U such that

U(|\psi\rangle \otimes |0\rangle)=\displaystyle \sum_i M_i|\psi\rangle \otimes |i_E\rangle

where |i_E\rangle are orthonormal states of the environment corresponding to the M_i.

You can check that U preserves inner products of the system Hilbert space:

(\langle 0|\otimes \langle v|) U^\dagger U (| w\rangle \otimes |0\rangle) = \langle v | w\rangle

It can be shown that such a U can be extended to a unitary operator U' on the entire Hilbert space. Now if we measure an operator on the environment with eigenstates |i_E\rangle, we obtain one of the system states

\displaystyle \frac{M_i |\psi\rangle}{\sqrt{\langle \psi | M_i^\dagger M_i | \psi\rangle}}

with probability

p_i = \langle \psi | M_i^\dagger M_i | \psi\rangle

just as above.

Look familiar? This interaction U is a more general version of the many-worlds “decoherence equation” above. Thus, the condition for a quantum object to implement a general measurement is that its internal states must interact with the system in this way. Decoherence propagates to the next object and so on until it reaches the observer, who makes the measurement.


In a nutshell: quantum mechanics relaxes the assumption of an objective description of the universe, while still being a predictive physical theory.


Q: How is the system measurement basis determined (the preferred-basis problem)?

A: First, recall that we do not measure the system directly, only our brain/body after it has interacted with the system. As to which of our internal states correspond to which perceptions, note that the same question applies to classical physics. In both cases, we must determine this empirically.

Q: Isn’t the boundary between system and observer also arbitrary? How do we determine which degrees of freedom can be perceived?

A: Again, the same question applies to classical physics, and must be determined empirically.

Q: What objects have consciousness?

A: No physical objects have consciousness. From your perspective, all physical objects are part of the wavefunction, and nothing else has the power to collapse the wavefunction. (Yes, this unfortunately leads to a kind of solipsism. It’s a lonely world out there.)

1 “Consciousness” is a dirty word among physicists, usually for good reason. Here, it simply means the ability to perceive things: cogito, ergo sum. In the formalism of quantum mechanics, this translates to the ability to collapse the wavefunction by inquiring about a measurement result. Much confusion results from trying to ascribe consciousness to physical objects or from giving the word additional meanings.

2 Or its field theory generalizations.

3 Semantic note: I sometimes use “observer” to refer to the subspace of the state space that is perceived, and sometimes to the conscious entity that does the measurement to collapse the wavefunction. Many-worlds says the latter does not exist. It should be clear from context which one is meant.

4 Can Born’s rule be derived? No, since probability is nowhere to be found in unitary time evolution, so there must be some axiom introducing probability into the theory. Regardless, whether Born’s rule is fundamental or derived has no bearing on the next section.

5 P_i = P_{Oi} \otimes 1, where P_{Oi} is a projector on the observer’s space and 1 is the identity on the rest of the space.

6 This section mostly comes from Nielsen and Chuang, Quantum Computation and Quantum Information, Ch. 2.2.


1 thought on “Quantum mechanics explained

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s