VOLUME 29. NUMBER 3. JULY. 1957
"Relative State" Formulation of Quantum Mechanics*
Palmer
Physical Laboratory, Princeton University, Princeton,
New Jersey
___________________________
* Thesis submitted to Princeton University March
1, 1957 in partial fulfillment of the requirements for the Ph.D. degree. An
earlier draft dated January, 1956 was circulated to several physicists whose comments were helpful. Professor Niels Bohr, Dr. H.J.Groenewald, Dr. Aage Peterson, Dr. A.Stern, and Professor L.Rosenfeld are free of
any responsibility, but they are warmly
thanked for the useful objections
that they raised. Most particular thanks are due
to Professor John A.Wheeler for his continued guidance and encouragement. Appreciation is also expressed to the National Science Foundation for fellowship
support.
† Present address:
Weapons Systems Evaluation Group, The Pentagon. Washington, D.C.
1. INTRODUCTION
The task of quantizing general relativity raises serious questions about the meaning
of the present formulation and interpretation of
quantum mechanics when applied to so fundamental a structure as the space-time geometry itself. This paper seeks to
clarify the
formulations of quantum mechanics. It presents a reformulation of quantum theory
in a form believed suitable
for application to general relativity.
The aim is not to deny or contradict the conventional formulation of quantum theory,
which has demonstrated
its usefulness in an overwhelming variety of problems, but rather to supply
a new, more general and complete formulation, from
which the conventional
interpretation can be deduced.
The relationship of this new formulation to
the older formulation is therefore that of a metatheory to a theory, that is, it is an underlying theory in which the nature and consistency, as well
as the realm of applicability,
of the older theory can
be investigated and clarified.
The new theory
is not based on any radical departure from the conventional one. The special postulates in the old theory which deal with observation are omitted in the
new theory. The altered
theory thereby acquires a new character.
It has to be analyzed in and for itself before any
identification becomes
possible between the quantities of the theory and the properties of the world of experience. The identification, when made, leads back to the omitted postulates of the conventional theory that deal with observation, but in a manner which clarifies their role and logical position.
We begin with a brief discussion of the conventional
formulation, and some of the reasons which motivate one to seek a modification.
2. REALM OF APPLICABILITY OF THE CONVENTIONAL
OR "EXTERNAL OBSERVATION" FORMULATION OF QUANTUM MECHANICS
We take the conventional
or "external observation"
formulation of quantum mechanics
to be essentially the
following1: A physical
system is completely described
by a state function y, which is an element of a Hilbert space, and
which furthermore gives information only to the extent of specifying the probabilities of the results of
various observations which can be made on the system by external observers. There are two fundamentally
different ways in which
the state function can
change:
___________________________
1 We use the terminology
and notation of J. von Neumann, Mathematical
Foundations of Quantum
Mechanics, translated by R.T.Beyer (Princeton University Press,
Princeton, I955).
Process 1: The discontinuous
change brought about by
the observation of a
quantity with eigenstates f1, f2, , in which the state y
will be changed to the
state fj, with probability |(y,fj)|2.
Process 2: The continuous, deterministic change of state of an isolated system with time according to a wave equation
dy/dt = áy, where A is a
linear operator.
This formulation describes a wealth of experience. No experimental evidence is known which contradicts it.
Not all conceivable situations fit the framework of this mathematical formulation. Consider
for example an isolated system consisting of an observer or measuring apparatus, plus an object system. Can the change with time of the state of the total system be described by Process 2? If so, then it would appear that no discontinuous
probabilistic process like Process 1 can take place. If not, we are forced
to admit that systems which contain observers are
not subject to the same
kind of quantum-mechanical description as we admit
for all other physical
systems. The question cannot be ruled out as lying in
the domain of psychology. Much of the discussion of "observers" in quantum mechanics has to do with photoelectric cells, photographic plates, and similar
devices where a mechanistic attitude can
hardly be contested. For the following one can limit himself to this class of problems, if he
is unwilling to consider observers in the more familiar sense on
the same mechanistic level of analysis.
What mixture of Processes
1 and 2 of the conventional formulation is to be
applied to the case where only an approximate measurement is effected; that is, where an apparatus or
observer interacts only weakly
and for a limited time with an object system? In this case
of an approximate measurement Á proper theory must specify (1) the new state
of the object system that corresponds to any particular reading of the
apparatus and (2) the probability with which this reading will occur, von NÅumann
showed how to treat a special class of
approximate measurements by the method of projection operators.2 However, a general
treatment of all approximate measurements by the method of projection operators
can be shown (Sec. 4) to be impossible.
_______________________________
2 Reference 1, Chap. 4, Sec. 4.
How is one to apply the conventional formulation of quantum mechanics to
the space-time geometry itself? The issue becomes especially acute in the case
of a closed universe.3
There is no place to stand outside the system to observe it. There is nothing
outside it to produce transitions from one state to another. Even the familiar
concept of a proper state of the energy is completely inapplicable. In the
derivation of the law of conservation of energy, one defines the total energy
by way of an integral extended over a surface large enough to include all parts
of the system and their interactions.4
But in a closed space, when a surface is made to include more and more of the
volume, it ultimately disappears into nothingness. Attempts to define a total
energy for a closed space collapse to the vacuous statement, zero equals zero.
_______________________________
3 See A.Einstein, The Meaning of
Relativity (Princeton University Press, Princeton, 1950), third edition, p.
107.
4 L.Landau and E.Lifshitz, The
Classical Theory of Fields, translated by M.Hamermesh (Addison-Wesley
Press, Cambridge, 1951), p. 343.
How are a quantum description of a closed universe, of approximate
measurements, and of a system that contains an observer to be made? These three
questions have one feature in common, that they all inquire about the quantum mechanics that is internal lo an isolated system.
No way is evident to apply the conventional formulation of quantum
mechanics to a system that is not subject to external observation. The whole interpretive scheme of that
formalism rests upon the notion of external observation. The probabilities of
the various possible outcomes of the observation are prescribed exclusively by
Process 1. Without that part of the formalism there is no means whatever to
ascribe a physical interpretation to the conventional machinery. But Process 1
is out of the question for systems not subject to external observation.5
_______________________________
5 See in particular the discussion of this point by N.Bohr and
L.Rosenfeld, Kgl. Danske Videnskab, Selskab, Mat.-fys. Medd. 12, No. 8 (1933).
3. QUANTUM MECHANICS INTERNAL TO AN ISOLATED SYSTEM
This paper proposes to reward pure wave mechanics (òrocess
2 only) as a complete theory. It postulates that a wave function that obeys a
linear wave equation everywhere and at all times supplies a complete mathematical
model for every isolated physical system without exception. It further
postulates that every system that is subject to external observation can be
regarded as part of a larger isolated system.
The wave function is taken as the basic physical entity with no a priori interpretation.
Interpretation only comes after an
investigation of the logical structure of the theory. Here as always the theory
itself sets the framework for its interpretation. 5
For any interpretation it is necessary to put the mathematical model of
the theory into correspondence with experience. For this purpose it is
necessary to formulate abstract models for observers that can be treated within
the theory itself as physical systems, to consider isolated systems containing
such model observers in interaction with other subsystems, to deduce the
changes that occur in an observer as a consequence of interaction with the
surrounding subsystems, and to interpret the changes in the familiar language
of experience.
Section 4 investigates representations of the state of a composite
system in terms of states of constituent subsystems. The mathematics leads one
to recognize the concept of the relativity
of states, in the following sense: a constituent subsystem cannot be said
to be in any single well-defined state, independently of the remainder of the
composite system. To any arbitrarily chosen state for one subsystem there will
correspond a unique relative state
for the remainder of the composite system. This relative state will usually
depend upon the choice of state for the first subsystem. Thus the state of one
subsystem does not have an independent existence, but is fixed only by the
state of the remaining subsystem. In other words, the states occupied by the
subsystems are not independent, but correlated.
Such correlations between systems arise whenever systems interact. In the
present formulation all measurements and observation processes are to be
regarded simply as interactions between the physical systems involved — interactions
which produce strong correlations. A simple model for a measurement, due to von
Neumann, is analyzed from this viewpoint.
Section 5 gives an abstract treatment of the problem of observation.
This uses only the superposition principle, and general rules by which
composite system states are formed of subsystem states, in order that the
results shall have the greatest generality and be applicable to any form of
quantum theory for which these principles hold. Deductions are drawn about the
state of the observer relative lo the state of the object system. It is found
that experiences of the observer (magnetic tape memory, counter system, etc.) are in
full accord with predictions of the conventional "external observer"
formulation of quantum mechanics, based on Process 1.
Section 6 recapituIates the "relative state" formulation of
quantum mechanics.
4. CONCEPT OF RELATIVE STATE
We now investigate some consequences of the wave mechanical formalism of
composite systems. If a composite system S,
is composed of two subsystems S1 and S2,
with associated Hilbert spaces H1 and H2,
then, according to the usual formalism of composite systems, the Hilbert space
for S is taken to be the tensor product of H1 and H2
(written H = H1ÄH2). This has the consecuence
that if the sets {xiS1} and {hjS2} are complete orthonormal sets of states for S1 and S2,
respectively, then the general state of S
can be written as a superposition:
yS = Si,jaijxiS1 hjS2. (1)
From (3.1) although S is in a
definite state yS, the subsystems S1 and S2
do not possess anything like definite states independentlÕ of
one another (except in the special case where all but one of the aij
are zero).
We can, however, for any choice of a state in one subsystem, uniquely assign a corresponding relative state in the other subsystem.
For example, if we choose xk as the state for S1,
while the composite system S is in
the state yS given by (3.1), then the corresponding relative state in S2, y(S2; relxk, S1), will
be:
y(S2;
relxk, S1) = Nk
S jakjhjS2 (2)
where Nk is a normalization constant. This relative state
for xk is independent of the choice of basis {xi} (i ¹ k) for the orthogonal
complement of xk, and is hence determined
uniquely by xk alone. To find the relative
state in S2 for an arbitrary state of S1 therefore,
one simply carries out the above procedure using any pair of bases for S1
and S2
which contains the desired state as one element of the basis for S1.
To find states in S1 relative to states in S2,
interchange S1 and S2 in the procedure.
In the conventional or "external observation" formulation, the
relative state in S2, y(S2;
relf, S1)
for a state fS1 in S1, gives the conditional
probability distributions for the results of all measurements in S2,
given that S1 has been measured and found to be in state fS1 — i.e., that fS1 is the eigenfunction of the measurement in S1
corresponding to the observed eigenvalue.
For any choice of basis in S1, {xi}, it is always possible to
represent the state of S, (1), as a single superposition of pairs of states,
each consisting of a state from the basis {xi} in S1 and its
relative state in S2. Thus, from (2), (1) can be written in the form:
1
yS = Si — xiS1 y(S2;
relxi, S1). (3)
Ni
This is an important representation used frequently.
Summarizing: There does not,
in general, exist anything like a single state for one subsystem of a composite
system. Subsystems do not possess states that are independent of the states of
the remainder of the system, so that the subsystem states are generally correlated with one another. One can
arbitrarily choose a state for one subsystem, and be led to the relative state
for the remainder. Thus we are faced with a fundamental relativity of states, which is implied by the formalism of composite systems. It is
meaningless to ask the absolute state of a subsystem — one can only ask the
state relative to a given state of the remainder of the subsystem.
At this point we consider a simple example, due to von Neumann, which
serves as a model of a measurement process. Discussion of this example prepares
the ground for the analysis of "observation." We start with a system
of only one coordinate, q (such as
position of a particle), and an apparatus of one coordinate r (for example the position of a meter
needle). Further suppose that they are initially independent, so that the
combined wave function is y0S+A = f(q)h (r) where f(q) is the initial system wave function, and h (r) is the initial apparatus
function. The Hamiltonian is such that the two systems do not interact except
during the interval t = 0 to t = T, during which time the total
Hamiltonian consists only of a simple interaction,
HI = - ižq(d/dr). (4)
Then the state
ytS+A (q,r) = f(q)h (r - qt) (5)
is a solution of the Schrödinger
equation,
iž(dytS+A /dt) = HIytS+A, (6)
for the specified initial conditions at lime t = 0.
From (5) at time t = T (at
which time interaction stops) there is no longer any definite independent
apparatus state, nor any independent system state. The apparatus therefore does
not indicate any definite object-system value, and nothing like process 1 has
occurred.
Nevertheless, we can look upon the total wave function (5) as a superposition of pairs of subsystem
states, each element of which has a definite q value and a correspondingly displaced apparatus state. Thus after
the interaction the state (5) has the form:
yTS+A = f(q')d(q - q')h (r - q'T)dq' , (7)
which is a superposition of states yq' = d(q - q')h (r - q'T). Each of these elements, yq', of the superposition describes a state in
which the system has the definite value q
= q', and in which the apparatus
has a state that is displaced from its original state by the amount q'T. These elements yq' are then superposed with coefficients f(q') to form the total state (7).
Conversely, if we transform to the representation where the apparatus coordinate is definite, we
write (5) as
yTS+A = (1/Nr')x r' (q)d(r - r') dr' ,
where
x r' (q) = Nr'f(q)h (r' - qT) (8)
and
(1/Nr')2 = f*(q) f(q)h*( r' - qT) h (r' - qT)dq .
Then the x r'(q) are the relative system state functions6 for the apparatus states d(r - r') of definite value r = r'.
_______________________________
6 This example provides a model of an approximate measurement. However,
the relative system state after the interaction x r'(q) cannot ordinarily be generated from the Ïoriginal system state
f by the application of ÁnÕ projection operator, E. Proof: Suppose on the contrary that x r'(q) = NEf(q) = N'f(q)h(r' - qt), where N, N' are normalization constants. Then
E(NEf(q)) = NE2f(q) = N''f(q)h2(r' - qt)
and E2f(q) = (N''/N)f(q)h2(r' - qt). But the condition E2 = E which is necessary for E to be a projection implies that N'/N''h(q) = h2(q) which is generally false.
If T is sufficiently large, or
h(r) sufficiently sharp (near d(r)) then xr'(q) is nearly d(q - r'/T) and the relative system states x r' (q) are nearly eigenstates for the values q = r'/T.
We have seen that (8) is a superposition of states yr', for each of which the apparatus has
recorded a definite value r', and the system is left in approximately the eigenstate of the
measurement corresponding to q = r'/T. The discontinuous "jump" into an eigenstate is thus only a
relative proposition, dependent upon the mode of decomposition of the total
wave function into the superposition, and relative to a particularly chosen
apparatus-coordinate value. So far as the complete theory is concerned all
elements of the superposition exist simultaneously, and the entire process is
quite continuous.
von Neumann's example is only a special case of a more general
situation. Consider any measuring apparatus interacting with any object
system. As a result of the interaction the state of the measuring apparatus is
no longer capable of independent definition. It can be defined only relative to the state of the object
system. In other words, there exists only a correlation between the states of
the two systems. It seems as if nothing can ever be settled by such a
measurement.
This indefinite behavior seems to be quite at variance with our
observations, since physical objects always appear to us to have definite
positions. Can we reconcile this feature wave mechanical theory built purely on
Process 2 with experience, or must the theory be abandoned as untenable? In
order to answer this question we consider the problem of observation itself
within the framework of the theory.
5. OBSERVATION
We have the task of making deductions about the appearance of phenomena
to observers which are considered as purely physical systems and are treated
within the theory. To accomplish this it is necessary to identify some present
properties of such an observer with features of the past experience of the
observer.
Thus, in order to say that an observer 0 has observed the event a, it
is necessary that the state of 0 has become changed from its former state to a
new state which is dependent upon a.
It will suffice for our purposes to consider the observers to possess
memories (i.e., parts of a relatively permanent nature whose states are in
correspondence with past experience of the observers). In order to make
deductions about the past experience of an observer it is sufficient to deduce
the present contents of the memory as it appears within the mathematical model.
As models for observers we can, if we wish, consider automatically
functioning machines, possessing sensory apparatus and coupled to recording devices
capable of registering past sensory data and machine configurations. We can
further suppose that the machine is so constructed that its present actions
shall be determined not only by its present sensory data, but by the contents
of its memory as well. Such a machine will then be capable of performing a
sequence of observations (measurements), and furthermore of deciding upon its
future experiments on the basis of past results. If we consider that current
sensory data, as well as machine configuration, is immediately recorded in the
memory, then the actions of the machine at a given instant can be regarded as a
function of the memory contents only, and all relevant experience of the
machine is contained in the memory.
For such machines we are justified in using such phrases as "the
machine has perceived A" or
"the machine is aware of A"
if the occurrence of A is represented
in the memory, since the future behavior of the machine will be based upon the
occurrence of A. In fact, all of the
customary language of subjective experience is quite applicable lo such
machines, and forms the most natural and useful mode of expression when dealing
with their behavior, as is well known to individuals who work with complex
automata.
When dealing with a system representing an observer quantum
mechanically we ascribe a state function, y0, to it. When the state y0 describes an observer whose
memory contains representations of the events A, B, , ó we denote this fact by appending the memory
sequence in brackets as a subscript, writing:
y0 [A, B, , C] (9)
The symbols A, B, , ó, which we assume to be ordered time-wise, therefore stand for memory
configurations which are in correspondence with the past experience of the observer.
These configurations can be regarded as punches in a paper tape, impressions on
a magnetic reel, configurations of a relay switching circuit, or even
configurations of brain cells. We require only that they be capable of the
interpretation: "The observer has experienced the succession of events A, B,
, ó." (We sometimes write dots in a memory sequence, A,
B, , ó, to indicate the possible presence of previous memories which are
irrelevant to the case being considered.)
The mathematical model seeks to treat the interaction of such observer
systems with other physical systems (observations), within the framework of
Process 2 wave mechanics, and to deduce the resulting memory configurations,
which are then to be interpreted as records of the past experiences of the
observers.
We begin by defining what constitutes a "good" observation. A
good observation of a quantity A,
with eigenfunctions fi, for a system S, by an observer whose initial state is
y0, consists of an interaction
which, in a specified period of time, transforms each (total) state
yS+0 = fiy0[. . .] (10)
into a new state
yS+0' = fiy0[. . .ai] (11)
where ai characterizes7 the state fi. (The symbol, ai, might stand for a
recording of the eigenvalue, for example.) That is, we require that the system
state, if it is an eigenstate, shall
be unchanged, and (2) that the observer state shall change so as to describe an
observer that is "aware" of which eigenfunction it is; that is, some
property is recorded in the memory of the observer which characterizes fi, such as the eigenvalue.
The requirement that the eigenstates for the system be unchanged is necessary
if the observation is to be significant (repeatable), and the requirement that
the observer state change in a manner which is different for each eigenfunction
is necessary if we are to be able to call the interaction an observation at
all. How closely a general interaction satisfies the definition of a good
observation depends upon (1) the way in which the interaction depends upon the
dynamical variables of the observer system —including memory variables — and
upon the dynamical variables of the object system and (2) the initial state of
the observer system. Given (1) and (2), one can for example solve the wave
equation, deduce the state of the composite system after the end of the
interaction, and check whether an object system that was originally in an
eigenstate is left in an eigenstate, as demanded by the repeatability
postulate. This postulate is satisfied, for example, by the model of von
Neumann that has already been discussed.
_______________________________
7 It should be understood that y0[. . .ai] is a different state for each i.
A more precise notation would write y0i[. . .ai], but no confusion can arise if we simply let the y0i be indexed only by the index of the memory configuration symbol.
From the definition of a good observation we first deduce the result of
an observation upon a system which is not
in an eigenstate of the observation. We know from our definition that the
interaction transforms states fiy0[. . .] into states
fiy0[. . .ai]. Consequently these
solutions of the wave equation can be superposed to give the final state for
the case of an arbitrary initial system state. Thus if the initial system state
is not an eigenstate, but a general state Siaifi, the final total state will
have the form:
yS+0' = Siaifiy0[. . .ai]. (12)
This superposition principle continues to apply in the presence of
further systems which do not interact during the measurement. Thus, if systems S1,
S2, . . . ,
Sn are present
as well as 0, with original states yS1, yS2, . . . , ySn, and the only interaction during the time of measurement
takes place between S1 and 0, the measurement will transform the initial
total state:
yS1 + S2 + . . . + Sn+ 0 = yS1yS2 . . .ySn,y0[. . .] (13)
into the final state:
y 'S1 + S2 + . . . + Sn+ 0 = Siaifi S1yS2 . . .ySn,y0[. . .ai] (14)
where ai = (fi S1,yS1) and fi S1 are eigenfunctions of the
observation.
Thus we arrive at the general rule for the transformation of total
state functions which describe systems within which observation processes
occur:
Rule 1: The observation of a quantity A, with eigenfunctions fi S1, in a system S1
by the observer 0, transforms the total state according to:
yS1yS2 . . .ySny0[. . .]
® Siaifi S1yS2 . . .ySn,y0[. . .ai] (15)
where
ai = (fi S1,yS1).
If we next consider a second observation
to be made, where our total state is now a superposition, we can apply Rule 1
separately to each element of the superposition, since each element separately
obeys the wave equation and behaves independently of the remaining elements,
and then superpose the results to obtain the final solution. We formulate this
as:
Rule 2: Rule 1 may be applied separately to each
element of a superposition of total system states, the results being superposed
to obtain the final total state. Thus, a determination of B, with eigenfunctions hjS2,^, on S2 by the observer 0 transforms the total state
Siaifi S1yS2 . . .ySn,y0[. . .ai] (16)
into the state
Si,jai bj fi S1hjS2yS2 . . .ySn,y0[. . .ai,bj] (17)
where bj = (hjS2,yS2), which follows from the application of Rule 1 to each element fi S1yS2 . . .ySn,y0[. . .ai], and then superposing the results with the Ócoefficients
ai.
These two rules, which follow directly from the superposition principle,
give a convenient method for determining final total states for any number of
observation process in any combinations. We now seek the interpretation of such final total states.
Let us consider the simple case of Á single observation of a
quantity A, with eigenfunctions fi, in the system S with initial state yS, by an observer 0 whose
initial state is y0[. . .]. The final
result is, as we have seen, the superposition
y 'S +
0 = Siaifi y0[. . .ai]. (18)
There is no longer any independent system state or observer state,
although the two have become correlated in a one-one manner. However, in each element of the superposition, fiy0[. . .ai], the object-system state
is a particular eigenstate of the observation, and furthermore the observer-sÕstem state describes the observer
as definitelyÕ perceiving that particular
system state. This correlation is what allows one to
maintain the interpretation that a measurement has been performed.
We now consider a situation where the observer system comes into
interaction with the object system for a second time. According lo Rule 2 we
arrive at the total state after the second observation:
y ''S +
0 = Siaifi y0[. . .ai,ai]. (19)
Again, each element fiy0[. . .ai,ai] describes a system eigenstate, but this time also describes the
observer as having obtained the same
result for each of the two observations. Thus for every separate state of
the observer in the final superposition the result of the observation was
repeatable, even though different for different states. This repeatability is a
consequence of the fact that after an observation the relative system state for a particular observer state is the
corresponding eigenstate.
Consider now a different situation. An observer-system 0, with initial
state y0[. . .], measures the same
quantity A in a number of
separate, identical, systems which are initially in the same state, yS1 =yS2 = . . . = ySn = Siaifi (where the fi are, as usual,
eigenfunctions of A). The initial
total state function is then
y0S1 + S2 + . . . + Sn+ 0 = yS1yS2 . . .ySny0[. . .] (20)
We assume that the measurements are performed on the systems in the
order S1, S2, . . . ,Sn.
Then the total state after the first measurement is by Rule 1,
y1S1 + S2 + . . . + Sn+ 0 = Siaifi S1yS2 . . .ySn,y0[. . .ai1] (21)
(where ai1 refers to the first system, S1).
After the second measurement it is, by Rule 2,
y2S1 + S2 + . . . + Sn+ 0
= Si,jai aj fi S1fjS2yS3 . . .ySn,y0[. . . ai1, aj2] (22)
and in general, after r
measurements have taken place (r £ n), Rule 2 gives the result :
yr = Si,j, ... k ai aj . . . ak fi S1fjS2yS3 . . .ySn,y0[. . . ai1, aj2] (23)
We can give this state, yr, the following interpretation. It consists of a superposition of states:
y 'ij . . . k = fi S1fjS2 . . . fk Sr
5ySr+1 . . .ySny0[ai1, aj2. . . akr] (24)
each of which describes the observer with a definite memory sequence [ai1,aj2. . . akr]. Relative to him the (observed) system states
are the corresponding eigenfunctions fiS1,fjS2, . . . ,fkSr, the remaining systems, S1,
S2,
.
. . ,Sn, being unaltered.
A typical element y'ij
... k of the final superposition describes a state
of affairs wherein the observer has perceived an apparently random sequence of
definite results for the observations. Furthermore the object systems have been
left in the corresponding eigenstates of the observation. At this stage suppose
that a redetermination of an earlier system observation (Sl) takes
place. Then it follows that every element of the resulting final superposition
will describe the observer with a memory configuration of the form [ai1, . . .ajl, . . .akr,ajl] in which the earlier
memory coincides with the later — i.e., the memory states are correlated. It will thus appear to the observer, as described by
a typical element of the superposition, that each initial observation on a
system caused the system to "jump" into an eigenstate in a random
fashion and thereafter remain there for subsequent measurements on the same
system. Therefore — disregarding for the moment quantitative questions of
relative frequencies — the probabilistic assertions of Process 1 appear to be valid to the observer
described by a typical element of the final superposition.
We thus arrive at the following picture: Throughout all of a sequence of
observation processes there is only one physical system representing the
observer, yet there is no single unique state
of the observer (which follows from the representations of interacting
systems). Nevertheless, there is a representation in terms of a superposition, each element of which
contains a definite observer state and a corresponding system state. Thus with
each succeeding observation (or interaction), the observer state
"branches" into a number of different states. Each branch represents
a different outcome of the measurement and the corresponding eigenstate for the object-system state. All branches
exist simultaneously in the superposition after any given sequence of
observations.‡ The "trajectory" of the memory configuration of an observer
performing a sequence of measurements is thus not a linear sequence of memory
configurations, but a branching tree, with all possible outcomes existing
simultaneously in a final superposition with various coefficients in the
mathematical model. In any familiar memory device the branching does not
continue indefinitely, but must stop at a point limited by the capacity of the
memory.
‡ Note added in proof. — In reply to a preprint of
this article some correspondents have raised the question of the "transition
from possible to actual," arguing that in "reality" there is —
as our experience testifies — no such splitting of observers states, so that
only one branch can ever actually exist. Since this point may occur to other
readers the following is offered in explanation.
The whole issue of the transition from "possible" to
"actual" is taken care of in the theory in a very simple way — there
is no such transition, nor is such a transition necessary for the theory to be
in accord with our experience. From the viewpoint of the theory all elements of
a superposition (all "branches") are "actual," none <are
[added in M.Price's FAQ — E.Sh.]>
any more "real" than the rest. It is unnecessary to suppose that all
but one are somehow destroyed, since all the separate elements of a
superposition individually obey the wave equation with complete indifference to
the presence or absence ("actuality" or not) of any other elements.
This total lack of effect of one branch on another also implies that no
observer will ever be aware of any "splitting" process.
Arguments that the world picture presented by this theory is
contradicted by experience, because we are unaware of any branching process,
are like the criticism of the Copernican theory that the mobility of the earth
as a real physical fact is incompatible with the common sense interpretation of
nature because we feel no such motion. In both cases the argument fails when it
is shown that the theory itself predicts that our experience will be what it in
fact is. (In the Copernican case the addition of Newtonian physics was required
to be able to show that the earth's inhabitants would be unaware of any motion
of the earth.)
In order to establish quantitative results, we must put some sort of
measure (weighting) on the elements of a final superposition. This is necessary
to be able to make assertions which hold for almost all of the observer states
described by elements of a superposition. We wish to make quantitative
statements about the relative frequencies of the different possible results of
observation — which are recorded in the memory — for a typical observer state;
but to accomplish this we must have Á method for selecting a
typical element from a superposition of orthogonal states.
We therefore seek a general scheme to assign a measure to the elements
of a superposition of orthogonal states Siai fi. We require a positive
function m of the complex
coefficients of the elements of the superposition, so that m(ai) shall be the measure
assigned to the clement fi. In order that this general
scheme be unambiguous we must first require that the states themselves always
be normalized, so that we can distinguish the coefficients from the states. However,
we can still only determine the coefficients,
in distinction to the states, up to an arbitrary phase factor. In order to
avoid ambiguities the function m must
therefore be a function of the amplitudes of the coefficients alone, m(ai) = m(|ai|).
We now impose an additivity requirement. We can regard a subset
n
of the superposition, say S aifi, as a single element af':
i = 1
n
af' = S aifi . (25)
i = 1
We then demand that the measure assigned to f'
shall be the sum of the measures assigned to the fi (i from 1 to n):
n
m(a) = S m(ai). (26)
i = 1
Then we have already restricted the choice of m to the square amplitude alone; in other words, we have m(ai) = ai*ai,
apart from a multiplicative constant.
To see this, note that the normality of f'
requires that |a| = (Sai*ai)1/2. From our remarks about
the dependence of m upon the
amplitude alone, we replace the ai by their amplitudes ui
= |ai|.
Equation (26) then imposes the requirement,
m(a) = m(Sai*ai)1/2
= m(ui2)1/2 = S m(ui) = S m(ui2)1/2. (27)
Defining a new function g(x)
g(x) =
m(Öx) (28)
we see that (27) requires that
g(Sui2) = S g(ui2)
. (29)
Thus g is restricted to be
linear and necessarily has the form:
g(x) =
cx (c
constant). (30)
Therefore g(x2)
= cx2 = m(Öx2) = m(x) and we have deduced that m
is restricted to the form
m(ai) = m(ui) = cui2
= cai*ai. (31)
We have thus shown that the only choice of measure consistent with our
additivity requirement is the square amplitude measure, apart from an arbitrary
multiplicative constant which may be fixed, if desired, by normalization
requirements. (The requirement that the total measure be unity implies that
this constant is 1.)
The situation here is fully analogous to that of classical statistical
mechanics, where one puts a measure on trajectories of systems in the phase
space by placing a measure on the phase space itself, and then making
assertions (such as ergodicity, quasi-ergodicity, etc.) which hold for
"almost all" trajectories. This notion of ''almost all" depends
here also upon the choice of measure, which is in this case taken to be the Lebesgue
measure on the phase space. One could contradict the statements of classical
statistical mechanics by choosing a measure for which only the exceptional
trajectories had nonzero measure. Nevertheless the choice of Lebesgue measure
on the phase space can be justified by the fact that it is the only choice for
which the "conservation of probability" holds, (Liouville's theorem)
and hence the only choice which makes possible any reasonable statistical
deductions at all.
In our case, we wish to make statements about "trajectories"
of ob-servers. However, for us a trajectory is constantly branching
(transforming from state to superposition) with each successive measurement. To
have a requirement analogous to the "conservation of probability" in
the classical case, we demand that the measure assigned to a trajectory at one
time shall equal the sum of the measures of its separate branches at a later
time. This is precisely the additivity requirement which we imposed and which
leads uniquely to the choice of square-amplitude measure. Our procedure is
therefore quite as justified as that of classical statistical mechanics.
Having deduced that there is a unique measure which will satisfy our
requirements, the square-amplitude measure, we continue our deduction. This
measure then assigns to the i,j, . . . kth
element of the superposition (24),
fi S1fjS2 . . . fk SrySr+1 . . .ySny0[ai1,aj2,. . . akr] (32)
the measure (weight)
Mi,j, . . .
k = (ai aj . . . ak)*( ai
aj . . .
ak) (33)
so that the observer state with memory configuration [ai1,aj2,. . . ,akr] is assigned the measure ai*aiaj*aj .
. . ak*ak = Mi,j, . . .
k. We see immediately that this is a product measure, namely,
Mi,j, . . .
k = Mi Mj .
. . Mk (34)
where
Mi = ai *ai
so that the measure assigned to a particular memory sequence [ai1,aj2,. . . ,akr] is simply the product of
the measures for the individual components of the memory sequence.
There is a direct correspondence of our measure structure to the
probability theory of random sequences. lf
we regard the Mi,j, . . . k as probabilities
for the sequences then the sequences are equivalent to the random sequences
which are generated by ascribing to each term the independent probabilities Mi = ai*ai.
Now probability theory is equivalent to measure theory mathematically, so that
we can make use of it, while keeping in mind that all results should be
translated back to measure theoretic language.
Thus, in particular, if we consider the sequences to become longer and
longer (more and more observations performed) each memory sequence of the final superposition will satisfy any
given criterion for a randomly generated sequence, generated by the independent
probabilities ai*ai, except for a set of
total measure which tends toward zero as the number of observations becomes
unlimited. Hence all averages of functions over any memory sequence, including the special case of frequencies, can
be computed from the probabilities ai*ai, except
for a set of memory sequences of measure zero. We have therefore shown that the
statistical assertions of Process 1 will appear to be valid to the observer,
in almost all elements of the
superposition (24), in the limit as the number of observations goes to
infinity.
While we have so far considered only sequences of observations of the
same quantity upon identical systems, the result is equally true for arbitrary
sequences of observations, as may be verified by writing more general sequences
of measurements, and applying Rules 1 and 2 in the same manner as presented
here.
We can therefore summarize the situation when the sequence of
observations is arbitrary, when these observations are made upon the same or
different systems in any order, and when the number of observations of each
quantity in each system is very large, with the following result:
Except for a set of memory sequences of measure nearly
zero, the averages of any functions over a memory sequence can be calculated
approximately by the use of the independent probabilities given by Process 1
for each initial observation, on a system, and by the use of the usual
transition probabilities for succeeding observations upon the same system. In
the limit, as the number of all types of observations goes to infinity the
calculation is exact, and the exceptional set has measure zero.
This prescription for the calculation of averages over memory sequences
by probabilities assigned to individual elements is precisely that of the
conventional "external observation" theory (Process 1). Moreover,
these predictions hold for almost all memory sequences. Therefore all
predictions of the usual theory will appear to be valid to the observer in
almost all observer states.
In particular, the uncertainty principle is never violated since the
latest measurement upon a system supplies all possible information about the
relative system state, so that there is no direct correlation between any
earlier results of observation on the system, and the succeeding observation.
Any observation of a quantity B,
between two successive observations of quantity A (all on the same system) will destroy the one-one correspondence
between the earlier and later memory states for the result of A. Thus for alternating observations of
different quantities there are fundamental limitations upon the correlations
between memory states for the same observed quantity, these limitations
expressing the content of the uncertainty principle.
As a final step one may investigate the consequences of allowing several
observer systems to interact with (observe) the same object system, as well as
to interact with one another (communicate). The latter interaction can be
treated simply as an interaction which correlates parts of the memory
configuration of one observer with another. When these observer systems are
investigated, in the same manner as we have already presented in this section
using Rules 1 and 2, one finds that in all
elements of the final superposition:
1. When several observers have separately observed the same quantity in
the object system and then communicated the results to one another they find
that they are in agreement. This agreement persists even when an observer
performs his observation after the
result has been communicated to him by another observer who has performed the
observation.
2. Let one observer perform an observation of a quantity A in the ÏbjÅct
system, then let a second perform an observation of a quantity B in this object system which does not
commute with A, and finally let the
first observer repeat his observation of A.
Then the memory system of the first observer will not in general show the same result for both observations. The
intervening observation by the other observer of the non-commuting quantity B prevents the possibility of any one to
one correlation between the two observations of A.
3. Consider the case where the states of two object systems are
correlated, but where the two systems do not interact. Let one observer perform
a specified observation on the first system, then let another observer perform
an observation on the second system, and finally let the first observer repeat
his observation. Then it is found that the first observer always gets the same
result both times, and the observation by the second observer has no effect
whatsoever on the outcome of the first's observations. Fictitious paradoxes
like that of Einstein, Podolsky, and Rosen8
which are concerned with such correlated, noninteracting systems are easily
investigated and clarified in the present scheme.
8 Einstein, Podolsky, and Rosen, Phys. Rev. 47, 777 (1935). For a thorough discussion of the physics of
observation, see the chapter by N.Bohr in Albert
Einstein, Philosopher-Scientist (The Library of Living Philosophers, Inc.,
Evanston, 1949).
Many further combinations of several observers and systems can be
studied within the present framework. The results of the present "relative
state" formalism agree with those of the conventional "external
observation" formalism in all those cases where that familiar machinery
is applicable.
In conclusion, the continuous evolution of the state function of a
composite system with time gives a complete mathematical model for processes
that involve an idealized observer. When interaction occurs, the result of the
evolution in time is a superposition of states, each element of which assigns a
different state to the memory of the observer. Judged by the state of the
memory in almost all of the observer states, the probabilistic conclusion of
the usual "external observation" formulation of quantum theory are
valid. In other words, pure Process 2 wave mechanics, without any initial
probability assertions, leads to all the probability concepts of the familiar
formalism.
6. DISCUSSION
The theory based on pure wave mechanics is a conceptually simple, causal
theory, which gives predictions in accord with experience. It constitutes a
framework in which one can investigate in detail, mathematically, and in a logically
consistent manner a number of sometimes puzzling subjects, such as the
measuring process itself and the interrelationship of several observers.
Objections have been raised in the past to the conventional or "external
observation" formulation of quantum theory on the grounds that its
probabilistic features are postulated in advance instead of being derived from
the theory itself. We believe that the present "relative-state"
formulation meets this objection, while retaining all of the content of the standard
formulation.
While our theory ultimately justifies the use of the probabilistic
interpretation as an aid to making practical predictions, it forms a broader
frame in which to understand the consistency of that interpretation. In this
respect it can be said to form a metatheorÕ for the standard theory. It transcends the
usual ''external observation" formulation, however, in its ability to deal
logically with questions of imperfect observation and approximate measurement.
The "relative state" formulation will apply to all forms of
quantum mechanics which maintain the superposition principle. It may therefore
prove a fruitful framework for the quantization of general relativity. The
formalism invites one to construct the formal theory first, and to supply the
statistical interpretation later. This method should be particularly useful for
interpreting quantized unified field theories where there is no question of
ever isolating observers and object systems. They all are represented in a single structure, the field. Any
interpretative rules can probably only be deduced in and through the theory
itself.
Aside from any possible practical advantages of the theory, it remains a
matter of intellectual interest that the statistical assertions of the usual
interpretation do not have the status of independent hypotheses, but are
deducible (in the present sense) from the pure wave mechanics that starts
completely free of statistical postulates.