Abstract
We present a macroscopic theory of electroencephalogram (EEG) dynamics based on the laws of motion that govern atomic and molecular motion. The theory is an application of ZwanzigMori projection operators. The result is a simple equation of motion that has the form of a generalized Langevin equation (GLE), which requires knowledge only of macroscopic properties. The macroscopic properties can be extracted from experimental data by one of two possible variational principles. These variational principles are our principal contribution to the formalism. Potential applications are discussed, including applications to the theory of critical phenomena in the brain, Granger causality and Kalman filters.
PACS code: 87.19.lj
1. Introduction
The electrical activity of the brain has intrigued scientists since the invention of the electroencephalogram (EEG) [1,2]. Scalp and intracranial EEG's are now in widespread clinical use in the diagnosis and management of epilepsy and other neurological disorders. These applications rely on empirical correlations between certain EEG patterns with specific neurological disorders. Intense interest exists in trying to understand EEG dynamics at a deeper level, so as to extract ever more information about brain health and function. These efforts fall into two classes: those which are largely empirical, based on traditional correlations between EEG patterns and clinical observations, and those which are theorybased, where one has in mind a certain model of brain dynamics and then one tries to interpret EEG patterns in terms of the theoretical model. In the empirical class are recent efforts to correlate high frequency oscillations with epileptogenic tissue [3]. In the theorybased class, the most celebrated approach is the cable theory of Hodgkin and Huxley [4]. This theory can be scaled up using compartmental models to describe networks of thousands or even millions of neurons using high power computers. In the hands of a master, much insight can come from such simulations [5]. However, these methods are computationally intensive. They are not easily scaled to truly macroscopic levels and they are not amenable to the clinical diagnostic situation where one wants to know, for specific EEG samples from specific individuals, whether a certain brain pathology is present.
Mesoscopic and macroscopic level theories of EEG dynamics have also been proposed, each based on a plausible basic postulate or mathematical model of the electrical activity of the brain [2,68]. The methods of nonlinear dynamics (chaos theory) also fall into this class and have been applied to seizure prediction [9,10].
It would be desirable to base a macroscopic theory of EEG dynamics on fundamental physical principles, for instance, on the laws of motion that govern atomic and molecular motion. In this paper, we discuss such a theory. The approach we take was developed by Zwanzig [11] and Mori [12] to explain thermodynamic irreversibility (why entropy always increases). The result is a simple equation of motion that has the form of a generalized Langevin equation (GLE), which requires knowledge only of macroscopic properties. The macroscopic properties can be extracted from experimental data by one of two variational principles. Potential applications are discussed, including applications to epilepsy research, the theory of critical phenomena in brains, Granger causality and Kalman filters.
2. Theory
EEG dynamics results from the motion of microscopic charged particles in the brain. The charged particles interact with other charged particles as well as with uncharged particles. The total number of particles is astronomical, on the order of 10^{23}. A macroscopic EEG electrode either on the scalp or inserted into the brain detects voltages set up by the local distribution of the charged particles. The changes in these voltages reflect the local flux of charged particles. The forces exerted by the particles on each other are highly nonlinear. How can we possibly hope to derive an equation of motion for the macroscopic EEG voltages? The most elegant solution is to apply ZwanzigMori projection operators [11,12]. We follow the discussion of Zwanzig [13]; see also Berne and Pecora [14] for more detailed discussions.
Let the voltage measured by electrode n at time t be represented as x(n, t) where n = 1 to N. If the reference ground is taken to be infinitely far away, then x(n, t) is given by:
where N_{0 }is the total number of charged particles, Q_{i }is the charge of particle i, is its position at time t, and is the position of the n^{th }electrode. If the reference ground is chosen differently, then x(n, t) will be given by some linear combination of the terms on the right hand side of Eq (1). The choice of reference ground does not affect what follows. Next, let x(n, t) represent the n^{th }component of an Ndimensional vector X(t). Hence, the voltage readouts from an Nchannel EEG are the N components of the vector X(t). For convenience, we will take X(t) to represent the EEG voltage after the time average of the raw data has been subtracted out: X(t) = X_{raw }(t)  〈(X_{raw }(t))〉.
In what follows, we will refer to the x(n, t)'s as the explicit degrees of freedom, while the 's and all other degrees of freedom are the implicit or "bath" degrees of freedom. At this point, readers who wish to skip some of the more mathematical discussion may jump to Eq. (17).
To continue, if the microscopic particles of the system all obey the laws of physics, then the dynamics of X(t) can be written:
Here L is the Liouville operator. The classical and quantum Liouville operators, respectively, are defined as:
Here and are the coordinate and momentum of the i^{th }particle, H is the total Hamiltonian, ħ is Planck's constant, and the square brackets denote the quantum commutator. In what follows, one may choose the dynamics to be determined by either the classical or quantum Liouville operator.
The Hamiltonian H in Eq (3) contains kinetic and potential energy terms for all the microscopic particles in the system. The potential energy terms describe how every microscopic particle interacts with every other microscopic particle. These terms are in general highly nonlinear. We shall not need to know the details of these interactions. The ZwanzigMori approach only requires that the microscopic dynamics obeys the form of Eq (2). The realm of validity for what follows thus includes all of classical and quantum mechanics.
Equations (2)–(3) are formally correct but they are not useful for describing macroscopic systems in practice because we cannot possibly specify L for all N_{0 }≈ 10^{23 }microscopic particles. If we could formally "average" or integrate Eq (2) over all the microscopic degrees of freedom and (that is, over all the implicit degrees of freedom) without integrating out the explicit degrees of freedom contained in X(t), then we can transform Eq (2) into a useful equation of motion for X(t). Here is where projection operators come in. Let Γ represent all the microscopic particle coordinates and momenta (i.e., it represents a point in phase space), and let . Define an inner product
where ρ(Γ) gives a microscopic distribution of Γ, not necessarily at equilibrium, and where the asterisk denotes a complex conjugate. Here A and B are explicit or implicit degrees of freedom which for the time being we will allow to be complex (i.e., they may have both real and imaginary components). We will later show how to recover convenient expressions for purely real observables, such as the EEG voltages and their time derivatives. Note that Eq (4) is essentially a phase space average of the product of A times B*.
Now imagine that we identify the vector A as the explicit degree of freedom, and we wish to project or integrate out all other degrees of freedom from Eq (2). The projection operator with respect to A is:
This operator integrates out all degrees of freedom of the system except those that affect A. Applying this projection operator on Eq (2) will result in an equation of motion for A only. We will not reproduce here the mathematical manipulations described in detail in Refs [1214]. One needs the operator identity:
We will take t_{0 }to represent the initial time within an arbitrary time window. After some work, for times such that t ≥ t_{0}, one arrives at the generalized Langevin equation (GLE):
where
Here Ω has the interpretation of a frequency, F(t) is a "random force" due to the dynamics of the implicit degrees of freedom, and Γ(τ) is referred to as a memory function that defines the effect of prior values of A on its current time rate of change. Equation (7) is the desired equation of motion for the explicit degrees of freedom. An important formal result is that
which states that the random force acting on the macroscopic observable A is not correlated with A. This result is rigorously true and it underlies the foundation of the Onsager regression hypothesis [14]. It will also suggest one of the variational principles by which we extract model parameters from experiment, as we will discuss shortly.
Other forms of the GLE may be derived from Eq (7). For convenience, let us rewrite Eq (7) as
where the open circle denotes a convolution and the terms in Ω and Γ(τ) have both been absorbed into the convolution:
The macroscopic observable A(t) thus far is allowed to have both real and imaginary parts. Now consider a purely real vector A(t) of the form:
We choose this form because the underlying dynamics is governed by the laws of physics, for which the coordinate and momentum (here represented as the velocity) are independent degrees of freedom. The coordinate X(t) is constrained such that
In practice, recall that X(t) represents the N channels of EEG voltages. Hence, V(t) is the first time derivative of X(t), which in practice can be constructed from experimental time series data by taking finite differences, e.g., V(t) ≈ [X(t + δt)  X(t  δt)]/(2δt), where δt is the EEG sampling time.
Taking the formal time derivative of Eq (14) and inserting Eq (12), we find
Applying the constraint of Eq (15), we have
where , K = W_{21}, G = W_{22 }and F_{R}(t) = F_{2}(t). This equation may be written in more expanded form as
Equation (17b) has the form of an equation of motion for a timedelayed damped harmonic oscillator with force constant matrix K, friction constant matrix G and random force F_{R}(t). The purely real matrix K has convolution components K(m, n, τ) giving the influence of x(n, t  τ) on x(m, t) with time delay τ. That is, a fluctuation in the n^{th }macroscopic coordinate of magnitude x(n) at a certain point in time, results in a force on the n^{th }macroscopic coordinate of magnitude K(m, n, τ)x_{n }at a time τ later. Similarly, G(τ) is a purely real timedelayed friction matrix with components G(m, n, τ) giving the frictional force of on the m^{th }macroscopic coordinate with time delay τ.
Recall that X(t) represents the EEG voltages. Hence Eq (17) tells us that we can think of EEG dynamics as being driven by three kinds of forces: one due to interactions between neurons at the sites of the electrodes (the Kterm), another due to frictional forces (the Gterm), and a third due to random environmental noise (the F_{R}(t)term). The influence of the enormous number of implicit degrees of freedom are hidden in microscopic expressions for the K(τ), G(τ) and F_{R}(t) terms as expressed through Eqs (8)–(10). In certain highly idealized situations, one may be able to estimate these functions quantitatively. In the general case, however, Eqs (8)–(10) are computationally intractable. Nonetheless, the functions K(τ), G(τ) and F_{R}(t) also represent macroscopic properties of the system, and it should be possible to extract them from experiment. What we need is a way to relate these functions to experimental data, preferably by a variational principle.
There are two obvious approaches to a variational principle by which K(τ), G(τ) and F_{R}(t) may be extracted from experiment: (I) minimization of the amplitude of the random force F_{R}(t) and (II) minimization of correlation of the random force with the macroscopic coordinate.
2.1. Variational principle I: minimization of random force amplitude
Equation (17) represents a way of dividing up the total ''force'' acting on the macroscopic observables into three components: (1) that due to explicit interactions between the macroscopic observables, represented by the term in K, (2) that due to frictional dissipation, represented by the term in G and (3) that due to environmental fluctuations, represented by the term in F_{R}(t). One may plausibly argue that one should ascribe as much of the total force as possible to the explicit interactions between the macroscopic observables, and as little as possible to environmental fluctuations. In this case, one may define an error functional which adds up all the square amplitudes of F_{R}(t) over the entire time interval under observation, t = [t_{a}, t_{b}], with the goal of minimizing it:
To minimize this error functional, one may take each value of K(m, n, τ) and G(m, n, τ), for each m, n and τ, as an independent parameter, and vary them so as to make E_{I}(t_{a}, t_{b}) as small as possible. To do this, first insert Eq (17) into Eq (18), to express E_{I}(t_{a}, t_{b}) in terms of X(t), , , K(τ), and G(τ). Next, insert experimental values for X(t), and into E_{I}(t_{a}, t_{b}). One may then minimize E_{I}(t_{a}, t_{b}) with respect to K(τ) and G(τ) by standard algorithms [15]. Because E_{I}(t_{a}, t_{b}) is quadratic in K(τ) and G(τ), one is guaranteed a unique solution. This procedure represents our first variational principle.
For future reference, we list the derivatives of E_{I}(t_{a}, t_{b}) with respect to the K(m, n, τ)'s and G(m, n, τ)'s. These equations are useful in global minimization algorithms [15]:
Alternatively, after some experience, if one finds that K(m, n, τ) and G(m, n, τ) tend to decay exponentially, possibly with some oscillations, then one may try to save some computational effort by parameterizing K(m, n, τ) and G(m, n, τ), for example, using
The minimization of E_{I}(t_{a}, t_{b}) would then be with respect to A_{K}(m, n), b_{K}(m, n), ω_{K}(m, n), etc. The drawback here is that the minimization algorithm would require an iterative algorithm suitable for nonlinear fits, and a unique solution is not guaranteed [15].
2.2. Variational principle II: minimization of random force correlation
To formulate our second variational principle, consider Eq (11), which states that the random force should not be correlated with any macroscopic observable when averaged over the phase space available to the microscopic degrees of freedom, as defined in Eq (4). One may hypothesize that this phase space average may be replaced by an average over time. This hypothesis is known as the ergodic hypothesis. Note that in Eq (4), we do not assume the phase space average is necessarily an equilibrium average. If for some possibly nonequilibrium distribution function, the ergodic hypothesis holds true, then we may take the assumption that the bath random force is not timecorrelated with macroscopic observables as the basis for our second variational principle. There are subtleties in this approach which must be treated with care.
It is generally not possible to be certain whether an experimental system is ergodic or not. For instance, the explicit degrees of freedom may happen to be trapped in a portion of phase space that is separated from other regions of phase space by very high free energy barriers. Even if over time the explicit degrees of freedom sample all of the phase space available within this bounded region, we can never know that there are not other regions that would have been equally sampled, were it not for the impenetrable energy barriers separating the regions. In this case, we can still define the random force term in Eq (17) to be such that it is not timecorrelated with the macroscopic observables. If the random force were correlated with a macroscopic observable, then of course it would not really be random.
In what follows, we will assume that the bath degrees of freedom are ergodic. We will be careful, however, not to invoke the ergodic hypothesis for correlations between two explicit degrees of freedom, because we wish to allow for the possibility that the explicit degrees of freedom are nonergodic and far from equilibrium.
To proceed, let us define a time averaged correlation function with the time average being over a time period [t_{a}, t_{b}], with t_{a }≤ t_{0 }≤ t_{b}:
Let C(n, m, t) represent the (n, m) matrix element of a matrix C(t). Evaluate Eq (17b) at time t + t_{0}, multiply both sides by X(t_{0}), and integrate over t_{0 }over the range [t_{a}, t_{b}]. We will assume that K(τ), G(τ) and F_{R}(t) do not depend on the time t_{0}, i.e., we assume these functions are stationary. The result is:
which may be written in more expanded form as
where
Here C_{RX}(t) is the time correlation between the random force and the macroscopic observables, when averaged over time. If we take the bath random force to be ergodic, then the time averaged correlation C_{RX}(t) from Eq (25) is equal to the phase space averaged correlation (F(t), X(t_{0})) from Eq (11), and therefore C_{RX}(t) should be equal to zero. Minimizing the square amplitude of every element of the matrix C_{RX}(t) over the time interval [t_{a}, t_{b}] represents our second variational principle.
The careful reader will note that Eq (24) is not quite the same as the typical equation of motion for the time correlation function, as defined in standard texts [13,14]. In the standard derivations, an equation of motion that is identical in form as Eq (24) is derived but the time correlation functions involve an average over phase space, not over time. One then invokes the ergodic hypothesis to equate these phase space averaged correlation functions with time averaged correlation functions. However, we wish to avoid making the ergodic hypothesis for the explicit degrees of freedom. In our derivation, we invoke the ergodic hypothesis only for the bath degrees of freedom.
To continue, define an error functional that consists of the sum of the square of every element of the matrix C_{RX}(t), summed over all the elements of the matrix and over every instant in time in the time interval [t_{a}, t_{b}]:
Here Tr signifies a matrix trace and T a matrix transpose. To minimize the error functional of Eq (26), one may take each value of K(m, n, τ) and G(m, n, τ), for each m, n and τ, as an independent parameter, and vary them so as to make E_{II}(t_{a}, t_{b}) as small as possible. To do this, first insert Eq (24) into Eq (26), to express E_{II}(t_{a}, t_{b}) in terms of C(t), , , K(τ), and G(τ). Next, insert experimental values for C(t), , into E_{II}(t_{a}, t_{b}). One may then minimize E_{II}(t_{a}, t_{b}) with respect to K(τ) and G(τ) by standard algorithms [15]. Because E_{II}(t_{a}, t_{b}) is quadratic in K(τ) and G(τ), one is guaranteed a unique solution. This procedure represents our second variational principle.
For future reference, we also list the derivatives of E_{II}(t_{a}, t_{b})with respect to the K(m, n, τ)'s and G(m, n, τ)'s. These equations are useful in global minimization algorithms [15]:
Alternatively, after some experience, if one finds that K(m, n, τ) and G(m, n, τ) tend to decay exponentially, possibly with some oscillations, then one may try to save some computational effort by parameterizing K(m, n, τ) and G(m, n, τ), for example, using
The minimization of E_{II}(t_{a}, t_{b}) would then be with respect to A_{K}(m, n), b_{K}(m, n), ω_{K}(m, n), etc. The drawback here is that the minimization algorithm would require an iterative algorithm suitable for nonlinear fits, and a unique solution is not guaranteed [15].
3. Discussion
The mathematical or physical model that one chooses to represent EEG dynamics influences how one interprets experimental observations. The model itself acts as a filter which biases one's interpretations. It is desirable therefore that these models be based on fundamental physical principles. The model we present here is suitable for interpreting electrophysiological data acquired from extracellular recordings. It is a macroscopic level model which complements more microscopic HodgkinHuxley type models. The assumptions that we make are fourfold. First, we assume that the underlying dynamics obeys the laws of either classical or quantum physics. This assumption is quite safe for biological systems.
Second, we assume that the total system under consideration is isolated and free from external influences, which allows us to write the Liouville operator in Eq (2) as being free of an explicit time dependence, i.e., the overall system is stationary. If there is an external driving force, such as environmental stimuli driving a learning experience, then the Liouville operator will have an explicit time dependence and the operator identity of Eq (6) is no longer valid. On the other hand, one could always subsume the external degrees of freedom into the Liouville operator, to make the external degrees of freedom part of the total system under consideration. It does not matter how many degrees of freedom are subsumed into the Liouville operator in this way, since we integrate them out anyway. One could in principle claim to include the entire universe in one's Liouville operator, in which case, neglecting relativistic and other cosmological effects, one recovers the time independence of the Liouville operator and the derivation of Eq (17) proceeds as above.
Thirdly, we have made the ergodic assumption for the bath degrees of freedom, which assumes that the bath degrees of freedom can quickly (on time scales much faster than the time scales of the explicit degrees of freedom) explore all of the phase space available to the bath. Importantly, we have avoided making the ergodic hypothesis for the explicit degrees of freedom, and thus our principal theoretical result, Eq (17), is capable of describing systems far from equilibrium.
The fourth assumption is comprised of either variational principle I or II, either of which allows us to extract the model parameters from experimental data. Variational principle I tries to explain as much of the dynamics as possible using the macroscopic observables, ascribing as little as possible to the influence of random unseen forces. Variational principle II does not assume that the random forces are small in amplitude, but rather that they are not correlated with the dynamics of the macroscopic observables.
We feel that variational principle II is better justified, as there is no reason to suppose that random environmental noise should generally be small. Variational principle II requires us to make either the ergodic hypothesis for the bath random forces, or one must take the definition of a random force to mean that it is not correlated with macroscopic observables, either within a phase space average or in a time average. If experimentally we find that the random force is in fact correlated with a macroscopic observable, then we would know that there is an unidentified degree of freedom in the experimental system that is important to the dynamics of the identified macroscopic observables. Such a scenario can arise if there are neurons very far from the experimental electrodes that strongly drive the observed neurons, but that are too far from any electrode to be directly observed. In this case, there very well may be strong correlations between the macroscopic observables and the presumptive random force term. Thus, difficulty in minimizing the error functional E_{II}(t_{a}, t_{b}) may sometimes signify that we have not identified all the key macroscopic observables. We may need to insert a few more electrodes in other parts of the brain.
In this case, if it is not possible to insert more electrodes into the brain or if we do not know where to insert them, what we can do is to take shorter time intervals of observation, given by [t_{a}, t_{b}], and to extract the K's and G's for each time interval individually, a different set of K's and G's for each time interval. If the time intervals are short enough, then it will be possible to make E_{II}(t_{a}, t_{b}) as small as one likes, simply because there will be fewer data points to be fitted with a given number of free fitting parameters. The K's and G's will then vary across time intervals. These variations reflect the influence of the unseen variables.
3.1. Nonlinear interactions
It may seem curious that Eq (7) is linear in the macroscopic variables X(t) and even though we have made no assumptions about linearity in the microscopic variables, the 's and 's. The reason is a subtle one but important to understand. In Eq (2), every component of the vector X(t), x(n, t), is itself a vector in Hilbert space which can be expanded in terms of an infinite basis set of eigenfunctions π_{j }of the Liouville operator L:
Higher order functions, such as x(n, t)^{2}, x(n, t)^{3}, etc, have their own expansion coefficients in terms of these eigenfunctions, and the expansion coefficients will not necessarily have a simple relationship to those for x(n, t). What this means is that in Hilbert space, higher order functions of the macroscopic variable x(n, t) are considered additional dimensions in the abstract space. If they are important for describing the dynamics of X(t), then one will not be able to satisfy one's variational principle and E_{II}(t_{a}, t_{b}) will be in some sense too large. Difficulty in satisfying the variational principle thus can signal either that there are important unseen variables at play or that there are nonlinear dependences on the macroscopic variables. In the latter case, one remedy would be to define a vector Z(t) with components z_{N+k}(n, t) = x(n, t)^{k}. One can include higher order terms up to whatever order k one likes, in order to capture some of the nonlinear dependences. To generalize even further, one could begin including other functions as well, for example, sigmoidal functions for modeling nonlinear neuronal responses. Regardless of what one chooses, the vector Z(t) will evolve according to the Liouville equation:
The rest of the development follows as before, and one finds a corresponding equation of motion for Z(t):
One can still appeal to either variational principle I or II to extract the K's and G's from experiment data.
In general, there is no simple prescription for determining which macroscopic observables are important nor the best nonlinear forms for the interactions between the macroscopic observables. The choices that one makes will depend on the specifics of one's experimental setup (where one has inserted electrodes) and on one's intuition about how best to model explicit interactions. Nonetheless, the form of Eq (17) or Eq (33) constrains the mathematical model to one that is consistent with the laws of physics, and the variational principles I and II guarantee the best fit of one's model to experimental data.
As an example, consider the situation where there are two intracranial electrodes, and where the neurons near one electrode interact with neurons near the other through both a linear term and a sigmoidal term. There may also be a frictional force acting at each site. A reasonable set of equations of motion may then look like this:
One may then appeal to either variational principle to obtain the K's and G's, and even λ if one wishes. The ZwanzigMori approach thus allows for a very flexible approach to the mathematical modeling of macroscopic observables. One does one's best to set up an equation of motion, with a certain set of fitting parameters, and then one appeals to the variational principle of one's choice to extract these fitting parameters from experiment. The variational principle guarantees the best description of the experimental data, given one's choice of a mathematical model.
3.2. Piecewise linear approximation
Alternatively, one can ignore all macroscopic observables that are not linear functions of the x(n, t)'s and simply take sequential time intervals [t_{a}, t_{b}] to be short enough that E_{I}(t_{a}, t_{b}) or E_{II}(t_{a}, t_{b}) is satisfactorily small. How short this time interval has to be is however short is necessary to obtain a value of E_{I}(t_{a}, t_{b}) or E_{II}(t_{a}, t_{b}) that is below one's desired threshold. In this case, one is assuming that the dynamics is linear within each time interval [t_{a}, t_{b}], but not necessarily linear across time intervals. This approach is the piecewise linear approximation. It is worthwhile considering when it is valid.
If the underlying microscopic dynamics obeys classical dynamics (i.e., Newtonian physics), then there is no loss of generality in making the piecewise linear approximation, at least as far as the validity Eq (17) is concerned. The reason is that the force exerted on a classical particle is proportional to the slope of a potential energy function at a single point on that surface, with that point being given by the instantaneous coordinate of the particle X(t). For a short enough period of time, one can perform a Taylor expansion of the potential energy surface about the instantaneous coordinate of the particle, expand to just the quadratic order term, and ignore the higher order terms. The force, since it is proportional to the slope of the potential energy surface, is then always piecewise linear in X(t).
In contrast, the instantaneous force on a quantum wavepacket depends on the shape of the potential energy surface over the instantaneous spatial extent of the entire wavepacket. If this wavepacket extends over a substantial patch of the potential energy surface, then an accurate Taylor expansion of the potential energy surface may have to include terms higher than just the quadratic order term. Thus quantum dynamics may not always be accurately described by a piecewise linear assumption.
Fortunately, biological systems are typically far from the quantum regime, and in this case, we can always simply take the time intervals [t_{a}, t_{b}] shorter and shorter until we satisfy our variational criterion. Piecewise linear analysis (also known as instantaneous normal modes) has been surprisingly successful in describing even highly nonlinear dynamics, including that of liquids [1620].
Once we have obtained the K's and G's, we can interpret interactions between groups of neurons with these functions. Which neuronal groups are linked by either the K's or G's? What are the time scales for the time delays? In particular, the eigenstates and eigenvalues of the Kmatrix would be of high interest, as these eigenstates represent spatiotemporal patterns created by the interaction between the explicit degrees of freedom, i.e., these eigenstates may represent ''memory traces''.
To explore this idea, first ignore the convolutions in Eq (17) and consider the eigenstates of K where the eigenvalues are purely real and positive. These eigenstates are oscillatory states which reside in free energy "wells". In terms of the EEG, one will see "standing wave" oscillations distributed over the spatial distribution of the respective eigenstates. These are stable states that can be used to store information. In contrast, eigenstates of K where the eigenvalues are purely real and negative are unstable states which map onto free energy "barriers". In terms of the EEG, these states are evanescent states which do not recur, or at least not in a periodic way. These unstable states are not useful for storing information because they are transitory and one cannot reliably design a trajectory to return to such states.
Because K is purely real but not necessarily symmetric, it can also have pairs of complexvalued eigenstates which have eigenvalues that are also complex. In each pair, the eigenvalues and eigenstates are complex conjugates of each other. The EEG dynamics represented by these pairs is that of a "traveling wave" that travels between the real and imaginary parts of the eigenstates. For instance, if one eigenstate is with complex eigenvalue λ_{1 }= λ_{R }+ iλ_{I}, then the other eigenstate is with eigenvalue λ_{2 }= λ_{R } iλ_{I}. In terms of the EEG dynamics, one will see activations of EEG voltages that morph continuously between the spatial distribution represented by and that represented by . If λ_{R }> 0, then these states are stable and can also be used to store information.
The presence of the convolution between K(τ) and X(t) in Eq (17) opens up the possibility of supereigenstates that are not only spatially distributed, but also extended in time. To see how these arise, define a supervector X(t) by concatenating (M+1) consecutive Ndimensional Xvectors:
The number M is determined by the number of time steps that contribute to the convolutions involving the K and G matrices. A supervector F_{R}(t) can be defined in an analogous way, as well as N(M + 1) × N(M + 1) dimensional supermatrices K and G. Equation (17) can then be written without convolution operators as:
The eigenstates of K now span not only space but also time over an interval given by τ_{M }= Mδt. We suggest that those eigenstates of K that are stable represent spatiotemporal memory traces.
3.3. Criticality, neural and thermodynamic
Of increasing interest in neurodynamics is the idea that the brain may poise itself near a critical point [2123]. Near this neural critical point, neuronal elements may exhibit spatiotemporal patterns of correlated activation that span many length scales and time scales, from the length scales of just a few neurons to that encompassing macroscopic brain regions, and from time scales of individual action potentials (millisecs) to the much longer time scales of a completed thought (seconds to minutes, and possibly to even longer times). For neural systems, criticality can be defined in terms of the connectivity of the system [24,25]. It has been shown that a neural system at critical connectivity exhibits maximal information storage capacity, and optimal information transmission and processing [21,22,24,2628]. Because the survival of an animal depends on how well its brain processes, stores and retrieves information, we have hypothesized that biological neural systems must maintain themselves at or near critical connectivity [29]. If this hypothesis is true, then failure to remain at or near critical connectivity may represent neurological disease, for instance, epilepsy [30]. If this hypothesis is false, it cannot be too far wrong, and it would be important to characterize how far a neural system can be from critical connectivity before it fails to be useful as an information processing system.
It would be of great interest to have a reliable measure of connectivity in the intact brain, as measured by clinically available electrodes either on the scalp of a human subject, placed on the surface of the brain, or inserted into the brain. For practical reasons, all such systems monitor only a tiny subset of all neural activity in the intact brain. An important consequence is the subsampling problem, where monitoring of only a portion of the total system is likely to misrepresent the true state of the system, including the possibility of mistaking a critical system for a subcritical or supercritical one [31]. It appears that one needs to monitor on the order of 25% of the total system in order to have some hope of a reliable measurement of system connectivity, at least using current measures of connectivity (Priesemann V; Personal communication, 2009).
Can the projection operator approach afford a more reliable measure of criticality, and can it connect the neural concept of criticality to the more traditional thermodynamic concept [32]? We do not know as yet, but the conceptual framework of the projection operator approach is suggestive of an approach to this problem. The reasoning is as follows.
The vector of macroscopic observables, X(t), can be regarded as a generalized coordinate of an abstract neural ''superparticle'' that exists in a very highdimensional space. The coordinate reflects the collective movement of a macroscopic number of microscopic charged ions and molecules in a very complicated, nonlinear way, as given by Eq (1). Notwithstanding the complexity of the microscopic dynamics, it is a characteristic of critical phenomena to span all length and time scales [32], and thus one may ask, if we define a heat capacity for the macroscopic neural superparticle, will it behave in the same way as the heat capacity of the microscopic system if either system is at or near its critical point? The answer is yes, it should, and therefore the critical point of the microscopic system should be identical to that of the macroscopic system. The heat capacity of the neural system should diverge at the critical point in the same way, with the same critical exponent.
To explore this idea, note that changes in the internal energy U(t) of the oscillator are driven by the force K ◦ X(t), and so incremental changes in this energy are given by
If we take the kinetic energy to be and the temperature to be related to the kinetic energy through
where T(t) is the instantaneous temperature, N is the total number of EEG channels and k_{B }is Boltzmann's constant [33], we can then define a dimensionless internal heat capacity as
Using Eq (40), we may then investigate divergences of the heat capacity near phase transitions including the critical point. Other thermodynamic quantities of interest may be similarly defined [34]. A caution is that more sophisticated definitions of temperature than Eq (39) may be needed for systems that are not at thermal equilibrium.
One can also begin to think of EEG dynamics in terms of established particle dynamics concepts. What does the free energy surface U(t) look like on which brain dynamics moves? Are there deep energy wells that trap neural superparticle trajectories? Are there frequent hops between energy wells? What proportion of time is spent inside a well vs on an energy barrier? How high are the energy barriers to transitions between free energy wells? Can one hope to bring into play such landmarks of physical chemistry as transition state theory and other reaction rate theories? What happens in disease, for example, in epilepsy? Do the energy wells become shallower? Is there less friction? We have previously taken small steps in investigating these questions [35], but a great deal more is possible. Our prior experience shows that the calculations required to utilize variational principle II are feasible.
3.4. Causality
According to Newtonian physics, the force exerted by one body on another acts instantaneously, with no time delay. Projection operator theory shows us that if the force exerted by one macroscopic body on another is mediated through many microscopic degrees of freedom, the response of the second body may be delayed in time. This time delay is represented by the convolution in Eq (17). The macroscopic time delay is intuitively obvious from our daily experience, and does not surprise us. The time delay is suggestive of but not proof of causality. Can causality be demonstrated using the ZwanzigMori equation of Eq (17)?
In this regard, it is instructive to consider Granger causality [36], which is finding increasing application in neuroscience and many other areas. In Granger causality, one assumes that the value of X(t + δt) at time t + δt depends on all prior values at earlier times. One then asks if X(t + δt) also depends on prior values of another vector Y(t). To answer this question, one can construct two time series:
One solves for G in Eq (41) by minimizing an error functional over some time interval [t_{a}, t_{b}]:
Similarly one solves for H_{1 }and H_{2 }by minimizing the error functional
If E_{XY }(t_{a}, t_{b}) is significantly less than E_{X }(t_{a}, t_{b}) by some statistical measure, then one says that Y(t) Granger causes X(t).
The ZwanzigMori formulation can be applied to demonstrate causality in an analogous way. The Granger error terms R_{X}(t) and R_{XY}(t) correspond to random force fluctuations in the ZwanzigMori equation. The error functionals of Granger causality in Eqs (43) and (44) are equivalent to the ZwanzigMori error functional E_{I}(t_{a}, t_{b}), corresponding to our variational principle I. That the ZwanzigMori equation contains both a coordinate and a velocity term is not a major difference, as within the Granger formulation, one could simply take the velocity as another degree of freedom, in effect, another coordinate.
However, within the ZwanzigMori formalism, one also has the option of appealing to variational principle II. This variational principle may be more useful for noisy systems. For noisy systems, it is not justifiable to assume that the random force contribution to dynamics should be as small as possible. Random environmental forces may in fact be dominant. Nonetheless, one can still define the random force to have no time correlation with the macroscopic observables, for reasons discussed above. Thus, we feel that variational principle II has advantages over variational principle I.
3.5. Prediction and control: relation to Kalman filters
Kalman filters are also beginning to find application in engineering applications of neural prediction and control [37]. The idea here is to understand how a system responds to externally applied perturbations so that one can apply the perturbations in a rationally planned way so as to achieve a desired response.
ZwanzigMori theory also allows one to probe system response to external perturbations. Let the system observables be described by a vector X(t) and let the external perturbations be described by a timedependent vector Y(t). On the experimental system, one may apply a variety of representative test perturbations Y(t). One then constructs a vector Z(t) containing both vectors X(t) and Y(t):
Going through all the steps as before, we find the ZwanzigMori equation for Z(t)
Appealing to either variational principle I or II, we can extract all the K's or G's in Eq (46) as a result of the test perturbations. There will be offdiagonal elements of the K's and G's that describe the effect of a given test perturbation Y(t) on the future dynamics of X(t). Knowing these elements allows one to predict the future response of the system to any variation of the test perturbations.
3.6. Related approaches
Hänggi and coworkers have also taken advantage of the ZwanzigMori formulation of the GLE to construct a hierarchy of statistical measures of system memory [38]. These measures have been applied to a patient with photosensitive epilepsy with striking results [39]. The ZwanzigMori approach has also been applied to construct a renormalized kinetic theory of dense fluids [40].
In terms of other theoretical approaches for deriving macroscopic equations of motion based on microscopic dynamics, a moment expansion method is also possible. In this approach, one defines moments of the macroscopic observables, e.g., where m and n are positive integers and where the angular brackets signify a phase space average over a distribution function, which need not be an equilibrium distribution function. Taking time derivatives and applying microscopic laws within the angular brackets results in coupled differential equations linking the dynamics of the different order moments of X(t). In general, the lower order moments will depend on higher order moments, and one needs a way of closing the moment expansion [e.g., see Refs [41,42]].
Yet another method is to expand the distribution function in terms of a "basis set", and then derive equations of motion for the expansion coefficients. It is possible to allow the basis functions themselves to change in time by appealing to the DiracFrenkel variational principle [43], which resembles our variational principle II. Indeed, the DiracFrenkel variational principle was the motivation for variational principle II. The variationally optimized "mobile basis set" approach has been applied to quantum dynamics [44,45] and can also be applied to statistical mechanics. A suggestion for workers who wish to try this mobile basis set approach is that one should expand the square root of the distribution function in a basis set, not the distribution itself, or else one would not be able to maintain normalization simultaneously with energy conservation.
3.7. Summary
The ZwanzigMori formalism is a simple and flexible mathematical framework for interpreting macroscopic dynamics even in the presence of significant environmental noise. Its range of applicability is very broad, including the dynamics of all natural systems governed by classical or quantum physics. It is based on sound physical principles, and it allows one to extract model parameters from experimental data using one of two variational principles. These variational principles are our principal contribution to the formalism.
Acknowledgements
DH was supported by NIH 1KL2RR02501201 and by the NIH Loan Repayment Program. We thank Drs. John Straub, Tom Keyes and Viola Priesemann for helpful comments.
References

Buzsaki G: Rhythms of the Brain. New York: Oxford University Press; 2006.

Nunez PL: Neocortical Dynamics and Human EEG Rhythms. New York: Oxford University Press; 1995.

Engel J Jr, Bragin A, Staba R, Mody I:
Epilepsia. 2009, 50:598604. PubMed Abstract  Publisher Full Text

J Physiol. 1952, 117:500544. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Traub RD, Miles R: Neuronal Networks of the Hippocampus. New York: Cambridge University Press; 1991.

Freeman WJ: Neurodynamics: An exploration in mesoscopic brain dynamics. London: Springer Verlag; 2000.

Phys Rev E. 1995, 51:50745083. Publisher Full Text

Biophys J. 1972, 12:124. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Epilepsy Behav. 2008, 12:128135. PubMed Abstract  Publisher Full Text

Mormann F, Andrzejak RG, Elger CE, Lehnertz K:
Brain. 2007, 130:314333. PubMed Abstract  Publisher Full Text

J Chem Phys. 1960, 33:13381341. Publisher Full Text

Prog Theor Phys. 1965, 33:423455. Publisher Full Text

Zwanzig R: Nonequilibrium statistical mechanics. New York: Oxford University Press; 2001.

Berne BJ, Pecora R: Dynamic Light Scattering. New York: Wiley; 1976.

Press WH, Flannery BP, Teukolsky SA, Vetterling WT: Numerical recipes: The art of scientific computing. New York: Cambridge University Press; 1989.

J Phys Chem A. 1997, 101:29212930. Publisher Full Text

La Nave E, Scala A, Starr FW, Sciortino F, Stanley HE:
Phys Rev Lett. 2000, 84:46054608. PubMed Abstract  Publisher Full Text

J Chem Phys. 1990, 93:1332. Publisher Full Text

Phys Rev E. 2000, 62:79057908. Publisher Full Text

Buchner M, Ladanyi BM, Stratt RM:
J Chem Phys. 1992, 97:8522. Publisher Full Text

Chialvo DR arXiv:qbio/0610041v1
2006.
arXiv:qbio/0610041v1

Trends Neurosci. 2007, 30:101110. PubMed Abstract  Publisher Full Text

Philos Transact A Math Phys Eng Sci. 2008, 366:329343. PubMed Abstract  Publisher Full Text

J Neurosci. 2003, 23:1116711177. PubMed Abstract  Publisher Full Text

J Neurosci. 2004, 24:52165229. PubMed Abstract  Publisher Full Text

Nat Phys. 2006, 2:348352. Publisher Full Text

Bertschinger N, Natschlager T:
Neural Comput. 2004, 16:14131436. PubMed Abstract  Publisher Full Text

Phys Rev Lett. 2005, 94:058101. PubMed Abstract  Publisher Full Text

Hsu D, Tang A, Hsu M, Beggs JM:
Phys Rev E. 2007, 76:041909. Publisher Full Text

Hsu D, Chen W, Hsu M, Beggs JM:
Epilepsy Behav. 2008, 13:511522. PubMed Abstract  Publisher Full Text

Priesemann V, Munk MH, Wibral M:
BMC Neurosci. 2009, 10:40. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Stanley HE: Introduction to Phase Transitions and Critical Phenomena. New York: Oxford University Press; 1971.

McQuarrie DA: Statistical Mechanics. Sausalito, CA: University Science Books; 2000.

Allen MP, Tildesley DJ: Computer Simulation of Liquids. Oxford: Clarendon Press; 1987.

Neurocomputing. 2005, 65–66:469474. Publisher Full Text

Econometrica. 1969, 37:424438. Publisher Full Text

J Neural Eng. 2008, 5:18. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Mokshin AV, Yulmetyev RM, Hanggi P:
Phys Rev Lett. 2005, 95:200601. PubMed Abstract  Publisher Full Text

Yulmetyev RM, Khusaenova EV, Yulmetyeva DG, Hanggi P, Shimojo S, Watanabe K, Bhattacharya J:
Math Biosci Eng. 2009, 6:189206. PubMed Abstract  Publisher Full Text

Ann Rev Phys Chem. 1979, 30:547577. Publisher Full Text

Physical Review A. 1982, 26:631. Publisher Full Text

Ma J, Hsu D, Straub JE: [http:/ / scitation.aip.org/ getabs/ servlet/ GetabsServlet?prog=normal&id=JCPSA6 000099000005004024000001&idtype=cvi ps&gifs=yes] webcite
J Chem Phys. 1993, 99:40244035. PubMed Abstract  Publisher Full Text

Frenkel J: Wave Mechanics: Advanced General Theory. Oxford: Clarendon Press; 1934.

J Chem Phys. 1992, 96:42664271. Publisher Full Text

J Chem Phys. 1989, 91:170179. Publisher Full Text