Modeling and Simulation of Reaction Networks

Our Boolean approach to gene regulation invoked the logic rule of each operon without questioning the underlying biophysical mechanism. The key players are the gene (stretch of DNA) and its reader (RNA Polymerase). Governing genes regulate the transcription of their subjects by either promoting or prohibiting the binding of RNAP to the subject's stretch of DNA.

This suggests that we might benefit from a tool that simulates the interactions that occur in a network of reacting chemical species. In the next two weeks we will consider two standard means for modeling and simulating such systems. The first is associated with the name of Gillespie and the second with the names of Michaelis and Menten.


Stochastic Chemical Kinetics


This is a beautifully simple procedure for capturing the dynamics (kinetics) of a randomly interacting (stochastic) bag of chemicals (us?).

We begin with N reacting chemical species, S1, S2, ..., SN, and their initial quantities

    X1, X2, ..., XN

We suppose that these species interact via M distinct reactions

    R1, R2, ..., RM

and that these reactions occur with individual propensities.

    c1, c2, ..., cM

For example, if R1 is

 
    X1+X2&rarr X3

then we suppose that the average probability that a particular S1S2 pair react within time dt is c1dt. If, at time t, there are X1 molecules of S1 and X2 molecules of S2 then there are X1X2 possible pairs and hence the probabilty of reaction R1 occuring in the window (t,t+dt) is X1X2c1dt.

For more general reactions we will write hm for the number of distinct Rm molecular reactant combinations available in the state (X1,...,Xn). We then arrive at the reaction probability am=hmcm. We shall have occasion to call on

    a0 = a1+a2+...+aM

Given this list of reactions and their propensities one naturally asks Which reaction is likely to happen next? and When is it likely to occurr?

Gillespie answered both questions at once by calculating

   P(T,m)dT = probability that, given the state (X1,...,Xn) at time t, 
              the next reaction will be reaction m and it will occur in the interval (t+T,t+T+dT).

A careful reading permits us to write P(T,m)dT as the product of two more elementary terms, namely, the probability that no reaction occurs in the window (t,t+T), and the probability that reaction m occurs within time dT of T. We have already seen the latter, and so, denoting the former by P0(T), we find that

   P(T,m)dT = P0(T)amdT

Regarding P0(T), as the probabilty that no reaction will occur in (t,t+dT) is 1-a0dT it follows that

   P0(T+dT) =  P0(T)(1-a0dT)

or

   (P0(T+dT)-P0(T))/dT = -a0P0(T)

and which, in the limit of small dT, states

   P0'(T) = -a0P0(T)

and so

   P0(T) = exp(-a0T)

It now comes down to drawing or "generating" a pair (T,m) from the set of random pairs whose probablity distribution is

   P(T,m) = amexp(-a0T)

Gillespie argues that this is equivalent to drawing two random numbers, r1 and r2, from the [0,1]-uniform distribution and then solving

(G1)          r1 = exp(-a0T)

for T and solving

(G2)          a1+a2+...+am-1 < r2a0 < a1+a2+...+am

for m. We have now assembled all of the ingredients of Gillespie's original algorithm:

    Gather propensities, reactions, initial molecular counts, 
    and maxiter from user/driver.  Set t = 0 and iter = 0;

    While iter < maxiter

          Plot each Xj at time t

          Calculate each aj

          Generate r1 and r2

          Solve (G1) and (G2) for T and m

          Set t = t + T

          Adjust each Xj according to Rm

          Set iter = iter + 1

     End

The gathering of reactions, like the gathering of wire and rule, may take some thought. For small networks we can meet the problem head on. We reproduce two examples from Gillespie.

The first is his equation (29)

      X&infin + Y &rarr 2Y   with propensity c1
(G29)
      2Y &rarr Z   with propensity c2

Here the subscript on X&infin indicates that its level remains constant (either because it corresponds to a constant feed or it is in such abundance that the first reaction will hardly diminish it) throughout the simulation. Here is the code that produced the figure at right. Though poorly documented you can nonetheless see each step of the algorithm.

Our second example is Gillespie's equation (33)

        X&infin + Y1 &rarr 2Y1   with propensity c1
(G33)   Y1 + Y2 &rarr 2Y2   with propensity c2
        Y2 &rarr Z   with propensity c3

Roughly speaking, the Y1 species procreates, is consumed by the Y2 species, which itself dies off at a fixed rate. Such models are known as predator-prey and, one can imagine, that the two populations often keep one another in check.

For example, here is the code that generated the lovely figure below.

Of course for larger reaction networks we, I mean you, must proceed more systematically. On the road to a more general robust method we wish to address

    1. The user is unsure of how to choose giter and would prefer
       to provide the final time, tfin.
    2. The user may enter the reaction list as a cell and code
       the update of h in a subfunction and so preclude a very 
       lengthy if clause.
    3. A single run of Gillespie is rarely sufficient - for it delivers
       but one possible time evolution. The user would prefer that we
       made nr runs and collected and presented statistics across runs.

We may incorporate the first point by exchanging Gillespie's for for a while.

To address the second point, I imagine a cell where each entry encodes a full reaction, say, e.g., by adopting the following convention

   rtab{k}(1:2:end) = indicies of reaction species
   rtab{k}(2:2:end) = actions for species above

for the kth reaction. In the Lotka example, this would read

   rtab = {[1 1]	% Y1 -> Y1 + 1
           [1 -1 2 1]   % Y1 -> Y1 - 1  and  Y2 -> Y2 + 1
           [2 -1]}      % Y2 -> Y2 - 1

and the update function would look like

    function h = update(y)
    h(1) = y(1);
    h(2) = y(1)*y(2);
    h(3) = y(2);

The use of update is relatively straightforward, e.g.,

    a = c.*h;

With this a in hand you may now use cumsum and find to find the next reaction index (in just a couple of lines - and with NO ifs). With this index in hand you may visit the proper element of rtab.

To address the third point above, we should suppress plotting on the fly and instead return the time and desired solution vectors to a master control program that will collect and plot statistics. In particular, if we lay our Gillespie runs along the rows of a big matrix and takes means doen the columns, e.g.,

    for j=1:nr
        [t,x] = mygill(tfin,rtab,x0,c);
        X(j,:) = x;
    end
    avg = mean(X,1);

we "ought" to be on the right track. The trouble is that each of our Gillespie runs are loyal to particular time vectors. One way around this is to legislate a uniform time vector

    tvec = 0:tinc:tfin

and bring the individual x vectors into X only after interpolating them with interp1.


Deterministic Chemical Kinetics


The stochastic view is a bare-handed means of following the dynamics of individual molecules. For large reacting systems this approach may be prohibitively expensive and/or unnecessarily precise. In such cases one may rely on more coarse grained tools, e.g., the

Law of Mass Action: The rate of a chemical reaction is directly proportional to the product of the effective concentrations of each participating molecule.

We will come to understand this law by its application. We revisit (G29) in the context of a fixed volume V and reaction rates k1 and k2 (in units of 1/M/s) where M designates Molar (which itself is short for moles per volume) and s designates seconds, and write

      X&infin + Y -> 2Y   (propensity c1)
(G29)
      2Y -> Z       (propensity c2)

If we denote the concentration of X by [X] then the Law of Mass Action dictates that

(G30)         d[Y(t)]/dt = 2c1[X&infin][Y(t)] - (c1[X&infin][Y(t)]+2(c2/2)[Y(t)]2)
                             gains      -     losses

The connection bewteen the rates and the propensities

       k1 = c1V   and    k2 = c2V/2

is worked out in section IIC on page 2343 of Gillespie.

Now, if, we denote the initial concentration

      [Y(0)] = Y0 

we may solve the ordinary differential equation (G30) by hand (via calculus or dsolve('DY = k1X*Y-2*k2*Y^2','Y(0)=Y0') in matlab) and find

                          Y0k1[X&infin]
      [Y(t)] =  ---------------------------------
                2Y0k2 - (2Y0k2-k1[X&infin])exp(-k1[X&infin]t)

This function is easy to graph and so compare with its stochastic cousin. Assuming V=1 we arrive at the figure below on running gill1o.m

We see that the deterministic model does an excellent job of tracking the stochastic trajectory. This is, of course, but one simple example.

The Law of Mass Action applied to the Lotka system

         X&infin + Y1 -> 2Y1   (propensity c1)
(G38)    Y1 + Y2 -> 2Y2   (propensity c2)
         Y2 -> Z          (propensity c3)

yields

        d[Y1]/dt = c1[X&infin][Y1] - c2[Y1][Y2]
(G39)  
        d[Y2]/dt = c2[Y1][Y2] - c3[Y2]

Now each of the ks differ from the cs by a power of V. This solution of this system is not presentable and so we turn to approximate, or numerical, means. Matlab sports an entire suite of programs for solving differential equations. The default program is ode23. We invoke it in our Lotka 2-way code and arrive at the figure below. As above blue is the Gillespie trajectory and red is the ode trajectory.

As further practice we consider a third example. Namely, the Brusselator of Gillespie, equation (44)

         X&infin,1 -> Y1            (propensity c1)
         X&infin,2 + Y1 -> Y2 + Z1   (propensity c2)
(G44)   
         2Y1 + Y2 -> 3Y1        (propensity c3) 
         Y1 -> Z2               (propensity c4)

becomes

         d[Y1]/dt = c1[X&infin,1]- c2[X&infin,2][Y1] + (3-2)(c3/2)[Y1]2[Y2] - c4[Y1]
(G45)
         d[Y2]/dt = c2[X&infin,2][Y1] - (c3/2)[Y1]2[Y2]

This associated code produces the figure below.