Stochasticity and noiseinduced transition of genetic toggle switch
 WeiYin Chen^{1}Email author
https://doi.org/10.1186/2195546821
© Chen; licensee Springer. 2014
Received: 12 July 2013
Accepted: 27 November 2013
Published: 8 January 2014
Abstract
The ability to predict and analyze the function of genetic circuits will enhance the design of autonomous, programmable, complex regulatory genetic structures. An abundance of modeling techniques has recently been developed to delineate simple genetic structures in terms of their constituents. Simple systems with characteristics of feedback inhibition, multistability, switching, and oscillatory expression have often been the focus. The present work is an attempt to improve existing deterministic models that fail to oblige to the crucial aspect of noise in genetic modeling.
The objective of this work is to analyze, model, and simulate the protein populations in gene expression mechanisms by resorting to stochastic algorithms. The system involves two types of genes; the protein produced from the expression of one gene is capable of turning off the expression of the other gene. Rates of degradation of these proteins are assumed to be proportional to their concentrations. The master equation of this ‘genetic toggle switch’ is formulated using the probabilistic population balance around a particular state and by considering five mutually exclusive events. The efficacy of the present methodology is mainly attributable to the ability to derive the governing equations for the means, variances, and covariance of the random variables by the method of systemsize expansion of the nonlinear master Equation. A less laborious approach based on Kurtz’s limit theorems for the derivation of the stochastic characteristics is also presented for comparison. Solving the resultant ordinary differential equations governing the means, variances, and covariance of the master equations simultaneously using the published data yield information concerning not only the means of the two populations of proteins but also the minimal uncertainties of the populations inherent in the expressions. It is demonstrated that systems with small populations are susceptible to large internal fluctuations (or uncertainties) in their population evolution. Large uncertainties are observed after the populations enter the proximity of the saddle node, which is likely to cause transition of system’s steady state from one to another. Independent MonteCarlo simulation runs clearly demonstrates that the occurrence of such internal noiseinduced transition.
Keywords
Introduction
One of the earliest examples of a bistable genetic switch is represented in the rightward operator of bacteriophage lambda [1, 2]. The essential elements of this type of genetic switch, are a pair of promoters that each produces a repressor protein capable of inhibiting the production of the opposing repressor. Overlayed on these essential elements are several layers of regulatory nuance. To elucidate the impacts of these essential elements of a simplified regulatory circuit, a series of synthetic toggle switches were created.
McAdams and Arkin’s [4, 5] MonteCarlo simulations of gene expression revealed the importance of fluctuations, or noises or uncertainties, of small systems. In such small systems, proteins are produced from an activated promoter in short bursts of variable numbers of proteins that occur at random time intervals. As a result, there can be large differences in the time between successive events in regulatory cascades across a cell population, which, in turn, creates both special and temporal heterogeneity of cell populations in biological systems. Soon after the discovery of the potential impacts of the stochasticity of genetic regulatory system, stochastic algorithms developed by chemical physicists have been introduced in analyzing gene expression (e.g., [6, 7]). The stochastic nature of a competitive expression mechanism can produce probabilistic outcomes in switching mechanisms that select between alternative regulatory paths, such as toggle switch.
Stochastic algorithms have been developed for analyzing noise of different origins and internal and external noises (e.g., [8–10]). External noises are the fluctuations created in an otherwise deterministic system by the application of an external random force, whose stochastic properties are supposed to be known. A Langevine equation is commonly adopted in the analysis of dynamics caused by external noises. Internal noise arises from discrete systems where only a limited number of variables affecting the populations of the discrete entities can be included in the analysis. Small discrete systems, such as genes of small populations, often exhibit notable internal fluctuations. A master equation, derived from probabilistic population balance around a particular state of the system by taking into account all mutually exclusive events, has been adopted this type of discrete state, continuoustime stochastic processes.
The stochasticity of gene expression is complicated by its nonlinearity. Multiple steady states, stability, and bifurcation in gene expressions (e.g., [11]) could mingle with the analysis of noise, or fluctuations. The efficacy of the master equation algorithm in gene expression is mainly attributable to its powerful ability to solve the nonlinear master equations through the system size expansion [9, 12]. In this approach, a suitable expansion parameter must be identified in the master equation. The expansion parameter represents the size of the fluctuations, and therefore, the magnitude of the jumps, or transitions, of system’s state. Since the internal noises are expected to be low when the system size is large, the system size has been proposed as an expansion parameter. Master equation formulation along with the systemsize expansion has indeed applied to the analysis of noise in gene expression. It should be mentioned that the limit theorems of Kurtz [13–15] have rendered the complex procedure of system size expansion simple and highly accessible. Kurtz’s proof demonstrated the solution of a Langevine equation approaches to van Kampen’s system size expansion as the system size approaches to infinite.
Kepler and Elston [16] examined the stochastic dynamics of the singlegene system with and without feedback and a switching system composed of two mutually repressed genes. Several assumptions were made in their simplified model: the two genes share the same operator and same degradation rate, proteins bind to the operator as dimers, and rate of dimerization is fast. Both master equation and MonteCarlo simulation were adopted in their study. Scott et al. [17] adopted the master equation along with the system size expansion algorithm in the estimation of internal noise of the singlegene system that involves the mRNA formation and degradation and protein formation and degradation.
The system size expansion has several limitations in modeling the gene regulatory process. It is a good approximation to the master equation for small internal noise and large system size. Moreover, the noise should be well within the boundary of attraction [9]. Thus, noises in oscillatory process and those away from the steady states have been a focus of several studies. Tao et al. [18] studied the noise far from the steady states and revealed that during the approach to equilibrium, the noise is not always reduced by the strength of the feedback. This is contrary to results seen in the equilibrium limit which show decreased noise with feedback strength. Ito and Uchida [19] found that the internal noise of a regulatory singlegene system grows without bound in oscillatory networks and developed an alternative method for estimating the evolution of internal noise in such systems.
Kepler and Elston’s simulation work [16] demonstrated that simple noisy genetic switch have rich bifurcation structures. Among them, bifurcations driven solely by changing the rate of operator fluctuations even as the underlying deterministic system remains unchanged. They find stochastic bistability where the deterministic equations predict monostability and vice versa. OchabMarcinek [20] investigated the stationary behavior of a nonlinear system, a reduced, deterministic Yildirim and Mackey [21] model of the gene regulatory system, and discovered the transition of a steady state induced by noise. A perturbed Gaussian white noise term was introduced in the deterministic model followed by numerical simulations. Turcotte et al. [22] studied noiseinduced stabilization of an unstable state of a genetic switch that undergoes a variety of bifurcations in response to parameter changes. Their Monte Carlo simulations showed that near one such bifurcation, noise induces oscillations around an unstable spiral point and thus effectively stabilizes this unstable fixed point.
In addition to the master equation algorithm, Monte Carlo simulation has been adopted in simulating the dynamic behaviors in genetic regulatory systems under the influences of internal noise (e.g., [11, 23, 24]). The Monte Carlo simulation shares the same assumption, the Markov property, as the master equation, and the noise can be obtained directly from master equation’s deterministic counterpart. Moreover, the Monte Carlo simulation is capable of revealing the various characteristics of nonlinear dynamic system, such as the number of steady states, bifurcation, and internal noises.
In this expositional work, the master equations are formulated by stochastic population balance. Van Kampen’s system size expansion of the resultant nonlinear master equation gives rise to the variances of the processes. We demonstrate the implementation of Kurtz’s limit theorems can efficiently achieve the same goal. Simulations are conducted based on both the master equations and the Monte Carlo procedure for three systems: bistable, monostable, and on the bifurcation curve. Finally, we demonstrate the possibility of transition induced by internal noises for a bistable system.
Model formulation
A genetic toggle switch with negative feedback to the genes consists of two mutually coupled genes. The transcription products of these genes are two inhibitory repressor proteins competing to shut off the production of two constitutive promoters [1, 3, 25]; the protein transcribed by a gene of one type is capable of deactivating the transcription of the other gene. A toggle switch typically has more than one possible stable steady state depending on the reaction parameters under consideration [3]. There are a number of instances in nature where this switchlike behavior is utilized. The lysogeny/lysis switch of the bacteriophage λ virus infecting the bacterium Escherichia coli is a representative example and has been discussed in detail by Ptashne [1] and Ptashne and Gann [25].
Moreover, by assuming the total number of unrepressed genes is much larger than that of R so that G remains constant during the process, it can be shown that the rate of production of protein is proportional to $\frac{1}{1+K{R}^{m}}$ where K is the equilibrium constant of the above reaction and R the concentration of the repressor monomer [27–29]. The second process in the model of Gardner et al. is degradation of the protein that is assumed to be first order.
Similar to the work of Gardner, we will assume that the genes are in equilibrium with their repressed genes in the current work. The stochastic nature of a competitive expression mechanism can produce probabilistic outcomes in switching mechanisms that select between alternative regulatory paths, such as toggle switch.
The master equation describing the stochastic nature of the toggle switch is developed through the probabilistic population balance. The formulation of the master equation given below follows what Oppenheim et al. [8], Gardiner [10], and van Kampen [9] established. We have previously adopted this algorithm in the analysis of disease spread [30].
Mathematical assumptions
 1.
The random vector, N(t), is Markovian, i.e., for any set of successive times, t _{1} < t _{2} < … < t _{ q }, we have P [N(t _{ q }) * N(t _{1}), N(t _{2}), , N(t _{q−1})] = P [N(t _{ q }) * N(t _{q−1})].
 2.
The number of increments or decrements in population numbers of the classes depends only on the time interval, Δt, but not time, i.e., it is temporally homogeneous, signifying that N(Δt)and [N(t + Δt) − N(t)] are identically distributed.
 3.
The probability of an individual to produce or degrade is proportional to the duration of time interval, (t, t + Δt), if the value of Δt is sufficiently small.
 4.
The probabilities of two or more transitions to take place are negligible during the time interval, (t, t + Δt), so that at most, one transition occurs during this period.
 5.
Individual proteins in the same class have the same probability of contacting the genes, and therefore, have the same probability of repressing the genes. Similarly, the individual proteins in the same class have the same probability of being degraded.
Transition intensity functions
On the basis of the assumptions given in the proceeding subsection, the transition probability of each event can be written in terms of the transition intensity functions, k_{1}, k_{2}, α_{2}, and α_{4}, as follows:
where f_{1} is the ratio of populations of active gene to total, active and repressed, genes of type1. In writing the last line of the above statement, we assume that the total number of gene remains constant during the process of interest. Thus, the parameter, α_{1}, is the probability that a particular active gene will transcribe and produce a type1 protein per unit time multiplied by the total number of genes.
K_{ b } the equilibrium constant of the combination reaction of the active gene of type2 and repressor of type1, a Mmer, and M is the number of protein monomers of type1 in the repressor.
It should be noted that the rates adopted in deterministic models and discussed earlier in the outset of the ‘Model Formulation’ section are used in defining the transition intensity functions below. The transition intensity functions have pivotal importance in master equation models and Monte Carlo simulations. More importantly, the adoption of deterministic rate constants in master equation is a cornerstone in the interpretation of intrinsic (or internal) noise van Kampen [9].
Master equations
Based on the transition intensity functions defined above, the master equation can be obtained by taking probability balance of the following five mutually exclusive events leading to the evolution of the state of the system:

a R_{1} is produced while R_{2} remains constant

a R_{1} is degraded while R_{2} remains constant

a R_{2} is produced while R_{1} remains constant

a R_{2} is degraded while R_{1} remains constant

both R_{1} and R_{2} remain the same.
The solution to the equation with the step operator yields the timedependent joint probability distribution of the populations of repressor proteins.
Systemsize expansion based on van Kampen’s procedure
The approximation of the master equation, Equation 10 or 12, leads the evolution of the joint probability distribution of the populations of the two competing repressors, P_{ n }(t). Equation 10 comprises a set of ordinary differential equations with the joint probability function, P_{ n }(t), as its unknown. Each equation in the set represents a particular outcome of n; thus, solving Equation 12 for the joint probability distribution of an exceedingly large number of all possible n s is extremely difficult, if not impossible. In practice, however, it often suffices to determine only the expressions that govern a limited number of moments, especially the first and second moments, of the resultant population distribution. These expressions yield the means, variances, and covariances that can be correlated or compared with the experimental data.
Moreover, Equation 12 is nonlinear, which prevents the moments from being evaluated by averaging techniques or joint probability generating function techniques [9]. This difficulty is circumvented by resorting to the systemsize expansion, a rational approximation technique based on the power series expansion [9, 12, 31]. The technique gives rise to the deterministic macroscopic equations as well as the equations of fluctuations for the master equation.
To apply the systemsize expansion, a suitable expansion parameter must be identified in the master equation, specifically in the transition intensity functions. The expansion parameter must govern the size of the fluctuations, and therefore, the magnitude of the jumps, or transitions. The macroscopic features are determined by the average behavior of all particles, while internal fluctuations are caused by the discrete nature of matter. Hence, we expect the fluctuations to be relatively small when the system size is large. The system size, Ω, has been proposed as an expansion parameter because it measures the relative importance of the fluctuations [9, 12, 31]. In the current genetic regulatory network, the total initial number of promoter population, or the total number of initial reactants, is chosen as Ω so that the noises estimated based on both the master equation and Monte Carlo simulations discussed below represent the standard deviations from the means.
Accordingly, the joint probability of n_{1} and n_{2} i.e., P_{ n }(t), is now transformed into that of y_{1} and y_{2}, i.e., Ψ_{ y }(t). Subsequently, the new random vector, Y, the new joint probability distribution, Ψ_{ y }(t), and the definition of the onestep operator, E, Equation 11, are substituted into Equation 12. By expanding the righthand side of the resultant expression into a Taylor’s series, the master equation in terms of the new variables is obtained, see Appendix 1. All appendices to this paper can be found in the supporting materials for this Journal.
Equations 17 and 18 are of the same forms as the macroscopic equations of Gardner [26].
A FokkerPlanck equation is considered linear if the coefficient matrix A, the drift term, is a linear function of Y and the coefficient matrix B, the diffusion term, is constant [9]. Note that the macroscopic trajectories, N and 2, are functions of t only and they can be obtained by integrating Equations 17 and 18. Thus, the coefficients of the equation governing the fluctuations, A and B in Equations 22 and 23, are independent of the fluctuations, Y. For a linear FokkerPlanck equation, the ordinary differential equations governing the means and variances of the fluctuations, Y, can be derived by taking the first and second moments of Equation 21.
System size expansion based on Kurtz’s limit theorems
where δ(n) and δ_{ i,j } are Dirac and Kronecker delta functions, respectively. The four parameters on the righthand side of Equation 33 are obtained from the definitions of transition intensity functions.
denotes a Gaussian white noise with a unit strength, and C_{ i }(ñ) denotes the effects of interactions of the noise and the system on the random variable. The discontinuity of Gaussian white noise has been the source of evolution of several algorithms in interpreting C_{ i }(ñ) during the process, and thus the conversion of a Langevin equation to its FokkerPlank counterpart. In Ito’s algorithm, the value of C_{ i }(ñ) before the arrival of white noise is used in averaging. In Stratonovich’s algorithm, the averaged value of C_{ i }(ñ) during the time of noise is used in averaging, which yields an extra term in the macroscopic part of the FokkerPlank equation. Since $L\left(\tilde{t}\right)$ is never infinitely sharp and it lasts a finite time, the Ito and Stratonovich’s calculus are more appropriate in modeling internal and external noises, respectively [9].
With this Langevin representation in hand, the equations derived in the last section, i.e., Equations 17, 18, 22, and 23, can be readily obtained. Specifically, substituting Equations 37 and 38 into Equation 43 and ignoring the noise term yields Equations 17 and 18. Since the drift coefficient in a FokkerPlanck equation, matrix A in Equation 21, is the Jacobian matrix of the functions on the righthand side Equations 17 and 18 [9], Equation 22 can be obtained by taking derivatives. Finally, it is obvious that the elements of the covariance matrix, Equations 39 through 42, are identical to those shown in Equation 23.
System size expansion based on Kurtz’s theorems is substantially simpler than the original procedure proposed by van Kampen [9]. This efficiency was previously utilized by Aparicio and Solari [32] and Chua et al. [33] in their studies of stochastic population dynamics of disease transmission and chemical vapor deposition, respectively.
It should be mentioned that the system size expansion method discussed in this and last sections suffers several limitations. Simulation with the systemsize expansion converges to the steady state within its boundary of attraction just like its deterministic counterpart, and it cannot be generate noiseinduced transition, as it will be discussed later in the simulation section [9]. The system size expansion near the steadystate boundary of attraction (i.e., away from the steady state) yields noises that are not compatible to those generated from near the steady states [18].
Simulations
The genetic toggle switch model presented in the preceding section has been simulated by two approaches. The first approach relies on the solution of the governing equations for the first and second moments of the random variables derived from the master equations. The second approach resorts to the eventdriven Monte Carlo algorithm.
Simulation based on the master equations
Equations 49, 50, and 54 through 58 can be integrated simultaneously to obtain the statistical characteristics of the dynamical processes. Equations 49 and 50 yield the means of the populations while Equations 54 and 55 yield the means of the fluctuations, which are essentially zero due to the assumption of symmetric noises around the means, i.e., Equations 13 and 14. Equations 56 through 58 generate the variance and covariance of the two constituent populations. The integration was conducted in Matlab by ode45, a subroutine based on Gear’s method for stiff sets of ordinary differential equations.
As marked in Figure 4, three possible sets of ${\alpha}_{1}^{\u2033}$ and ${\alpha}_{2}^{\u2033}$ are sufficient to characterize the different cases of population dynamics: monostable, bistable, and bifurcation. Thus, the following three sets of parameter values are chosen in our simulations for characterizing the dynamics in different regions:
Case A, in bistable region: ${\alpha}_{1}^{\u2033}$ = 15.6 and ${\alpha}_{2}^{\u2033}$ = 15.6,
Case B, on bifurcation curve: ${\alpha}_{1}^{\u2033}$ = 15.6 and ${\alpha}_{2}^{\u2033}$ = 4.0,
Case C, in monostable region: ${\alpha}_{1}^{\u2033}$ = 15.6 and ${\alpha}_{2}^{\u2033}$ = 1.2.
We assume m = 2 and M = 2 for all the simulations presented herein.
Initial protein populations are also important to the evolution of the dynamics in several aspects. It is established in nonlinear dynamics that different initial conditions could lead to different steady states, and the evolution of the dynamics may be altered significantly by small variation of initial conditions. In this work, we will demonstrate that noise could induce system transition from one steady state to another when the populations pass through the neighborhood of an unstable steady state (or the saddle node) of a bistable system. Moreover, for very small initial populations, the numerical equations become invalid as the protein values tend to become so small that they drive the bifurcation lines beyond the domain of application. Thus, the choice of initial population should be such that it is between 10s and 100 s. In the present work, the initial populations are chosen u(0) = 155 and v(0) = 154 for the three cases discussed above. To further illustrate the effects of the initial populations, a simulation is conducted with u(0) = 15, v(0) = 155, ${\alpha}_{1}^{\text{'}\text{'}}$ = 15.6, and ${\alpha}_{2}^{\u2033}$ = 15.6 for the bistable system, or Case A. The population trajectories do not pass through the neighborhood of the saddle node in this simulation, and possess no risk of noiseinduced transition.
where Ω stands for the system size, i.e., the total initial population; and − R_{ i }, the population converted attributable to transition type i protein per unit time. A detailed discussion of the relationship between the deterministic rate constant and the intensity function can be found in [9].
Simulation based on Monte Carlo simulation
Linear or nonlinear dynamic processes have been simulated either deterministically or stochastically by Monte Carlo procedures. It is worth noting that a welldeveloped class of Monte Carlo simulation procedures essentially shares identical computational bases with the master equation algorithm presented in the preceding sections. Specifically, the assumptions of Markov property and temporal homogeneity of the random variables lead to the definitions of transition intensity functions [33, 40, 41]. As discussed in the “Model Formulation” section, probability balances of various events on the basis of these intensity functions give rise to the master equations. In the Monte Carlo simulation, the system’s state is simulated by a stepwise, randomwalk scheme based on the same intensity functions.
Process systems or phenomena can be simulated by timedriven and eventdriven Monte Carlo procedures [42]. The difference between these two procedures is in the manner of updating the time clock of the evolution of the system. The timedriven procedure advances the simulation clock by a prespecified time increment, t, which is sufficiently small so that at most, one event will occur in this interval. The probability of an event occurring is determined by the nature and magnitudes of the transition intensity functions. In contrast, the eventdriven procedure updates the simulation clock by randomly generating the waiting time, τ_{ w }, which has an exponential distribution [43, 44]; this distribution signifies that a population transition takes place completely randomly. At the end of each waiting interval, one event will occur, and the state to which the system will transfer is also determined by the nature of the transition intensity functions.
The process of interest here, i.e., genetic toggle switch, has been simulated by the eventdriven procedure; it is usually computationally faster than the timedriven procedure. The simulation starts with a given initial distribution of population; the essential task is to obtain the probability distributions of the protein numbers at any subsequent times. To determine the system transition in each time step, two random numbers are generated for two different purposes. The first random number in (0, 1), i.e., r_{ 1 }, is for estimating the waiting time during which a possible transition of the system’s state will take place. The second random number in (0, 1), i.e., r_{ 2 }, is for identifying the transition type.
Waiting time
Note that H_{ n }(τ_{ w }) is the probability function of T_{ n }.
It can be verified that if the waiting time, T_{ n }, whose realization is τ_{ w }, is exponentially distributed, then the random variable, U, whose realization is u, is uniformly distributed over interval [0, 1], see Appendix 3.
Probabilities of four possible transitions
implies that the population of type2 protein increases by 1.
Simulation algorithm
 1.
Define the initial populations of the two types of proteins, and let the system size, Ω, be the sum of the two protein populations. This Ω will also be the total number of independent simulations to be conducted before taking their statistics. Start the random walk from this point.
 2.
Select the total length of time of each simulation, T _{ f } has to be selected. For the current work, T _{ f } was chosen to be either 15 or 50 s.
 3.
Determine the length of the waiting time, τ _{ w }. First, generate a random number, r _{1}, from a uniform distribution in [0, 1]; then, calculate τ _{ w }, for a system’s transition state n(t) = (n _{1}(t), n _{2}(t)) according to Equation 66.
 4.
Update the computer clock by letting t = t + τ _{ w }.
 5.
Calculate the transition probabilities that the system will transfer from state n to the other states Q _{ i }’s by Equations 67 through 70. Then, generate another random number r _{2}, from a uniform distribution in [0, 1]. Determine the transition type by examining in which interval given by Equations 71 through 74 is r _{2} located.
 6.
Repeat steps 3 to 5 until the total time exceeds T _{ f }; this terminates one replication of simulation.
 7.Repeat steps 2 to 6 for Ω times, and store the resultant number in proteins of type i during the j th replication at time t, n _{ ij }(t). This yields the mean number of proteins of type i at time t as$\mathit{E}\left[{N}_{i}\left(t\right)\right]\phantom{\rule{0.2em}{0ex}}=\phantom{\rule{0.2em}{0ex}}\frac{{\displaystyle \sum _{j=1}^{\Omega}{n}_{\mathit{ij}}}}{\Omega}$(75)
As mentioned at the outset of this section, both the Monte Carlo simulation and the simulation based on the master equations adopted in the current work are rooted in the identical set of transition intensity functions derived from the same set of assumptions. Thus, integrating the equations for the first and second moments of the master equations, Equations 54 through 58 for the process, is expected to generate results nearly identical to those from the Monte Carlo simulations, i.e., Equations 75 through 77. Equations 75 to 77 are expected to be nearly identical to Equations 54 to 58 together with 49 to 50.
Results and discussion
The present stochastic analysis of the genetic toggle switch yields the transition probabilities of mutually exclusive events through the definitions of the transition intensity functions of protein production as well as degradation. This analysis renders it possible to formulate the nonlinear master equations of the process as well as to derive the eventdriven Monte Carlo simulation. Even though each of these was simulated separately they portrayed interesting analogies.
The stochastic algorithms developed here allow us to analyze the stochastic nature of the twostate toggle switch quantitatively. The master equations governing the numbers of the two types of protein are formulated from stochastic population balance. The stochastic pathways of the two proteins, i.e., their means and the fluctuations around these means, have been numerically simulated independently by the algorithm derived from the master equations, as well as by an eventdriven Monte Carlo algorithm. Both algorithms have given rise to the identical results. Moreover, these analyses render it possible to circumvent the possibility of noiseinduced transitions.
Simulation based on the master equations
Case A, ${\alpha}_{2}^{\u2033}$ = 15.6, represents a bistable system, as marked in the bifurcation diagram in Figure 4. Figure 5 presents the simulated results of this system based on the master equations. As expected, the populations eventually reaches the stable steady state #2 marked in Figure 3 since the initial conditions consist a point below the separatrix, and our analysis of the vector field depicting the flow of dynamics [45] suggests this outcome. The protein populations decrease rapidly and stay in the proximity of the saddle node for a while before they depart for their steady states, an observation consistent with the classical dynamics. During this period, the populations of the two proteins are very similar to each other. The fluctuations around the mean trajectories increase initially from zero and then decrease when they approach the steady states. In a stable system, the standard deviation of the number of either type1 or type2 proteins attains the maximum because the state of the system is usually well defined at the outset of the process and the uncertainties decline eventually until it varnishes upon stabilization [46]. The uncertainty in the population of the type2 protein in Figure 5 appears to remain constant; a special computer experiment was conducted with long simulation time to ensure that it indeed decreases over time.
The formulator is often confronted with a myriad of interacting factors related to a gene’s expression mechanisms before settling on a strategy to assess their impact. A mathematical description of this complex process usually relies on a manageable number of system variables. This lumping procedure inevitably results in a high degree of freedom and fluctuations, or uncertainties, in the predictions of populations of discrete systems [9]. The behavior of an individual protein molecule in a discrete system with such a high degree of freedom is thus difficult to predict even when the system is monitored experimentally. The parameters in the equations, e.g., the transition intensity functions of the master equation algorithm adopted here, are presumed to depend only on the major variables of the system and to be independent of the variables of secondary importance. Neglecting these secondary variables is, in essence, the source of internal, or system, or minimal noises that should be appropriately analyzed stochastically. Thus, the internal noises caused by the discrete nature of a system are inherent in the system and they govern the minimum scattering expected of the random variable of interest. The experimentally observed scattering should always be larger than the predicted one induced by internal noises because of inevitable external noises attributable to experimental errors and imprecision of measuring devices. This implies that it is worth cautioning ourselves not to replicate the experiments excessively in an attempt to reduce the scattering far beyond what is predicted. It is interesting to note that fluctuations reported by Gardner et al. are significantly higher than what master equations will predict. The number of culture used in their fluorescence analysis was 40,000, and the actual number of culture in the sample is much larger than this number. Therefore, the noise levels reported by Gardner et al. [3] certainly involve not internal, but also external noises. External noises are the fluctuations created in an otherwise deterministic system by the application of a random force, whose stochastic properties are supposed to be known [9].
The two proteins do not have welldefined states as a deterministic model depicts when they pass the saddle node. Instead, their populations are probabilistically distributed. The two proteins have not only similar populations but also similar uncertainties in their populations. In fact, as shown in the lefthand side of Figure 5, the uncertainties in their populations are in the same order of magnitude. These characteristics imply that there is a high probability that the relative sizes of the two protein populations are switched when the system approaches the unstable steady state. This switch brings the populations to the region above the separatrix in the phase diagram in Figure 3, and the vector field in that region eventually leads the process to the steady state #1 marked in the same figure. The noiseinduced phase transition has been examined in detail by Nicolis and Turner [47], Malek Mansour et al. [48], and Horsthemke and Lefever [49]. Nicolis and Turner have shown that the fluctuations enhanced at a ‘critical point’ (populations closest to the instable steady state); the variances are of the order of Ω^{−1/2}, a result consistent with that derived by van Kampen for expanding the master equation by system size expansion. Thus, systems with low populations are more subjective to noiseinduced transitions. The noise enhancement near the instability is illuminated in Figure 5. Once the system moves away from the instability, the noises decrease and noiseinduced transition becomes more difficult. Internal fluctuations do not change the local stability of the system, and the position of transition points is in no way modified by the presence of these fluctuations.
It should be mentioned that the parameter values for our simulation are carefully chosen to illustrate the possibility of noiseinduced transition. Gardner et al. [3] did not observe this possibility probably because the populations of their system are very large and the difference between the two protein populations at the critical point is large, as discussed earlier.
Case B, ${\alpha}_{2}^{\u2033}$ = 4.0, represents system on the bifurcation line, as marked in the bifurcation diagram in Figure 4. Gardner [3] has a good exposition on the dependence of bifurcation diagram and phase diagram on the parameters. There are two steady states on the phase diagram, similar to Figure 3; one is stable and the other, unstable. Figure 6 presents the simulated results of this system based on the master equations. Similar to Case A, the populations eventually reach the stable steady state. The populations do not stay in the vicinity of the saddle node for a long time as they are in Case A. Although the two protein populations are very close to each other and the fluctuations are of the same order of the populations during this time period, one steady state characteristic guarantees the system’s final destination.
Case C, ${\alpha}_{2}^{\u2033}$ = 1.2, is a monostable system, as marked in the bifurcation diagram in Figure 4. Figure 7 presents the simulated results of this system based on the master equations. Similar to Case B, the populations eventually reach the stable steady state. Although the two protein populations have maximal fluctuations during the evolution of the dynamics, but they eventually vanish to zero.
Simulation based on Monte Carlo procedure
Monte Carlo simulations have yielded results essentially indistinguishable from those generated from the master equations. This is expected since the algorithms based on the eventdriven Monte Carlo procedure and master equations derived in the present work are rooted in identical assumptions, i.e., the Markov property and temporal homogeneity of the random variables. These assumptions lead to the definitions of transition intensity functions that are the cornerstones of the formulation of the master equations and of the Monte Carlo procedure.
The fact that the two algorithms have yielded essentially the same results implies that both indeed define the evolution of dynamic process in a precisely equivalent way. The masterequation algorithm generates the equations governing the statistical moments of the process, which can be readily varied to cover a wide range of initial conditions, whereas the Monte Carlo procedure will require far more computational time and storage space under such circumstances.
As mentioned in the ‘Introduction’ section, the master equation and its systemsize expansion suffers a few limitations. One of such limitations is that the algorithm is valid for the dynamics well within the boundary of attraction [9]. For a bistable dynamics staring in a region outside this boundary, such as Case A, the Monte Carlo simulation converges to two possible steady states. The master equation algorithm converges to only one.
Some comparisons of the three algorithms are worth mentioning. The governing equations for the system size expansion can be derived in a straightforward manner, though the detailed derivations may be cumbersome and time consuming. It requires only a minor transformation of variable for some unstable stochastic processes, such as the diffusion process, well beyond the initial transient period [9]. Unlike the Monte Carlo simulation, the derived moment equations can be repeatedly integrated for different sets of parameters and initial conditions. Consequently, system size expansion has been widely adopted in the derivation of governing equation of stochastic processes governed by internal noises.
Kurtz’s algorithm is highly compact and convenient. The implementation of the rigorous Kurtz algorithm requires knowledge about the relations among master, Langevin, and FokkerPlanck equations. It allows direct derivation of the equations governing the moments. However, the algorithm merely describes the dynamics in the initial transient period of unstable systems for selected processes, such as the diffusion process [9].
The Monte Carlo method is easy to implement because it bypasses all derivations of equations. It is most efficient when the number of random variables is large and the master equation is difficult to derive. Repeated simulations have to be carried out for different sets of parameters and initial conditions. The required computational time and disk space are usually high.
Conclusions
The current model adopts the essential concepts of a nonlinear toggle switch model for analyzing a proteinregulated system. The master equation algorithm, along with its system size expansion, involves the stochastic probability balance of the two types of populations. The resultant master equation should yield not only the deterministic evolution of protein populations during gene expression, but also the fluctuations, or uncertainties inherited in the prediction or measurement. Kurtz’s limit theorems significantly reduce the complex and laborious exercise of system size expansion. In fact, they will be indispensable tools for the analysis of really complex genetic networks.
The validity of the model is amply demonstrated by numerically calculating the evolution of population of both types and their fluctuations over time through two simulation algorithms, one based on the master equations and the other based on the eventdriven Monte Carlo procedure. These two algorithms are implemented totally independently of each other but with the same set of system parameters, i.e., the transition intensity functions. Hence, it is indeed remarkable that the two algorithms have yielded essentially identical results.
Both simulation results demonstrate the possibility of noiseinduced transition when the dynamics passes through the proximity of the saddle node. It happens when the protein populations are low and the noises are in the same order of magnitudes as the populations. This property may have practical applications in developing gene therapy, cell cycle control, and protein sensors.
Nomenclature
E, onestep operator; N_{ 1 }, random variable representing population of repressor 1; N_{ 2 }, random variable representing population of repressor 2; n_{ 1 }, realization of random variable, N_{ 1 }(t); n_{ 2 }, realization of random variable, N_{ 2 }(t); N, random vector, i.e., [N_{ 1 }(t),N_{ 2 }(t),N_{ 3 }(t)]; n, realization of random vector N (t); P_{ n }, probability that the system is at state n at time t; t, time; Y, random variable denoting the fluctuations about macroscopic behavior; y, realization of random variable Y fluctuations; Q, the transition probability; u, Gardner’s concentration of repressor 1; v, Gardner’s concentration of repressor 2; K_{1,} effective reaction rate for repressor 1 formation; K_{2,} effective reaction rate for repressor 2 formation.
Greek letters
α_{1}, the rate of production of repressor 1; α_{2}, the rate of production of repressor 2; α_{3}, the rate of degradation of repressor 1; α_{4}, the rate of degradation of repressor 2; ${\alpha}_{1}^{\prime}$, the effective rate of synthesis of repressor 1 on system size; ${\alpha}_{2}^{\prime}$, the effective rate of synthesis of repressor 2 on system size; ∅, macroscopic number of repressor 1; θ, macroscopic number of repressor 2; τ_{ w }, the waiting time; λ, transition intensity function; Ψ, joint probability distribution in terms of random vector Y ; Ω, total number of repressors or system size; β, the multimerization constant of repressor 1; γ, the multimerization constant of repressor 2; Θ, the vector representing the two mean numbers of proteins in systemsize expansion.
Subscripts
1, repressor 1; 2, repressor 2
Appendices
Appendix 1: systemsize expansion
Accordingly, the joint probability of n_{1} and n_{2}, P_{ n }(t), is now transformed into that of y_{1} and y_{2}, i.e., Ψ_{ y }(t).
where ${K}_{a}^{\prime}$ and ${K}_{b}^{\prime}$ are independent of the system size, Ω.
Equations 103, 104, and 105 are Equations 21, 22, and 23 in the text, respectively.
Appendix 2: distribution functions of waiting time
Chapter 14 of the book by Karlin and Taylor [43, 44] contains a more rigorous proof of Equation 112.
Equations 112 and 115 are the Equations 61 and 64 in the text, respectively. The latter signifies that the probability density function for the population to make the next transition is exponentially distributed.
Appendix 3: random number transformation
Obviously, u = 0 at τ_{ w } = 0 and u = 1 when τ_{ w } → ∞. We are to prove that the random variable, U, whose realization u, is uniformly distributed in [0, 1].
Equations 117 and 118 signifies that the probabilitydensity functions of T_{ n }, i.e., h_{ n }(τ_{ w }), is transformed into that of U, i.e., f(u), such that the probability represented by h(τ_{ w })dτ_{ w } and that represented by f(u)du are identical.
Declarations
Acknowledgements
Professor Michael Mossing of the Department of Chemistry and Biochemistry of the University of Mississippi provided valuable advices during this study. Assad Mohammed and Oluseye Adeyemi provided valuable technical supports for the completion of this work.