 Research
 Open access
 Published:
Stochasticity and noiseinduced transition of genetic toggle switch
Journal of Uncertainty Analysis and Applications volume 2, Article number: 1 (2014)
Abstract
The ability to predict and analyze the function of genetic circuits will enhance the design of autonomous, programmable, complex regulatory genetic structures. An abundance of modeling techniques has recently been developed to delineate simple genetic structures in terms of their constituents. Simple systems with characteristics of feedback inhibition, multistability, switching, and oscillatory expression have often been the focus. The present work is an attempt to improve existing deterministic models that fail to oblige to the crucial aspect of noise in genetic modeling.
The objective of this work is to analyze, model, and simulate the protein populations in gene expression mechanisms by resorting to stochastic algorithms. The system involves two types of genes; the protein produced from the expression of one gene is capable of turning off the expression of the other gene. Rates of degradation of these proteins are assumed to be proportional to their concentrations. The master equation of this ‘genetic toggle switch’ is formulated using the probabilistic population balance around a particular state and by considering five mutually exclusive events. The efficacy of the present methodology is mainly attributable to the ability to derive the governing equations for the means, variances, and covariance of the random variables by the method of systemsize expansion of the nonlinear master Equation. A less laborious approach based on Kurtz’s limit theorems for the derivation of the stochastic characteristics is also presented for comparison. Solving the resultant ordinary differential equations governing the means, variances, and covariance of the master equations simultaneously using the published data yield information concerning not only the means of the two populations of proteins but also the minimal uncertainties of the populations inherent in the expressions. It is demonstrated that systems with small populations are susceptible to large internal fluctuations (or uncertainties) in their population evolution. Large uncertainties are observed after the populations enter the proximity of the saddle node, which is likely to cause transition of system’s steady state from one to another. Independent MonteCarlo simulation runs clearly demonstrates that the occurrence of such internal noiseinduced transition.
Introduction
One of the earliest examples of a bistable genetic switch is represented in the rightward operator of bacteriophage lambda [1, 2]. The essential elements of this type of genetic switch, are a pair of promoters that each produces a repressor protein capable of inhibiting the production of the opposing repressor. Overlayed on these essential elements are several layers of regulatory nuance. To elucidate the impacts of these essential elements of a simplified regulatory circuit, a series of synthetic toggle switches were created.
Figure 1 shows the twostate genetic toggle switch consisting of two protein repressor genes and two promoters, which was investigated by Gardner et al. [3]. Each promoter enables the production of one repressor and is inhibited by the other. They elegantly designed experiments that demonstrated switching of a toggle circuit from one steady state to another by switching system’s parameters across the bifurcation curve to a bistable region through either thermal inactivation of Repressor A or ligand bindinginduced dissociation of the Repressor BDNA complex. In the proximity of the bifurcation point, the final steadystate protein population possesses a bimodal distribution in their green fluorescent protein (GFP) fluorescence. It does not have a sharp jump from one fluorescence level to another, as the deterministic model predicts. The authors surmise that the stochastic nature of the dynamics blurs the bifurcation point.
McAdams and Arkin’s [4, 5] MonteCarlo simulations of gene expression revealed the importance of fluctuations, or noises or uncertainties, of small systems. In such small systems, proteins are produced from an activated promoter in short bursts of variable numbers of proteins that occur at random time intervals. As a result, there can be large differences in the time between successive events in regulatory cascades across a cell population, which, in turn, creates both special and temporal heterogeneity of cell populations in biological systems. Soon after the discovery of the potential impacts of the stochasticity of genetic regulatory system, stochastic algorithms developed by chemical physicists have been introduced in analyzing gene expression (e.g., [6, 7]). The stochastic nature of a competitive expression mechanism can produce probabilistic outcomes in switching mechanisms that select between alternative regulatory paths, such as toggle switch.
Stochastic algorithms have been developed for analyzing noise of different origins and internal and external noises (e.g., [8–10]). External noises are the fluctuations created in an otherwise deterministic system by the application of an external random force, whose stochastic properties are supposed to be known. A Langevine equation is commonly adopted in the analysis of dynamics caused by external noises. Internal noise arises from discrete systems where only a limited number of variables affecting the populations of the discrete entities can be included in the analysis. Small discrete systems, such as genes of small populations, often exhibit notable internal fluctuations. A master equation, derived from probabilistic population balance around a particular state of the system by taking into account all mutually exclusive events, has been adopted this type of discrete state, continuoustime stochastic processes.
The stochasticity of gene expression is complicated by its nonlinearity. Multiple steady states, stability, and bifurcation in gene expressions (e.g., [11]) could mingle with the analysis of noise, or fluctuations. The efficacy of the master equation algorithm in gene expression is mainly attributable to its powerful ability to solve the nonlinear master equations through the system size expansion [9, 12]. In this approach, a suitable expansion parameter must be identified in the master equation. The expansion parameter represents the size of the fluctuations, and therefore, the magnitude of the jumps, or transitions, of system’s state. Since the internal noises are expected to be low when the system size is large, the system size has been proposed as an expansion parameter. Master equation formulation along with the systemsize expansion has indeed applied to the analysis of noise in gene expression. It should be mentioned that the limit theorems of Kurtz [13–15] have rendered the complex procedure of system size expansion simple and highly accessible. Kurtz’s proof demonstrated the solution of a Langevine equation approaches to van Kampen’s system size expansion as the system size approaches to infinite.
Kepler and Elston [16] examined the stochastic dynamics of the singlegene system with and without feedback and a switching system composed of two mutually repressed genes. Several assumptions were made in their simplified model: the two genes share the same operator and same degradation rate, proteins bind to the operator as dimers, and rate of dimerization is fast. Both master equation and MonteCarlo simulation were adopted in their study. Scott et al. [17] adopted the master equation along with the system size expansion algorithm in the estimation of internal noise of the singlegene system that involves the mRNA formation and degradation and protein formation and degradation.
The system size expansion has several limitations in modeling the gene regulatory process. It is a good approximation to the master equation for small internal noise and large system size. Moreover, the noise should be well within the boundary of attraction [9]. Thus, noises in oscillatory process and those away from the steady states have been a focus of several studies. Tao et al. [18] studied the noise far from the steady states and revealed that during the approach to equilibrium, the noise is not always reduced by the strength of the feedback. This is contrary to results seen in the equilibrium limit which show decreased noise with feedback strength. Ito and Uchida [19] found that the internal noise of a regulatory singlegene system grows without bound in oscillatory networks and developed an alternative method for estimating the evolution of internal noise in such systems.
Kepler and Elston’s simulation work [16] demonstrated that simple noisy genetic switch have rich bifurcation structures. Among them, bifurcations driven solely by changing the rate of operator fluctuations even as the underlying deterministic system remains unchanged. They find stochastic bistability where the deterministic equations predict monostability and vice versa. OchabMarcinek [20] investigated the stationary behavior of a nonlinear system, a reduced, deterministic Yildirim and Mackey [21] model of the gene regulatory system, and discovered the transition of a steady state induced by noise. A perturbed Gaussian white noise term was introduced in the deterministic model followed by numerical simulations. Turcotte et al. [22] studied noiseinduced stabilization of an unstable state of a genetic switch that undergoes a variety of bifurcations in response to parameter changes. Their Monte Carlo simulations showed that near one such bifurcation, noise induces oscillations around an unstable spiral point and thus effectively stabilizes this unstable fixed point.
In addition to the master equation algorithm, Monte Carlo simulation has been adopted in simulating the dynamic behaviors in genetic regulatory systems under the influences of internal noise (e.g., [11, 23, 24]). The Monte Carlo simulation shares the same assumption, the Markov property, as the master equation, and the noise can be obtained directly from master equation’s deterministic counterpart. Moreover, the Monte Carlo simulation is capable of revealing the various characteristics of nonlinear dynamic system, such as the number of steady states, bifurcation, and internal noises.
In this expositional work, the master equations are formulated by stochastic population balance. Van Kampen’s system size expansion of the resultant nonlinear master equation gives rise to the variances of the processes. We demonstrate the implementation of Kurtz’s limit theorems can efficiently achieve the same goal. Simulations are conducted based on both the master equations and the Monte Carlo procedure for three systems: bistable, monostable, and on the bifurcation curve. Finally, we demonstrate the possibility of transition induced by internal noises for a bistable system.
Model formulation
A genetic toggle switch with negative feedback to the genes consists of two mutually coupled genes. The transcription products of these genes are two inhibitory repressor proteins competing to shut off the production of two constitutive promoters [1, 3, 25]; the protein transcribed by a gene of one type is capable of deactivating the transcription of the other gene. A toggle switch typically has more than one possible stable steady state depending on the reaction parameters under consideration [3]. There are a number of instances in nature where this switchlike behavior is utilized. The lysogeny/lysis switch of the bacteriophage λ virus infecting the bacterium Escherichia coli is a representative example and has been discussed in detail by Ptashne [1] and Ptashne and Gann [25].
Gardner [26] discussed results generated from their deterministic model of a negative feedback toggle switch. Each type of the repressor protein is involved in two types of processes. The first process corresponds to the production of the protein. The rate of protein production is proportional to the concentration of mRNA, which, in turn, is proportional to the concentration of the unrepressed gene, G. The repressor binding on unrepressed gene is commonly assumed to be in a quasisteady state with the repressor, R, and the repressed gene, GR_{ m }, i.e.,
Moreover, by assuming the total number of unrepressed genes is much larger than that of R so that G remains constant during the process, it can be shown that the rate of production of protein is proportional to \frac{1}{1+K{R}^{m}} where K is the equilibrium constant of the above reaction and R the concentration of the repressor monomer [27–29]. The second process in the model of Gardner et al. is degradation of the protein that is assumed to be first order.
Similar to the work of Gardner, we will assume that the genes are in equilibrium with their repressed genes in the current work. The stochastic nature of a competitive expression mechanism can produce probabilistic outcomes in switching mechanisms that select between alternative regulatory paths, such as toggle switch.
The master equation describing the stochastic nature of the toggle switch is developed through the probabilistic population balance. The formulation of the master equation given below follows what Oppenheim et al. [8], Gardiner [10], and van Kampen [9] established. We have previously adopted this algorithm in the analysis of disease spread [30].
Mathematical assumptions
Let the random variables, N_{1}(t) and N_{2}(t) represent the populations of the repressor protein R_{1} and repressor protein R_{2} at time t, respectively. The random vector of the system is N(t) such that N(t) = [N_{1}(t), N_{2}(t)] and the realization of this random vector representing the state of the system at time t is given by n(t), where n(t) = [n_{1}(t), n_{2}(t)]. Moreover, the probability of the system to be in state n at time t is denoted by P_{n 1,n 2}(t) or P[n_{1}(t), n_{2}(t);t]. The following assumptions are imposed in driving the master equation governing the transition of the system among various states.

1.
The random vector, N(t), is Markovian, i.e., for any set of successive times, t _{1} < t _{2} < … < t _{ q }, we have P [N(t _{ q }) * N(t _{1}), N(t _{2}), , N(t _{q−1})] = P [N(t _{ q }) * N(t _{q−1})].

2.
The number of increments or decrements in population numbers of the classes depends only on the time interval, Δt, but not time, i.e., it is temporally homogeneous, signifying that N(Δt)and [N(t + Δt) − N(t)] are identically distributed.

3.
The probability of an individual to produce or degrade is proportional to the duration of time interval, (t, t + Δt), if the value of Δt is sufficiently small.

4.
The probabilities of two or more transitions to take place are negligible during the time interval, (t, t + Δt), so that at most, one transition occurs during this period.

5.
Individual proteins in the same class have the same probability of contacting the genes, and therefore, have the same probability of repressing the genes. Similarly, the individual proteins in the same class have the same probability of being degraded.
Transition intensity functions
On the basis of the assumptions given in the proceeding subsection, the transition probability of each event can be written in terms of the transition intensity functions, k_{1}, k_{2}, α_{2}, and α_{4}, as follows:
The first transitionintensity function, k_{1}, is the production probability of a type1 repressor protein from a particular active (not repressed) gene of type1 per unit time. Based on the assumption of temporal homogeneity, we have
where \underset{\mathit{\Delta t}\to 0}{lim}\frac{o\left(\mathit{\Delta t}\right)}{\mathit{\Delta t}}=0. By considering all active type1 genes in the system, the probability that the population of the type1 protein will increase by one is k_{1}G_{a 1}, where G_{a 1} denotes the number of active gene of type1, i.e., the genes that are not repressed. Mathematically,
where f_{1} is the ratio of populations of active gene to total, active and repressed, genes of type1. In writing the last line of the above statement, we assume that the total number of gene remains constant during the process of interest. Thus, the parameter, α_{1}, is the probability that a particular active gene will transcribe and produce a type1 protein per unit time multiplied by the total number of genes.
For a negative feedback genetic circuit, Goodwin [27, 28] and Griffith [29] showed that
where K_{ a } is the equilibrium constant of the combination reaction of the active gene of type1 and repressor, a mmer, and m is the number of protein monomers of type2 in the repressor. Combining the last two equations yields
The second transition intensity function, α_{2}, is the overall consumption probability of a particular active protein of type1 in time interval, (t, t + Δt), including its function in repressing protein type2. Mathematically,
By considering all repressor protein of type1 genes in the system, the probability that the population of the type1 protein will decrease by one is α_{2}n_{1}, or,
By analogy, the third transition intensity function, k_{2}, is the production probability of a type2 repressor protein from a particular active (not repressed) gene of type2 per unit time, or,
This definition will lead to the following transition probability:
where G_{a 2} denotes the number of active gene of type2, f_{2} the ratio of populations of active gene to total, active and repressed, genes of type2, or,
K_{ b } the equilibrium constant of the combination reaction of the active gene of type2 and repressor of type1, a Mmer, and M is the number of protein monomers of type1 in the repressor.
Also by analogy, the fourth transitionintensity function, α_{4}, is the consumption probability of a particular active protein of type2 during the time interval, (t, t + Δt), or,
By considering all repressor protein of type2 genes in the system, we have,
It should be noted that the rates adopted in deterministic models and discussed earlier in the outset of the ‘Model Formulation’ section are used in defining the transition intensity functions below. The transition intensity functions have pivotal importance in master equation models and Monte Carlo simulations. More importantly, the adoption of deterministic rate constants in master equation is a cornerstone in the interpretation of intrinsic (or internal) noise van Kampen [9].
Master equations
Based on the transition intensity functions defined above, the master equation can be obtained by taking probability balance of the following five mutually exclusive events leading to the evolution of the state of the system:

a R_{1} is produced while R_{2} remains constant

a R_{1} is degraded while R_{2} remains constant

a R_{2} is produced while R_{1} remains constant

a R_{2} is degraded while R_{1} remains constant

both R_{1} and R_{2} remain the same.
As illustrated in Figure 2, the probabilities that these five exclusive events will lead the system to state n at arbitrary time (t + Δt) can be written as follows:
where {W}_{\mathit{n},\phantom{\rule{0.2em}{0ex}}{\mathit{n}}^{\mathit{\prime}}}\left(t\right) is the conditional probability of the system transition from state n′(t) to state n(t + Δt) per unit time.
Since these five events are mutually exclusive, we have
By substituting all the transition probabilities discussed in Equations 5 through 9 into the above expression, we obtain the probability of the system at state n at arbitrary time (t + Δt) as follows:
By rearranging the above equation and taking the limit as Δt → 0, we obtain the following master equation:
For convenience, the onestep operator, E, is defined through its effect on arbitrary function f(n) as van Kampen [9]:
The master equation is rewritten compactly in terms of the onestep operator as follows:
The solution to the equation with the step operator yields the timedependent joint probability distribution of the populations of repressor proteins.
Systemsize expansion based on van Kampen’s procedure
The approximation of the master equation, Equation 10 or 12, leads the evolution of the joint probability distribution of the populations of the two competing repressors, P_{ n }(t). Equation 10 comprises a set of ordinary differential equations with the joint probability function, P_{ n }(t), as its unknown. Each equation in the set represents a particular outcome of n; thus, solving Equation 12 for the joint probability distribution of an exceedingly large number of all possible n s is extremely difficult, if not impossible. In practice, however, it often suffices to determine only the expressions that govern a limited number of moments, especially the first and second moments, of the resultant population distribution. These expressions yield the means, variances, and covariances that can be correlated or compared with the experimental data.
Moreover, Equation 12 is nonlinear, which prevents the moments from being evaluated by averaging techniques or joint probability generating function techniques [9]. This difficulty is circumvented by resorting to the systemsize expansion, a rational approximation technique based on the power series expansion [9, 12, 31]. The technique gives rise to the deterministic macroscopic equations as well as the equations of fluctuations for the master equation.
To apply the systemsize expansion, a suitable expansion parameter must be identified in the master equation, specifically in the transition intensity functions. The expansion parameter must govern the size of the fluctuations, and therefore, the magnitude of the jumps, or transitions. The macroscopic features are determined by the average behavior of all particles, while internal fluctuations are caused by the discrete nature of matter. Hence, we expect the fluctuations to be relatively small when the system size is large. The system size, Ω, has been proposed as an expansion parameter because it measures the relative importance of the fluctuations [9, 12, 31]. In the current genetic regulatory network, the total initial number of promoter population, or the total number of initial reactants, is chosen as Ω so that the noises estimated based on both the master equation and Monte Carlo simulations discussed below represent the standard deviations from the means.
For a linear system, fluctuations are of the order of Ω^{1/2} in a collection of Ω entities. As a result, their effect on the macroscopic properties is of the order of Ω^{−1/2}[9, 12]. In the system under consideration, therefore, we expect that the joint probability, P_{ n }(t), will have a sharp maximum around the macroscopic value, n(t) = ΩΘ(t), with a width of the order of Ω^{1/2}. Here, Θ(t) is a vector where elements are the mean numbers of the two protein populations, ∅(t) and θ(t) obtained through the solution of the macroscopic equations as will be elaborated later. To exploit these characteristics of the system, a new random vector Y(t) is defined as follows:
The equations of realizations of these expressions are given, respectively, by
Accordingly, the joint probability of n_{1} and n_{2} i.e., P_{ n }(t), is now transformed into that of y_{1} and y_{2}, i.e., Ψ_{ y }(t). Subsequently, the new random vector, Y, the new joint probability distribution, Ψ_{ y }(t), and the definition of the onestep operator, E, Equation 11, are substituted into Equation 12. By expanding the righthand side of the resultant expression into a Taylor’s series, the master equation in terms of the new variables is obtained, see Appendix 1. All appendices to this paper can be found in the supporting materials for this Journal.
Collecting the terms of order Ω^{1/2} in the righthand side of the expanded equation gives rise to the following expressions governing the evolution of the macroscopic equation of the system:
where the constants, α_{1}^{′}, K_{ a }^{′}, α_{3}^{′}, and K_{ b }^{′} correspond respectively to the parameters α_{1}, K_{ a }, α_{3}, and K_{ b }, normalized with Ω or a specific power of Ω so that collected terms in system size expansion have the same order of magnitude, i.e.,
Equations 17 and 18 are of the same forms as the macroscopic equations of Gardner [26].
Similarly, by collecting the terms of order Ω^{0} gives rise to the following linear FokkerPlank equation [9], see Appendix 1, that governs the first and the second moments associated with the fluctuations of the system:
where the two matrices A and B are
A FokkerPlanck equation is considered linear if the coefficient matrix A, the drift term, is a linear function of Y and the coefficient matrix B, the diffusion term, is constant [9]. Note that the macroscopic trajectories, N and 2, are functions of t only and they can be obtained by integrating Equations 17 and 18. Thus, the coefficients of the equation governing the fluctuations, A and B in Equations 22 and 23, are independent of the fluctuations, Y. For a linear FokkerPlanck equation, the ordinary differential equations governing the means and variances of the fluctuations, Y, can be derived by taking the first and second moments of Equation 21.
Taking the first moment of Equation 21 yields the expression governing the mean of the fluctuations, Y:
By substituting Equations 22 and 23 into the above expression gives rise to
Similarly, taking the second moment of Equation 21 yields the expression governing the second moment of the fluctuations, Y:
By substituting Equations 22 and 23 into the above expression gives rise to
System size expansion based on Kurtz’s limit theorems
The approximation of the master equation discussed in the preceding section, i.e., system size expansion method, can be derived and stated compactly in a general form based on Kurtz’s limit theorems [13–15] under the condition Ω → ∞. First, the master equation, Equation 11, can be written in the following continuous state, gainloss form [9]:
where W(n;n + r) is the transition probability from state n to state n + r per unit time. Both n and r in Equation 29 are now treated as continuous varying vectors. The convergence of the system size expansion procedure relies on two criteria for transition probability rate: small jump and slow varying [9]. Mathematically, the smalljump criterion implies that there is a small δ so that
and the slow varying assumption means that there is a small δ so that
To satisfy these criteria, the unit jumps associated with the mutually exclusive events in the formulation of the master equation are replaced by jumps of size Ω^{−1}, the system size or the largeness parameter. Thus, the random vector N(t) = (n_{1}(t), n_{2}(t)) is replaced by \tilde{\mathit{N}}\left(t\right)\phantom{\rule{0.2em}{0ex}}=\phantom{\rule{0.2em}{0ex}}\raisebox{1ex}{$\mathit{N}\left(t\right)$}\!\left/ \!\raisebox{1ex}{$\Omega $}\right. and time is replaced by \tilde{\mathit{t}}\phantom{\rule{0.2em}{0ex}}=\phantom{\rule{0.2em}{0ex}}\raisebox{1ex}{$t$}\!\left/ \!\raisebox{1ex}{$\Omega $}\right.. The resultant master equation of Equation 29 becomes
Comparing Equations 10 and 29 yields the transition probability per unit time for the current problem can be stated in the following form:
where δ(n) and δ_{ i,j } are Dirac and Kronecker delta functions, respectively. The four parameters on the righthand side of Equation 33 are obtained from the definitions of transition intensity functions.
Kurtz’s limit theorems state that, as Ω → ∞ with an error of O(lnΩ/Ω), the statistical properties of the master equation, Equation 32, can be approximated by the following FokkerPlanck equation:
where the deterministic drift, {K}_{i}^{\left(1\right)\infty}\left(\tilde{\mathit{n}}\right), and diffusion coefficients, {K}_{i\phantom{\rule{0.2em}{0ex}}j}^{\left(2\right)\infty}\left(\tilde{\mathit{n}}\right), are
Substituting Equation 33 into Equation 35 yields the first moments of \tilde{W}:
Similarly, substituting Equation 33 into Equation 36 yields the following second moments of \tilde{W}:
The approximation of the master equation, Equation 12, can be found base on the fact that the FokkerPlanck equation, Equation 34, can be obtained by integrating the following nonlinear Langevin equation in Ito’s interpretation [9]
where the first term on the righthand side of the above equation represents the deterministic, or macroscopic characteristic of the process, {\eta}_{i}\left(\tilde{\mathit{n}},\phantom{\rule{0.2em}{0ex}}\tilde{t}\right) denotes a Gaussian white noise having the following means and covariance matrix
denotes a Gaussian white noise with a unit strength, and C_{ i }(ñ) denotes the effects of interactions of the noise and the system on the random variable. The discontinuity of Gaussian white noise has been the source of evolution of several algorithms in interpreting C_{ i }(ñ) during the process, and thus the conversion of a Langevin equation to its FokkerPlank counterpart. In Ito’s algorithm, the value of C_{ i }(ñ) before the arrival of white noise is used in averaging. In Stratonovich’s algorithm, the averaged value of C_{ i }(ñ) during the time of noise is used in averaging, which yields an extra term in the macroscopic part of the FokkerPlank equation. Since L\left(\tilde{t}\right) is never infinitely sharp and it lasts a finite time, the Ito and Stratonovich’s calculus are more appropriate in modeling internal and external noises, respectively [9].
With this Langevin representation in hand, the equations derived in the last section, i.e., Equations 17, 18, 22, and 23, can be readily obtained. Specifically, substituting Equations 37 and 38 into Equation 43 and ignoring the noise term yields Equations 17 and 18. Since the drift coefficient in a FokkerPlanck equation, matrix A in Equation 21, is the Jacobian matrix of the functions on the righthand side Equations 17 and 18 [9], Equation 22 can be obtained by taking derivatives. Finally, it is obvious that the elements of the covariance matrix, Equations 39 through 42, are identical to those shown in Equation 23.
System size expansion based on Kurtz’s theorems is substantially simpler than the original procedure proposed by van Kampen [9]. This efficiency was previously utilized by Aparicio and Solari [32] and Chua et al. [33] in their studies of stochastic population dynamics of disease transmission and chemical vapor deposition, respectively.
It should be mentioned that the system size expansion method discussed in this and last sections suffers several limitations. Simulation with the systemsize expansion converges to the steady state within its boundary of attraction just like its deterministic counterpart, and it cannot be generate noiseinduced transition, as it will be discussed later in the simulation section [9]. The system size expansion near the steadystate boundary of attraction (i.e., away from the steady state) yields noises that are not compatible to those generated from near the steady states [18].
Simulations
The genetic toggle switch model presented in the preceding section has been simulated by two approaches. The first approach relies on the solution of the governing equations for the first and second moments of the random variables derived from the master equations. The second approach resorts to the eventdriven Monte Carlo algorithm.
Simulation based on the master equations
To effectively analyze the impact of system parameters, the equations governing the first and second moments are converted to dimensionless forms. Following Gardner’s procedure [3], we introduce the following variables, with the assumption α_{2} = α_{4}:
Substituting these three variables into Equations 17 and 18 yields the following compact set of equations:
where
When the effective rates of synthesis of the two proteins are comparable, we have
Then Equations 24 through 28 can be transformed into the following compact forms
Equations 49, 50, and 54 through 58 can be integrated simultaneously to obtain the statistical characteristics of the dynamical processes. Equations 49 and 50 yield the means of the populations while Equations 54 and 55 yield the means of the fluctuations, which are essentially zero due to the assumption of symmetric noises around the means, i.e., Equations 13 and 14. Equations 56 through 58 generate the variance and covariance of the two constituent populations. The integration was conducted in Matlab by ode45, a subroutine based on Gear’s method for stiff sets of ordinary differential equations.
As we will demonstrate later, some of the simulation results, including noiseinduced transitions, depend on the parameter values and initial conditions, which, in turn, are closely related to the properties of the deterministic system, i.e., Equations 49 and 50. For a nonlinear system governed by Equations 49 and 50, the location of the parameters {\alpha}_{1}^{\u2033} and {\alpha}_{2}^{\u2033} in the bifurcation diagram and the initial population in the phase diagram have significant effects on the evolution of system’s state. In order to analyze the process under selected conditions, the values of the four parameters used for simulation, {\alpha}_{1}^{\u2033}, {\alpha}_{2}^{\u2033}, m, and M, are taken from published experimental results [3, 5, 34, 35] as well as the inference that can be drawn from the phase and bifurcation diagrams. A thorough review of the protein and mRNA reaction rates involved in the control mechanism can be found in Santallin and Mackey [36]. The values of several of these variables can also be found in other regulatory modeling literature [7, 37–39]. As shown in Figure 3, for m = M = 2 and {\alpha}_{1}^{\u2033}={\alpha}_{2}^{\u2033}=15.6 the traces of (u, v) by setting the righthand sides of Equations 49 and 50 being zero yield with three interceptions. Liapunov stability analysis reveals that two of these steady states are stable, and the one in the middle is unstable, i.e., a saddle node. The bifurcation analysis, for m = M = 2, illustrates that the system has one or two stable steady states depending on the values of {\alpha}_{1}^{\u2033} and {\alpha}_{2}^{\u2033} (see Figure 4 and [3]).
As marked in Figure 4, three possible sets of {\alpha}_{1}^{\u2033} and {\alpha}_{2}^{\u2033} are sufficient to characterize the different cases of population dynamics: monostable, bistable, and bifurcation. Thus, the following three sets of parameter values are chosen in our simulations for characterizing the dynamics in different regions:
Case A, in bistable region: {\alpha}_{1}^{\u2033} = 15.6 and {\alpha}_{2}^{\u2033} = 15.6,
Case B, on bifurcation curve: {\alpha}_{1}^{\u2033} = 15.6 and {\alpha}_{2}^{\u2033} = 4.0,
Case C, in monostable region: {\alpha}_{1}^{\u2033} = 15.6 and {\alpha}_{2}^{\u2033} = 1.2.
We assume m = 2 and M = 2 for all the simulations presented herein.
Initial protein populations are also important to the evolution of the dynamics in several aspects. It is established in nonlinear dynamics that different initial conditions could lead to different steady states, and the evolution of the dynamics may be altered significantly by small variation of initial conditions. In this work, we will demonstrate that noise could induce system transition from one steady state to another when the populations pass through the neighborhood of an unstable steady state (or the saddle node) of a bistable system. Moreover, for very small initial populations, the numerical equations become invalid as the protein values tend to become so small that they drive the bifurcation lines beyond the domain of application. Thus, the choice of initial population should be such that it is between 10s and 100 s. In the present work, the initial populations are chosen u(0) = 155 and v(0) = 154 for the three cases discussed above. To further illustrate the effects of the initial populations, a simulation is conducted with u(0) = 15, v(0) = 155, {\alpha}_{1}^{\text{'}\text{'}} = 15.6, and {\alpha}_{2}^{\u2033} = 15.6 for the bistable system, or Case A. The population trajectories do not pass through the neighborhood of the saddle node in this simulation, and possess no risk of noiseinduced transition.
It should be mentioned that the process of interest is characterized by the transition intensity functions, k_{1}, k_{2}, α_{2}, and α_{4}, defining the probabilities of transitions of each type of population per unit time. If the fraction of population converted per unit time is taken to represent the intensity function, its significance is equivalent to the deterministic rate constant of the specific rate. In other words, from the change in the population of a particular protein type i due to the conversion of type i during the time interval, (t, t + Δt), we have
where Ω stands for the system size, i.e., the total initial population; and − R_{ i }, the population converted attributable to transition type i protein per unit time. A detailed discussion of the relationship between the deterministic rate constant and the intensity function can be found in [9].
Simulation based on Monte Carlo simulation
Linear or nonlinear dynamic processes have been simulated either deterministically or stochastically by Monte Carlo procedures. It is worth noting that a welldeveloped class of Monte Carlo simulation procedures essentially shares identical computational bases with the master equation algorithm presented in the preceding sections. Specifically, the assumptions of Markov property and temporal homogeneity of the random variables lead to the definitions of transition intensity functions [33, 40, 41]. As discussed in the “Model Formulation” section, probability balances of various events on the basis of these intensity functions give rise to the master equations. In the Monte Carlo simulation, the system’s state is simulated by a stepwise, randomwalk scheme based on the same intensity functions.
Process systems or phenomena can be simulated by timedriven and eventdriven Monte Carlo procedures [42]. The difference between these two procedures is in the manner of updating the time clock of the evolution of the system. The timedriven procedure advances the simulation clock by a prespecified time increment, t, which is sufficiently small so that at most, one event will occur in this interval. The probability of an event occurring is determined by the nature and magnitudes of the transition intensity functions. In contrast, the eventdriven procedure updates the simulation clock by randomly generating the waiting time, τ_{ w }, which has an exponential distribution [43, 44]; this distribution signifies that a population transition takes place completely randomly. At the end of each waiting interval, one event will occur, and the state to which the system will transfer is also determined by the nature of the transition intensity functions.
The process of interest here, i.e., genetic toggle switch, has been simulated by the eventdriven procedure; it is usually computationally faster than the timedriven procedure. The simulation starts with a given initial distribution of population; the essential task is to obtain the probability distributions of the protein numbers at any subsequent times. To determine the system transition in each time step, two random numbers are generated for two different purposes. The first random number in (0, 1), i.e., r_{ 1 }, is for estimating the waiting time during which a possible transition of the system’s state will take place. The second random number in (0, 1), i.e., r_{ 2 }, is for identifying the transition type.
Waiting time
Let T_{ n } be the random variable representing the waiting time of the population of the system of interest at state n prior to its transition due to the transformation of a protein production or consumption. τ_{ w } is the realization of T_{ n }. Moreover, let G_{ n }(τ_{ w }) be the probability that no transition takes place during τ_{ w }. Thus,
This can be expressed as (see derivation in Appendix 2)
The complement of G_{ n }(τ_{ w })
expresses the cumulative probability distribution of T_{ n } up to τ_{ w }. The probability density function of T_{ n }, i.e.,
Therefore, h_{ n }(τ_{ w }) has the following exponential form (see Appendix 2)
Note that H_{ n }(τ_{ w }) is the probability function of T_{ n }.
Equation 64 indicates that to estimate the waiting time of a proteinregulated gene expression, τ_{ w }, a sequence of exponentially distributed random numbers must be generated. The sequences of the computergenerated random numbers, however, are usually uniformly distributed in interval [0, 1]. This uniform distribution, therefore, need be transformed into the exponential distribution, which can be accomplished by defining a new random variable, denoted by U, whose realization, denoted by u, assumes the value of H_{ n }(τ_{ w }) at τ_{ w }[43, 44], i.e.,
or, inversely,
It can be verified that if the waiting time, T_{ n }, whose realization is τ_{ w }, is exponentially distributed, then the random variable, U, whose realization is u, is uniformly distributed over interval [0, 1], see Appendix 3.
Probabilities of four possible transitions
After residing in state n = (n_{1}, n_{2}) for a waiting time of τ_{ w }, the system will transfer to one of its adjacent states. During the process, the transition intensity functions governing the four possible transitions of protein populations from state (n_{1}, n_{2}) to states (n_{1} − 1, n_{2}), (n_{1} + 1, n_{2}), (n_{1}, n_{2} − 1), and (n_{1}, n_{2} + 1) are α_{2}, k_{1}, α_{4}, and k_{2}, respectively. These transitions are exact equivalents of the transitions from states (n_{1} + 1, n_{2}), (n_{1} − 1, n_{2}), (n_{1}, n_{2} + 1) and (n_{1}, n_{2} − 1) to state (n_{1}, n_{2}), as shown in Figure 2. These four possible transitions are mutually exclusive events. Moreover, as discussed in the last section, one and only one of the four possible transitions takes place during the waiting time determined by the random number r_{ 1 }. Thus, the probability of the system transferring from (n_{1}, n_{2}) to (n_{1} − 1, n_{2}) is
The probability of the system transferring from state (n_{1}, n_{2}) to (n_{1} + 1, n_{2}) is
The probability of the system transferring from state (n_{1}, n_{2}) to (n_{1}, n_{2} − 1) is
Similarly, the probability of the system transferring from state (n_{1}, n_{2}) to (n_{1}, n_{2} + 1) is
Since the sum of Q_{1} through Q_{4} is 1, the transition type can be identified by the randomly generated number, r_{ 2 }. Specifically, r_{ 2 } falling within the interval,
implies that the population of type1 protein decreases by 1, see Equation 67; r_{ 2 } falling within the interval,
implies that the population of type1 protein increases by 1; r_{ 2 } falling within the interval,
implies that the population of type2 protein decreases by 1; r_{ 2 } falling within the interval,
implies that the population of type2 protein increases by 1.
Simulation algorithm
The eventdriven Monte Carlo procedure is conducted according to Rajamani [40]. A stepwise description of the procedure is given below.

1.
Define the initial populations of the two types of proteins, and let the system size, Ω, be the sum of the two protein populations. This Ω will also be the total number of independent simulations to be conducted before taking their statistics. Start the random walk from this point.

2.
Select the total length of time of each simulation, T _{ f } has to be selected. For the current work, T _{ f } was chosen to be either 15 or 50 s.

3.
Determine the length of the waiting time, τ _{ w }. First, generate a random number, r _{1}, from a uniform distribution in [0, 1]; then, calculate τ _{ w }, for a system’s transition state n(t) = (n _{1}(t), n _{2}(t)) according to Equation 66.

4.
Update the computer clock by letting t = t + τ _{ w }.

5.
Calculate the transition probabilities that the system will transfer from state n to the other states Q _{ i }’s by Equations 67 through 70. Then, generate another random number r _{2}, from a uniform distribution in [0, 1]. Determine the transition type by examining in which interval given by Equations 71 through 74 is r _{2} located.

6.
Repeat steps 3 to 5 until the total time exceeds T _{ f }; this terminates one replication of simulation.

7.
Repeat steps 2 to 6 for Ω times, and store the resultant number in proteins of type i during the j th replication at time t, n _{ ij }(t). This yields the mean number of proteins of type i at time t as
\mathit{E}\left[{N}_{i}\left(t\right)\right]\phantom{\rule{0.2em}{0ex}}=\phantom{\rule{0.2em}{0ex}}\frac{{\displaystyle \sum _{j=1}^{\Omega}{n}_{\mathit{ij}}}}{\Omega}(75)
The variance of population of type i at time t can be calculated from its definition, i.e.,
The covariance around the means between the two types of populations i and j, at time t can be calculated from its definition, i.e.,
As mentioned at the outset of this section, both the Monte Carlo simulation and the simulation based on the master equations adopted in the current work are rooted in the identical set of transition intensity functions derived from the same set of assumptions. Thus, integrating the equations for the first and second moments of the master equations, Equations 54 through 58 for the process, is expected to generate results nearly identical to those from the Monte Carlo simulations, i.e., Equations 75 through 77. Equations 75 to 77 are expected to be nearly identical to Equations 54 to 58 together with 49 to 50.
Results and discussion
The present stochastic analysis of the genetic toggle switch yields the transition probabilities of mutually exclusive events through the definitions of the transition intensity functions of protein production as well as degradation. This analysis renders it possible to formulate the nonlinear master equations of the process as well as to derive the eventdriven Monte Carlo simulation. Even though each of these was simulated separately they portrayed interesting analogies.
The stochastic algorithms developed here allow us to analyze the stochastic nature of the twostate toggle switch quantitatively. The master equations governing the numbers of the two types of protein are formulated from stochastic population balance. The stochastic pathways of the two proteins, i.e., their means and the fluctuations around these means, have been numerically simulated independently by the algorithm derived from the master equations, as well as by an eventdriven Monte Carlo algorithm. Both algorithms have given rise to the identical results. Moreover, these analyses render it possible to circumvent the possibility of noiseinduced transitions.
Simulation based on the master equations
Figures 5, 6, and 7 represent the temporal profiles of the Cases A through C discussed earlier. The lefthand parts of these figures are the exploded portion of the more completed simulations on the right. These simulations were conducted with m = M = 2, {\alpha}_{1}^{\u2033} = 15.6, and the same set of initial conditions, u(0) = 155 and v(0) = 154. These initial conditions correspond to a point below the separatrix in the phase diagram, see Figure 3. The value of {\alpha}_{2}^{\u2033} varies to illustrate the characteristics of three different cases of dynamics: in the bistable region, on the bifurcation curve, and in the monostable region. The standard deviation envelopes are plotted around the macroscopic trajectories.
Case A, {\alpha}_{2}^{\u2033} = 15.6, represents a bistable system, as marked in the bifurcation diagram in Figure 4. Figure 5 presents the simulated results of this system based on the master equations. As expected, the populations eventually reaches the stable steady state #2 marked in Figure 3 since the initial conditions consist a point below the separatrix, and our analysis of the vector field depicting the flow of dynamics [45] suggests this outcome. The protein populations decrease rapidly and stay in the proximity of the saddle node for a while before they depart for their steady states, an observation consistent with the classical dynamics. During this period, the populations of the two proteins are very similar to each other. The fluctuations around the mean trajectories increase initially from zero and then decrease when they approach the steady states. In a stable system, the standard deviation of the number of either type1 or type2 proteins attains the maximum because the state of the system is usually well defined at the outset of the process and the uncertainties decline eventually until it varnishes upon stabilization [46]. The uncertainty in the population of the type2 protein in Figure 5 appears to remain constant; a special computer experiment was conducted with long simulation time to ensure that it indeed decreases over time.
The formulator is often confronted with a myriad of interacting factors related to a gene’s expression mechanisms before settling on a strategy to assess their impact. A mathematical description of this complex process usually relies on a manageable number of system variables. This lumping procedure inevitably results in a high degree of freedom and fluctuations, or uncertainties, in the predictions of populations of discrete systems [9]. The behavior of an individual protein molecule in a discrete system with such a high degree of freedom is thus difficult to predict even when the system is monitored experimentally. The parameters in the equations, e.g., the transition intensity functions of the master equation algorithm adopted here, are presumed to depend only on the major variables of the system and to be independent of the variables of secondary importance. Neglecting these secondary variables is, in essence, the source of internal, or system, or minimal noises that should be appropriately analyzed stochastically. Thus, the internal noises caused by the discrete nature of a system are inherent in the system and they govern the minimum scattering expected of the random variable of interest. The experimentally observed scattering should always be larger than the predicted one induced by internal noises because of inevitable external noises attributable to experimental errors and imprecision of measuring devices. This implies that it is worth cautioning ourselves not to replicate the experiments excessively in an attempt to reduce the scattering far beyond what is predicted. It is interesting to note that fluctuations reported by Gardner et al. are significantly higher than what master equations will predict. The number of culture used in their fluorescence analysis was 40,000, and the actual number of culture in the sample is much larger than this number. Therefore, the noise levels reported by Gardner et al. [3] certainly involve not internal, but also external noises. External noises are the fluctuations created in an otherwise deterministic system by the application of a random force, whose stochastic properties are supposed to be known [9].
The two proteins do not have welldefined states as a deterministic model depicts when they pass the saddle node. Instead, their populations are probabilistically distributed. The two proteins have not only similar populations but also similar uncertainties in their populations. In fact, as shown in the lefthand side of Figure 5, the uncertainties in their populations are in the same order of magnitude. These characteristics imply that there is a high probability that the relative sizes of the two protein populations are switched when the system approaches the unstable steady state. This switch brings the populations to the region above the separatrix in the phase diagram in Figure 3, and the vector field in that region eventually leads the process to the steady state #1 marked in the same figure. The noiseinduced phase transition has been examined in detail by Nicolis and Turner [47], Malek Mansour et al. [48], and Horsthemke and Lefever [49]. Nicolis and Turner have shown that the fluctuations enhanced at a ‘critical point’ (populations closest to the instable steady state); the variances are of the order of Ω^{−1/2}, a result consistent with that derived by van Kampen for expanding the master equation by system size expansion. Thus, systems with low populations are more subjective to noiseinduced transitions. The noise enhancement near the instability is illuminated in Figure 5. Once the system moves away from the instability, the noises decrease and noiseinduced transition becomes more difficult. Internal fluctuations do not change the local stability of the system, and the position of transition points is in no way modified by the presence of these fluctuations.
It should be mentioned that the parameter values for our simulation are carefully chosen to illustrate the possibility of noiseinduced transition. Gardner et al. [3] did not observe this possibility probably because the populations of their system are very large and the difference between the two protein populations at the critical point is large, as discussed earlier.
Case B, {\alpha}_{2}^{\u2033} = 4.0, represents system on the bifurcation line, as marked in the bifurcation diagram in Figure 4. Gardner [3] has a good exposition on the dependence of bifurcation diagram and phase diagram on the parameters. There are two steady states on the phase diagram, similar to Figure 3; one is stable and the other, unstable. Figure 6 presents the simulated results of this system based on the master equations. Similar to Case A, the populations eventually reach the stable steady state. The populations do not stay in the vicinity of the saddle node for a long time as they are in Case A. Although the two protein populations are very close to each other and the fluctuations are of the same order of the populations during this time period, one steady state characteristic guarantees the system’s final destination.
Case C, {\alpha}_{2}^{\u2033} = 1.2, is a monostable system, as marked in the bifurcation diagram in Figure 4. Figure 7 presents the simulated results of this system based on the master equations. Similar to Case B, the populations eventually reach the stable steady state. Although the two protein populations have maximal fluctuations during the evolution of the dynamics, but they eventually vanish to zero.
Figure 8 presents the results from a simulation very similar to Case A. It uses the identical set of parameters for Case A and with a slightly different set of initial conditions: u(0) = 14 and v(0) = 154. It is a bistable system and the initial conditions represent a point above the separatrix in Figure 4. As expected, the process eventually reaches the steady state #1 shown in Figure 3. Unlike Case A, however, this dynamics does not pass through the proximity of the saddle node, and the fluctuations around the means do not permit easy switching between the two populations, see Figure 8.
Simulation based on Monte Carlo procedure
Monte Carlo simulations have yielded results essentially indistinguishable from those generated from the master equations. This is expected since the algorithms based on the eventdriven Monte Carlo procedure and master equations derived in the present work are rooted in identical assumptions, i.e., the Markov property and temporal homogeneity of the random variables. These assumptions lead to the definitions of transition intensity functions that are the cornerstones of the formulation of the master equations and of the Monte Carlo procedure.
The fact that the two algorithms have yielded essentially the same results implies that both indeed define the evolution of dynamic process in a precisely equivalent way. The masterequation algorithm generates the equations governing the statistical moments of the process, which can be readily varied to cover a wide range of initial conditions, whereas the Monte Carlo procedure will require far more computational time and storage space under such circumstances.
Internal noiseinduced transition was clearly observed during Monte Carlo simulation for Case A. Figure 9 demonstrates the two traces from two independent Monte Carlo simulations with parameters and initial conditions identical to those for Case A. These two independent Monte Carlo simulations result in two different steady states that is a consequence of internal noiseinduced transition. It should be mentioned that results based on master equation, as shown in Figure 5, represent an averaged outcome of independent Monte Carlo simulations of Ω times (Ω = 155 + 154 = 309 for this case), which are indeed observed in our simulation experiments. As mentioned in the last section, systems of small populations are susceptible to large internal fluctuations (or uncertainties) in the evolution of their dynamics. The evolutions of protein statistics shown in Figure 5 also illustrate the large uncertainties after the populations enter the proximity of the saddle node. In fact, the uncertainty is of the same magnitude as the mean number of particles. Internal fluctuations are inherent characteristics of discrete systems that are beyond the regulation of external means. The results on the righthand side of Figure 9 show a clear transition in protein numbers in a particular Monte Carlo simulation. The transition takes place soon after the populations enter the proximity of the saddle node. It is caused by the fact that the populations of both proteins are low and, therefore, there are susceptible to large internal fluctuations and noiseinduced transitions. Noiseinduced transition has been discussed by Nicolis and Turner [47], Malek Mansour et al. [48], and Horsthemke and Lefever [49].
As mentioned in the ‘Introduction’ section, the master equation and its systemsize expansion suffers a few limitations. One of such limitations is that the algorithm is valid for the dynamics well within the boundary of attraction [9]. For a bistable dynamics staring in a region outside this boundary, such as Case A, the Monte Carlo simulation converges to two possible steady states. The master equation algorithm converges to only one.
Some comparisons of the three algorithms are worth mentioning. The governing equations for the system size expansion can be derived in a straightforward manner, though the detailed derivations may be cumbersome and time consuming. It requires only a minor transformation of variable for some unstable stochastic processes, such as the diffusion process, well beyond the initial transient period [9]. Unlike the Monte Carlo simulation, the derived moment equations can be repeatedly integrated for different sets of parameters and initial conditions. Consequently, system size expansion has been widely adopted in the derivation of governing equation of stochastic processes governed by internal noises.
Kurtz’s algorithm is highly compact and convenient. The implementation of the rigorous Kurtz algorithm requires knowledge about the relations among master, Langevin, and FokkerPlanck equations. It allows direct derivation of the equations governing the moments. However, the algorithm merely describes the dynamics in the initial transient period of unstable systems for selected processes, such as the diffusion process [9].
The Monte Carlo method is easy to implement because it bypasses all derivations of equations. It is most efficient when the number of random variables is large and the master equation is difficult to derive. Repeated simulations have to be carried out for different sets of parameters and initial conditions. The required computational time and disk space are usually high.
Conclusions
The current model adopts the essential concepts of a nonlinear toggle switch model for analyzing a proteinregulated system. The master equation algorithm, along with its system size expansion, involves the stochastic probability balance of the two types of populations. The resultant master equation should yield not only the deterministic evolution of protein populations during gene expression, but also the fluctuations, or uncertainties inherited in the prediction or measurement. Kurtz’s limit theorems significantly reduce the complex and laborious exercise of system size expansion. In fact, they will be indispensable tools for the analysis of really complex genetic networks.
The validity of the model is amply demonstrated by numerically calculating the evolution of population of both types and their fluctuations over time through two simulation algorithms, one based on the master equations and the other based on the eventdriven Monte Carlo procedure. These two algorithms are implemented totally independently of each other but with the same set of system parameters, i.e., the transition intensity functions. Hence, it is indeed remarkable that the two algorithms have yielded essentially identical results.
Both simulation results demonstrate the possibility of noiseinduced transition when the dynamics passes through the proximity of the saddle node. It happens when the protein populations are low and the noises are in the same order of magnitudes as the populations. This property may have practical applications in developing gene therapy, cell cycle control, and protein sensors.
Nomenclature
E, onestep operator; N_{ 1 }, random variable representing population of repressor 1; N_{ 2 }, random variable representing population of repressor 2; n_{ 1 }, realization of random variable, N_{ 1 }(t); n_{ 2 }, realization of random variable, N_{ 2 }(t); N, random vector, i.e., [N_{ 1 }(t),N_{ 2 }(t),N_{ 3 }(t)]; n, realization of random vector N (t); P_{ n }, probability that the system is at state n at time t; t, time; Y, random variable denoting the fluctuations about macroscopic behavior; y, realization of random variable Y fluctuations; Q, the transition probability; u, Gardner’s concentration of repressor 1; v, Gardner’s concentration of repressor 2; K_{1,} effective reaction rate for repressor 1 formation; K_{2,} effective reaction rate for repressor 2 formation.
Greek letters
α_{1}, the rate of production of repressor 1; α_{2}, the rate of production of repressor 2; α_{3}, the rate of degradation of repressor 1; α_{4}, the rate of degradation of repressor 2; {\alpha}_{1}^{\prime}, the effective rate of synthesis of repressor 1 on system size; {\alpha}_{2}^{\prime}, the effective rate of synthesis of repressor 2 on system size; ∅, macroscopic number of repressor 1; θ, macroscopic number of repressor 2; τ_{ w }, the waiting time; λ, transition intensity function; Ψ, joint probability distribution in terms of random vector Y ; Ω, total number of repressors or system size; β, the multimerization constant of repressor 1; γ, the multimerization constant of repressor 2; Θ, the vector representing the two mean numbers of proteins in systemsize expansion.
Subscripts
1, repressor 1; 2, repressor 2
Appendices
Appendix 1: systemsize expansion
The constituent populations at any time in the genetic toggle switch system can be represented by the random vector N(t) = [N_{1}(t), N_{2}(t)], their mean values can be taken as a deterministic vector Θ(t) = [ϕ(t), θ(t)] and their fluctuations can be taken as another random vector given by Y (t), where Y(t) = [Y_{1}(t), Y_{2}(t)]. As stated in Equations 13 and 14 in the text:
Their realizations of these expressions are given, respectively, by
Accordingly, the joint probability of n_{1} and n_{2}, P_{ n }(t), is now transformed into that of y_{1} and y_{2}, i.e., Ψ_{ y }(t).
Recall that in the context of deriving the master equation, the state or dependent variable of interest is the joint probability of the population distribution, P_{ n }(t), and the realization of random variables at time t, i.e., n_{1} and n_{2,} are invariant with respect to time. Consequently, the time derivatives of Equations 80 and 81 are, respectively,
For the convenience of the subsequent expansion of the master equation, Eq. 11 in the text is restated below
Substituting Equations 82 and 83 into the righthand side of the above expression yields
Without causing confusion, the subscript y of Ψ_{ y }(t) is eliminated in the subsequent discussion. The step operators, {{\rm E}}_{{n}_{1}} and {{\rm E}}_{{n}_{1}}^{1}, convert n_{1} to n_{1} + 1 and n_{1} − 1, respectively. Similarly, Equation 80 suggests that {{\rm E}}_{{n}_{1}} shifts y_{1} to y_{1} + Ω^{− 1/2}. Therefore, the operations of step operators in Equation 84 are equivalent to evaluating the values of target functions at shifted points through the following Taylor series expansions, i.e.,
Substituting Equations 85 through 89 into 84 yields
In order to collect the terms of same power of Ω in the subsequent expansion, the Ω dependence of the parameters in the above equation have to be examined and converted to their Ω independent counterparts. The definitions of α_{1} and α_{3} in the ‘Model formulation’ section suggest that they are proportional to the system size, Ω, i.e.,
where {\alpha}_{1}^{\prime} and {\alpha}_{3}^{\prime} are independent of the system size, Ω. Moreover, the definitions of equilibrium constants, K_{ a } and K_{ b }, for the gene repression, G + m R ⇆ GR_{ m }, in the beginning of the ‘Model formulation’ section suggest
where {K}_{a}^{\prime} and {K}_{b}^{\prime} are independent of the system size, Ω.
Substituting Equations 91 and 92 into Equation 90 gives
The first and third terms on the righthand side of the above expression can be expanded in power of Ω through known power and binomial expansions. Specifically, for small {K}_{a}^{\prime}\phantom{\rule{0.2em}{0ex}}{\Omega}^{m}{\left[\mathit{\Omega \theta}\phantom{\rule{0.12em}{0ex}}\left(t\right)+{\Omega}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{1ex}{$2$}\right.}{y}_{2}\left(t\right)\right]}^{m}, we have
where \left(\begin{array}{l}m\\ \phantom{\rule{0.12em}{0ex}}i\end{array}\right) denotes a binomial coefficient. Lumping the terms of the same power of Ω in the above expansion gives
Substituting the above expression into the first term on the righthand side of Equation 93 yields
The expansion of the second term on the righthand side of Equation 93 gives
Following the same procedure, the third and fourth terms on the righthand side of Equation 93 can be expanded into the following power series of {\Omega}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{1ex}{$2$}\right.}, respectively:
and
Substituting Equations 96 through 99 into Equation 93 and collecting the terms of order {\Omega}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{1ex}{$2$}\right.} on both sides of the Equation 100 yields the macroscopic, or the deterministic, equations:
They are the Equations 17 and 18 in the text. By collecting terms of order Ω^{0} yields
This equation can be rearranged in a linear FokkerPlank equation form [9] as follows:
or
where
Equations 103, 104, and 105 are Equations 21, 22, and 23 in the text, respectively.
Appendix 2: distribution functions of waiting time
Equation 9 in the text indicates that the probability of no transition in the small time interval, (τ_{ w }, τ_{ w } + Δτ_{ w }), is
The Markov property implies that succeeding time intervals, (0, τ_{ w }) and (τ_{ w }, τ_{ w } + Δτ_{ w }), are independent of each other (30), thus
Rearranging the above equation yields
Dividing the expression by taking the Δτ_{ w } and taking the limits as Δτ_{ w }