## Abstract

Engineered bacterial sensors have potential applications in human health monitoring, environmental chemical detection, and materials biosynthesis. While such bacterial devices have long been engineered to differentiate between combinations of inputs, their potential to process signal timing and duration has been overlooked. In this work, we present a two‐input temporal logic gate that can sense and record the order of the inputs, the timing between inputs, and the duration of input pulses. Our temporal logic gate design relies on unidirectional DNA recombination mediated by bacteriophage integrases to detect and encode sequences of input events. For an *E. coli* strain engineered to contain our temporal logic gate, we compare predictions of Markov model simulations with laboratory measurements of final population distributions for both step and pulse inputs. Although single cells were engineered to have digital outputs, stochastic noise created heterogeneous single‐cell responses that translated into analog population responses. Furthermore, when single‐cell genetic states were aggregated into population‐level distributions, these distributions contained unique information not encoded in individual cells. Thus, final differentiated sub‐populations could be used to deduce order, timing, and duration of transient chemical events.

## Synopsis

A synthetic temporal logic gate is combined with stochastic single‐cell digital response to achieve population‐level encoding of the order, timing, and duration of two chemical inputs.

Bacteriophage integrases can be used to create a synthetic temporal logic gate that records the timing and duration of two chemical inputs. Based on the sequence of events, single cells have four possible final genetic states.

While single cells have digital states, stochastic noise translates into population distributions that are analog with response to timing and duration of input signals.

A combination of mathematical model simulations and

*in vivo E. coli*experiments was used to show that the final proportion of cells in each state provides detailed information about past events that cannot be deduced from single cell analysis alone.The stochasticity and delays inherent in biological systems can and should be utilized to provide additional avenues of information and control in synthetic systems.

## Introduction

Engineered bacteria could one day be powerful self‐replicating biosensors with environmental, health, and industrial applications. Synthetic biology has made important strides in identifying and optimizing genetic components for building such devices. In particular, much work has focused on Boolean logic gates that detect the presence or absence of static chemical signals (Gardner *et al*, 2000; Anderson *et al*, 2007; Wang *et al*, 2011; Moon *et al*, 2013; Shis *et al*, 2014) and compute a digital response.

Temporal logic gates, which process time‐varying chemical signals, have been much less explored. Pioneering work by Friedland *et al* (2009) used serine integrase‐based recombination for the counting and detection of sequential pulses of inducers. But thus far, no work has studied the potential for temporal logic gates to provide information about the duration of a signal, or the time between two chemical events. Here, we present a temporal logic gate that allows us to infer analog signal timing and duration information about the sequential application of two inducer molecules to a population of bacterial cells.

Similar to previous temporal logic gates, our design takes advantage of the irreversibility of serine integrase recombination. While bistable switches have been successfully deployed as memory modules in genetic circuits (Kotula *et al*, 2014), such switches require constant protein production to maintain state and are sensitive to cell division rates and growth phase. The large serine integrases, on the other hand, reliably and irreversibly flip or excise unique fragments of DNA (Yuan *et al*, 2008). Thus, logic circuits built from integrases intrinsically include DNA‐level memory that requires virtually no cellular resources to maintain state, thus enabling permanent and low‐cost genetic differentiation of individual bacterial cells based on transient integrase induction. Further advantages of the serine integrases include the short length (40–50 bp) and directionality of their attachment sites. Serine integrases recognize flanking DNA binding domains (attB, attP) and subsequently digest, flip or excise, and re‐ligate the DNA between the attachment sites. Flipping or excision activity is determined by the relative orientation of the sites, which allows complex orientation‐dependent behavior to be programmed into integrase circuits. Well‐known serine integrases include Bxb1, TP901‐1, and ΦC31, all of which have been used to demonstrate static‐input logic gates (Bonnet *et al*, 2013; Siuti *et al*, 2013), and some have cofactors that can reverse directionality (Khaleel *et al*, 2011; Bonnet *et al*, 2012). Recently, an entirely new set of 11 orthogonal integrases was characterized, greatly expanding the set of circuits that can be built (Yang *et al*, 2014).

In contrast to previous studies of temporal logic gates, our work leverages the stochastic nature of single‐cell switching to create a robust population‐level response to a time‐varying chemical signal. The fundamental nature of living cells that makes them so attractive for engineering—their extremely low energy operation in the limit of using small numbers of molecules to represent information—is also inextricably linked to stochasticity and noise. By traditional engineering standards, synthetic circuits would ideally perform identically within every cell of a population. When this ideal is applied to biology, the stochastic nature of molecular processes, particularly at low‐copy numbers, presents a significant barrier to reliable outputs from engineered cells. Thus, while natural cellular dynamics and differentiation take advantage of noisy gene expression (Elowitz *et al*, 2002; Süel *et al*, 2007), synthetic circuits often require noise reduction for proper function (Dunlop *et al*, 2008). Recent work has taken a different direction, toward understanding of population‐level dynamics. This includes analysis of both stochastic cellular responses to inputs (Uhlendorf *et al*, 2012; Ruess *et al*, 2015) and changes in collective population‐level memory in response to stress (Mathis & Ackermann, 2016). Such efforts suggest that a deeper understanding of the inherent heterogeneity in biological systems might eventually lead to circuit designs that operate on distributions of cellular responses, rather than depending on homogeneous responses from all cells.

It is with this vision in mind that we designed a two‐input temporal logic gate using strategically interleaved and oriented integrase (Bxb1, TP901‐1) DNA recombination sites and used this gate to engineer an *E. coli* strain with four possible genetically differentiated end states. This strain contains single genomic copies of the temporal logic gate, ensuring digital‐yet‐stochastic responses from individual cells. We then utilized the heterogeneity of individual cellular responses to encode sequences of chemical inputs into the overall population response and used a stochastic model of single‐cell trajectories to predict the population response. By analyzing the distributions of final cell states, we can deduce the timing and pulse duration of transient chemical pulses and show that cumulative population‐level distributions contain additional event information not encoded in any single cell. Furthermore, because the states are genetically encoded, we can recover details of a chemical event long after its occurrence.

## Results

### Design of a two‐integrase temporal logic gate

We have designed a two‐input temporal logic gate that differentiates between the start times of two chemical inputs and produces unique outputs accordingly (Fig 1A). The design relies on a system of two integrases with nested DNA attachment sites (Fig 1B). The use of integrases irreversibly inverts segments of DNA, resulting in a memory feature that can be maintained for multiple generations (Bonnet *et al*, 2012).

The design of the integrase temporal logic gate hinges on interleaving the attB attachment site of integrase B (intB) with the attP site of integrase A (intA), thus ensuring that the possible DNA flipping outcomes are mutually exclusive (Fig 1B). The serine integrases used in this design are TP901‐1 (intA) and Bxb1 (intB). The fluorescent proteins mKate2‐RFP (RFP) and superfolder‐GFP (GFP) are used both as placeholders for future downstream gene activation and as real‐time readouts of the logic gate. The design also features a terminator (Bba‐B0015) and a strong constitutive promoter (P7). In the case where there are no inputs, the terminator prevents expression of RFP from the constitutive promoter.

There are five possible basic events that could occur in a two‐input system (Fig 1C): no input, inducer **a** only (*E*_{a}), inducer **b** only (*E*_{b}), inducer **a** followed by **b** at a later time (*E*_{ab}), and inducer **b** followed by **a** at a later time (*E*_{ba}). Consequently, in a perfectly resolved temporal logic gate, there should be five unique DNA states corresponding to the five types of events: *S*_{o} (the initial state), *S*_{a}, *S*_{b}, *S*_{ab}, and *S*_{ba}. This design is limited to only four DNA states due to excision when *E*_{b} occurs (*S*_{b} = *S*_{ba}). The two fluorescent outputs correspond to the two states that occur when inducer **a** is detected first—RFP is produced when the cell is in state *S*_{a}, and GFP is produced when the cell is in state *S*_{ab}.

Fig 1D illustrates the sequence of recombination that occurs during an event *E*_{ab} that results in DNA state *S*_{ab} and the production of GFP. Upon addition of inducer **a** at time *t*_{1}, TP901‐1 flips the DNA between its attachment sites, reversing the directionality of the terminator and the Bxb1 attB recognition site (state *S*_{a}). Then, when inducer **b** is added at some time *t*_{2} that is greater than *t*_{1}, the directionality of the Bxb1 sites is such that the DNA is flipped to reverse the directionality of the P7 constitutive promoter (state *S*_{ab}). If inducer **b** is added first (Fig 1E), the Bxb1 attachment sites are unidirectional—a configuration that results not in recombination, but in excision of the DNA between the sites (state *S*_{b}).

Once DNA recombination has occurred, it is irreversible. The unique attB and attP attachment sites are recombined into attL and attR sites, respectively, with which the integrases cannot bind to without additional cofactors (Ghosh *et al*, 2005). The nesting of the integrase attachment sites is the key design feature that produces the temporal *a**then *** b** logic, and the irreversibility of the recombination records the event in DNA memory. The result is a genetic record that can both be sequenced later and immediately read via constitutive production of fluorescent outputs.

### A Markov model for integrase recombination

The most compelling advantage of engineered biological systems over man‐made sensors lies in their inherent capabilities for replication and parallel sensing with minimal energy and resource requirements. Thus, deployment of synthetic bacterial devices would almost certainly involve populations of cells, never just a single cell. It is therefore important to understand how stochastic single‐cell responses affect overall population‐level distributions and outcomes. We created a Markov model of integrase‐mediated DNA flipping and then used a stochastic simulation algorithm (Gillespie, 1977) to simulate individual cell trajectories (Fig 2A). All of the four possible DNA states are represented in the model: the original state (*S*_{o}), the intB excision state (*S*_{b}), the intA single flip state (*S*_{a}), and the *a**then *** b** double flip state (

*S*

_{ab}). We have implemented the system experimentally by chromosomally integrating the target DNA into the genome of the

*E. coli*cell. This allows us to assume that each cell only has one copy of the temporal logic gate (Haldimann & Wanner, 2001) and that each cell can be characterized by the tuple (DNA; IntA; IntB) (Fig 2B). The DNA terms are

*S*

_{o},

*S*

_{a},

*S*

_{b}, or

*S*

_{ab}, and IntA and IntB are non‐negative integers representing the molecular copy number of each integrase. Once a DNA cassette has flipped into any of the states other than the original state

*S*

_{o}, there is no reverse process. The logic gate is designed such that if integrase B is expressed prior to integrase A, the DNA cassette is excised and the chain reaches the dead‐end

*S*

_{b}state. In order for a cell to successfully detect

*E*

_{ab}, it first needs to switch into state

*S*

_{a}then transition into state

*S*

_{ab}upon addition of inducer

**b**.

Since each cell contains only a single copy of the temporal logic gate DNA, we can expect each cell to behave differently and to be highly susceptible to internal and external noise. This stochastic behavior will create a heterogeneous population response that can be analyzed for a more complex profile of event than if all the cells behaved uniformly. In order to capture the heterogeneity of cell population, we model the temporal logic gate using a stochastic model. Specifically, the stochastic transitions between the DNA states and the production/degradation of integrases are mathematically modeled by a continuous‐time Markov chain over the state space (DNA; IntA; IntB) as illustrated in Fig 2B. Definitions of transition rates can be found in Appendix Table S1.

*In silico*, the dynamics of a single cell translates to each stochastic simulation of the Markov model starting with (DNA = *So*; IntA = 0; IntB = 0) state. We define ;;; and as the probability that the DNA state of a single cell is *S _{o}*;

*S*;

_{a}*S*and

_{b};*S*at time

_{ab}*t*, respectively.

The temporal dynamics of the probability can be modeled by the following ordinary differential equation (ODE) (see equation (1)).

(1)

Where the notation stands for the conditional expected value at time *t* (Full derivation, Appendix Section 12.1).

Serine integrases are produced as monomers that form dimers, search for specific attB and attP sequences, and, once both attB and attP sites are occupied, form a tetramer (dimer of dimers) that digests, flips, and re‐ligates the DNA (Yuan *et al*, 2008; Rutherford *et al*, 2013). Though some cooperativity in ΦC31 binding to attB has been found (McEwan *et al*, 2009), cooperativity in Bxb1 or TP901‐1 integrase binding to attB and attP not been observed (Ghosh *et al*, 2005; Singh *et al*, 2014).

Rather than account for all individual DNA‐integrase interactions, we have created a minimal model of stochastic transitions where only the final DNA states (*S*_{o}, *S*_{a}, *S*_{b}, *S*_{ab}) and the number of integrase monomer molecules (intA, intB) are tracked and all integrase activity is encompassed in the term. Since no cooperativity has been observed in Bxb1 or TP901‐1 DNA binding (i.e., occupation of attB does not increase the probability of attP binding), we represent the required tetramerization as a fraction where flipping efficiency is zero unless at least four molecules are present. Thus, the propensity functions for state transitions as a function of integrase concentration, , are defined in equation (2)

(2)

where is integrase concentration; is the dissociation constant; is the rate of flipping if the tetramer is formed; *i* = 1; 2; 3; and * = A;B (See Appendix Fig S1 for visualization of and Appendix Section 12.2 for full derivation).

We also define the time between the introduction of the first inducer () and the arrival of the second inducer () as the inducer separation time (), such that (3) as shown in Fig 2C.

In the following set of simulations and experiments, we will consider cases with step inputs (Fig 2C), where the inducers are either present or not present. Concentrations of the inducers when they are “on” will be held constant. Also, it is important to note that inducer **a** is still present during and after time Δt when inducer **b** is introduced.

Simulations of the Markov model were done with biologically plausible parameters in order to predict qualitative circuit behavior (Appendix Table S1). We limited the parameters to only the basic processes (integrase production, degradation, and DNA flipping), and parameter values were chosen to be within biological orders of magnitude. The single production rate constants, and , combine the transcription and translation rates of each integrase. When an integrase in the model is induced, its production rate, , is the sum of and any leaky transcriptional expression, (*= intA or intB). The integrase monomer disassociation constant, , was estimated from measured Bxb1 binding constants (Singh *et al*, 2013). Parameter values for preliminary simulations were , (2.3 h half‐life), , , and molecules.

Our analysis of initial numerical simulation results highlights the significant role that the inducer separation time, ∆*t*, plays in setting the final population distributions (Fig 2D). For each ∆*t*, individual cell trajectories were generated with the assumption that each cell only has one copy of the target DNA (*N* = 5,000 trajectories). Then, at every time point, the total number of cells in each DNA state is counted (Appendix Fig S2). Fig 2D shows the contrast between adding both inducers simultaneously (∆*t* = 0 h) and adding inducer **b** after a 5‐h delay (∆*t* = 5 h). Since both inducers are present by the end of simulation, all of the cells must have a final state that is either the *S*_{ab} state or the *S*_{b} state. No cells remain in the original *S*_{o} configuration. *S*_{a} is a transient state that builds up prior to the addition of inducer **b** and begins to convert to *S*_{ab} immediately after the introduction of **b**. These initial simulation results suggest that ∆*t* may be a way to reliably tune the final population fractions of *S*_{ab} versus *S*_{b} state cells.

### Population distributions reflect inducer order and separation time

We used the model to further investigate the effects of varying both inducer order and separation time on population distributions in our experimental system and to understand the possible outcomes. In Fig 3, we simulate *in silico* cell populations that have been exposed to a sequence of overlapping step functions (*N* = 5,000 trajectories).

In the case of an *E*_{ab} event, the proportion of cells that successfully detect *a**then *** b** and switch to state

*S*

_{ab}is a function of the inducer separation time, ∆

*t*(Fig 3A). High ∆

*t*means increasing the time that cells spend in only inducer

**a**, allowing for most of the population to transition from

*S*

_{o}→

*S*

_{a}before the addition of any inducer

**b**. Exposing cells to the inverse sequence of events,

*E*

_{ba}, results in a decrease of

*S*

_{ab}cells proportional to increasing ∆

*t*(Fig 3B). High ∆

*t*in an

*E*

_{ba}event means that

*S*

_{o}→

*S*

_{b}is the dominant reaction and cells that are partitioned into

*S*

_{b}will not respond to inducer

**a**. If we plot the final number of

*S*

_{ab}cells from both

*E*

_{ab}and

*E*

_{ba}as a function of ∆

*t*(Fig 3C), we see that the two curves do not overlap.

*S*

_{ab}fractions exposed to

*E*

_{ab}increase monotonically with ∆

*t*, while those exposed to

*E*

_{ba}decrease monotonically with ∆

*t*. Thus, measuring the fraction of

*S*

_{ab}cells by itself is sufficient to determine both the order of events and the timing, ∆

*t*, between them.

Additionally, we can define a detection limit, ∆*t*_{90}, for which the inducer separation time results in ≥ 90% of the population switching into the *S*_{ab} state (Fig 3C). This ∆*t*_{90} limit provides a way to capture the two response regimes of the population. If the inducer separation time is less than the detection limit (∆*t *< ∆*t*_{90}), then the rate of population switching is fast enough such that the number of *S*_{ab} cells will correspond uniquely to some ∆*t* value. If ∆*t *> ∆*t*_{90}, then most cells have already switched to a final state, and the differences in *S*_{ab} cell count are too small to uniquely determine ∆*t*.

The single‐cell limitations of the temporal logic gate circuit can be overcome by measuring the number of *S*_{ab} cells as a fraction of total cells. Though the logic gate itself does not have a unique genetic *S*_{ba} state and cannot distinguish between a *b**only* event versus a *b**then *** a** event, these simulation results suggest that population‐level fractional phenotypes can provide this additional information (Fig 3D). In the case of

*E*

_{ab}, fractions of

*S*

_{ab}will always be above 50%, while

*S*

_{ab}fractions less than 50% indicate

*E*

_{ba}. Additional figures showing how populations of

*S*

_{a},

*S*

_{b}, and

*S*

_{o}cells change with ∆

*t*can be found in Appendix Figure S3.

*In vivo* step induction data supported model predictions and showed that population fractions of *S*_{ab} cells could be tuned using ∆*t* (Fig 4). DH5α‐Z1 cells were chromosomally integrated with one copy of the integrase target DNA and then transformed with a high‐copy plasmid containing Ptet‐Bxb1 and PBAD‐TP901‐1. When ∆*t* was varied from 0 to 8 h, we observed results qualitatively similar to model predictions. In Fig 4A, the cells have been exposed to an *E*_{ab} event, where inducer **a** is present from time *t* = 0 h to *t*_{end}, and **b** is present from *t* = ∆*t* h to *t*_{end}. GFP expression during time course measurements is used as a proxy for *S*_{ab} state cells, and flow cytometry was used to measure final populations. Comparisons of bulk fluorescence versus cytometry cell counts suggest that in single‐copy integrants, overall GFP fluorescence is a good approximation of population *S*_{ab} levels (Appendix Fig S12).

Source Data for Figure 4 [msb156663-sup-0004-SDataFig4.xlsx]

In Fig 4A, the number of cells in the GFP‐expressing *S*_{ab} state increases proportionally with increasing ∆*t* and continues to be responsive even when the two inducers are separated by 8 h. There is some expression of GFP in the presence of only inducer **a** (*E*_{a}), indicating some basal levels of intB. RFP expression, a proxy for the number of cells in state *S*_{a}, begins to increase at *t* = 0 h and drops at time *t* = ∆*t* when inducer **b** is added (Appendix Fig S4A). Aligning all of the GFP expression curves by ∆*t* (Appendix Fig S5) shows that lower values of ∆*t* not only have lower final GFP expression values, but also have slower rates of GFP production. This is consistent with modeling results because if we assume inducer **b** has an equal probability of entering any one cell, then in case of small ∆*t* (∆*t*_{90} ≤ 4 h), there are a much larger number of *S*_{o} cells and so the rate of *S*_{a} → *S*_{ab} state conversion will be lower. In the case of ∆*t* > 4 h, the majority of cells in the population are already in the *S*_{a} state configuration, and so the rate of cell state conversion to *S*_{ab} will be much higher. When cells are exposed to *E*_{ba}, the number of *S*_{ab} cells decreases monotonically with increasing ∆*t* (Fig 4B), and there is no RFP expression above background (Appendix Fig S4B). In both types of events, the cells maintained their state for up to 30 h in liquid culture and when re‐streaked as single colonies. (Additional data with a more distinct color scheme and OD curves for this set of experiments can be found in Appendix Figs S6 and S7. Single‐colony analysis in Appendix Fig S11.)

Final *S*_{ab} (GFP) population fractions are sufficient to differentiate between populations that have been exposed to *E*_{ab} versus *E*_{ba} within 1 h of separation time between inducers (Fig 4C). Final populations after 30 h of growth were measured via flow cytometry and plotted against ∆*t*. As ∆*t* increases, so does the *S*_{ab} sub‐population. The cells that encountered *E*_{ba} have lower *S*_{ab} fractions with high ∆*t*, and at ∆*t* = 6 h, the final *S*_{ab} sub‐population is equal to the baseline expression of the *b**only* population, indicating that the addition of inducer **a** after a 6‐h exposure to only inducer **b** has no effect at all. Based on where the GFP fraction exceeds 90% of the maximum *S*_{ab} population fraction, the ∆*t*_{90} detection limit for the experimental system is ~4 h. These experimental results show that the *S*_{ab} population fraction clearly diverges for *E*_{ab} and *E*_{ba} when ∆*t * 0 h, indicating that *S*_{ab} fractions alone can be used to determine both event order and separation time.

Further analysis of population‐level data for all of the measurable fluorescent cell states can provide additional insights into differences in sub‐population growth rates and leaky integrase expression (Fig EV1, Appendix Figs S8–S10). In Fig EV1, experimental populations from the step input experiments have been gated into quadrants such that *S*_{ab}, *S*_{a}, and *S*_{o} + *S*_{b} populations can be counted. Even with maximum induction at highest ∆*t*, the maximum population fraction that can be switched appears to be approximately 60% of the total population. We believe this is due to the non‐fluorescent cells (*S*_{o}, *S*_{b}) having a slight growth advantage over differentiated cells. Studies have shown that unnecessary protein production has inverse effects on cell growth (Tan *et al*, 2009; Scott *et al*, 2010), and even with single‐copy integrants, this would result in some overrepresentation of non‐fluorescent sub‐populations within the population. Single‐colony analysis of the final populations shows that *S*_{o} cells persist in the population even with 30–40 h of inducer exposure (Appendix Fig S11E).

Leaky expression of intA and intB can also be inferred from the *no inducer*,** a**

*only*, and

*b**only*populations (Fig EV1A and B), and we can conclude that leaky expression is quite low, not exceeding ~0.5–3%. Even accounting for the overrepresentation of non‐fluorescent cells, the baseline population split when both

**a**and

**b**are added simultaneously (∆

*t*= 0 h) is just under 50% of the total GFP population fraction. This suggests that the integrase flipping rates, and , may not be equal and that the basal expression rates, should be nonzero.

### Varying model parameters for integrase activity and basal expression

Prior to proceeding with additional model‐driven experimental designs, model parameters were modified to better represent asymmetrical integrase activity. The parameters for integrase flipping and leaky basal expression were tuned to account for the asymmetrical population responses to *E*_{ab} versus *E*_{ba} events (Fig 4C). We hypothesized that this asymmetry arises from a combination of unequal integrase activity when searching for and flipping the DNA, as well as leaky background expression of the integrases (Fig 5).

To understand overall trends in model behavior, we varied and while holding the other parameters constant. When the relative flipping efficiency of intA () was varied from 0.2 to 0.5 h^{−1} (*k*_{flipB} = 0.3 h^{−1}), we observed a bias in the baseline population split when both inducers are introduced simultaneously, ∆*t* = 0 h (Fig 5A, *N* = 3,000). Previously in the preliminary model (Fig 3C), the two integrases were assigned equal flipping rates, and the population split was expected to be 50/50 for *S*_{ab}/*S*_{b}. As the flipping rate of intA decreases relative to that of intB, that baseline shifts downwards to favor the more active integrase, intB. Varying the basal expression of intB () from 1% to 20% of the intB production rate () monotonically decreases the maximum *S*_{ab} population fraction that can be reached in an *E*_{ab} event (Fig 5B, *N* = 3,000). If there is a constant level of un‐induced intB, then there will always be a minimum population of *S*_{b} cells inhibiting the maximum fraction of *S*_{ab} cells.

These simulation results showed that by varying and , we could tune the baseline shift at ∆*t* = 0 and the maximum *S*_{ab} ceiling at high ∆*t* to better approximate our experimental system. However, experimental measurements of leaky integrase expression showed that background expression was actually quite low (1% for intA, 1–3% for intB) (Figs 4C and EV1, *b**only*,** a**

*only*). Given actual measurements for , we constrained those parameters and fit the model by varying .

In order to find the best pair of values for and , the flipping efficiency parameters for both integrases were varied from 0.1 to 0.6 h^{−1} *in silico* (*N* = 500 cell trajectories), creating a matrix of simulated *S*_{ab} population fractions for each combination (Appendix Fig S13). Leaky basal expression of the integrases was held constant based on experimentally measured values ( = 1% of , = 2% of ), and experimental data were normalized to a 70% population maximum for fitting purposes. Mean squared error was found by comparing model fits with experimental data (Appendix Fig S13A), and the combination with the minimum MSE was chosen (Appendix Fig S13B).

Fig 5C shows ∆*t* versus *S*_{ab} population simulation results for final revised parameters. The final parameters were set to be , , , and (Appendix Table S2). The introduction of leaky integrase expression into the model suggests that due to leaky expression of intB, around 2% of the population will “detect” *E*_{ab} and be in state *S*_{ab} even when no inducer **a** has been introduced. Additionally, preliminary simulation results suggest that the ∆*t*_{90} detection limit can be tuned by increasing or decreasing the overall production rate (*= A or B) (Appendix Fig S14), though this remains to be experimentally verified in future work.

*In silico* parameter space exploration shows that varying and parameters enables tuning of baseline ∆*t* = 0 h split for *E*_{ab}/*E*_{ba} and the maximum ceiling for *S*_{ab} population fraction. Fold‐change variations in relative rates allowed us to understand overall trends in the final populations, and we adjusted the model to account for inequalities in integrase flipping and leaky basal expression. Since leaky expression was measured to be small, we primarily tuned flipping rates. This process led us to more relevant model‐informed predictions of experimental outcomes. With the refined model, we were interested to see whether distributions of the RFP‐expressing *S*_{a} state could provide information that measuring *S*_{ab} fractions alone could not.

### Deducing pulse width from *S*_{a} population fractions

Using the fraction of *S*_{ab} (GFP) cells alone, we can determine ∆*t* values up to a ∆*t*_{90} limit for any given sequence of two step inputs. Now consider a pulse type of event, in which inducer **a** begins at time *t* = 0 h, remains constant throughout, and inducer **b** is introduced as a finite pulse at time *t* = ∆*t* h (Fig 6A). The start time of inducer **a** then becomes a reference for when the entire system is activated and ready to detect inducer **b**. Cell states are measured via flow cytometry at time *t*_{end}, where *t*_{end} > 24 h. Modeling results presented in this section are using the refined set of parameters defined in Fig 5C and Appendix Table S2.

If either of the two inducers is present in the media to some limit *t*_{end}, we would expect all of the *S*_{o} cells to end up in one of two populations (Fig 6B). Cells that encounter inducer **b** first will be in the *S*_{b} state, while cells that encounter **a** first will either be in the *S*_{a} or *S*_{ab} states. In the previous sections, once an inducer was added to the population, it was not removed, and the assumption was made that at times > 24 h, only a negligible number of *S*_{o} cells remained. This type of step function induction also meant that only the final number of *S*_{ab} cells (GFP) was needed to uniquely determine the separation time ∆*t* because *any and all* cells that had switched to *S*_{a} would eventually become *S*_{ab}.

However, in the case of a transient pulse, some cells that are in the *S*_{a} state (RFP) will not ever encounter inducer **b**. Assuming that is small, these cells will remain in the *S*_{a} state. Therefore, the population of *a**first* cells equals *S*_{a} + *S*_{ab}. We simulated a matrix of populations exposed to varying inducer separation times (∆*t*) and inducer **b** pulse widths (*PW*_{b}) to measure the resolution of detectable events (Fig 6C). In simulation (Fig 6D), we can see that the two populations mirror each other to add up to 100% of the total cells (*N* = 3,000 cells, additional simulations in Appendix Fig S16).

Given that the step induction of **b** is equivalent to a pulse of infinite length (*PW*_{b} = ∞) and our prior experimental evidence showed that virtually no cells remain in state *S*_{a} when *PW*_{b} = ∞, we reasoned that the final number of *S*_{a} cells could be used to deduce information about the pulse width of **b**. This hypothesis was tested *in silico* by running a matrix of simulations with varying ∆*t* and *PW*_{b}. In Fig 6E, we see that the fraction of *S*_{a} cells over the total number of cells decreases monotonically with increasing *PW*_{b}, and the curves overlap regardless of ∆*t*. The overlap occurs despite nonzero leaky expression of intA and intB. The maximum number of *S*_{a} cells does not go to 1 at *PW*_{b} = 0 h because of leaky intB expression ( = 0.02 ).

Analytically, we solved equation (1) for to ensure that the *S*_{a} population fraction is only dependent on *PW*_{b}. If inducer **a** is used as a constant reference signal, all cells transition into one of *S*_{a}; *S*_{b}; or *S*_{ab} states, thus . If we assume that the basal leaky expression of intB is zero ( = 0), holds for , since there is no intB to switch the DNA state into *S*_{b} or *S*_{ab}. Thus, we can show that is dependent only on *PW*_{b}, the duration of the pulse width of inducer B, for *t* > ∆*t*. This conclusion holds as long as is negligibly small compared to other kinetic constants (, , , , and ) (See Appendix Section 12.3 for full derivation).

If *S*_{a} population fractions can be modulated by changing *PW*_{b}, then conversely, we should be able to use measured experimental RFP population fractions as a way to determine *PW*_{b}. Once *PW*_{b} is known, then the *S*_{ab} fraction can be used to uniquely determine the time between inducers, ∆*t* (Fig 6F). Furthermore, the genetically encoded state means that these population fractions should be maintained and measurable at a time, *t*_{end}, that is much later than the time of the events.

These conclusions can be extended in simulation to create a scatterplot of *S*_{a} cells versus *S*_{ab} cells in a population (Fig 7A) over an 11 × 11 parameter matrix varying ∆*t* and *PW*_{b} from 0 to 6 h in increments of 0.5 h (Additional plots in Appendix Fig S17). Each point on the chart in Fig 7A represents a simulated population (*N* = 3,000) exposed to a unique combination of ∆*t* and *PW*_{b} values. Vertical lines represent the same *PW*_{b} value, and points with the same shape and color have the same ∆*t* value. The simulation results suggest sufficient resolution of events as long as *PW*_{b} and ∆*t* values are between 0 and 4 h. For any single value of *PW*_{b}, we can follow the increasing ∆*t* values vertically and see that the population response saturates after 4.5 h resulting in overlapping between populations with 4.5 < ∆*t* < 6 h. We can trace any individual ∆*t* value horizontally from right to left and observe that the points begin to cluster and overlap when 4.5 < *PW*_{b} < 6 h. These simulation data suggest that there should be some defined detection range of ∆*t* and *PW*_{b} where every possible combination of the two is uniquely identifiable.

Source Data for Figure 7 [msb156663-sup-0005-SDataFig7BC.xlsx]

Experimentally, we tested a 7 × 7 matrix of varying ∆*t* and *PW*_{b} (0–6 h, 1 h increments) on independent populations of the temporal logic gate *E. coli* strain (Fig 7B). All populations, except for the control, were exposed to inducer **a** (L‐ara 0.01%/vol) at time *t*_{0} to *t*_{end}. Pulses of inducer **b** (aTc, 200 ng/ml) were achieved by sampling 5 μl of the population and diluting 1:100 into fresh media with only inducer **a** (M9CA + 0.01%/vol L‐ara). Populations were collected and measured via flow cytometry after 24 additional hours of growth in inducer **a** (~36 h after start of experiment) (Fig EV2). For all values of ∆*t*, the number of *S*_{a} cells (RFP) is highest when there is no exposure to inducer **b** (*PW*_{b} = 0 h) and decreases monotonically as a function of *PW*_{b} (Fig 7B, top). We see a more pronounced separation of the ∆*t* curves when we look at *S*_{ab} (GFP) cell fractions (Fig 7B, bottom). The number of *S*_{ab} cells is dependent on both ∆*t* and *PW*_{b} and increases proportionally with both increasing **b** pulse duration and inducer separation time.

By counting population fractions of RFP versus GFP‐expressing cells, we can resolve the different populations that result from varying ∆*t* and *PW*_{b} values (Fig 7C). As with Fig 7A, each point on the graph represents an independent population of cells (OD~0.7, ~10^{6} cells counted per population). All of the populations exposed to either or both of the inducers occupy fractional coordinates that are unique from that of the *no inducer* controls (indicated by dotted circle). We see that if ∆*t* is constant and *PW*_{b} increases (Fig 7C, right to left), then the *S*_{a} fraction decreases as *S*_{ab} fractions increase. For constant *PW*_{b} with increasing ∆*t* (Fig 7C, bottom to top), the *S*_{a} cell fraction remains mostly constant relative to increasing *S*_{ab}. In the case where there is no **b** pulse (*PW*_{b} = 0 h), we see maximum *S*_{a} (RFP) cell fractions of about 60% with minimal *S*_{ab} populations that are about the same as *no inducer S*_{ab} levels. Overall, populations with different *PW*_{b} exposures are well separated by *S*_{a} (RFP) fraction up to 4 h. Even for *PW*_{b} at 5 and 6 h, the populations have unique *S*_{a}/*S*_{ab} coordinates, just not unique *S*_{a} fractional values.

This method of profiling is only valid if the fraction of *S*_{a} state cells can be used as a measure of *PW*_{b} that is independent of ∆*t*. In previous experiments with step inputs (Fig EV1), there would be a significant population of cells with both GFP and RFP fluorescence, since they had transitioned to *S*_{ab} but had not yet fully diluted out built up RFP protein levels from being in *S*_{a} for extended periods. If a significant percentage of the population remained in this transition state (Q2), that would make RFP an unreliable measure of *S*_{a} state cells. However, flow cytometry analysis of the pulse‐modulated populations (Fig EV2) showed that although there were some cells expressing both RFP and GFP (Q2), these cells were always < 3% of the total population. (Additional flow cytometry analysis can be found in Appendix Figs S18–S23.) Thus, RFP was measured to be a reliable determinant of *S*_{a} state cells, and subsequently, of *PW*_{b}.

For any given *PW*_{b}, we observed higher experimental *S*_{a} (RFP) population fractions with lower ∆*t* (Fig 7A top), resulting in a diagonal slant for each value of *PW*_{b} (Fig 7C). Upon further investigation, we believe this is due to a slower transition than we anticipated. In our model, we assume is equal to , since both transitions are mediated by intB. However, the gradual decrease in *S*_{a} fractions with increasing ∆*t* for each value of *PW*_{b} suggests that the α_{1} transition rate may be actually be slower than α_{2} or α_{3}. Simulation results with adjusted transition rates (α_{1} < α_{2} = α_{3}) recapitulated the slanting *S*_{a} population fractions (Appendix Fig S25). This inequality in transition rates could have arisen from differences in DNA sequence length or from differences in the DNA excision required for *S*_{o} transition to *S*_{a} instead of the recombination that occurs in the other transitions. Differences in DNA excision or recombination for a single integrase are important experimental parameters, but do not ultimately affect our conclusions about the overall system. Despite unequal intB transition rates, experimental implementation of the temporal logic gate still produces unique (*S*_{a}, *S*_{ab}) fractional coordinates for each combination of ∆*t*,* PW*_{b}, even though *S*_{a} values are not unique for higher *PW*_{b}.

Model‐informed predictions on population fractions in response to pulses of inducer **b** led to experiments that could produce unique *S*_{a} and *S*_{ab} coordinates for different combinations of ∆*t* and *PW*_{b}. However, experimental data also revealed areas in which the model had been oversimplified. While it is important to have a model to understand overall properties and limitations of the experimental system, it is also impractical to design simulations that can account for all possible variations that might occur in the implementation of biological devices. Therefore, we believe that future workflows should also involve calibration protocols for specific applications of engineered biological populations.

### Practical use and calibration of populations for event detection

Curve‐fitting methods were used to automatically convert experimentally measured RFP and GFP population fractions into *PW*_{b} and ∆*t* values and to evaluate the resolution with which population ratios can be used to determine inducer separation time and pulse duration. Using the experimental data from Fig 7B and C, we generated fitting curves for PWb as a function of RFP population percentage (*R*) and for ∆*t* as a function of both GFP population percentage (*G*) and *PW*_{b} (Appendix Figs S26 and S27, Appendix Table S3). We will denote these functions with *PW*_{b}(*R*) and ∆*t*(*G*,* PW*_{b}), respectively. The functions *PW*_{b}(*R*) and ∆*t*(*G*,* PW*_{b}) can then be used to generate a mesh of estimated *PW*_{b} and ∆*t* values for any given normalized fluorescence values (Fig 8A, Appendix equations 8–11).

Source Data for Figure 8 [msb156663-sup-0006-SDataFig8.zip]

The estimated values were compared against the actual values to determine the approximate time window with which a specific *PW*_{b} or ∆*t* value can be resolved. For each actual value of *PW*_{b} and ∆*t*, we calculated the average and standard deviation for the set of estimated values. The standard deviation allows us to visualize the range for which the majority of predictions will fall for any given actual value. For instance, a *PW*_{b} of 1 h can be detected ± 0.25 h, but as *PW*_{b} increases, this prediction window widens and for *PW*_{b} ≥ 3 h, the resolution of detection is closer to ± 1 h (Fig 8B). Similarly, predicted values of ∆*t* fall within ± 0.5 h for 0 < ∆*t *<* *3 h and increase to ± 1 h when ∆*t* ≥ 3 h (Fig 8C). Using these fitting functions, we can also pre‐generate a reference table that converts RFP and GFP population fractions into predicted *PW*_{b} and ∆*t* values (Appendix Table S4).

## Discussion

Engineered biological systems have inherent capabilities for replication, parallel processing, and energy efficiency. These advantages rely on the existence of bacteria not as single cells, but as populations. As the field moves forward with synthetic gene circuits, it is important to understand outcomes not just as single‐cell outputs but as overall population‐level distributions.

We have designed and implemented a temporal logic gate that takes advantage of the population dynamics to collectively sense and record sequences of transient chemical inputs. We show both that single cells independently sense and record events and that aggregate population fractions create unique outcomes that provide information not encoded in single cells. As with all engineered systems, proper calibration of these temporal logic gate populations will be required prior to deployment in the “field”. We envision a process similar to the one described in this report. First, experimental populations are exposed to a matrix of *PW*_{b} and ∆*t* values. This will set the maximum and minimum RFP and GFP population fractions and provide necessary data for determining the ∆*t*_{90} limit and producing the fitting functions *PW*_{b}(*R*) and ∆*t*(*G*,* PW*_{b}). Once the fitting functions have been determined, values for *PW*_{b} and ∆*t* for experimental samples can be estimated within ± 0.25 to 1 h of the actual values. A calibrated table could also be generated and used for as a reference for samples that have been exposed to unknown conditions.

The stochastic nature of molecular processes often presents a significant barrier to homogenous outputs from an engineered population of cells. This implementation of event detection via population fractions takes advantage of stochastic and heterogeneous individual responses to environmental conditions in order to map final population fractions back to unique sequences and durations of chemical events. The sensitivity of the system and the ∆*t*_{90} detection limit could potentially be modulated by increasing or decreasing protein production rates via tuning of plasmid copy numbers, signal concentration, or transcription/translation sequences. The use of digital cellular outputs combined with the analog population response creates event detection systems that are more robust to stochasticity and can be tuned more easily. We plan to explore these possibilities in future work.

As a proof‐of‐concept, we have used the common laboratory inducers L‐arabinose and aTc as inputs, but we anticipate that our temporal logic gate system could be modularly adapted to any pair of biosensors. In particular, we believe there are possibilities for detection of miRNAs and biofilm formation. Stable populations of microRNAs (miRNAs) circulating in the blood have generated a lot of interest as biomarkers for human health (Cortez *et al*, 2011). These short (~20–30nt) regulatory RNAs have been shown to have sequential tissue‐specific expression signatures that correlate with pregnancy, tumor formation, and other diseases (Gilad *et al*, 2008; Mitchell *et al*, 2008), and synthetic biology has developed many customizable RNA sensors (Friedland *et al*, 2009; Green *et al*, 2014). Detection of miRNAs would require implementation of the temporal logic gate in mammalian cells. Though recombinase‐based synthetic circuits have not been shown in mammalian cells, serine integrases have been used quite effectively in a wide variety of mammalian cell types, primarily for genome editing and integration (Keravala *et al*, 2006; Xu *et al*, 2013).

Another possible application would involve the detection of harmful biofilms. Biofilms are self‐assembling, highly structured, multi‐species consortia that develop in stages and have sophisticated networks of interaction and function (Stoodley *et al*, 2002; Flemming & Wingender, 2010; Elias & Banin, 2012). Unnatural biofilm development in environments such as industrial water sources or waste streams can be both harmful for both the natural environment and the industrial mechanisms. Detection of biomarkers for known strains of biofilm colonizers would provide early warning of changing ecosystems, and although we do not yet fully understand these networks, it is known that quorum sensing plays a critical role in the process. Quorum‐sensing molecules and receptors are available in the synthetic biology toolbox and so may provide an accessible way of detecting the sequential colonization of different microbes. Field deployment of engineered bacteria will likely involve transient signals, low‐nutrient environments, and possibly even other microbial competitors (i.e., soil, flowing rivers, the digestive tract). We used minimal media in this study to better approximate low‐nutrient environments, and anticipate further characterization in more customized “local” environments (i.e., gut model or air model or soil model) and with hardier microbial chassis.

Finally, this study focused on the population outputs as indicators of past events, but we believe that this temporal logic gate could be used to reliably differentiate a single strain into controlled sub‐populations via input pulse order, duration, and frequency. In recent years, it has been recognized that many natural systems modulate cellular behavior not only by changing the concentration of signaling molecules but also by regulating signal pulse frequency (Cai *et al*, 2008; Lin *et al*, 2015). If we consider the fluorescent proteins GFP and RFP in this circuit as simply placeholders for downstream genes, then this system could easily be applied as a top‐down population differentiator. By modulating the sequence of inputs, one could systematically predict and create mixed populations of genetically differentiated cells. This greatly expands our capability to design synthetic systems that have controllable distributions as outcomes, not just digital on/off phenotypes. Furthermore, we can then begin to develop frameworks for understanding the role of feedback and control theory in modulating these sub‐populations given different starting distributions or uneven growth rates due to resource limitations. As the scientific community turns toward further understanding of microbiomes and multi‐cellular consortia, engineered bacteria populations could be used not only as a tool for investigating the activities of natural communities but also as a way to build synthetic communities from the ground up.

## Materials and Methods

### Cell strains and plasmids

All plasmids used in this study were designed in Geneious 7.1 (Biomatters, Ltd.) and made using standard Gibson isothermal cloning techniques. Integrases Bxb1 and TP901‐1 are on a high‐copy plasmid (pVHed05, plasmid map in Appendix Fig S9) with a ColE1 origin of replication (original template from the dual recombinase controller (Bonnet *et al*, 2013), Addgene Plasmid 44456). Integrase A (Bxb1) is behind a Ptet promoter and integrase B (TP901‐1) is behind a PBAD promoter. The plasmid has been modified with an additional TetR gene. The temporal logic gate was integrated into the Phi80 site on the *E. coli* chromosome using CRIM integration (Haldimann & Wanner, 2001) and screened for single integrant colonies. The integration plasmid template and DH5α‐Z1 strain were generously provided by J. Bonnet and D. Endy and modified to contain the temporal logic gate (pVHed07, plasmid map in Appendix Fig S9). Additional DNA and oligonucleotides primers were ordered from Integrated DNA Technologies (IDT, Coralville, Iowa). A custom formulation of M9CA media was used for all experiments. The media contained 1× M9 salts (Teknova, M1906) augmented with 100 mM NH_{4}CL, 2 mM MGSO_{4}, 0.01% casamino acids, 0.15 μg/ml biotin, 1.5 μM thiamine, and 0.2% glycerol and then sterile‐filtered (0.2 μm).

### Model simulations

The stochastic simulation algorithm by Gillespie (Gillespie, 1977) was implemented to generate the sample paths of individual cells using the Markov model (see Appendix Table S6 for the definitions of Markov transitions and transition rates). All simulation runs and their analyses were done with MATLAB (R2014b, The MathWorks, Inc.). Simulated populations were done with 3,000–5,000 individual cell trajectories. Source code for MATLAB simulations is available as Code EV1.

### Experimental methods

Prior to all experiments, cells were grown overnight from plate cultures in M9CA for 2 days, then diluted to OD 0.1 and recovered for 4–6 h at 37°C. L‐arabinose and anhydrous tetracycline (aTc) were used as inducers **a** and **b**, respectively. L‐ara was used a concentration of 0.01% by volume, and aTc was used a concentration of 200 ng/ml (450 nM). All media contained the antibiotics chloramphenicol (Sigma‐Aldrich, Inc (C0378); 50 μg/ml) and kanamycin (Sigma‐Aldrich, Inc (K1876); 30 μg/ml). All experiments were performed with the aid of timed liquid handling by a Hamilton STARlet Liquid Handling Robot (Hamilton Company).

For step function experiments, the cells were diluted to OD 0.06–0.1 into a 96‐well matriplate (Brooks Automation, Inc., MGB096‐1‐2‐LG‐L) with 500 μl total volume in M9CA. Cultures were incubated at 37°C in a BioTek Synergy H1F plate reader with linear shaking (1096 cycles per minute) (BioTek Instruments, Inc.), and inducers were added at appropriate time by the Hamilton robot. OD and fluorescence measurements (superfolder‐GFP ex488/em520, mKate2‐RFP ex580/em610) were taken by the BioTek every 10 min. Each experimental condition was done on the plate in triplicate.

For the pulse experiments, single 500‐μl cultures were grown at 37°C in the BioTek plate reader (linear shaking, 1096 cycles per minute) and inducers added at time ∆*t* by the Hamilton liquid handler. Pulses were achieved through dilution of the culture into fresh M9CA media containing 0.01% L‐arabinose. The Hamilton was programmed to sample 5 μl of the culture and dilute it into 500 μl of fresh M9CA + 0.01% L‐ara to achieve pulsatile exposure to aTc. This was done in three independent triplicates for each experimental condition. About 96‐well deep‐well plates containing the diluted cultures were then incubated at 37°C incubated for an additional 24 h (~36–40 h from start of experiment). Final end point populations were measured using the plate reader and also stored and further analyzed using flow cytometry.

Analysis of experimental data was done using custom MATLAB scripts. All depicted error bars are standard error of the mean. Fitting of curves was done in MATLAB.

### Flow cytometry

Experimental cultures were spun down, washed, resuspended in sterile PBS with 15% glycerol, and stored at −80°C (Jahn *et al*, 2013). Cultures were then thawed on ice and diluted to 10^{6} cells/ml in sterile PBS prior to running on the flow cytometer. Flow cytometry was done using a MACSQuant VYB (Miltenyi Biotec, Germany) at the Caltech flow cytometry core facility. Flow data analysis and gating was done with FlowJo version 10.0.8r1 (Flowjo, LLC, Ashland, OR). For inducer separation time experiments shown in Figs 4 and EV1, ~10^{5} cells were measured per population. For pulse induction experiments shown in Figs 7 and EV2, ~10^{6} cells were measured per population.

## Acknowledgements

The authors would like to thank J. Bonnet and D. Endy for the initial plasmids used in this work, S. Sanchez for critical assistance in automation and liquid handling, D. Perez for flow cytometry assistance, and C. Hayes for discussions. V.H. is supported by the U.S. Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program. Y.H. is supported by JSPS Fellowship for Research Abroad. Research supported in part by the Institute for Collaborative Biotechnologies through grant W911NF‐09‐0001 from the U.S. Army Research Office. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.

## Author contributions

VH conceived of the circuit design, constructed the necessary experimental strains, performed experimental work and data analysis, ran model simulations, and wrote the manuscript. YH developed the stochastic model and derived the mathematical results. PWKR provided feedback and guidance on data analysis and interpretation. RMM provided feedback and guidance on overall project vision, circuit design, and interpretation of results.

## Conflict of interest

The authors declare that they have no conflict of interest.

## Expanded View

Appendix [msb156663-sup-0001-Appendix.pdf]

Expanded View Figures PDF [msb156663-sup-0002-EVFigs.pdf]

Code EV1 [msb156663-sup-0003-CodeEV1.zip]

Source Data for Expanded View and Appendix [msb156663-sup-0007-SDataEVAppendix.zip]

## Funding

U.S. Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Programhttp://dx.doi.org/10.13039/100000005## References

This is an open access article under the terms of the Creative Commons Attribution 4.0 License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

- © 2016 The Authors. Published under the terms of the CC BY 4.0 license