What is Open Access?

Open Access is an initiative that aims to make scientific research freely available to all. To date our community has made over 100 million downloads. It’s based on principles of collaboration, unobstructed discovery, and, most importantly, scientific progression. As PhD students, we found it difficult to access the research we needed, so we decided to create a new Open Access publisher that levels the playing field for scientists across the world. How? By making research easy to access, and puts the academic needs of the researchers before the business interests of publishers.

Our Authors and Editors

We are a community of more than 103,000 authors and editors from 3,291 institutions spanning 160 countries, including Nobel Prize winners and some of the world’s most-cited researchers. Publishing on IntechOpen allows authors to earn citations and find new collaborators, meaning more people see your work not only from your own field of study, but from other related fields too.

Content Alerts

Brief introduction to this section that descibes Open Access especially from an IntechOpen perspective

How it Works Manage preferences

Contact

Want to get in touch? Contact our London head office or media team here

Careers

Our team is growing all the time, so we’re always on the lookout for smart people who want to help us reshape the world of scientific publishing.

Home > Books > Exploring the Benefits of Numerical Simulation and Modelling [Working Title]

Open access peer-reviewed chapter - ONLINE FIRST

Written By

Pavel Loskot

Submitted: 12 February 2024 Reviewed: 20 February 2024 Published: 21 June 2024

DOI: 10.5772/intechopen.1005516

From the Edited Volume

## Exploring the Benefits of Numerical Simulation and Modelling [Working Title]

Dr. Mykhaylo I. Andriychuk

## Abstract

Random processes are a key component in modeling stochastic systems. They are often defined by stochastic differential or, in discrete time, difference equations (SDEs), which precisely describe how the instantaneous time-dependent and for multidimensional processes also location-dependent values vary. The solutions of SDEs are Markov processes, which is important in the analysis and simulations. The aim of this chapter is to examine various SDE models of random processes and their specific cases, for example, the random processes that arise in statistical filtering of random signals. The problem of discretizing continuous processes and linearizing nonlinear processes is also considered. Since general Langevin equation is often difficult to solve analytically, in many scenarios, it is sufficient as well as useful to solve the corresponding Fokker-Planck equation or master equation in order to obtain the time-evolving distribution of the system states. After outlining the most common random processes in continuous and discrete time such as Bernoulli and Poisson sequences, telegraph signal, and Wiener and Gaussian processes, numerical methods for solving SDEs are discussed. The chapter concludes by presenting a general method for generating random processes that are defined by their probability distribution and the auto-covariance or auto-correlation function (ACF).

### Keywords

- dynamic system
- Fokker-Planck equation
- Kalman filter
- Langevin equation
- master equation
- stochastic differential equation
- random process

## Author Information

#### Pavel Loskot*

- ZJU-UIUC Institute, Haining, Zhejiang, China

*Address all correspondence to: pavelloskot@intl.zju.edu.cn

## 1. Introduction

Random processes have been studied in different forms for decades. They are integral component of describing stochastic systems and their dynamics. They frequently appear in many applications of statistical signal processing, and they are also considered in modeling phenomena in engineering, computer science, statistical physics, social sciences, and more recently also in computational biology and medicine. Therefore, there exists very rich literature, and in many cases, also a good theoretical understanding of their models and properties. In practical settings, the choice of a random process is dictated by the specific application as well as by the tractability of mathematical analysis and feasibility of producing numerical results.

Among different models of random processes, the processes described by stochastic differential (in continuous time) or difference (in discrete time) equations (SDEs) are probably the most common. The SDE models are sufficiently versatile and in many cases offer good theoretical frameworks and efficient numerical algorithms. More importantly, the solutions of these SDEs are Markov processes, which are more prone to analysis and simulations.

This chapter explores different forms of SDEs that are used to define various random processes. It is clear that this is a very large topic with many results, which has been evolving for decades. The aim is to highlight the key ideas while the detailed description can be found in the cited literature. The advanced theoretical concepts were limited to minimum in order to make the discussion accessible for typical graduate students in engineering. The references consist only of the textbooks and monographs on stochastic processes [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]. In particular, the reference [4] is a good introductory textbook. The book [17] is a recommended reading to gain more advanced theoretical foundations. The fundamental concepts in statistical signal processing are provided in [14] and in estimation theory in [11]. The aspects of computer simulations involving random processes are considered in [16].

The topics that are not covered, but which may nevertheless be highly relevant to stochastic processes, include the boundary conditions, absorbing processes, the first passage time and other such metrics, many methods and software tools for numerically solving the SDEs and integro-differential equations, analytical methods for solving SDEs having a specific structure, working with SDEs involving spatial diffusion, defining and generating complex-valued random processes and other. Some examples of more recent research topics in stochastic processes that are not covered in this chapter include their use in machine learning models and algorithms, defining stochastic processes in complex topological spaces such as manifolds, stochastic processes on graphs, multivariate stochastic processes involving mixed distributions, partially defined stochastic processes and so on.

This chapter also assumes that the readers are already familiar with the basic concepts in stochastic processes such as the probability distributions of different orders, stationarity, ergodicity, definition of the ACF, Markovian property, and understand linear auto-regressive modeling of stochastic processes, and why Kalman filter has been introduced. It is an advantage to already have working knowledge of differential equations and the strategies to obtain their solutions.

The rest of this chapter is organized as follows. Section 2 outlines various linear and nonlinear SDE models in continuous and discrete time and also considers the problems of discretization and linearization. Section 3 defines the Langevin, Fokker-Planck and master equations. The common random processes are presented in Section 4. Numerical methods for solving SDEs are discussed in Section 5. Section 6 provides a general method for generating random processes with the defined distribution and the ACF. Section 7 concludes the chapter.

The following mathematical notations are used in this chapter. A scalar, vector and matrix process, respectively, is denoted as $x\left(t\right)$, $\mathit{x}\left(t\right)$, and $\mathit{X}\left(t\right)$, in continuous time, $t\in \mathcal{R}$, and as $x\left[t\right],$$\mathit{x}\left[t\right]$, and $\mathit{X}\left[t\right]$, in discrete time, $t\in \mathbb{Z}$. The first time-derivative is denoted as $\stackrel{\u0307}{x}\left(t\right)=\mathit{dx}\left(t\right)/\mathit{dt}$, whereas the $k$-th derivatives, $k>1$, are denoted as ${x}^{\left(k\right)}\left(t\right)={d}^{k}x\left(t\right)/{\mathit{dt}}^{k}$. Other used symbols: $\mid \cdot \mid $ denotes absolute value, $E\phantom{\rule{0ex}{0ex}}\left[\cdot \right]$ denotes expectation, $\mathit{\text{Prob}}\phantom{\rule{0ex}{0ex}}\left\{\cdot \right\}$ denotes probability, $\delta \left(t\right)$ denotes Dirac delta function, the Kronecker delta function, ${\delta}_{\mathit{ij}}=1$, if $i=j$, and $0$ otherwise, $\mathit{I}$ is an identity matrix, $\mathbf{0}$ is an all-zeros vector or matrix, ${\left(\cdot \right)}^{T}$ denotes a matrix transpose, $\left(\cdot \right)!$ is the factorial, and $\approx $ indicates approximation. Moreover, for simplicity, all signals and values are assumed to be real-valued.

## 2. Modeling random processes as SDEs

Random processes can be conveniently modeled by SDEs. These models can be linear or nonlinear and defined in continuous time or in discrete time. The continuous-time nonlinear SDEs are preferred for developing theoretical foundations of stochastic processes, since they allow using differential (and integral) calculus. The discrete-time linear SDEs are normally used in generating the samples of stochastic processes in computer simulations and in implementing the methods of digital signal processing.

### 2.1 Linear SDE models

In continuous time, a linear random process, $x\left(t\right)$, is defined by the * m*-th order linear non-hom*ogeneous differential equation,

$\sum _{i=0}^{m}{a}_{i}\left(t\right){x}^{\left(i\right)}\left(t\right)=s\left(t\right)$E1

where ${a}_{i}\left(t\right)$ as well as $s\left(t\right)$ are other defined random processes, or ${a}_{i}\left(t\right)$ can be random constants. It is customary to assume that the driving process (input noise), $s\left(t\right)$, of model (1) is generated as a linear combination of higher-order derivatives, i.e.,

$s\left(t\right)=\sum _{i=0}^{M}{b}_{k}\left(t\right){u}^{\left(i\right)}\left(t\right)$E2

where ${b}_{i}\left(t\right)$ and $u\left(t\right)$ are other defined random processes. Typically, ${b}_{i}\left(t\right)$ are deterministic functions or constants, whereas $u\left(t\right)$ is a stationary, zero-mean, unit-variance white noise, i.e., its ACF is $E\phantom{\rule{0ex}{0ex}}\left[u\left(t\right)u\left(t+\tau \right)\right]=\delta \left(\tau \right)$.

In discrete time, a linear random process, $x\left[t\right]$, is defined by the * m*-th order linear non-hom*ogeneous difference equation,

$\sum _{i=0}^{m}{a}_{i}\left[t\right]x\left[t-i\right]=s\left[t\right]$E3

with the driving (process) noise,

$s\left[t\right]=\sum _{i=0}^{M}{b}_{k}\left[t\right]u\left[t-i\right].$E4

Similar assumptions about the processes or functions, ${a}_{i}\left[t\right]$, ${b}_{i}\left[t\right]$, and $u\left[t\right]$, are usually adopted as for continuous-time random processes. Since the descriptions in continuous and discrete time are nearly identical, in the sequel of this section, the expressions are mostly provided assuming continuous-time models.

It is common to express the * mth-order* linear model of a continuous-time random process as the first-order vector linear differential equation. In particular, it is straightforward to show that linear model (3) can be rewritten in the matrix form as

$\underset{\stackrel{\u0307}{\mathit{x}}\left(t\right)}{\underset{\u23df}{\left[\begin{array}{c}\stackrel{\u0307}{x}\left(t\right)\\ \stackrel{\u0307}{x}\left(t-1\right)\\ \vdots \\ \stackrel{\u0307}{x}\left(t-m+2\right)\\ \stackrel{\u0307}{x}\left(t-m+1\right)\end{array}\right]}}=\underset{\mathit{A}\left(t\right)}{\underset{\u23df}{\left[\begin{array}{ccccc}-{a}_{1}\left(t\right)& 1& 0& \cdots & 0\\ -{a}_{2}\left(t\right)& 0& 1& \ddots & 0\\ \vdots & \vdots & \ddots & \ddots & \vdots \\ -{a}_{m-1}\left(t\right)& 0& 0& \ddots & 1\\ -{a}_{m}\left(t\right)& 0& 0& \cdots & 0\end{array}\right]}}\underset{\mathit{x}\left(t\right)}{\underset{\u23df}{\left[\begin{array}{c}x\left(t\right)\\ x\left(t-1\right)\\ \vdots \\ x\left(t-m+2\right)\\ x\left(t-m+1\right)\end{array}\right]}}+\underset{\mathit{b}\left(t\right)}{\underset{\u23df}{\left[\begin{array}{c}{b}_{0}\left(t\right)\\ \vdots \\ {b}_{M}\left(t\right)\\ \vdots \\ 0\end{array}\right]}}u\left(t\right).$E5

In the discrete time, the equivalent first-order vector difference equation is written as

$\mathit{x}\left[t\right]=\mathit{A}\left[t\right]\mathit{x}\left[t-1\right]+\mathit{b}\left[t\right]u\left[t\right]$E6

assuming that the discrete-time indices, $t\in \mathbb{Z}$.

In the literature, model (5) is generalized as ([17], (1.89)),

$\stackrel{\u0307}{\mathit{x}}\left(t\right)=\mathbf{A}\left(t\right)\mathit{x}\left(t\right)+\mathbf{B}\left(t\right)\stackrel{\u0307}{\mathbf{u}}\left(t\right).$E7

Importantly, the derivatives in (7) may not be defined, if the underlying processes are not differentiable everywhere; in such a case, the differential eq. (7) must be formulated with differentials, $\mathrm{d}x\left(t\right)=x\left(t\right)\mathrm{d}t$, and, $\phantom{\rule{0ex}{0ex}}\mathrm{d}u\left(t\right)=u\left(t\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}t$. Then, conditioned on matrix, $\mathbf{A}\left(t\right)$, a solution of the hom*ogeneous equation is the matrix, $\mathbf{X}\left(t,s\right)$, such that $\frac{\partial}{\partial t}\mathbf{X}\left(t,s\right)=\mathbf{A}\left(t\right)\mathbf{X}\left(t,s\right)$ and $\mathbf{X}\left(s,s\right)=\mathbf{I}$ is an identity matrix. Denoting the initial value as $\mathbf{x}\left(0\right)={\mathbf{x}}_{0}$, the overall solution is obtained by the integration, i.e.,

$\mathbf{x}\left(t\right)=\mathbf{X}\left(t,0\right){\mathbf{x}}_{0}+{\int}_{0}^{t}\mathbf{X}\left(t,s\right)\mathbf{B}\left(s\right)\stackrel{\u0307}{\mathbf{u}}\left(s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s.$E8

Note that the mean process $\mathrm{E}\left[\mathbf{x}\left(t\right)\right]=\mathbf{X}\left(t,0\right){\mathbf{x}}_{0}$, and the mean of its derivative $\mathrm{E}\left[\stackrel{\u0307}{\mathbf{x}}\left(t\right)\right]=\frac{\phantom{\rule{0ex}{0ex}}\mathrm{d}}{\phantom{\rule{0ex}{0ex}}\mathrm{d}t}\mathrm{E}\left[\mathbf{x}\left(t\right)\right]=\mathbf{A}\left(t\right)\mathrm{E}\left[\mathbf{x}\left(t\right)\right]$. These solutions can be further specialized by conditioning also on the matrix, $\mathbf{B}\left(t\right)$. Thus, if matrices, $\mathbf{A}$ and $\mathbf{B}$, are time-invariant and known, the solution, $\mathbf{X}\left(t,s\right)=exp\left(\mathbf{A}\left(t-s\right)\right)$, can be substituted into Eq. (8).

Given constant matrices, $\mathbf{A}$ and $\mathbf{B}$, the SDE (7) defines different random processes for specific cases of the input process, $\mathbf{u}\left(t\right)$. In particular, provided that $\mathbf{u}\left(t\right)$ is a white Gaussian noise, the random process (8) represents a multi-dimensional Brownian motion, i.e., a stationary Gaussian process with independent increments. The Ornstein-Uhlenbeck random process is defined for $\mathbf{u}\left(t\right)$ being a Wiener process, which is a continuous Gaussian process with independent increments.

There is also an integral representation of stationary random processes in the frequency domain. It is known as the Cramér spectral representation, and it is written as the inverse Fourier transform. Thus, a random process, $x\left(t\right)$, can be generated from the random increments, $\mathrm{d}Z\left(\omega \right)$, as [7],

$x\left(t\right)=\frac{1}{2\pi}{\int}_{-\infty}^{\infty}\phantom{\rule{0ex}{0ex}}{\mathrm{e}}^{\mathit{j\omega t}}X\left(\omega \right)\phantom{\rule{0ex}{0ex}}\mathrm{d}\omega =\frac{1}{2\pi}{\int}_{-\infty}^{\infty}\phantom{\rule{0ex}{0ex}}{e}^{\mathit{j\omega t}}\phantom{\rule{0ex}{0ex}}\mathrm{d}Z\left(\omega \right).$E9

The random increment process, $\mathrm{d}Z\left(\omega \right)$, has zero mean, and its ACF function defined in the frequency domain is also the spectral density, i.e., $\mathrm{E}\left[\mathrm{d}Z\left(\omega \right)\mathrm{d}Z\left(\nu \right)\right]=S\left(\omega \right)\delta \left(\omega -\nu \right)\phantom{\rule{0ex}{0ex}}\mathrm{d}\omega \phantom{\rule{0ex}{0ex}}\mathrm{d}\nu $.

### 2.2 Nonlinear SDE models

An effective way to obtain nonlinear models of random processes is to consider a nonlinear transformation, $g\left(x,t\right)$, of the linear random process, $x\left[t\right]$. Hence, define a random observation process

$y\left[t\right]=g\left(x,t\right)+v\left[t\right]$E10

where $v\left[t\right]$ represents a measurement noise, which is usually assumed to be zero-mean, stationary, and white.

A common strategy for processing nonlinear signals is to linearize the model about a suitable value, ${x}_{0}\left[t\right]$, at time $t$. This strategy is neither optimum nor the most effective, but it is often used in practice. In particular, the first-order Taylor approximation of the process model (10) is written as

${\left.y\left[t\right]=g({x}_{0},t)+\frac{\mathrm{d}g\left(x,t\right)}{\mathrm{d}x}\right|}_{x={x}_{0}\left(t\right)}\left(x\left[t\right]-{x}_{0}\left[t\right]\right)+v\left[t\right].$E11

The value of ${x}_{0}\left[t\right]$ affects whether the linearization error could be neglected with respect to the strength of the measurement noise, $v\left[t\right]$, so it must be updated at every time instant, $t$. The practical choice is to set ${x}_{0}\left[t\right]$ to be the predicted value of $x\left[t\right]$ using the past estimated values, $\widehat{x}\left(t-1\right)$, $\widehat{x}\left(t-2\right)$, $\dots $, or simply letting, ${x}_{0}\left[t\right]=\mathrm{E}\left[x\left[t\right]\right]$, if the variance, $var\left[x\left[t\right]\right]$, is small.

A nonlinear form of SDE model (7) can be written as ([17], eq. (4.1)),

$\stackrel{\u0307}{\mathbf{x}}\left(t\right)=\mathbf{a}\left(\mathbf{x},t\right)+\mathbf{B}\left(\mathbf{x},t\right)\stackrel{\u0307}{\mathbf{u}}\left(t\right)$E12

where the vector function, $\mathbf{a}\left(\mathbf{x},t\right)={\left[{a}_{1}\left(\mathbf{x},t\right),{a}_{2}\left(\mathbf{x},t\right),\dots \right]}^{T}$, and the elements of the matrix, $\mathbf{B}\left(\mathbf{x},t\right),$ are the multivariate processes, ${\left[\mathbf{B}\left(\mathbf{x},t\right)\right]}_{\mathit{ij}}={b}_{\mathit{ij}}\left(\mathbf{x},t\right)$. The corresponding stochastic integral representation is

$\mathbf{x}\left(t\right)=\mathbf{x}\left(0\right)+{\int}_{0}^{t}\mathbf{a}\left(\mathbf{x}\left(s\right),s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s+{\int}_{0}^{t}\mathbf{B}\left(\mathbf{x}\left(s\right),s\right)\stackrel{\u0307}{\mathbf{u}}\left(s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s.$E13

In general, integrating a random process can be performed assuming Itô, Stratonovich, backward, and other methods including a path integral ([17], Section 4.1). These methods lead to specific numerical algorithms and have different levels of convergence and stability. The choice of the appropriate method is mainly dependent on the ACF of the random process. In this chapter, the SDE models of random processes are presented assuming Itô integration (calculus).

More importantly, it can be shown that the solutions of the integral equation (13) are Markovian processes. It is then not surprising that many examples of Markovian processes can be found in the literature as well as in practical applications. Moreover, if the values of a Markov process are chosen from a discrete, countably finite, or infinite set, such a process is referred to as Markov chain. Some examples of the Markov chain processes include a one-dimensional random walk, a queuing system, inventory modeling, branch processes, and games involving success runs [10].

A scalar Markovian process solving (13) is referred to as a diffusion process, provided that

$\begin{array}{lll}\underset{\mathrm{\Delta}t\to 0}{lim}\frac{1}{\mathrm{\Delta}t}\mathrm{E}\left[x\left(t+\mathrm{\Delta}t\right)-x\left(t\right)|x\left(t\right)\right]& =& a\left(x,t\right)\\ \underset{\mathrm{\Delta}t\to 0}{lim}\frac{1}{\mathrm{\Delta}t}\mathrm{E}\left[{\left(x\left(t+\mathrm{\Delta}t\right)-x\left(t\right)\right)}^{2}|x\left(t\right)\right]& =& {b}^{2}\left(x,t\right)\end{array}$E14

are continuous and deterministic processes, which are referred to as drift and diffusion coefficient, respectively. The diffusion process can be defined similarly in higher dimensions by assuming a drift vector and a diffusion matrix. Moreover, a diffusion process becomes a classical Brownian motion when it has zero drift and a constant diffusion.

### 2.3 Filtering random processes

Consider the random process defined by Eq. (8). Denote $\overline{\mathbf{x}}\left(t\right)=\mathbf{X}\left(t,0\right){\mathbf{x}}_{0},$$\mathbf{h}\left(t,s\right)=\mathbf{X}\left(t,s\right)$, and $\mathbf{B}\left(s\right)\stackrel{\u0307}{\mathbf{u}}\left(s\right)=\left(\mathbf{y}\left(s\right)-\overline{\mathbf{y}}\left(s\right)\right)$, where the means $\overline{\mathbf{x}}\left(t\right)=\mathrm{E}\left[\mathbf{x}\left(t\right)\right]$ and $\overline{\mathbf{y}}\left(t\right)=\mathrm{E}\left[\mathbf{y}\left(t\right)\right]$. With these substitutions, the random process (8) can be rewritten as

$\mathbf{x}\left(t\right)=\overline{\mathbf{x}}\left(t\right)+{\int}_{0}^{t}\mathbf{h}\left(t,s\right)\left(\mathbf{y}\left(s\right)-\overline{\mathbf{y}}\left(s\right)\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s.$E15

Eq. (15) now represents a random process generated from another random process, $\mathbf{y}\left(t\right)$. More importantly, provided that (cf. (10)),

$\mathbf{y}\left(t\right)=\mathbf{a}\left(\tilde{\mathbf{x}},t\right)+\mathbf{v}\left(t\right)$E16

the random process (15) is a linear estimate of the process, $\tilde{\mathbf{x}}\left(t\right)$, in an additive measurement noise, $\mathbf{v}\left(t\right)$, from the observations, $\mathbf{y}\left(t\right)$, using a linear and generally non-stationary filter with the impulse response, $\mathbf{h}\left(t,s\right)$. For example, the minimum mean square error (MMSE) filter must satisfy the integral equation,

${\int}_{0}^{t}\mathbf{h}\left(t,r\right){\mathbf{C}}_{\mathbf{y}}\left(r,s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}r={\mathbf{C}}_{\mathbf{x},\mathbf{y}}\left(t,s\right),\phantom{\rule{0ex}{0ex}}\forall s\in \left(0,t\right)$E17

where ${\mathbf{C}}_{\mathbf{y}}\left(t,s\right)$ is the covariance matrix of observations, $\mathbf{y}\left(t\right)$, and ${\mathbf{C}}_{\mathbf{x},\mathbf{y}}\left(t,s\right)$ is the matrix of cross-covariances between processes, $\mathbf{x}\left(t\right)$ and $\mathbf{y}\left(t\right)$. Analogical expressions can be also defined in the discrete time.

In the sequel of this subsection, a general model of random processes is presented that is used to obtain a discrete-time Kalman filter. The actual Kalman filter is derived, for example, in [11]. The Kalman filter is known to be the best linear unbiased estimator (BLUE) of a linear random process under these conditions:

the observations are linear;

the measurement noise is additive, even though possibly non-stationary;

the random process to be estimated can be also non-stationary, but it is generated by linear filtering of a white process (driving) noise;

the process noise, i.e., also the estimated random process and the measurement noise are uncorrelated.

Let assume the model of measurements in the form,

$\begin{array}{lll}\mathbf{y}\left[t\right]& =& \mathbf{D}\left[t\right]\mathbf{x}\left[t\right]+\mathbf{d}\left[t\right]+\mathbf{v}\left[t\right]\\ \mathbf{v}\left[t\right]& =& {\mathbf{A}}_{\mathbf{v}}\left[t\right]\mathbf{v}\left[t-1\right]+{\mathbf{b}}_{\mathbf{v}}\left[t\right]+{\mathbf{q}}_{\mathbf{v}}\left[t\right]\end{array}$E18

where $\mathbf{D}\left[t\right]$, $\mathbf{d}\left[t\right]$, ${\mathbf{A}}_{\mathbf{v}}\left[t\right]$, and ${\mathbf{b}}_{\mathbf{v}}\left[t\right]$ are known (deterministic) functions. The measurement noise, $\mathbf{v}\left[t\right]$, has zero mean, the covariance matrix, ${\mathbf{C}}_{\mathbf{v}}\left[t\right]$, and it is uncorrelated with the estimated random process, $\mathbf{x}\left[t\right]$, i.e., ${\mathbf{q}}_{\mathbf{v}}\left[t\right]$ is a zero-mean white noise, which is uncorrelated with the estimated process noise, $\mathbf{q}\left[t\right]$. The estimated process noise is modeled after (6), i.e.,

$\mathbf{x}\left[t\right]=\mathbf{A}\left[t\right]\mathbf{x}\left[t-1\right]+\mathbf{b}\left[t\right]+\mathbf{q}\left[t\right]$E19

where $\mathbf{A}\left[t\right]$ and $\mathbf{b}\left[t\right]$ are known (deterministic) functions, and $\mathbf{q}\left[t\right]$ is a zero-mean, white noise having the covariance matrix, ${\mathbf{C}}_{\mathbf{q}}\left[t\right]$, and it is uncorrelated with the measurement noise, $\mathbf{v}\left[t\right]$.

The correlations make the measurement noise to be predictable, so it can be added to the estimated process. The extended vector of parameters becomes

$\left[\begin{array}{c}\mathbf{x}\left[t\right]\\ \mathbf{v}\left[t\right]\end{array}\right]=\left[\begin{array}{cc}\mathbf{A}\left[t\right]& \mathbf{0}\\ \mathbf{0}& {\mathbf{A}}_{\mathbf{v}}\left[t\right]\end{array}\right]\left[\begin{array}{c}\mathbf{x}\left[t-1\right]\\ \mathbf{v}\left[t-1\right]\end{array}\right]+\left[\begin{array}{c}\mathbf{b}\left[t\right]\\ {\mathbf{b}}_{\mathbf{v}}\left[t\right]\end{array}\right]+\left[\begin{array}{c}\mathbf{q}\left[t\right]\\ {\mathbf{q}}_{\mathbf{v}}\left[t\right]\end{array}\right]$E20

and the corresponding measurements

$\mathbf{y}\left[t\right]=\left[\mathbf{D}\left[t\right]\phantom{\rule{0ex}{0ex}}\mathbf{I}\right]\left[\begin{array}{c}\mathbf{x}\left[t\right]\\ \mathbf{v}\left[t\right]\end{array}\right]+\mathbf{d}\left[t\right].$E21

The random processes (20) and (21) are then used to obtain the Kalman filter.

### 2.4 Discretization of continuous-time models

Digital signal processing requires representing the continuous-time random processes as discrete sequences of random samples. In addition, it is desirable that these samples are statistically uncorrelated. The procedure to obtain the uncorrelated samples is known as Karhunen-Loéve expansion in the literature. In particular, given a finite set of $K$ orthonormal continuous real-valued functions, ${\varphi}_{k}\left(t\right)$, over a finite time interval, $\mathcal{T}$, such that

${\int}_{\mathcal{T}}{\varphi}_{k}\left(t\right){\varphi}_{l}\left(t\right)\phantom{\rule{0ex}{0ex}}\mathit{dt}=\left(\begin{array}{cc}1,& k=l\\ 0,& k\ne l\end{array}\right.$E22

the task is to decompose the random process, $x\left(t\right)$, into the sum

$x\left(t\right)=\sum _{k=1}^{K}{X}_{k}{\varphi}_{k}\left(t\right)$E23

such that

$\mathrm{E}\left[{X}_{k}{X}_{l}\right]=\left(\begin{array}{cc}{\lambda}_{k},& k=l\\ 0,& k\ne l.\end{array}\right..$E24

Consequently, the mean process and the ACF, respectively, are defined as

$\begin{array}{lll}\overline{x}\left(t\right)& =& \mathrm{E}\left[x\left(t\right)\right]=\sum _{k=1}^{K}\mathrm{E}\left[{X}_{k}\right]{\varphi}_{k}\left(t\right)\\ {C}_{x}\left(t,s\right)& =& \mathrm{E}\left[\left(x\left(t\right)-\overline{x}\left(t\right)\right)\left(x\left(s\right)-\overline{x}\left(s\right)\right)\right]=\sum _{k=1}^{K}{\lambda}_{k}{\varphi}_{k}\left(t\right){\varphi}_{k}\left(s\right).\end{array}$E25

In addition to being orthonormal, the basis functions, ${\varphi}_{k}\left(t\right)$, must satisfy the Fredholm integral equation,

${\int}_{\mathcal{T}}{C}_{x}\left(t,s\right){\varphi}_{k}\left(s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s={\lambda}_{k}{\varphi}_{k}\left(t\right),\phantom{\rule{0ex}{0ex}}t\in \mathcal{T}.$E26

The Fredholm integral equations are often encountered in designing statistical signal processing methods. They are difficult to solve analytically, unless the underlying random process is stationary. In practice, it is sufficient to show that the adopted set of orthonormal functions, ${\varphi}_{k}\left(t\right),$ satisfies condition (26) rather than finding the eigenvalues and the eigenfunctions of the ACF.

It should be noted that ordinary Nyquist sampling of random signals may not yield samples that are uncorrelated. In such a case, Karhunen-Loéve expansion can be used to obtain the uncorrelated samples. In such a case, the set of orthonormal functions, ${\varphi}_{k}\left[t\right]$, must be chosen, so that the discrete-time ACF can be computed as

${C}_{x}\left(t,s\right)=\sum _{k=1}^{K}{\lambda}_{k}{\varphi}_{k}\left[t\right]{\varphi}_{k}\left[s\right]$E27

for all the indices, $t$ and $s$, from a finite subset of integers. Unlike in continuous time, the values, ${\lambda}_{k}$, can be computed from the variance matrix, $var\left[\mathbf{X}\right]=\mathrm{E}\left[\left(\mathbf{X}-\overline{\mathbf{X}}\right){\left(\mathbf{X}-\overline{\mathbf{X}}\right)}^{T}\right]$, where $\mathbf{X}={\left[{X}_{1},\dots ,{X}_{K}\right]}^{T}$ is a column vector of samples, and $\overline{\mathbf{X}}=\mathrm{E}\left[\mathbf{X}\right]$. It can be shown that ${\lambda}_{k}$ are the roots of the characteristic polynomial defined as the determinant,

$det\left(var\left[\mathbf{X}\right]-{\lambda}_{k}\mathbf{I}\right)=0\phantom{\rule{0ex}{0ex}}\forall k.$E28

The vectors, ${\varphi}_{k}\left[t\right]$, are then computed, so they lie in the null space of the matrix, $\left(var\left[\mathbf{X}\right]-{\lambda}_{k}\mathbf{I}\right).$

Furthermore, there are situations, for example, in describing the critical dynamics of a system, when the model of a stochastic process must involve both the deterministic and random effects. The random variables represent local, microscopic random fluctuations, whereas the deterministic variables model the collective, macroscopic, slowly varying effects. The corresponding SDEs are so-called Langevin equations. They are generally difficult to solve; however, the underlying probability density evolution can be often obtained by solving the corresponding Fokker-Planck equation (FPE) as will be shown in the next section.

A general Langevin equation in one dimension has the SDE form (12), i.e.,

$\stackrel{\u0307}{x}\left(t\right)=a\left(x,t\right)+b\left(x,t\right)L\left(t\right)$E29

where $L\left(t\right)$ represents the Langevin random force, which is assumed to be a zero-mean white Gaussian noise. If the coefficient, $b\left(x,t\right)$, does not depend on $x$, then the noise term in (29) is assumed to be additive; otherwise, the noise is multiplicative. The multiplicative noise allows describing more complex behaviors including non-equilibrium dynamics.

In $d$-dimensions, the set of general Langevin equations is defined as

${\stackrel{\u0307}{\mathbf{x}}}_{i}\left(t\right)={\mathbf{a}}_{i}\left(\mathbf{x},t\right)+{\mathbf{B}}_{\mathit{ij}}\left(\mathbf{x},t\right){L}_{j}\left(t\right),\phantom{\rule{0ex}{0ex}}i,j=1,2,\dots ,d$E30

where ${L}_{j}\left(t\right)$ are the zero-mean Langevin random processes having the cross-covariances, $\mathrm{E}\left[{L}_{i}\left(t\right){L}_{j}\left(s\right)\right]=2{\delta}_{\mathit{ij}}\delta \left(t-s\right)$.

For example, a one-dimensional zero-mean Gaussian random process with the ACF, $\mathrm{E}\left[x\left(t\right)x\left(s\right)\right]=aexp\left(-a|t-s|\right)$, is described by the Langevin SDE,

$\stackrel{\u0307}{x}\left(t\right)=-\mathit{ax}\left(t\right)+\mathit{aL}\left(t\right)$E31

where $L\left(t\right)$ is a zero-mean white Gaussian noise. This random process is normally referred to as Ornstein-Uhlenbeck process. Note also that, in the limit of very large $a$, the process, $x\left(t\right)$, becomes white.

## 3. Evolution of probability density

By definition, at any given time instant, the value of a random process is a random variable, which is described by a particular probability density function (PDF). It is, therefore, of interest to investigate how such a PDF evolves in time along the realizations of a random process.

Consider a general nonlinear random process described by model (12). As stated previously, the solutions of this SDE are always Markov processes defined by their transition probability density, $p\left({\mathbf{x}}_{t},t|{\mathbf{x}}_{s},s\right)$, where, for notational convenience, ${\mathbf{x}}_{t}\equiv \mathbf{x}\left(t\right)$, and, ${\mathbf{x}}_{s}\equiv \mathbf{x}\left(s\right)$. The evolution of the probability density of the random variable, ${\mathbf{x}}_{t}$, is described by the forward Kolmogorov equation, which is better known as the FPE. It assumes that $t>s$, and the initial condition $p\left(\mathbf{x},s\right)=\delta \left(\mathbf{x}\right)$. On the other hand, the evolution from the final (terminal) state, $p\left(\mathbf{x},s\right)={u}_{s}\left(\mathbf{x}\right)$, is described by a similar partial differential equation, i.e., the backward Kolmogorov equation, assuming $t\le s$.

The FPE can be effectively formulated using the Fokker-Planck operator. Hence, assuming the PDF, $p\left(\mathbf{x},t\right)$, define the operator

${L}_{\mathbf{x}}p=\sum _{i}\frac{\partial}{\partial {\mathbf{x}}_{i}}\left(\sum _{j}\frac{\partial}{\partial {\mathbf{x}}_{j}}{\mathbf{B}}_{\mathit{ij}}\left(\mathbf{x},t\right)p-{\mathbf{a}}_{i}(\mathbf{x},t)p\right)$E32

where the subscripts, $i$ and $j$, denote the components of vectors or the elements of the process matrix, $\mathbf{B}$. The FPE can be then written as

$\frac{\partial p\left({\mathbf{x}}_{t},t|{\mathbf{x}}_{s},s\right)}{\partial t}={L}_{{\mathbf{x}}_{t}}p\left({\mathbf{x}}_{t},t|{\mathbf{x}}_{s},s\right),\phantom{\rule{0ex}{0ex}}t>s.$E33

Note that, in the limit, ${lim}_{t\to s}p\left({\mathbf{x}}_{t},t|{\mathbf{x}}_{s},s\right)=\delta \left({\mathbf{x}}_{t}-{\mathbf{x}}_{s}\right).$

Omitting the dependence on the initial state, ${\mathbf{x}}_{s}$, at time $t=s$, the FPE of a * d*-dimensional diffusion process defined by the SDE model (12) is written as

$\frac{\partial p\left(\mathbf{x},t\right)}{\partial t}=-\sum _{i=1}^{d}\frac{\partial}{\partial {\mathbf{x}}_{i}}\left({\mathbf{a}}_{i}\left(\mathbf{x},t\right)p(\mathbf{x},t)\right)+\sum _{i,j=1}^{d}\frac{{\partial}^{2}}{\partial {\mathbf{x}}_{i}\partial {\mathbf{x}}_{j}}\left({D}_{\mathit{ij}}\left(\mathbf{x},t\right)p(\mathbf{x},t)\right)$E34

where $\mathbf{a}\left(\mathbf{x},t\right)$ is the drift vector, and the diffusion coefficients are computed as ${D}_{\mathit{ij}}=\frac{1}{2}{\sum}_{k=1}^{d}{\mathbf{B}}_{\mathit{ik}}\left(\mathbf{x},t\right){\mathbf{B}}_{\mathit{jk}}\left(\mathbf{x},t\right)$.

The one-dimensional Ornstein-Uhlenbeck process, i.e., $\stackrel{\u0307}{x}\left(t\right)=-\mathit{ax}\left(t\right)+b\stackrel{\u0307}{u}\left(t\right)$, (cf. (31)) is described by the FPE,

$\frac{\partial p\left(x,t\right)}{\partial t}=a\frac{\partial \left(\mathit{xp}\left(x,t\right)\right)}{\partial x}+\frac{{b}^{2}}{2}\frac{{\partial}^{2}p\left(x,t\right)}{\partial {x}^{2}}.$E35

A Brownian motion process in one dimension is a special case of the diffusion having zero drift and the diffusion coefficient equal to $1/2$. In such a case, $\stackrel{\u0307}{\mathbf{x}}\left(t\right)=\stackrel{\u0307}{\mathbf{u}}\left(t\right)$, where $\mathbf{u}\left(t\right)$ is a Wiener process, and the corresponding FPE is

$\frac{\partial p\left(x,t\right)}{\partial t}=\frac{1}{2}\frac{{\partial}^{2}p\left(x,t\right)}{\partial {x}^{2}}.$E36

Assuming the initial condition, $p\left(x,0\right)=\delta \left(x\right),$ the SDE (36) has the solution, $p\left(x,t\right)={\left(2\mathit{\pi t}\right)}^{-1/2}exp\left(-{x}^{2}/\left(2t\right)\right)$, i.e., it is a non-stationary Gaussian process.

In general, the FPE can be solved for special cases, such as when the coefficients of the SDE are time-independent. The stationary solution is obtained by assuming that $\stackrel{\u0307}{\mathbf{x}}\left(t\right)=0$, for some ${t}_{\mathrm{ss}}<t$. The solution of a non-stationary FPE can be obtained by separation or transformation of variables, variational approach, eigenvalue expansion, numerical integration, and other methods ([12], Ch. 3).

Furthermore, the dynamics of many physical and chemical systems are often described by so-called a master equation. The master equation describes the distribution of the discrete states over time. For these systems, it is desirable to model how likely the system is at any given state at different time instances. The switching between the states occurs randomly at random time instances with a certain rate. This can be described by a continuous-time Markov chain.

Specifically, the master equation defines the change of the probability distribution, $\mathbf{P}\left(t\right)$, of the system state at time, $t$. It is the first-order matrix SDE,

$\stackrel{\u0307}{\mathbf{P}}\left(t\right)=\mathbf{A}\left(t\right)\mathbf{P}\left(t\right)$E37

where $\mathbf{A}\left(t\right)$ denotes the transition matrix. Its elements, ${\left[\mathbf{A}\left(t\right)\right]}_{\mathit{ij}}$, $i\ne j$, are non-negative, whereas the diagonal elements are set, so the rows of $\mathbf{A}$ sum to zero.

The matrix, $\mathbf{A}$, is the Laplacian of a directed weighted graph representing the state transitions of a Markov chain with the weights being equal to the rates of the state transitions. In case the elements, ${\left[\mathbf{A}\left(t\right)\right]}_{\mathit{ij}}$, of the transition matrix reflect the probabilities of a hom*ogeneous Markov chain to jump from the state, * i*, to the state,

*, at time instant,*j

*, then by Chapman-Kolmogorov equation,*t

$\mathbf{A}\left(t+s\right)=\mathbf{A}\left(t\right)\mathbf{A}\left(s\right).$E38

In discrete time, it follows that $\mathbf{P}\left(t\right)={\mathbf{P}}^{t}\left(1\right)$, so the master equation can be expressed as

$\frac{\mathrm{d}}{\mathrm{d}t}{\left[\mathbf{P}\right]}_{i}\left(t\right)=\sum _{j\ne i}\left({\left[\mathbf{A}\right]}_{\mathit{ij}}\left(t\right){\left[\mathbf{P}\right]}_{j}\left(t\right)-{\left[\mathbf{A}\right]}_{\mathit{ji}}\left(t\right){\left[\mathbf{P}\right]}_{i}\left(t\right)\right).$E39

Importantly, this indicates that the master equation does not depend on the diagonal elements of the transition matrix, $\mathbf{A}$. At equilibrium, the detailed balance requires that

${\left[\mathbf{A}\right]}_{\mathit{ij}}{\left[\mathbf{P}\right]}_{j}={\left[\mathbf{A}\right]}_{\mathit{ji}}{\left[\mathbf{P}\right]}_{i}\phantom{\rule{0ex}{0ex}}\forall i,j.$E40

The corresponding probability distribution of states, $\mathbf{P}$, is referred to as the equilibrium distribution. The equilibrium describes the time-reversibility of microscopic dynamics, which is different from a steady-state; the latter occurs, when, $\frac{\mathrm{d}}{\mathrm{d}t}{\left[\mathbf{P}\right]}_{i}\left(t\right)=0$, for $t>{t}_{\mathrm{ss}}$, and $\forall i$.

The master equation can be used to model a one-dimensional birth-death process. In particular, if the population size, $x\left(t\right)$, is of interest, it can increase by one to $x\left(t+s\right)=x\left(t\right)+1$, due to a birth event, or decrease by one to $x\left(t+s\right)=x-1$, when a death occurred, while $x\left(t\right)\ge 0,$ for $\forall t$. Assuming that the transitions are Markovian, the time-dependent transition probabilities are defined as

${P}_{\mathit{ij}}\left(\mathrm{\Delta}t\right)=\text{Prob}\left\{x\left(t+\mathrm{\Delta}t\right)=j|x\left(t\right)=i\right\}.$E41

For small $\mathrm{\Delta}t>0$, and the current population size, $i\ge 0$, it can be assumed that at most one birth or death has occurred, so that,

$\begin{array}{lll}{P}_{i,i+1}& =& \mathit{\alpha i}\mathrm{\Delta}t,\phantom{\rule{0ex}{0ex}}i\ge 0\\ {P}_{i,i-1}& =& \mathit{\beta i}\mathrm{\Delta}t,\phantom{\rule{0ex}{0ex}}i\ge 1\\ {P}_{i,i}& =& 1-\left(\alpha +\beta \right)i\mathrm{\Delta}t,\phantom{\rule{0ex}{0ex}}i\ge 1.\end{array}$E42

The coefficients, $\alpha $ and $\beta $, in (42) represent the probabilities that the corresponding events occur per unit of time. Denoting the probability distribution of the population sizes at time $t$ as ${P}_{i}\left(t\right)$, the resulting master equation is written as

$\begin{array}{lll}\frac{\mathrm{d}}{\mathrm{d}t}{P}_{i}\left(t\right)& =& \alpha \left(i-1\right){P}_{i-1}\left(t\right)+\beta \left(i+1\right){P}_{i+1}\left(t\right)-\left(\alpha +\beta \right){\mathit{iP}}_{i}\left(t\right),\phantom{\rule{0ex}{0ex}}i\ge 1\\ \frac{\mathrm{d}}{\mathrm{d}t}{P}_{0}\left(t\right)& =& \beta {P}_{1}\left(t\right).\end{array}$E43

The stationary solution of (43) is obtained assuming $\frac{\mathrm{d}}{\mathrm{d}t}{P}_{i}\left(t\right)=0$, $i\ge 0$, i.e., when ${P}_{0}=1,$ and, ${P}_{i}=0$, $i\ge 1$; in such a case, the population eventually goes extinct. On the other hand, if ${P}_{0}<1$, the population can either become extinct with a certain probability, or it grows without any bounds. More precisely, the derived steady-state probabilities are computed as

$\begin{array}{lll}{P}_{i}& =& {P}_{0}\prod _{j=1}^{i}\frac{\alpha \left(j-1\right)}{\mathit{\beta j}},\phantom{\rule{0ex}{0ex}}i\ge 1\\ {P}_{0}& =& {\left(1+\sum _{i=1}^{\infty}\prod _{j=1}^{i}\frac{\alpha \left(j-1\right)}{\mathit{\beta j}}\right)}^{-1}.\end{array}$E44

## 4. Common random processes

In simulations and modeling of various systems, some random processes are encountered more often than others. The random processes that are preferred in practical applications are those that can be succinctly described, have well-defined properties, and can be readily generated or fitted to measured data. In this section, the most common random processes are outlined starting from the random processes that are defined in discrete time, which are often referred to as random sequences, before introducing the random processes defined in continuous time.

The random sequence, $\left\{x\left[0\right],x\left[1\right],\dots \right\}$, is a martingale, if its increments are independent and have zero mean, i.e., ([18], Ch. 6)

$\mathrm{E}\left[x\left[t\right]\phantom{\rule{0ex}{0ex}}|\phantom{\rule{0ex}{0ex}}x\left[0\right],x\left[1\right],\dots ,x\left[t-1\right]\right]=\mathrm{E}\left[x\left[t\right]-x\left[t-1\right]|x\left[0\right],\dots ,x\left[t-1\right]\right]=0.$E45

More generally, a sequence, ${\left\{y\left[t\right]\right\}}_{t}$, is a martingale with respect to a sequence, ${\left\{x\left[t\right]\right\}}_{t}$, provided that

$\mathrm{E}\left[y\left[t\right]|x\left[0\right],\dots ,x\left[t-1\right]\right]=y\left[t-1\right].$E46

For example, if the samples, $x\left[t\right]$, are drawn independently from the same probability distribution, $p\left(x;\theta \right)$, the likelihood ratio of the estimated parameter, $\widehat{\theta}$, from $T$ observed samples, is

$\mathrm{\Lambda}\left[T\right]=\prod _{t=1}^{T}\frac{p\left(x\left[t\right];\widehat{\theta}\right)}{p\left(x\left[t\right];\theta \right)}.$E47

The mean value of the likelihood ratio conditioned on the observations is

$\mathrm{E}\left[\mathrm{\Lambda}\left[T\right]|x\left[1\right],\dots ,x\left[T-1\right]\right]=\mathrm{\Lambda}\left[T-1\right]$E48

so the sequence of likelihoods is the martingale.

In continuous time, the martingale process is defined as

$\mathrm{E}\left[x\left({t}_{2}\right)|x\left(t\right),t\le {t}_{1}<{t}_{2}\right]=x\left({t}_{1}\right)$E49

or conditioned on another process as

$\mathrm{E}\left[y\left({t}_{2}\right)|x\left(t\right),t\le {t}_{1}<{t}_{2}\right]=y\left({t}_{1}\right).$E50

The Wiener process is an example of the martingale in continuous time.

The conditional probability of samples in a Markov sequence satisfies

$P\left(x\left[t\right]|x\left[1\right],\dots ,x\left[t-1\right]\right)=P\left(x\left[t\right]|x\left[t-1\right]\right).$E51

The higher-order Markov sequences can be also defined; however, they are assumed rarely in practical problems. The Markov sequences attaining a finite set of values are often referred to as Markov chains, and their values are referred to as states.

The Markov chain is irreducible, if any state can be reached from any other state in a finite number of steps, i.e.,

$\forall x\left[{t}_{1}\right],\phantom{\rule{0ex}{0ex}}\exists {t}_{2}>{t}_{1}:\phantom{\rule{0ex}{0ex}}P\left(x\left[{t}_{2}\right]|x\left[{t}_{1}\right]\right)>0.$E52

Assuming Chapman-Kolmogorov equation, the transition probability over $T$ steps is computed recursively from transition probabilities over $\left(T-1\right)$ steps as

$P\left(x\left[T\right]=j|x\left[1\right]=i\right)=\sum _{k}P\left(x\left[T-1\right]=k|x\left[1\right]=i\right)P\left(x\left[T\right]=j|x\left[T-1\right]=k\right).$E53

If the chain is hom*ogeneous, and the one-step transition probabilities are arranged in a matrix, , then the transition probabilities after $k$ steps are computed as ${\mathbf{P}}^{k}$.

The properties of the Markov process in continuous time are identical to the Markov chain, provided that the number of states is countably finite or infinite; such a process is referred to as a continuous-time Markov chain. For example, if the process, $x\left(t\right)$, can be constructed by independent increments, then it is the Markov process.

Bernoulli sequence is formed by binary random variables having the fixed probability of outcomes. The number of successes among a given number of trials is binomially distributed, whereas the number of trials required to achieve a given number of successes has a negative binomial distribution. Bernoulli sequence can be generalized to more than two outcomes; it is then called Bernoulli scheme sequence. These sequences can be used to define other random sequences. Consider, for example, the sum

$x\left[T\right]=\sum _{t=1}^{T}u\left[t\right]$E54

where $u\left[i\right]\in \left\{-1,+1\right\},$ which defines a Bernoulli random walk. It has the mean, $\mathrm{E}\left[x\left[T\right]\right]={\left(1-2p\right)}^{T},$ which can be both increasing or decreasing over time, depending on $p=\text{Prob}\left\{x\left[T\right]=+1\right\}$. The variance, $var\left[x\left[T\right]\right]=4p\left(1-p\right)T$, is always increasing over time. The ACF can be derived as $\mathrm{E}\left[x\left[{T}_{1}\right]x\left[{T}_{2}\right]\right]=4p\left(1-p\right)min\left({T}_{1},{T}_{2}\right)+{\left(1-2p\right)}^{2}{T}_{1}{T}_{2}$. It should be noted that the random walk can be also created assuming a birth-death process discussed above.

The case, $u\left[i\right]\in \left\{0,1\right\}$, in (54) defines a binomial counting sequence with the mean value, $\mathrm{E}\left[x\left[T\right]\right]=\mathit{pT}$, and the variance, $var\left[x\left[T\right]\right]=p\left(1-p\right)T$. The ACF is now obtained as $\mathrm{E}\left[x\left[{T}_{1}\right]x\left[{T}_{2}\right]\right]=p\left(1-p\right)min\left({T}_{1},{T}_{2}\right)+{p}^{2}{T}_{1}{T}_{2}$.

The binomial counting process can be generalized as a continuous-time Poisson counting process having the independent and stationary increments, and the initial value $x\left(0\right)=0$. In the literature, these processes are sometimes referred to as renewal processes [10]. The Poisson counting process has the probability of $k\ge 0$ successes occurring over a time duration, $T>0$, approximated by the Poisson distribution, i.e.,

$\text{Prob}\left\{x\left(T\right)=k\right\}\approx \frac{{\left(\mathit{\lambda T}\right)}^{k}}{k!}exp\left(-\mathit{\lambda T}\right)$E55

where $\lambda $ denotes the average rate of successes per unit time. If $\lambda $ is time-dependent, such a process is called non-hom*ogeneous. The inter-arrival time between two consecutive successes is a random variable, which can be shown to be exponentially distributed with the same parameter, * λ*. The inter-arrival time can be used to obtain the distribution of times until the occurrence of the

*-th success; this is an Erlang random variable. The ACF of the Poisson process can be shown to be $\mathrm{E}\left[x\left({T}_{1}\right)x\left({T}_{2}\right)\right]=\lambda min\left({T}_{1},{T}_{2}\right)+{\lambda}^{2}{T}_{1}{T}_{2}$.*k

Poisson process is an example of a general class of Lévy random processes. Another example of the Lévy random process is a gamma process having the inter-arrival times being gamma distributed. Poisson process can be generalized as Hawkes random process. It is a marked point process, which models the random intensity changes in the arrival times as a self-excitation phenomenon ([2], Ch. 11).

The random telegraph signal, $y\left(t\right)$, can be obtained as a transformation of the Poisson process, $x\left(t\right)$, i.e.,

$y\left(t\right)={\left(-1\right)}^{x\left(t\right)}=\left\{\begin{array}{cc}+1,& x\left(t\right)-\text{even}\\ -1,& x\left(t\right)-\mathrm{odd}.\end{array}\right.$E56

Thus, $y\left(t\right)$ is a continuous-time Markov jump process having only two states. The probability of states can be derived to be

$\begin{array}{lll}\text{Prob}\left\{y\left(t\right)=+1\right\}& =& exp\left(-\mathit{\lambda t}\right)cosh\left(\mathit{\lambda t}\right)\\ \text{Prob}\left\{y\left(t\right)=-1\right\}& =& exp\left(-\mathit{\lambda t}\right)sinh\left(\mathit{\lambda t}\right).\end{array}$E57

Note that the rate constant, $\lambda $, could be assumed to be different in the transitions to $+1$ and $-1$ value, respectively. The probability distribution of $y\left(t\right)$ is

$p\left(y,t\right)=exp\left(-\mathit{\lambda t}\right)\left(cosh\left(\mathit{\lambda t}\right)\delta \left(y-1\right)+sinh\left(\mathit{\lambda t}\right)\delta \left(y+1\right)\right)$E58

with the mean, $\mathrm{E}\left[y\left(t\right)\right]=exp\left(-2\mathit{\lambda t}\right)$, and the variance, $var\left[y\left(t\right)\right]=1-exp\left(-4\mathit{\lambda t}\right)$. Moreover, the telegraph signal can be randomized by assuming the product, $\mathit{Uy}\left(t\right)$, where $U$ are independent and equally likely values, $+1$ or $-1$, generated whenever the process, $y\left(t\right)$, changes its value.

The Wiener process is best known as a model of Brownian motion of particles in a fluid. It is a solution of the second-order partial differential equation

$\frac{\partial}{\partial t}p\left(x,t\right)=\alpha \frac{{\partial}^{2}}{\partial {x}^{2}}p\left(x,t\right),\phantom{\rule{0ex}{0ex}}x\left(0\right)=0$E59

where $\alpha $ denotes the rate of particle collisions. Solving (59), $p\left(x,t\right)$ is the Gaussian distribution with zero mean and the variance equal to $\left(\mathit{\alpha t}\right)$, i.e., this process is non-stationary. The ACF is derived as $\mathrm{E}\left[x\left({t}_{1}\right)x\left({t}_{2}\right)\right]=\alpha min\left({t}_{1},{t}_{2}\right)$. More generally, the Wiener process can be also defined assuming the time increments rather than the absolute times.

A multivariate Gaussian process is probably the most frequently assumed random process. Since these processes have been extensively described and studied in the literature, they are mentioned only very briefly here. The Gaussian process is fully defined by the vector of means and the covariance matrix; the covariance matrix also defines the spatial ACF of this process. The Gaussian distribution has the largest entropy (i.e., it is the worst-case noise) among all continuous distributions. Its widespread occurrence in many practical scenarios and systems is justified by the central limit theorem. It is also the only distribution, which is preserved by linear transformations, and for which uncorrelated variables are equivalent to independent variables. Gaussian processes play an important role in estimating the parameter values. The Gaussian mixture models add another level of flexibility in modeling random processes and in defining the prior and posterior distributions in Bayesian as well as non-Bayesian estimation of parameter values.

More complicated random processes can be built from other random processes. For example, Archimedean survival process is a vector of differences between gamma bridge processes, which are themselves created by scaling the gamma random processes by other random variables ([2], Ch. 10).

## 5. Numerical methods for solving SDEs

The random processes considered in this chapter are defined by the SDEs. Since many practical SDEs cannot be solved analytically, they must be solved numerically by computer simulations. The caveat is that the existing algorithms for solving ordinary differential equations (ODEs) may not be directly applicable or efficient for solving SDEs. The notable exception is SDEs with the driving noise that is generated independently of the random process values; for these processes, the algorithms developed for ODEs can be directly used for SDEs.

In general, the algorithms for solving SDEs should provide fast convergence, good accuracy, and also stability. The accuracy is affected by discretization, whereas the lack of stability may cause a sudden divergence of computations. Therefore, numerical solvers for SDEs require formal mathematical frameworks and derivations in order to satisfy these requirements.

As an example, consider the integral form of a one-dimensional time-hom*ogeneous SDE

$x\left(t\right)=x\left({t}_{0}\right)+{\int}_{{t}_{0}}^{t}a\left(x\left(s\right)\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s+{\int}_{{t}_{0}}^{t}b\left(x\left(s\right)\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}u\left(s\right).$E60

Using Itô’s formula, the Euler method approximates integral equation (60) as

$x\left(t\right)\approx x\left({t}_{0}\right)+a\left(x\left({t}_{0}\right)\right)\left(t-{t}_{0}\right)+b\left(x\left({t}_{0}\right)\right)\left(u\left(t\right)-u\left({t}_{0}\right)\right).$E61

The more accurate higher-order approximation can be obtained in the form

$x\left(t\right)\approx x\left({t}_{0}\right)+a\left(x\left({t}_{0}\right)\right)\left(t-{t}_{0}\right)+b\left(x\left({t}_{0}\right)\right){\int}_{{t}_{0}}^{t}\mathrm{d}u\left(s\right)+{L}_{1}b\left(x\left({t}_{0}\right)\right){\int}_{{t}_{0}}^{t}{\int}_{{t}_{0}}^{s}\phantom{\rule{0ex}{0ex}}\mathrm{d}u\left({s}^{\prime}\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}u\left(s\right)$E62

where the function operator, ${L}_{1}$, is introduced to denote ${L}_{1}f\left(t\right)=b\frac{\phantom{\rule{0ex}{0ex}}d}{\phantom{\rule{0ex}{0ex}}\mathit{dt}}f\left(t\right)$.

Assuming the time discretization to evaluate the process, $x\left(t\right)$, at $N$ discrete-time instances, $t=i\mathrm{\Delta}t$, $i=0,1,\dots ,N$, so that, $N\mathrm{\Delta}t=T$, and denoting the Wiener process increments as $\mathrm{\Delta}{u}_{i}=u\left(\left(i+1\right)\mathrm{\Delta}t\right)-u\left(i\mathrm{\Delta}t\right)$, the Euler algorithm computes one of the following iterations depending on the assumed approximation, i.e.,

$\begin{array}{lll}{x}_{i+1}& =& {x}_{i}+a\left({x}_{i}\right)\mathrm{\Delta}t+b\left({x}_{i}\right)\mathrm{\Delta}{u}_{i}\\ {x}_{i+1}& =& {x}_{i}+a\left({x}_{i}\right)\mathrm{\Delta}t+b\left({x}_{i}\right)\mathrm{\Delta}{u}_{i}+\frac{1}{2}{L}_{0}b\left({x}_{i}\right)\left(\mathrm{\Delta}{u}_{i}^{2}-\mathrm{\Delta}t\right)\end{array}$E63

where the function operator, ${L}_{0}$, is defined as ${L}_{0}f\left(t\right)=a\frac{\mathrm{d}}{\mathrm{d}t}f\left(t\right)+\frac{1}{2}{b}^{2}\frac{{\mathrm{d}}^{2}}{\mathrm{d}{t}^{2}}f\left(t\right)$. A better stability can be obtained by assuming $a\left({x}_{i+1}\right)$ and $b\left({x}_{i+1}\right)$, or, $\left(a\left({x}_{i}\right)+a\left({x}_{i+1}\right)\right)/2$ and $\left(b\left({x}_{i}\right)+b\left({x}_{i+1}\right)\right)/2$, in (63). However, these latter values do not guarantee the convergence to the correct continuous-time solution even when the time step, $\mathrm{\Delta}t$, goes to zero. The converge problem can be rectified by using $a\left({\overline{x}}_{i}\right)$ and $b\left({\overline{x}}_{i}\right)$ in (63), where the filtered sequence, ${\overline{x}}_{i}=\mathrm{\u03f5}{x}_{i+1}+\left(1-\mathrm{\u03f5}\right){x}_{i}$. For $\mathrm{\u03f5}=1/2$, the algorithm to generate the samples, ${x}_{i}$, of the random process must periodically find the root, ${\overline{x}}_{i}$, of the equation

$2\left({\overline{x}}_{i}-{x}_{i}\right)=a\left({\overline{x}}_{i}\right)\mathrm{\Delta}t+b\left({\overline{x}}_{i}\right)\mathrm{\Delta}{u}_{i}.$E64

The root mean square error as a measure of the algorithm convergence between the approximately computed values, ${x}_{i}$, and the exact values, $x\left(i\mathrm{\Delta}t\right)$, is of the order between $\sqrt{\mathrm{\Delta}t}$ and $\mathrm{\Delta}t$. Thus, the approximation accuracy improves by reducing the discretization step size. There are other algorithms, for example, Milstein algorithm, that were proposed in the literature to improve the accuracy.

There is a notion of communicative and non-communicative noises, depending whether the driving noise (usually a Wiener process with independent Gaussian increments) is additive or multiplicative, i.e., whether the scaling coefficient of the noise is independent of the random process values or not. This is important, for example, when deriving the FPE for a Langevin equation. If the noise is additive, then Itô and Stratonovich interpretation of Langevin equation leads to the same FPE; this is not the case when the noise is multiplicative.

Furthermore, similar discretization methods can be used to generate samples of the vector random processes as explained in ([5], Section 10.5). Subsequently, the vector counterparts of the Euler and Milstein algorithms exist.

In addition to low-order approximations by the Euler and Milstein algorithms, other algorithms relying on higher-order approximations are available. However, the increased accuracy of these algorithms usually comes at the cost of increased computational complexity and possible issues with convergence. In general, a good strategy is to exploit the structure of the specific SDE in order to avoid many of the computational problems. For example, the noise can be quantized to a small number of levels or even be generated as a binary sequence of $\pm 1$ values.

The substantial computational problems arise in simulating spatially resolved partial SDEs, since they have to be discretized not only in time, but also in 2D or even 3D space. This may prevent considering any higher-order approximations and the corresponding algorithms. In the literature, the algorithms for spatial SDEs assume, for example, the fast Fourier transforms, and there is also the interaction picture method inspired by Hamiltonians in quantum mechanics.

### 5.1 Example

Consider the one-dimensional SDE,

$\mathrm{d}x\left(t\right)=\mathit{\mu x}\left(t\right)\mathrm{d}t+\mathit{\sigma x}\left(t\right)\mathrm{d}w\left(t\right),\phantom{\rule{0ex}{0ex}}x\left(0\right)={x}_{0}$E65

with the constant drift, $\mu $, and the constant diffusion, $\sigma $, driven by the Wiener process, $w\left(t\right)$. In this case, the solution can be obtained in the closed form, i.e.,

$x\left(t\right)={x}_{0}\phantom{\rule{0ex}{0ex}}{e}^{\left(\mu -{\sigma}^{2}/2\right)t+\mathit{\sigma w}\left(t\right)}$E66

which can be verified by substituting into (65) and using the Itô calculus.

The exact solution (66) can be approximated by computing the approximate values, $\tilde{x}\left({t}_{i}\right)\equiv {\tilde{x}}_{i}$, of $x\left({t}_{i}\right)$, at chosen discrete-time instances, ${t}_{i}$. Specifically, using the Euler algorithm

${\tilde{x}}_{i}={\tilde{x}}_{i-1}+\mu {\tilde{x}}_{i-1}\mathrm{\Delta}{t}_{i}+\sigma {\tilde{x}}_{i-1}\mathrm{\Delta}{w}_{i},\phantom{\rule{0ex}{0ex}}{\tilde{x}}_{0}={x}_{0}.$E67

The samples of the Wiener process increments are generated, $\mathrm{\Delta}{w}_{i}=\mathcal{N}\left(0,1\right)\sqrt{\mathrm{\Delta}{t}_{i}}$, where the time increments, $\mathrm{\Delta}{t}_{i}={t}_{i}-{t}_{i-1}$, and $\mathcal{N}\left(0,1\right)$ is the zero-mean, unit-variance Gaussian random variable.

The Milstein algorithm approximates the SDE (65) as

${\tilde{x}}_{i}={\tilde{x}}_{i-1}+\mu {\tilde{x}}_{i-1}\mathrm{\Delta}{t}_{i}+\sigma {\tilde{x}}_{i-1}\mathrm{\Delta}{w}_{i}+\frac{\sigma}{2}\left(\mathrm{\Delta}{w}_{i}^{2}-\mathrm{\Delta}{t}_{i}\right),\phantom{\rule{0ex}{0ex}}{\tilde{x}}_{0}={x}_{0}$E68

so it introduces the extra term compared to the approximation (66).

It can be shown that both Euler and Milstein algorithms converge in the mean to the exact solution as the step size, $\mathrm{\Delta}t$, is reduced, i.e.,

$\underset{\mathrm{\Delta}t\to 0}{lim}\mathrm{E}\left[x\left(t\right)-\tilde{x}\left(t\right)\right]=\mathcal{O}\phantom{\rule{0ex}{0ex}}\left({\left(\mathrm{\Delta}t\right)}^{n}\right)\phantom{\rule{0ex}{0ex}}\forall t>0$E69

where $n$ represents the order of converge. For $L$ random realizations of the values, $\tilde{x}\left(t\right)$, the expectation can be evaluated as

$\mathrm{E}\left[x\left(t\right)-\tilde{x}\left(t\right)\right]\approx \mathrm{E}\left[x\left(t\right)\right]-\frac{1}{L}\sum _{l=1}^{L}{\tilde{x}}_{\left(l\right)}\left(t\right).$E70

The order, $n$, is $1/2$, for the Euler algorithm, and $1$, for the Milstein algorithm, respectively. Consequently, in order to reduce the approximation error to one half, the step size, $\mathrm{\Delta}t$, must be reduced to $1/4$, for the Euler algorithm, but only to $1/2$, for the Milstein algorithm.

The approximations (67) and (68) of the exact solution (66) are numerically validated in Figure 1 (top) assuming the initial value, ${x}_{0}=10.0$, the parameters, $\mu =1.0$, and, $\sigma =0.5$, and $N=50$ time steps in the interval $\left(0,1\right)$. The relative approximation error is expressed as

$\mathrm{\u03f5}\left({t}_{i}\right)=\left(100\%\right)\times \left(x\left({t}_{i}\right)-\tilde{x}\left({t}_{i}\right)\right)/x\left({t}_{i}\right).$E71

In addition, Figure 1 (bottom) shows the absolute mean error, $\mid x\left({t}_{i}\right)-\tilde{x}\left({t}_{i}\right)\mid $, as a function of the number of time steps, $N={2}^{k}$, $k=2,3,\dots ,10$, in the interval, $\left(0,1\right)$, for ${t}_{i}={t}_{N}=1.0$. These curves indicate that the absolute mean error decreases as ${\left(1/N\right)}^{1/2}$, for the Euler algorithm, and as $\left(1/N\right)$, for the Milstein algorithm.

## 6. Generating random processes

The previous section discussed how to simulate random processes that are defined by a given SDE. This section considers the problem of generating random processes of given properties, i.e., the problem how to find the corresponding SDE model that can be used to generate samples of the random process.

In general, in most practical scenarios, the random process is typically described by the probability distribution of its samples, which also fully defines all statistical moments at any given time instant, and by the second-order pairwise moments between the samples at any two time instances; these moments are referred to as the ACF or auto-covariance function.

More importantly, the generated random processes are ergodic and stationary, so that the statistical mean is equal to the average value in time, i.e.,

$\mathrm{E}\left[x\left(t\right)\right]=\mathrm{Av}\left[x\left(t\right)\right]=\mathrm{Av}\left[\mathrm{E}\left[x\left(t\right)\right]\right]=\mathrm{E}\left[\mathrm{Av}\left[x\left(t\right)\right]\right]\phantom{\rule{0ex}{0ex}}\forall t.$E72

The non-ergodic processes are rarely encountered in practical applications. The non-stationary process, $x\left(t\right)$, can be generated from a stationary process, $u\left(t\right)$, using the transformation involving deterministic functions, $a\left(t\right)$ and $b\left(t\right)$, as

$x\left(t\right)=a\left(t\right)u\left(t\right)+b\left(t\right).$E73

Note that transformation (73) is nonlinear, unless $b\left(t\right)=0$, for $\forall t$. Moreover, the robust signal processing can be achieved by assuming the values, $\mathrm{Av}\left[\mathrm{E}\left[x\left(t\right)\right]\right]$ or $\mathrm{E}\left[\mathrm{Av}\left[x\left(t\right)\right]\right]$, instead of $\mathrm{Av}\left[x\left(t\right)\right]$ or $\mathrm{E}\left[x\left(t\right)\right]$.

The simplest, but very often considered random process is a correlated (colored) Gaussian noise. In continuous time, the uncorrelated (white) Gaussian process, $u\left(t\right)$, is correlated by a linear, not necessarily stationary filter with the impulse response, $h\left(t,s\right)$, i.e.,

$x\left(t\right)={\int}_{-\infty}^{\infty}h\left(t,s\right)u\left(s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s.$E74

The mean and the ACF of the process, $x\left(t\right)$, can be computed as

$\begin{array}{c}\overline{x}=\overline{u}{\int}_{-\infty}^{\infty}h\left(t,s\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}s\\ \mathrm{E}\left[x\left(t\right)x\left(s\right)\right]={\int}_{-\infty}^{\infty}h\left(t,r\right)h\left(s,r\right)\phantom{\rule{0ex}{0ex}}\mathrm{d}r\end{array}$E75

where $\overline{u}=\mathrm{E}\left[u\left(t\right)\right]$, and $\mathrm{E}\left[u\left(t\right)u\left(s\right)\right]=\delta \left(t-s\right)$.

Assuming a vector of mutually uncorrelated Gaussian processes, $\mathbf{u}\left(t\right)={\left[{u}_{1}\left(t\right),{u}_{2}\left(t\right),\dots \right]}^{T}$, where each ${u}_{i}\left(t\right)$ can be individually correlated in time, can be transformed into a vector of mutually correlated processes, $\mathbf{x}\left(t\right)$, by a linear transformation

$\mathbf{x}\left(t\right)=\mathbf{Au}\left(t\right).$E76

Then, the correlation matrix $\mathrm{E}\left[\mathbf{x}\left(t\right){\mathbf{x}}^{T}\left(t\right)\right]=\mathbf{A}\mathrm{E}\left[\mathbf{u}\left(t\right){\mathbf{u}}^{T}\left(t\right)\right]{\mathbf{A}}^{T}$. If the number of processes, $N$, is large, the elements of $\mathbf{A}$ can be efficiently calculated as

$\begin{array}{lll}{A}_{1,1}& =& \sqrt{{C}_{1,1}},\phantom{\rule{0ex}{0ex}}{A}_{k,1}=\frac{{C}_{k,1}}{{A}_{1,1}},\phantom{\rule{0ex}{0ex}}k=2,\dots ,N\\ {A}_{j,j}& =& \sqrt{{C}_{j,j}-{\sum}_{i=1}^{j-1}{A}_{j,j}^{2}},\phantom{\rule{0ex}{0ex}}{A}_{k,j}=\frac{{C}_{k,j}-{\sum}_{i=1}^{j-1}{A}_{k,i}{A}_{j,i}}{{A}_{j,j}},\phantom{\rule{0ex}{0ex}}k=j+1,\dots ,N\\ {A}_{N,N}& =& \sqrt{{C}_{N,N}-{\sum}_{i=1}^{N-1}{A}_{N,i}^{2}}\end{array}$E77

assuming, for simplicity, that all ${u}_{i}\left(t\right)$ have zero mean, and ${C}_{i,j}=\mathrm{E}\left[{x}_{i}\left(t\right){x}_{j}\left(t\right)\right]$ denote the desired pairwise correlations. This procedure can be also used for a discrete-time signal represented by a column vector, $\mathbf{u}$. For very large, $N\gg 1$, the auto-regressive moving average (ARMA) model in continuous [cf. (1)] or discrete time should be used. The initial values of the generated random process can be discarded until the process becomes stationary. Alternatively, a non-stationary ARMA model can be used initially before switching to model (77) in order to speed up the convergence to stationary samples.

The above methods involving linear filtering are well justified when the samples are Gaussian. There are, however, many situations when the random process is specified by a general probability distribution and its ACF. This is not a complete statistical description, so different generation methods can yield different higher-order moments. The most straightforward, however, in practice rather complicated method is to generate uncorrelated samples with a certain distribution, so that linear filtering of these samples yields both the desired distribution as well as the ACF.

A much more tractable method first generates Gaussian samples with given correlations and then uses a memoryless nonlinear transformation to obtain the desired distribution and the ACF. In particular, let the zero-mean Gaussian noise, $u\left(t\right),$ have the ACF, ${C}_{u}\left({t}_{1}-{t}_{2}\right)=\mathrm{E}\left[u\left({t}_{1}\right)u\left({t}_{2}\right)\right]$, and the pairwise PDF, ${p}_{u}\left({u}_{1},{u}_{2};{t}_{1}-{t}_{2}\right)$. The random process, $x\left(t\right)=g\left(u\left(t\right)\right)$, has the one-dimensional and two-dimensional PDF, respectively, [14].

$\begin{array}{lll}{p}_{x}\left(x\right)& =& \frac{{p}_{u}\left({g}^{-1}\left(x\right)\right)}{\left|\stackrel{\u0307}{g}\left({g}^{-1}\left(x\right)\right)\right|}\\ {p}_{x}\left({x}_{1},{x}_{2}\right)& =& {p}_{u}\left({g}^{-1}\left({u}_{1}\right),{g}^{-1}\left({u}_{2}\right)\right)\left|\frac{\mathrm{d}{g}^{-1}\left({u}_{1}\right)}{\phantom{\rule{0ex}{0ex}}\mathrm{d}{u}_{1}}\frac{\phantom{\rule{0ex}{0ex}}\mathrm{d}{g}^{-1}\left({u}_{2}\right)}{\phantom{\rule{0ex}{0ex}}\mathrm{d}{u}_{2}}\right|.\end{array}$E78

The corresponding mean and the ACF, respectively, can be computed as

$\begin{array}{lll}\overline{x}& =& {\int}_{-\infty}^{\infty}g\left(u\right){p}_{u}\left(u\right)\phantom{\rule{0ex}{0ex}}\mathit{du}\\ {C}_{x}\left({t}_{1}-{t}_{2}\right)& =& {\int}_{-\infty}^{\infty}{\int}_{-\infty}^{\infty}g\left({u}_{1}\right)g\left({u}_{2}\right){p}_{u}\left({u}_{1},{u}_{2};{t}_{1},{t}_{2}\right)\phantom{\rule{0ex}{0ex}}{\mathit{du}}_{1}\phantom{\rule{0ex}{0ex}}{\mathit{du}}_{2}.\end{array}$E79

The generation procedure then proceeds in these steps:

find a nonlinearity, $g\left(\cdot \right)$, so that for the target output distribution, the input distribution is Gaussian;

given $g\left(\cdot \right)$, find the correlations (covariances), ${C}_{u}\left({t}_{1}-{t}_{2}\right)$, so that the output process, $x\left(t\right)$, has the desired ACF, ${C}_{x}\left({t}_{1}-{t}_{2}\right)$;

given ${C}_{u}\left({t}_{1}-{t}_{2}\right)$, obtain a linear filter to generate such an ACF from the uncorrelated Gaussian samples.

This procedure is general and can be used to generate different processes with given distribution and ACF. However, in some cases, when the nature of a random process has been motivated by a specific physical or other phenomenon, it can directly indicate how to generate the underlying process. For example, a continuous-time Poisson process can be generated by a random sequence of Dirac delta pulses with exponentially distributed inter-arrival times, which are then filtered with an integrator. It is, therefore, advisable, to check the relevant literature whether this might be the case for a given random process.

### 6.1 Example

Consider the problem of generating a zero-mean stationary binary random sequence, $x\left[t\right]$, of $\pm 1$ values with a given ACF. Such a process can be generated, for example, from a discrete-time Gaussian process, $y\left[t\right]$, using the nonlinearity,

$x\left[t\right]=sign\phantom{\rule{0ex}{0ex}}\left(y\left[t\right]\right)=\left\{\begin{array}{cc}+1,& y\left[t\right]\ge 0\\ -1,& y\left[t\right]<0\end{array}\right..$E80

The ACF, ${C}_{x}\left[\tau \right]$, of the output process, $x\left[t\right]$, can be expressed in terms of the ACF, ${C}_{y}\left[\tau \right]$, of the input process, $y\left[t\right]$, as

${C}_{x}\left[\tau \right]=\frac{2}{\pi}arcsin\phantom{\rule{0ex}{0ex}}\left(\frac{{C}_{y}\left[\tau \right]}{{C}_{y}\left[0\right]}\right).$E81

For example, assume that ${C}_{x}\left[0\right]=1$, ${C}_{x}\left[-1\right]={C}_{x}\left[+1\right]=-1/3$, and ${C}_{x}\left[\tau \right]=0$, otherwise. In such a case, the sequence, $y\left[t\right]$, has a zero mean and the unit variance, and by inverting the ACF (81), the required ACF is ${C}_{y}\left[0\right]=1$, ${C}_{y}\left[-1\right]={C}_{y}\left[+1\right]=-1/2$, and ${C}_{y}\left[\tau \right]=0$, otherwise. The Gaussian process with this ACF can be obtained by filtering the uncorrelated Gaussian samples, $u\left[t\right]$, with a discrete-time linear filter having the impulse response, $h\left[0\right]=1/\sqrt{2}$, $h\left[1\right]=-1/\sqrt{2}$, and $h\left[t\right]=0$, otherwise. Therefore, the process, $x\left[t\right]$, with the desired ACF is generated from the uncorrelated Gaussian samples as

$x\left[t\right]=sign\phantom{\rule{0ex}{0ex}}\left(\frac{u\left[t\right]-u\left[t-1\right]}{\sqrt{2}}\right).$E82

## 7. Conclusions

This chapter discussed random processes that are usually modeled as SDEs. The resulting SDEs can be linear or nonlinear and defined in continuous time or discrete time. The higher-order differential and difference linear equations can be equivalently represented by the first-order vector equations. Brownian motion and Ornstein-Uhlenbeck random processes are the special cases of a linear SDE. There is also Cramér integral representation of random processes in frequency domain. The nonlinear models can be obtained by introducing a nonlinear measurement equation with added measurement noise.

Filtering and estimating random processes often give rise to random processes defined by integral equations. Langevin equation allows incorporating both deterministic and random effects. In practical applications, it is usually easier to solve the corresponding Fokker-Planck equation and obtain the time-evolution of the state distribution. In modeling physical and chemical system dynamics, the FPE is referred to as master equation.

## Abbreviations

auto-correlation function | |

auto-regressive moving average | |

best linear unbiased estimator | |

Fokker-Planck equation | |

minimum mean square error | |

ordinary differential equation | |

probability density function | |

stochastic differential equation |

## References

- 1.
Amir A. Thinking Probabilistically, Stochastic Processes, Disordered Systems, and Their Applications. Cambridge, UK: Cambridge University Press; 2021 - 2.
Bielecki TR, Jakubowski J, Niewȩgłowski M. Structured Dependence between Stochastic Processes. Cambridge, UK: Cambridge University Press; 2020 - 3.
Leon-Garcia A. Probability and Random Processes for Electrical Engineering. 2nd ed. New York, USA: Addison-Wesley; 1994 - 4.
Gardner WA. Introduction to Random Processes With Applications. 2nd ed. New York, USA: McGraw-Hill; 1990 - 5.
Gardiner CW. Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences. 3rd ed. New York, NY: Springer; 2004 - 6.
Grimmett GR, Stirzaker DR. Probability and Random Processes. 3rd ed. Oxford, UK: Oxford University Press; 2001 - 7.
Haykin S. Adaptive Filter Theory. 5th ed. Harlow, UK: Pearson Education Limited; 2014 - 8.
Jazwinski AH. Stochastic Processes and Filtering Theory. New York, USA: Academic Press; 1970 - 9.
van Kampen NG. Stochastic Processes in Physics and Chemistry. 3rd ed. Amsterdam, Netherlands: Elsevier; 2007 - 10.
Karlin S, Taylor HM. A First Course Instochastic Processes. 2nd ed. New York, USA: Academic Press Inc.; 1975 - 11.
Kay SM. Fundamentals of Statistical Signal Processing: Estimation Theory. Vol. I. Upper Saddle River, New Jersey: Prentice Hall; 1993 - 12.
Kwok SF. Langevin and Fokker-Planck Equations and their Generalizations. Singapore: World Scientific Publishing Co. Pte. Ltd.; 2018 - 13.
Miller SL, Childers D. Probability and Random Processes With Applications to Signal Processing and Communications. 2nd ed. Waltham, MA, USA: Academic Press; 2012 - 14.
Papoulis A, Pillai SU. Probability, Random Variables, and Stochastic Processes. 4th ed. New York, NY: McGraw-Hill; 2002 - 15.
Pavliotis GA. Stochastic Processes and Applications. New York, USA: Springer Science+Business Media; 2014 - 16.
Ross SM. Stochastic Processes. 2nd ed. New York, USA: Wiley; 1996 - 17.
Schuss Z. Theory and Applications of Stochastic Processes, An Analytical Approach. New York, NY: Springer; 2010 - 18.
Shynk JJ. Probability, Random Variables and Random Processes. Hoboken, New Jersey: John Wiley & Sons; 2013

Written By

Pavel Loskot

Submitted: 12 February 2024 Reviewed: 20 February 2024 Published: 21 June 2024

© The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.