Self-similar process

Self-similar processes are types of stochastic processes that exhibit the phenomenon of self-similarity. A self-similar phenomenon behaves the same when viewed at different degrees of magnification, or different scales on a dimension (space or time). Self-similar processes can sometimes be described using heavy-tailed distributions, also known as long-tailed distributions. Examples of such processes include traffic processes, such as packet inter-arrival times and burst lengths. Self-similar processes can exhibit long-range dependency.

Overview[edit]

The design of robust and reliable networks and network services has become an increasingly challenging task in today's Internet world. To achieve this goal, understanding the characteristics of Internet traffic plays a more and more critical role. Empirical studies of measured traffic traces have led to the wide recognition of self-similarity in network traffic.^[1]

Self-similar Ethernet traffic exhibits dependencies over a long range of time scales. This is to be contrasted with telephone traffic which is Poisson in its arrival and departure process.^[2]

In traditional Poisson traffic, the short-term fluctuations would average out, and a graph covering a large amount of time would approach a constant value.

Heavy-tailed distributions have been observed in many natural phenomena including both physical and sociological phenomena. Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena, e.g. Stock markets, earthquakes, climate, and the weather.^{[citation needed]} Ethernet, WWW, SS7, TCP, FTP, TELNET and VBR video (digitised video of the type that is transmitted over ATM networks) traffic is self-similar.

Self-similarity in packetised data networks can be caused by the distribution of file sizes, human interactions and/ or Ethernet dynamics. Self-similar and long-range dependent characteristics in computer networks present a fundamentally different set of problems to people doing analysis and/or design of networks, and many of the previous assumptions upon which systems have been built are no longer valid in the presence of self-similarity.^[3]

The Poisson distribution[edit]

Before the heavy-tailed distribution is introduced mathematically, the Poisson process with a memoryless waiting-time distribution, used to model (among many things) traditional telephony networks, is briefly reviewed below.

Assuming pure-chance arrivals and pure-chance terminations leads to the following:

The number of call arrivals in a given time has a Poisson distribution, i.e.:

P(a)=\left({\frac {\mu ^{a}}{a!}}\right)e^{-\mu },

where a is the number of call arrivals in time T, and $\mu$ is the mean number of call arrivals in time T. For this reason, pure-chance traffic is also known as Poisson traffic.

The number of call departures in a given time, also has a Poisson distribution, i.e.:

P(d)=\left({\frac {\lambda ^{d}}{d!}}\right)e^{-\lambda },

where d is the number of call departures in time T and $\lambda$ is the mean number of call departures in time T.

The intervals, T, between call arrivals and departures are intervals between independent, identically distributed random events. It can be shown that these intervals have a negative exponential distribution, i.e.:

P[T\geq \ t]=e^{-t/h},\,

where h is the mean holding time (MHT).^{[citation needed]}

The heavy-tail distribution[edit]

A distribution is said to have a heavy tail if

\lim _{x\to \infty }e^{\lambda x}\Pr[X>x]=\infty \quad {\mbox{for all }}\lambda >0.\,

One simple example of a heavy-tailed distribution is the Pareto distribution.

Modelling self-similar traffic[edit]

Since (unlike traditional telephony traffic) packetised traffic exhibits self-similar or fractal characteristics, conventional traffic models do not apply to networks which carry self-similar traffic.^{[citation needed]}

With the convergence of voice and data, the future multi-service network will be based on packetised traffic, and models which accurately reflect the nature of self-similar traffic will be required to develop, design and dimension future multi-service networks.^{[citation needed]}

Previous analytic work done in Internet studies adopted assumptions such as exponentially-distributed packet inter-arrivals, and conclusions reached under such assumptions may be misleading or incorrect in the presence of heavy-tailed distributions.^[2]

Deriving mathematical models which accurately represent long-range dependent traffic is a fertile area of research.

Self-similar stochastic processes modeled by Tweedie distributions[edit]

The Tweedie convergence theorem can be used to explain the origin of the variance to mean power law, 1/f noise and multifractality, features associated with self-similar processes.^[4]

Network performance[edit]

Network performance degrades gradually with increasing self-similarity. The more self-similar the traffic, the longer the queue size. The queue length distribution of self-similar traffic decays more slowly than with Poisson sources. However, long-range dependence implies nothing about its short-term correlations which affect performance in small buffers. Additionally, aggregating streams of self-similar traffic typically intensifies the self-similarity ("burstiness") rather than smoothing it, compounding the problem.^{[citation needed]}

Self-similar traffic exhibits the persistence of clustering which has a negative impact on network performance.

With Poisson traffic (found in conventional telephony networks), clustering occurs in the short term but smooths out over the long term.
With self-similar traffic, the bursty behaviour may itself be bursty, which exacerbates the clustering phenomena, and degrades network performance.

Many aspects of network quality of service depend on coping with traffic peaks that might cause network failures, such as

Cell/packet loss and queue overflow
Violation of delay bounds e.g. in video
Worst cases in statistical multiplexing

Poisson processes are well-behaved because they are stateless, and peak loading is not sustained, so queues do not fill. With long-range order, peaks last longer and have greater impact: the equilibrium shifts for a while.^[5]

References[edit]

^ Park, Kihong; Willinger, Walter (2000), Self-Similar Network Traffic and Performance Evaluation, New York, NY, USA: John Wiley & Sons, Inc., ISBN 0471319740.
^ ^a ^b "Appendix: Heavy-tailed Distributions". Cs.bu.edu. 2001-04-12. Retrieved 2012-06-25.
^ "The Self-Similarity and Long Range Dependence in Networks Web site". Cs.bu.edu. Retrieved 2012-06-25.
^ Kendal, Wayne S.; Jørgensen, Bent (2011-12-27). "Tweedie convergence: A mathematical basis for Taylor's power law, 1/f noise, and multifractality". Physical Review E. 84 (6). American Physical Society (APS): 066120. Bibcode:2011PhRvE..84f6120K. doi:10.1103/physreve.84.066120. ISSN 1539-3755. PMID 22304168.
^ "Everything you always wanted to know about Self-Similar Network Traffic and Long-Range Dependency, but were ashamed to ask*". Cs.kent.ac.uk. Retrieved 2012-06-25.

External links[edit]

A site offering numerous links to articles written on the effect of self-similar traffic on network performance.

[1] Park, Kihong; Willinger, Walter (2000), Self-Similar Network Traffic and Performance Evaluation, New York, NY, USA: John Wiley & Sons, Inc., ISBN 0471319740.

[autogenerated1-2] "Appendix: Heavy-tailed Distributions". Cs.bu.edu. 2001-04-12. Retrieved 2012-06-25.

[3] "The Self-Similarity and Long Range Dependence in Networks Web site". Cs.bu.edu. Retrieved 2012-06-25.

[Kendal2011b-4] Kendal, Wayne S.; Jørgensen, Bent (2011-12-27). "Tweedie convergence: A mathematical basis for Taylor's power law, 1/f noise, and multifractality". Physical Review E. 84 (6). American Physical Society (APS): 066120. Bibcode:2011PhRvE..84f6120K. doi:10.1103/physreve.84.066120. ISSN 1539-3755. PMID 22304168.

[5] "Everything you always wanted to know about Self-Similar Network Traffic and Long-Range Dependency, but were ashamed to ask*". Cs.kent.ac.uk. Retrieved 2012-06-25.

[1]

[2]

[3]

[4]

[5]

v t e Stochastic processes
Discrete time	Bernoulli process Branching process Chinese restaurant process Galton–Watson process Independent and identically distributed random variables Markov chain Moran process Random walk Loop-erased Self-avoiding Biased Maximal entropy
Continuous time	Additive process Bessel process Birth–death process pure birth Brownian motion Bridge Excursion Fractional Geometric Meander Cauchy process Contact process Continuous-time random walk Cox process Diffusion process Dyson Brownian motion Empirical process Feller process Fleming–Viot process Gamma process Geometric process Hawkes process Hunt process Interacting particle systems Itô diffusion Itô process Jump diffusion Jump process Lévy process Local time Markov additive process McKean–Vlasov process Ornstein–Uhlenbeck process Poisson process Compound Non-homogeneous Schramm–Loewner evolution Semimartingale Sigma-martingale Stable process Superprocess Telegraph process Variance gamma process Wiener process Wiener sausage
Both	Branching process Galves–Löcherbach model Gaussian process Hidden Markov model (HMM) Markov process Martingale Differences Local Sub- Super- Random dynamical system Regenerative process Renewal process Stochastic chains with memory of variable length White noise
Fields and other	Dirichlet process Gaussian random field Gibbs measure Hopfield model Ising model Potts model Boolean network Markov random field Percolation Pitman–Yor process Point process Cox Poisson Random field Random graph
Time series models	Autoregressive conditional heteroskedasticity (ARCH) model Autoregressive integrated moving average (ARIMA) model Autoregressive (AR) model Autoregressive–moving-average (ARMA) model Generalized autoregressive conditional heteroskedasticity (GARCH) model Moving-average (MA) model
Financial models	Binomial options pricing model Black–Derman–Toy Black–Karasinski Black–Scholes Chan–Karolyi–Longstaff–Sanders (CKLS) Chen Constant elasticity of variance (CEV) Cox–Ingersoll–Ross (CIR) Garman–Kohlhagen Heath–Jarrow–Morton (HJM) Heston Ho–Lee Hull–White Korn-Kreer-Lenssen LIBOR market Rendleman–Bartter SABR volatility Vašíček Wilkie
Actuarial models	Bühlmann Cramér–Lundberg Risk process Sparre–Anderson
Queueing models	Bulk Fluid Generalized queueing network M/G/1 M/M/1 M/M/c
Properties	Càdlàg paths Continuous Continuous paths Ergodic Exchangeable Feller-continuous Gauss–Markov Markov Mixing Piecewise-deterministic Predictable Progressively measurable Self-similar Stationary Time-reversible
Limit theorems	Central limit theorem Donsker's theorem Doob's martingale convergence theorems Ergodic theorem Fisher–Tippett–Gnedenko theorem Large deviation principle Law of large numbers (weak/strong) Law of the iterated logarithm Maximal ergodic theorem Sanov's theorem Zero–one laws (Blumenthal, Borel–Cantelli, Engelbert–Schmidt, Hewitt–Savage, Kolmogorov, Lévy)
Inequalities	Burkholder–Davis–Gundy Doob's martingale Doob's upcrossing Kunita–Watanabe Marcinkiewicz–Zygmund
Tools	Cameron–Martin formula Convergence of random variables Doléans-Dade exponential Doob decomposition theorem Doob–Meyer decomposition theorem Doob's optional stopping theorem Dynkin's formula Feynman–Kac formula Filtration Girsanov theorem Infinitesimal generator Itô integral Itô's lemma Karhunen–Loève theorem Kolmogorov continuity theorem Kolmogorov extension theorem Lévy–Prokhorov metric Malliavin calculus Martingale representation theorem Optional stopping theorem Prokhorov's theorem Quadratic variation Reflection principle Skorokhod integral Skorokhod's representation theorem Skorokhod space Snell envelope Stochastic differential equation Tanaka Stopping time Stratonovich integral Uniform integrability Usual hypotheses Wiener space Classical Abstract
Disciplines	Actuarial mathematics Control theory Econometrics Ergodic theory Extreme value theory (EVT) Large deviations theory Mathematical finance Mathematical statistics Probability theory Queueing theory Renewal theory Ruin theory Signal processing Statistics Stochastic analysis Time series analysis Machine learning
List of topics Category