The properties of a normal distribution are. Normal (Gaussian) distribution law. One of the representations of the probability integral

Metals and metal products 28.02.2021
Metals and metal products

In practice, most random variables, which are affected by a large number of random factors, obey the normal law of probability distribution. Therefore, in various applications of probability theory, this law is of particular importance.

A random variable $X$ obeys the normal probability distribution law if its probability distribution density has the following form

$$f\left(x\right)=((1)\over (\sigma \sqrt(2\pi )))e^(-(((\left(x-a\right))^2)\over ( 2(\sigma )^2)))$$

Schematically, the graph of the function $f\left(x\right)$ is shown in the figure and has the name "Gaussian curve". To the right of this graphic is the German 10 Mark banknote, which was in use even before the introduction of the euro. If you look closely, then on this banknote you can see the Gaussian curve and its discoverer, the greatest mathematician Carl Friedrich Gauss.

Let's go back to our density function $f\left(x\right)$ and give some explanation about the distribution parameters $a,\ (\sigma )^2$. The parameter $a$ characterizes the center of dispersion of the values ​​of the random variable, that is, it has the meaning of the mathematical expectation. When the parameter $a$ changes and the parameter $(\sigma )^2$ remains unchanged, we can observe the shift of the graph of the function $f\left(x\right)$ along the abscissa axis, while the density graph itself does not change its shape.

The parameter $(\sigma )^2$ is the variance and characterizes the shape of the curve of the $f\left(x\right)$ density plot. When changing the parameter $(\sigma )^2$ with the parameter $a$ unchanged, we can observe how the density graph changes its shape, shrinking or stretching, while not shifting along the abscissa.

Probability of a normally distributed random variable falling into a given interval

As is known, the probability that a random variable $X$ falls into the interval $\left(\alpha ;\ \beta \right)$ can be calculated $P\left(\alpha< X < \beta \right)=\int^{\beta }_{\alpha }{f\left(x\right)dx}$. Для нормального распределения случайной величины $X$ с параметрами $a,\ \sigma $ справедлива следующая формула:

$$P\left(\alpha< X < \beta \right)=\Phi \left({{\beta -a}\over {\sigma }}\right)-\Phi \left({{\alpha -a}\over {\sigma }}\right)$$

Here the function $\Phi \left(x\right)=((1)\over (\sqrt(2\pi )))\int^x_0(e^(-t^2/2)dt)$ is the Laplace function . The values ​​of this function are taken from . The following properties of the function $\Phi \left(x\right)$ can be noted.

1 . $\Phi \left(-x\right)=-\Phi \left(x\right)$, i.e. the function $\Phi \left(x\right)$ is odd.

2 . $\Phi \left(x\right)$ is a monotonically increasing function.

3 . $(\mathop(lim)_(x\to +\infty ) \Phi \left(x\right)\ )=0.5$, $(\mathop(lim)_(x\to -\infty ) \ Phi \left(x\right)\ )=-0.5$.

To calculate the values ​​of the $\Phi \left(x\right)$ function, you can also use the $f_x$ function wizard of the Excel package: $\Phi \left(x\right)=NORMDIST\left(x;0;1;1\right )-0.5$. For example, let's calculate the values ​​of the function $\Phi \left(x\right)$ for $x=2$.

The probability that a normally distributed random variable $X\in N\left(a;\ (\sigma )^2\right)$ falls into an interval symmetric with respect to the expectation $a$ can be calculated by the formula

$$P\left(\left|X-a\right|< \delta \right)=2\Phi \left({{\delta }\over {\sigma }}\right).$$

Three sigma rule. It is almost certain that a normally distributed random value$X$ will fall into the interval $\left(a-3\sigma ;a+3\sigma \right)$.

Example 1 . The random variable $X$ is subject to the normal probability distribution law with parameters $a=2,\ \sigma =3$. Find the probability that $X$ falls into the interval $\left(0,5;1\right)$ and the probability that the inequality $\left|X-a\right|< 0,2$.

Using the formula

$$P\left(\alpha< X < \beta \right)=\Phi \left({{\beta -a}\over {\sigma }}\right)-\Phi \left({{\alpha -a}\over {\sigma }}\right),$$

find $P\left(0,5;1\right)=\Phi \left(((1-2)\over (3))\right)-\Phi \left(((0,5-2)\ over (3))\right)=\Phi \left(-0.33\right)-\Phi \left(-0.5\right)=\Phi \left(0.5\right)-\Phi \ left(0.33\right)=0.191-0.129=$0.062.

$$P\left(\left|X-a\right|< 0,2\right)=2\Phi \left({{\delta }\over {\sigma }}\right)=2\Phi \left({{0,2}\over {3}}\right)=2\Phi \left(0,07\right)=2\cdot 0,028=0,056.$$

Example 2 . Let us assume that during the year the price of shares of a certain company is a random variable distributed according to the normal law with a mathematical expectation equal to 50 conventional monetary units and a standard deviation equal to 10. What is the probability that on a randomly chosen day of the period under discussion, the price for the share will be:

a) more than 70 conventional monetary units?

b) below 50 per share?

c) between 45 and 58 conventional monetary units per share?

Let the random variable $X$ be the price of shares of some company. By condition $X$ is subject to normal distribution with parameters $a=50$ - mathematical expectation, $\sigma =10$ - standard deviation. Probability $P\left(\alpha< X < \beta \right)$ попадания $X$ в интервал $\left(\alpha ,\ \beta \right)$ будем находить по формуле:

$$P\left(\alpha< X < \beta \right)=\Phi \left({{\beta -a}\over {\sigma }}\right)-\Phi \left({{\alpha -a}\over {\sigma }}\right).$$

$$a)\ P\left(X>70\right)=\Phi \left(((\infty -50)\over (10))\right)-\Phi \left(((70-50)\ over (10))\right)=0.5-\Phi \left(2\right)=0.5-0.4772=0.0228.$$

$$b)\ P\left(X< 50\right)=\Phi \left({{50-50}\over {10}}\right)-\Phi \left({{-\infty -50}\over {10}}\right)=\Phi \left(0\right)+0,5=0+0,5=0,5.$$

$$c)\ P\left(45< X < 58\right)=\Phi \left({{58-50}\over {10}}\right)-\Phi \left({{45-50}\over {10}}\right)=\Phi \left(0,8\right)-\Phi \left(-0,5\right)=\Phi \left(0,8\right)+\Phi \left(0,5\right)=$$

The article shows in detail what the normal law of distribution of a random variable is and how to use it in solving practical problems.

Normal distribution in statistics

The history of the law has 300 years. The first discoverer was Abraham de Moivre, who came up with an approximation as early as 1733. Many years later, Carl Friedrich Gauss (1809) and Pierre-Simon Laplace (1812) derived mathematical functions.

Laplace also discovered a remarkable regularity and formulated central limit theorem (CPT), according to which the sum of a large number of small and independent variables has a normal distribution.

The normal law is not a fixed equation of how one variable depends on another. Only the nature of this dependence is fixed. The specific form of distribution is specified by special parameters. For example, y = ax + b is the equation of a straight line. However, where exactly it passes and at what slope is determined by the parameters A And b. Same with the normal distribution. It is clear that this is a function that describes the tendency of a high concentration of values ​​near the center, but its exact form is given by special parameters.

The Gaussian normal distribution curve has the following form.

The normal distribution graph resembles a bell, so you can see the name bell curve. The graph has a "hump" in the middle and a sharp decrease in density at the edges. This is the essence of the normal distribution. The probability that a random variable will be near the center is much higher than that it deviates strongly from the middle.

The figure above shows two areas under the Gaussian curve: blue and green. Grounds, i.e. intervals are equal in both sections. But the heights are noticeably different. The blue area is far from the center, and has a significantly lower height than the green one, which is located in the very center of the distribution. Consequently, the areas also differ, that is, the probabilities of falling into the indicated intervals.

The formula for the normal distribution (density) is as follows.

The formula consists of two mathematical constants:

π – number pi 3.142;

e– base of the natural logarithm 2.718;

two variable parameters that define the shape of a particular curve:

m– mathematical expectation (other notation may be used in various sources, for example, µ or a);

σ2– dispersion;

well, the variable itself x, for which the probability density is calculated.

The specific form of the normal distribution depends on 2 parameters: ( m) And ( σ2). Briefly denoted N(m, σ 2) or N(m, σ). Parameter m(Expectation) determines the distribution center, which corresponds to the maximum height of the chart. Dispersion σ2 characterizes the range of variation, that is, the "smearing" of the data.

The mathematical expectation parameter shifts the distribution center to the right or to the left without affecting the very shape of the density curve.

But the dispersion determines the sharpness of the curve. When the data has a small spread, then all of its mass is concentrated at the center. If the data has a large spread, then they are “smeared” over a wide range.

Distribution density has no direct practical application. To calculate the probabilities, you need to integrate the density function.

The probability that a random variable will be less than some value x, is determined normal distribution function:

Using the mathematical properties of any continuous distribution, it is not difficult to calculate any other probabilities, since

P(a ≤ X< b) = Ф(b) – Ф(a)

standard normal distribution

The normal distribution depends on the parameters of the mean and variance, which is why its properties are poorly visible. It would be nice to have some distribution standard that does not depend on the scale of the data. And he exists. called standard normal distribution. In fact, this is the usual normal normal distribution, only with the parameters of the mathematical expectation 0, and the variance is 1, shortly written N(0, 1).

Any normal distribution can be easily converted into a standard distribution by normalizing:

Where z is a new variable that is used instead of x;
m- expected value;
σ - standard deviation.

For sample data, estimates are taken:

Arithmetic mean and variance of the new variable z are now also equal to 0 and 1, respectively. This is easy to verify with the help of elementary algebraic transformations.

The name appears in the literature z-score. This is it - normalized data. Z-score can be directly compared with theoretical probabilities, since its scale matches the standard.

Now let's see what the density of the standard normal distribution looks like (for z-scores). Let me remind you that the Gaussian function has the form:

Substitute instead of (x-m)/σ letter z, but instead σ - one, we get density function of the standard normal distribution:

Density Graph:

The center, as expected, is at point 0. At the same point, the Gaussian function reaches its maximum, which corresponds to the acceptance of its average value by the random variable (i.e. x-m=0). The density at this point is 0.3989, which can be calculated even in the mind, because. e 0 =1 and it remains to calculate only the ratio of 1 to the root of 2 pi.

Thus, the graph clearly shows that values ​​that have small deviations from the average fall out more often than others, and those that are very far from the center are much less common. The abscissa scale is measured in standard deviations, which allows you to get rid of the units of measurement and get the universal structure of the normal distribution. The Gaussian curve for normalized data perfectly demonstrates other properties of the normal distribution. For example, that it is symmetrical about the y-axis. Within ±1σ of the arithmetic mean, most of all values ​​​​are concentrated (we are still estimating by eye). Most of the data are within ±2σ. Almost all data are within ±3σ. The last property is commonly known as three sigma rule for a normal distribution.

The standard normal distribution function allows you to calculate probabilities.

Of course, no one counts by hand. Everything is calculated and placed in special tables, which are at the end of any textbook on statistics.

Normal distribution table

Normal distribution tables are of two types:

- table density;

- table functions(integral of density).

Table density rarely used. However, let's see what it looks like. Let's say we need to get the density for z = 1, i.e. the density of the value that is 1 sigma away from the expected value. Below is a portion of the table.

Depending on the organization of the data, we are looking for desired value by column and row name. In our example, we take the line 1,0 and column 0 , because no hundredths. The desired value is 0.2420 (0 before 2420 is omitted).

The Gaussian function is symmetrical about the y-axis. That's why φ(z)= φ(-z), i.e. density for 1 is identical to the density for -1 , which is clearly seen in the figure.

In order not to waste paper, tables are printed only for positive values.

In practice, values ​​are often used functions standard normal distribution, that is, the probabilities for different z.

Such tables also contain only positive values. Therefore, in order to understand and find any the necessary probabilities should be known properties of the standard normal distribution.

Function Ф(z) is symmetrical about its value of 0.5 (and not the y-axis, like density). Hence the equality is true:

This fact is shown in the picture:

Function values Ф(-z) And Ф(z) divide the graph into 3 parts. Moreover, the upper and lower parts are equal (indicated by checkmarks). In order to complete the probability Ф(z) to 1, just add the missing value Ф(-z). You get the same equation as above.

If you need to find the probability of falling into the interval (0;z), that is, the probability of a deviation from zero in a positive direction to a certain number of standard deviations, it is enough to subtract 0.5 from the value of the standard normal distribution function:

For clarity, you can look at the figure.

On a Gaussian curve, the same situation looks like the area from the center to the right to z.

Quite often, the analyst is interested in the probability of deviation in both directions from zero. And since the function is symmetrical about the center, the previous formula must be multiplied by 2:

Picture below.

Under the Gaussian curve, this is the central part, limited by the selected value -z left and z on right.

These properties should be taken into account, because table values ​​rarely correspond to the interval of interest.

To facilitate the task, textbooks usually publish tables for a function of the form:

If you need the probability of deviation in both directions from zero, then, as we have just seen, the tabular value for this function is simply multiplied by 2.

Now let's look at specific examples. Below is a table of the standard normal distribution. Let's find tabular values ​​for three z: 1.64, 1.96 and 3.

How to understand the meaning of these numbers? Let's start with z=1.64, for which the table value is 0,4495 . The easiest way to explain the meaning is in the figure.

That is, the probability that a standardized normally distributed random variable falls within the interval from 0 before 1,64 , is equal to 0,4495 . When solving problems, it is usually necessary to calculate the probability of deviation in both directions, so we multiply the value 0,4495 by 2 and get approximately 0.9. The occupied area under the Gaussian curve is shown below.

Thus, 90% of all normally distributed values ​​fall within the interval ±1.64σ from the arithmetic mean. I did not choose the meaning by chance z=1.64, because the neighborhood around the arithmetic mean, occupying 90% of the total area, is sometimes used to calculate confidence intervals. If the checked value does not fall into the designated area, then its occurrence is unlikely (only 10%).

To test hypotheses, however, an interval covering 95% of all values ​​is more often used. Half the chance of 0,95 - This 0,4750 (see the second highlighted value in the table).

For this probability z=1.96. Those. within almost ±2σ from the average is 95% of the values. Only 5% fall outside these limits.

Another interesting and frequently used table value corresponds to z=3, it is equal to our table 0,4986 . Multiply by 2 and get 0,997 . So, within the framework ±3σ almost all values ​​are included from the arithmetic mean.

This is how the 3 sigma rule looks like for a normal distribution on the chart.

With the help of statistical tables, you can get any probability. However, this method is very slow, inconvenient and very outdated. Today everything is done on the computer. Next, we move on to the practice of calculations in Excel.

Normal distribution in Excel

Excel has several functions for calculating the probabilities or reciprocals of a normal distribution.

NORM.S.DIST function

Function NORM.ST.DIST designed to calculate the density ϕ(z) or probabilities Φ(z) according to normalized data ( z).

= NORM.ST.DIST(z, cumulative)

z is the value of the standardized variable

integral– if 0, then the density is calculatedϕ(z) , if 1 is the value of the function Ф(z), i.e. probability P(Z

Calculate the density and value of the function for various z: -3, -2, -1, 0, 1, 2, 3(we will indicate them in cell A2).

To calculate the density, you need the formula =NORM.ST.DIST(A2;0). In the diagram below, this is the red dot.

To calculate the value of the function =NORM.ST.DIST(A2;1). The diagram shows the shaded area under the normal curve.

In reality, it is more often necessary to calculate the probability that a random variable will not go beyond some limits from the mean (in standard deviations corresponding to the variable z), i.e. P(|Z| .

Let us determine what is the probability that a random variable will fall within the limits ±1z, ±2z and ±3z from zero. Formula required 2Ф(z)-1, in Excel =2*NORM.ST.DIST(A2;1)-1.

The diagram clearly shows the main basic properties of the normal distribution, including the three sigma rule. Function NORM.ST.DIST is an automatic spreadsheet of normal distribution function values ​​in Excel.

There may also be an inverse problem: according to the available probability P(Z find the standardized value z, which is the quantile of the standard normal distribution.

NORM.ST.INV function

NORM.ST.INV calculates the reciprocal of the standard normal distribution function. The syntax consists of one parameter:

=NORM.S.OBR(probability)

probability is a probability.

This formula is used as often as the previous one, because the same tables have to look for not only probabilities, but also quantiles.

For example, when calculating confidence intervals, a confidence probability is specified, according to which it is necessary to calculate the value z.

Considering that the confidence interval consists of an upper and lower bound and that the normal distribution is symmetrical around zero, it is sufficient to obtain an upper bound (positive deviation). The lower bound is taken with a negative sign. Let us denote the confidence probability as γ (gamma), then the upper limit of the confidence interval is calculated using the following formula.

Calculate values ​​in Excel z(which corresponds to the deviation from the mean in sigmas) for several probabilities, including those that any statistician knows by heart: 90%, 95%, and 99%. In cell B2, enter the formula: =NORM.ST.OBR((1+A2)/2). By changing the value of the variable (probability in cell A2), we get different boundaries of the intervals.

The confidence interval for 95% is 1.96, which is almost 2 standard deviations. From here it is easy even in the mind to estimate the possible spread of a normal random variable. In general, the 90%, 95%, and 99% confidence intervals correspond to ±1.64, ±1.96, and ±2.58 σ confidence intervals.

In general, the NORM.ST.DIST and NORM.ST.OBR functions allow you to perform any calculation related to the normal distribution. But to make things easier and less work, Excel has a few other features. For example, to calculate confidence intervals for the mean, you can use CONFID.NORM. To check the arithmetic mean there is a formula Z.TEST.

Consider a couple more useful formulas with examples.

NORM.DIST function

Function NORM.DIST differs from NORM.ST.DIST only by the fact that it is used to process data of any scale, and not just normalized ones. The normal distribution parameters are specified in the syntax.

=NORM.DIST(x, mean, standard_dev, cumulative)

average is the mathematical expectation used as the first parameter of the normal distribution model

standard_off– standard deviation – the second parameter of the model

integral- if 0, then the density is calculated, if 1 - then the value of the function, i.e. P(X

For example, the density for a value of 15, which is drawn from a normal sample with mean 10, standard deviation 3, is calculated as follows:

If the last parameter is set to 1, then we get the probability that the normal random variable will be less than 15 for the given distribution parameters. Thus, the probabilities can be calculated directly from the original data.

NORM.INV function

This is a quantile of the normal distribution, i.e. the value of the inverse function. The syntax is the following.

=NORM.INV(probability, mean, standard deviation)

probability- probability

average– expectation

standard_off– standard deviation

Purpose is the same as NORM.ST.INV, only the function works with data of any scale.

An example is shown in the video at the end of the article.

Modeling the Normal Distribution

Some tasks require the generation of normal random numbers. There is no ready-made function for this. However, Excel has two functions that return random numbers: RANDOMBETWEEN And RAND. The first produces random uniformly distributed integers within the specified limits. The second function generates uniformly distributed random numbers between 0 and 1. To make an artificial sample with any given distribution, you need a function RAND.

Let's say that for the experiment it is necessary to obtain a sample from a normally distributed general population with a mean of 10 and a standard deviation of 3. For one random value, we will write a formula in Excel.

NORM.INV(RAND();10;3)

Let's extend it to the required number of cells and the normal selection is ready.

To model standardized data, you should use NORM.ST.OBR.

The process of converting uniform numbers to normal numbers can be shown in the following diagram. From the uniform probabilities that are generated by the RAND formula, horizontal lines are drawn to the graph of the normal distribution function. Then, projections onto the horizontal axis are lowered from the points of intersection of the probabilities with the graph.

The most famous and frequently used law in probability theory is the normal distribution law or Gauss law .

main feature The normal distribution law lies in the fact that it is the limiting law for other distribution laws.

Note that for a normal distribution, the integral function has the form:

.

Let's show now that the probabilistic meaning of the parameters and is as follows: A there is a mathematical expectation, - the standard deviation (that is, ) of the normal distribution:

a) by definition of the mathematical expectation of a continuous random variable, we have

Really

,

since there is an odd function under the integral sign, and the limits of integration are symmetrical with respect to the origin;

- Poisson integral .

So, the mathematical expectation of the normal distribution is equal to the parameter A .

b) by definition of the dispersion of a continuous random variable and, taking into account that , we can write

.

Integrating by parts, setting , find

Hence .

So, the standard deviation of the normal distribution is equal to the parameter .

If and normal distribution is called normalized (or, standard normal) distribution. Then, obviously, the normalized density (differential) and the normalized integral distribution function will be written respectively in the form:

(The function, as you know, is called the Laplace function (see LECTURE 5) or the probability integral. Both functions, that is, , are tabulated and their values ​​are recorded in the corresponding tables).

Normal distribution properties (normal curve properties):

1. Obviously, a function on the entire real line.

2. , that is, the normal curve is located above the axis Oh .

3. , that is, the axis Oh serves as the horizontal asymptote of the graph.

4. Normal curve is symmetrical about a straight line x = a (accordingly, the graph of the function is symmetrical about the axis OU ).

Therefore, we can write: .

5. .

6. It is easy to show that the points And are the inflection points of the normal curve (prove yourself).

7.It's obvious that

but since , That . Besides , therefore, all odd moments are equal to zero.

For even moments, we can write:

8. .

9. .

10. , Where .

11. For negative values ​​of the random variable: , where .


13. The probability of hitting a random variable on a plot symmetrical about the center of distribution is equal to:

EXAMPLE 3. Show that a normally distributed random variable X deviates from expectation M(X) no more than .

Solution. For a normal distribution: .

In other words, the probability that the absolute value of the deviation will exceed triple the standard deviation is very small, namely 0.0027. This means that only in 0.27% of cases this can happen. Such events, based on the principle of the impossibility of unlikely events, can be considered practically impossible.

So, an event with a probability of 0.9973 can be considered practically certain, that is, a random variable deviates from the mathematical expectation by no more than .

EXAMPLE 4. Knowing the characteristics of the normal distribution of a random variable X - tensile strength of steel: kg / mm 2 and kg / mm 2, find the probability of obtaining steel with a tensile strength of 31 kg / mm 2 to 35 kg / mm 2.

Solution.

3. Exponential distribution (exponential distribution law)

The exponential (exponential) is the probability distribution of a continuous random variable X , which is described by a differential function (distribution density)

where is a constant positive value.

The exponential distribution is defined one parameter . This feature of the exponential distribution indicates its advantage over distributions that depend on a larger number of parameters. Usually, the parameters are unknown and one has to find their estimates (approximate values); of course, it is easier to evaluate one parameter than two, or three, etc.

It is easy to write the integral function of the exponential distribution:

We have defined the exponential distribution using a differential function; it is clear that it can be determined using the integral function.

Comment: Consider a continuous random variable T - the duration of the uptime of the product. Let us denote its accepted values ​​by t , . Cumulative distribution function defines failure probability products over a period of time t . Therefore, the probability of failure-free operation for the same time duration t , that is, the probability of the opposite event is equal to

) plays a particularly important role in probability theory and is most often used in solving practical problems. Its main feature is that it is the limiting law, which is approached by other laws of distribution under very common typical conditions. For example, the sum of a sufficiently large number of independent (or weakly dependent) random variables approximately obeys the normal law, and this is the more accurate, the more random variables are summed.

It has been experimentally proven that measurement errors, deviations in geometric dimensions and position of elements of building structures during their manufacture and installation, variability of the physical and mechanical characteristics of materials and loads acting on building structures are subject to the normal law.

Almost all random variables obey the Gaussian distribution, the deviation of which from the average values ​​is caused by a large set of random factors, each of which is individually insignificant (central limit theorem).

normal distribution called the distribution of a random continuous variable for which the probability density has the form (Fig. 18.1).

Rice. 18.1. Normal distribution law for a 1< a 2 .

(18.1)

where a and are the distribution parameters.

The probabilistic characteristics of a random variable distributed according to the normal law are:

Mathematical expectation (18.2)

Dispersion (18.3)

Standard deviation (18.4)

Asymmetry coefficient A = 0(18.5)

Excess E= 0. (18.6)

The parameter σ included in the Gaussian distribution is equal to the root-mean-square ratio of a random variable. Value A determines the position of the distribution center (see Fig. 18.1), and the value A- distribution width (Fig. 18.2), i.e. statistical spread around the mean.

Rice. 18.2. Normal distribution law for σ 1< σ 2 < σ 3

The probability of falling into a given interval (from x 1 to x 2) for a normal distribution, as in all cases, is determined by the integral of the probability density (18.1), which is not expressed in terms of elementary functions and is represented by a special function, called the Laplace function (integral of probabilities).

One of the representations of the probability integral:

(18.7)

Value And called quantile.

It can be seen that Ф(х) is an odd function, i.e. Ф(-х) = -Ф(х) . The values ​​of this function are calculated and presented in the form of tables in the technical and educational literature.


The distribution function of the normal law (Fig. 18.3) can be expressed in terms of the probability integral:

(18.9)

Rice. 18.2. The function of the normal distribution law.

The probability that a random variable distributed according to the normal law falls into the interval from X. to x, is determined by the expression:

It should be noted that

Ф(0) = 0; Ф(∞) = 0.5; Ф(-∞) = -0.5.

When solving practical problems related to distribution, one often has to consider the probability of falling into an interval that is symmetric with respect to the mathematical expectation, if the length of this interval i.e. if the interval itself has a boundary from to , we have:

When solving practical problems, the boundaries of deviations of random variables are expressed through the standard, the standard deviation, multiplied by a certain factor that determines the boundaries of the area of ​​deviations of a random variable.

Taking and and also using the formula (18.10) and the table F (x) (Appendix No. 1), we obtain

These formulas show that if a random variable has a normal distribution, then the probability of its deviation from its mean value by no more than σ is 68.27%, by no more than 2σ - 95.45%, and by no more than 3σ - 99.73%.

Since the value of 0.9973 is close to unity, it is practically considered impossible that the normal distribution of a random variable deviates from the mathematical expectation by more than 3σ. This rule, which is valid only for a normal distribution, is called the three sigma rule. Violation of it is likely P = 1 - 0.9973 = 0.0027. This rule is used when setting the boundaries of permissible deviations of tolerances of geometric characteristics of products and structures.

Normal distribution ( normal distribution) - plays an important role in data analysis.

Sometimes instead of the term normal distribution use the term Gaussian distribution in honor of K. Gauss (older terms, practically not used now: Gauss law, Gauss-Laplace distribution).

Univariate normal distribution

The normal distribution has a density::

In this formula, fixed parameters, - average, - standard deviation.

Graphs of density for various parameters are given.

The characteristic function of the normal distribution has the form:

Differentiating the characteristic function and setting t = 0, we obtain moments of any order.

The normal distribution density curve is symmetric with respect to and has a single maximum at this point, equal to

The standard deviation parameter varies from 0 to ∞.

Average varies from -∞ to +∞.

As the parameter increases, the curve spreads along the axis X, tending to 0 shrinks around the average value (the parameter characterizes the spread, scattering).

When it changes the curve is shifted along the axis X(see graphs).

By varying the parameters and , we obtain various models of random variables that arise in telephony.

A typical application of the normal law in the analysis of, for example, telecommunications data is signal modeling, description of noise, interference, errors, traffic.

Plots of univariate normal distribution

Figure 1. Normal distribution density plot: mean is 0, standard deviation is 1

Figure 2. Density plot of the standard normal distribution with areas containing 68% and 95% of all observations

Figure 3. Density plots of normal distributions with zero mean and different deviations (=0.5, =1, =2)

Figure 4 Graphs of two normal distributions N(-2,2) and N(3,2).

Note that the center of distribution has shifted when changing the parameter.

Comment

In a programme STATISTICS the designation N(3,2) is understood as a normal or Gaussian law with parameters: mean = 3 and standard deviation =2.

In the literature, sometimes the second parameter is interpreted as dispersion, i.e. square standard deviation.

Normal Distribution Percentage Point Calculations with a Probability Calculator STATISTICS

Using a probability calculator STATISTICS it is possible to calculate various characteristics of distributions without resorting to the cumbersome tables used in old books.

Step 1. We launch Analysis / Probability Calculator / Distributions.

In the distribution section, choose normal.

Figure 5. Launching the probability distribution calculator

Step 2 Specify the parameters we are interested in.

For example, we want to calculate the 95% quantile of a normal distribution with a mean of 0 and a standard deviation of 1.

Specify these parameters in the fields of the calculator (see fields of the calculator mean and standard deviation).

Let's introduce the parameter p=0.95.

Checkbox "Reverse f.r.". will be displayed automatically. Check the "Graph" box.

Click the "Calculate" button in the upper right corner.

Figure 6. Parameter setting

Step 3 In the Z field, we get the result: the quantile value is 1.64 (see the next window).

Figure 7. Viewing the result of the calculator

Figure 8. Plots of density and distribution functions. Straight x=1.644485

Figure 9. Graphs of the normal distribution function. Vertical dotted lines - x=-1.5, x=-1, x=-0.5, x=0

Figure 10. Graphs of the normal distribution function. Vertical dotted lines - x=0.5, x=1, x=1.5, x=2

Estimation of normal distribution parameters

Normal distribution values ​​can be calculated using interactive calculator.

Bivariate normal distribution

The univariate normal distribution generalizes naturally to two-dimensional normal distribution.

For example, if you consider a signal at only one point, then a one-dimensional distribution is enough for you, at two points - a two-dimensional distribution, at three points - a three-dimensional distribution, and so on.

The general formula for the bivariate normal distribution is:

Where is the pairwise correlation between x1 And x2;

x1 respectively;

Mean and standard deviation of a variable x2 respectively.

If random variables X 1 And X 2 are independent, then the correlation is 0, = 0, respectively, the middle term in the exponent vanishes, and we have:

f(x 1 ,x 2) = f(x 1)*f(x 2)

For independent quantities, the two-dimensional density decomposes into the product of two one-dimensional densities.

Bivariate Normal Density Plots

Figure 11. Density plot of a bivariate normal distribution (zero mean vector, unit covariance matrix)

Figure 12. Section of the density plot of the two-dimensional normal distribution by the plane z=0.05

Figure 13. Density plot of the bivariate normal distribution (zero expectation vector, covariance matrix with 1 on the main diagonal and 0.5 on the side diagonal)

Figure 14. Cross-section of the 2D normal density plot (expectation vector zero, covariance matrix with 1 on the main diagonal and 0.5 on the side diagonal) by the z= 0.05 plane

Figure 15. Density plot of a bivariate normal distribution (zero expectation vector, covariance matrix with 1 on the main diagonal and -0.5 on the side diagonal)

Figure 16. Cross-section of the density plot of the two-dimensional normal distribution (zero expectation vector, covariance matrix with 1 on the main diagonal and -0.5 on the side diagonal) by the z=0.05 plane

Figure 17. Cross-sections of plots of 2D normal distribution densities by plane z=0.05

For a better understanding of the bivariate normal distribution, try the following problem.

Task. Look at the graph of the bivariate normal distribution. Think about it, can it be represented as a rotation of a graph of a one-dimensional normal distribution? When do you need to apply the deformation technique?

We recommend reading

Top