Probability distribution law for a discrete two-dimensional random variable. Systems of random variables The distribution of a two-dimensional random variable is given by a table

Bearing structures 28.02.2021
Bearing structures

Definition 2.7. is a pair of random numbers (X, Y), or a point on the coordinate plane (Fig. 2.11).

Rice. 2.11.

A two-dimensional random variable is a special case of a multidimensional random variable, or random vector.

Definition 2.8. Random vector - is it a random function?,(/) with a finite set of possible argument values t, whose value for any value t is a random variable.

A two-dimensional random variable is called continuous if its coordinates are continuous, and discrete if its coordinates are discrete.

To set the law of distribution of two-dimensional random variables means to establish a correspondence between its possible values ​​and the probability of these values. According to the ways of setting, random variables are divided into continuous and discrete, although there are general ways to set the distribution law of any RV.

Discrete two-dimensional random variable

A discrete two-dimensional random variable is specified using a distribution table (Table 2.1).

Table 2.1

Allocation table (joint allocation) CB ( X, U)

Table elements are defined by the formula

Distribution table element properties:

The distribution over each coordinate is called one-dimensional or marginal:

R 1> = P(X =.d,) - marginal distribution of SW X;

p^2) = P(Y= y,)- marginal distribution of SV U.

Communication of the joint distribution of CB X and Y, given by the set of probabilities [p () ), i = 1,..., n,j = 1,..., t(distribution table), and marginal distribution.


Similarly for SV U p- 2)= X p, g

Problem 2.14. Given:

Continuous 2D random variable

/(X, y)dxdy- element of probability for a two-dimensional random variable (X, Y) - probability of hitting a random variable (X, Y) in a rectangle with sides cbc, dy at dx, dy -* 0:

f(x, y) - distribution density two-dimensional random variable (X, Y). Task /(x, y) we give complete information about the distribution of a two-dimensional random variable.

Marginal distributions are specified as follows: for X - by the distribution density of CB X/,(x); on Y- SV distribution density f>(y).

Setting the distribution law of a two-dimensional random variable by the distribution function

A universal way to specify the distribution law for a discrete or continuous two-dimensional random variable is the distribution function F(x, y).

Definition 2.9. Distribution function F(x, y)- probability of joint occurrence of events (Xy), i.e. F(x0,y n) = = P(X y), thrown onto the coordinate plane, fall into an infinite quadrant with a vertex at the point M(x 0, u i)(in the shaded area in Fig. 2.12).

Rice. 2.12. Illustration of the distribution function F( x, y)

Function Properties F(x, y)

  • 1) 0 1;
  • 2) F(-oo,-oo) = F(x,-oo) = F(-oo, y) = 0; F( oo, oo) = 1;
  • 3) F(x, y)- non-decreasing in each argument;
  • 4) F(x, y) - continuous left and bottom;
  • 5) consistency of distributions:

F(x, X: F(x, oo) = F,(x); F(y, oo) - marginal distribution over Y F( oo, y) = F 2 (y). Connection /(x, y) With F(x, y):

Relationship between joint density and marginal density. Dana f(x, y). We get the marginal distribution densities f(x),f 2 (y)".


The case of independent coordinates of a two-dimensional random variable

Definition 2.10. SW X and Yindependent(nc) if any events associated with each of these RVs are independent. From the definition of nc CB it follows:

  • 1 )Pij = p X) pf
  • 2 )F(x,y) = F l (x)F 2 (y).

It turns out that for independent SWs X and Y completed and

3 )f(x,y) = J(x)f,(y).

Let us prove that for independent SWs X and Y2) 3). Proof, a) Let 2), i.e.,

in the same time F(x,y) = f J f(u,v)dudv, whence it follows 3);

b) let 3 now hold, then


those. true 2).

Let's consider tasks.

Problem 2.15. The distribution is given by the following table:

We build marginal distributions:

We get P(X = 3, U = 4) = 0,17 * P(X = 3) P (Y \u003d 4) \u003d 0.1485 => => SV X and Dependents.

Distribution function:


Problem 2.16. The distribution is given by the following table:

We get P tl = 0.2 0.3 = 0.06; P 12 \u003d 0.2? 0.7 = 0.14; P2l = 0,8 ? 0,3 = = 0,24; R 22 - 0.8 0.7 = 0.56 => SW X and Y nz.

Problem 2.17. Dana /(x, y) = 1/st exp| -0.5(d "+ 2xy + 5d/ 2)]. Find Oh) and /Ay)-

Solution

(calculate yourself).

A two-dimensional random variable is called ( X, Y), the possible values ​​of which are pairs of numbers ( x, y). Components X and Y, considered simultaneously, form system two random variables.

A two-dimensional quantity can be geometrically interpreted as a random point M(X; Y) on surface xOy or as a random vector OM.

Discrete called a two-dimensional quantity, the components of which are discrete.

continuous called a two-dimensional quantity, the components of which are continuous.

distribution law The probabilities of a two-dimensional random variable are called the correspondence between possible values ​​and their probabilities.

The distribution law of a discrete two-dimensional random variable can be given: a) in the form of a double-entry table containing possible values ​​and their probabilities; b) analytically, for example, in the form of a distribution function.

distribution function probabilities of a two-dimensional random variable is called the function F(x, y), defining for each pair of numbers (x, y) the likelihood that X takes on a value less than x, and at the same time Y takes on a value less than y:

F(x, y) = P(X< x, Y < y).

Geometrically, this equality can be interpreted as follows: F(x, y) there is a probability that a random point ( X, Y) falls into an infinite quadrant with vertex ( x,y) located to the left and below this vertex.

Sometimes the term "integral function" is used instead of the term "distribution function".

The distribution function has the following properties:

Property 1. The values ​​of the distribution function satisfy the double inequality

0 ≤ F (x, y) ≤ 1.

Property 2. The distribution function is a non-decreasing function with respect to each argument:

F(x 2 , y) ≥ F(x 1 , y) if x 2 > x 1 ,

F(x, y 2) ≥ F(x, y 1) if y 2 > y 1 .

Property 3. There are limit relations:

1) F(–∞, y) = 0,

3) F(–∞, –∞) = 0,

2) F(x, –∞) = 0,

4) F(∞, ∞) = 1.

Property 4. a) At=∞ the distribution function of the system becomes the distribution function of the component X:

F(x, ∞) = F 1 (x).

b) For x = ∞ the distribution function of the system becomes the distribution function of the component Y:



F(∞, y) = F 2 (y).

Using the distribution function, you can find the probability of a random point falling into a rectangle x 1< X < x 2 , y 1 < Y < у 2 :

P(x1< X < x 2 , y 1 < Y < у 2) = – .

The density of the joint probability distribution (two-dimensional probability density) a continuous two-dimensional random variable is called the second mixed derivative of the distribution function:

Sometimes, instead of the term "two-dimensional probability density", the term "differential function of the system" is used.

The density of the joint distribution can be considered as the limit of the ratio of the probability of a random point falling into a rectangle with sides D x and D y to the area of ​​this rectangle when both its sides tend to zero; geometrically, it can be interpreted as a surface, which is called distribution surface.

Knowing the distribution density, one can find the distribution function by the formula

The probability of a random point (X, Y) falling into the region D is determined by the equality

A two-dimensional probability density has the following properties:

Property 1. Bivariate probability density is non-negative:

f(x,y) ≥ 0.

Property 2. The double improper integral with infinite limits of the two-dimensional probability density is equal to one:

In particular, if all possible values ​​of (X, Y) belong to a finite domain D, then

226. The probability distribution of a discrete two-dimensional random variable is given:

Find the laws of distribution of components.

228. The distribution function of a two-dimensional random variable is given

Find the probability of hitting a random point ( X, Y x = 0, x= p/4, y= p/6, y= p/3.

229. Find the probability of hitting a random point ( X, Y) into a rectangle bounded by lines x = 1, x = 2, y = 3, y= 5 if the distribution function is known

230. The distribution function of a two-dimensional random variable is given

Find the two-dimensional probability density of the system.

231. In a circle x 2 + y 2 ≤ R 2 bivariate probability density ; outside the circle f(x, y)= 0. Find: a) a constant C; b) the probability of hitting a random point ( X, Y) into a circle of radius r= 1 centered at the origin if R = 2.

232. In the first quadrant, the distribution function of the system of two random variables is given F(x, y) = 1 + 2 - x - 2 - y + 2 - x - y. Find: a) the two-dimensional probability density of the system; b) the probability of hitting a random point ( X, Y) into a triangle with vertices A(1; 3), B(3; 3), C(2; 8).

8.2. Conditional laws of distribution of probabilities of components
discrete two-dimensional random variable

Let the components X and Y are discrete and have the following possible values, respectively: x 1 , x 2 , …, x n ; y 1 , y 2 , …, y m.

Conditional distribution of component X at Y=yj(j retains the same value for all possible values ​​of X) is called the set of conditional probabilities

p(x 1 |y j), p(x 2 |y j), …, p(x n |y j).

The conditional distribution Y is defined similarly.

The conditional probabilities of the X and Y components are calculated, respectively, by the formulas

To control the calculations, it is advisable to make sure that the sum of the probabilities of the conditional distribution is equal to one.

233. Given a discrete two-dimensional random variable ( X, Y):

Find: a) conditional distribution law X provided that Y=10; b) conditional distribution law Y provided that X=6.

8.3. Finding Densities and Conditional Distribution Laws
components of a continuous two-dimensional random variable

The distribution density of one of the components is equal to improper integral with infinite limits on the density of the joint distribution of the system, and the integration variable corresponds to another component:

It is assumed here that the possible values ​​of each of the components belong to the entire numerical axis; if the possible values ​​belong to a finite interval, then the corresponding finite numbers are taken as the limits of integration.

The conditional distribution density of the component X at a given value Y=y is the ratio of the joint distribution density of the system to the distribution density of the component Y:

The conditional distribution density of the component is determined similarly Y:

If the conditional distribution densities of random variables X and Y are equal to their unconditional densities, then such quantities are independent.

Uniform is called the distribution of a two-dimensional continuous random variable ( X, Y) if in the region to which all possible values ​​belong ( x, y), the density of the joint probability distribution remains constant.

235. The joint distribution density of a continuous two-dimensional random variable (X, Y) is given

Find: a) the distribution density of the components; b) conditional distribution densities of components.

236. Joint distribution density of a continuous two-dimensional random variable ( X, Y)

Find: a) constant factor C; b) distribution density of components; c) conditional distribution densities of components.

237. Continuous two-dimensional random variable ( X, Y) is distributed uniformly inside a rectangle with the center of symmetry at the origin and sides 2a and 2b parallel to the coordinate axes. Find: a) the two-dimensional probability density of the system; b) the distribution density of the components.

238. Continuous two-dimensional random variable ( X, Y) is evenly distributed inside a right triangle with vertices O(0; 0), BUT(0; 8), AT(8;0). Find: a) the two-dimensional probability density of the system; b) densities and conditional distribution densities of components.

8.4. Numerical characteristics of a continuous system
two random variables

Knowing the distribution densities of the X and Y components of a continuous two-dimensional random variable (X, Y), we can find their mathematical expectations and variances:

Sometimes it is more convenient to use formulas containing a two-dimensional probability density (double integrals are taken over the range of possible values ​​of the system):

Initial moment n k, s order k+s systems ( X, Y) is called the expectation of the product X k Y s:

nk, s = M.

In particular,

n 1.0 = M(X), n 0.1 = M(Y).

Central moment m k, s order k+s systems ( X, Y) is called the mathematical expectation of the product of deviations, respectively k-th and s th degrees:

m k, s = M( k ∙ s ).

In particular,

m 1.0 = M = 0, m 0.1 = M = 0;

m 2.0 = M 2 = D(X), m 0.2 = M 2 = D(Y);

Correlation moment m xу systems ( X, Y) is called the central moment m 1.1 order 1 + 1:

m xу = M( ∙ ).

Correlation coefficient the values ​​X and Y are the ratio of the correlation moment to the product of the standard deviations of these values:

r xy = m xy / (s x s y).

The correlation coefficient is a dimensionless quantity, and | rxy| ≤ 1. The correlation coefficient is used to assess the tightness of the linear relationship between X and Y: the closer the absolute value of the correlation coefficient is to one, the stronger the relationship; the closer the absolute value of the correlation coefficient is to zero, the weaker the relationship.

correlated two random variables are called if their correlation moment is different from zero.

Uncorrelated two random variables are called if their correlation moment is equal to zero.

Two correlated quantities are also dependent; if two quantities are dependent, then they can be either correlated or uncorrelated. From the independence of two quantities follows their uncorrelatedness, but from the uncorrelatedness it is still impossible to conclude that these quantities are independent (for normally distributed quantities, their independence follows from the uncorrelatedness of these quantities).

For continuous quantities X and Y, the correlation moment can be found using the formulas:

239. The joint distribution density of a continuous two-dimensional random variable (X, Y) is given:

Find: a) mathematical expectations; b) variances of the X and Y components.

240. The joint distribution density of a continuous two-dimensional random variable (X, Y) is given:

Find the mathematical expectations and variances of the components.

241. The joint distribution density of a continuous two-dimensional random variable ( X, Y): f(x, y) = 2 cosx cozy squared 0 ≤ x≤p/4, 0 ≤ y≤p/4; outside the square f(x, y)= 0. Find the mathematical expectations of the components.

242. Prove that if the two-dimensional probability density of a system of random variables ( X, Y) can be represented as a product of two functions, one of which depends only on x, and the other - only from y, then the quantities X and Y independent.

243. Prove that if X and Y connected by a linear relationship Y = aX + b, then the absolute value of the correlation coefficient is equal to one.

Solution. By definition of the correlation coefficient,

r xy = m xy / (s x s y).

m xу = M( ∙ ). (*)

Let's find the mathematical expectation Y:

M(Y) = M = aM(X) + b. (**)

Substituting (**) into (*), after elementary transformations we obtain

m xy \u003d aM 2 \u003d aD (X) \u003d as 2 x.

Given that

Y – M(Y) = (aX + b) – (aM(X) + b) = a,

find the variance Y:

D(Y) = M 2 = a 2 M 2 = a 2 s 2 x .

From here s y = |a|s x. Therefore, the correlation coefficient

If a a> 0, then rxy= 1; if a < 0, то rxy = –1.

So, | rxy| = 1, which was to be proved.

Let a two-dimensional random variable $(X,Y)$ be given.

Definition 1

The distribution law of a two-dimensional random variable $(X,Y)$ is the set of possible pairs of numbers $(x_i,\ y_j)$ (where $x_i \epsilon X,\ y_j \epsilon Y$) and their probabilities $p_(ij)$ .

Most often, the distribution law of a two-dimensional random variable is written in the form of a table (Table 1).

Figure 1. Law of distribution of a two-dimensional random variable.

Let's remember now theorem on the addition of probabilities of independent events.

Theorem 1

The probability of the sum of a finite number of independent events $(\ A)_1$, $(\ A)_2$, ... ,$\ (\ A)_n$ is calculated by the formula:

Using this formula, one can obtain distribution laws for each component of a two-dimensional random variable, that is:

From here it will follow that the sum of all probabilities of a two-dimensional system has the following form:

Let us consider in detail (step by step) the problem associated with the concept of the distribution law of a two-dimensional random variable.

Example 1

The distribution law of a two-dimensional random variable is given by the following table:

Figure 2.

Find the laws of distribution of random variables $X,\ Y$, $X+Y$ and check in each case that the total sum of probabilities is equal to one.

  1. Let us first find the distribution of the random variable $X$. The random variable $X$ can take the values ​​$x_1=2,$ $x_2=3$, $x_3=5$. To find the distribution, we will use Theorem 1.

Let us first find the sum of probabilities $x_1$ as follows:

Figure 3

Similarly, we find $P\left(x_2\right)$ and $P\left(x_3\right)$:

\ \

Figure 4

  1. Let us now find the distribution of the random variable $Y$. The random variable $Y$ can take the values ​​$x_1=1,$ $x_2=3$, $x_3=4$. To find the distribution, we will use Theorem 1.

Let us first find the sum of probabilities $y_1$ as follows:

Figure 5

Similarly, we find $P\left(y_2\right)$ and $P\left(y_3\right)$:

\ \

Hence, the law of distribution of the quantity $X$ has the following form:

Figure 6

Let's check the fulfillment of the equality of the total sum of probabilities:

  1. It remains to find the law of distribution of the random variable $X+Y$.

Let's designate it for convenience through $Z$: $Z=X+Y$.

First, let's find what values ​​this quantity can take. To do this, we will pairwise add the values ​​of $X$ and $Y$. We get the following values: 3, 4, 6, 5, 6, 8, 6, 7, 9. Now, discarding the matched values, we get that the random variable $X+Y$ can take the values ​​$z_1=3,\ z_2=4 ,\ z_3=5,\ z_4=6,\ z_5=7,\ z_6=8,\ z_7=9.\ $

First, let's find $P(z_1)$. Since the value of $z_1$ is single, it is found as follows:

Figure 7

All probabilities are found similarly, except for $P(z_4)$:

Let us now find $P(z_4)$ as follows:

Figure 8

Hence, the distribution law for $Z$ has the following form:

Figure 9

Let's check the fulfillment of the equality of the total sum of probabilities:

An ordered pair (X , Y) of random variables X and Y is called a two-dimensional random variable, or a random vector of a two-dimensional space. A two-dimensional random variable (X,Y) is also called a system of random variables X and Y. The set of all possible values ​​of a discrete random variable with their probabilities is called the distribution law of this random variable. A discrete two-dimensional random variable (X, Y) is considered given if its distribution law is known:

P(X=x i , Y=y j) = p ij , i=1,2...,n, j=1,2...,m

Service assignment. Using the service, according to a given distribution law, you can find:

  • distribution series X and Y, mathematical expectation M[X], M[Y], variance D[X], D[Y];
  • covariance cov(x,y), correlation coefficient r x,y , conditional distribution series X, conditional expectation M;
In addition, an answer is given to the question, "Are the random variables X and Y dependent?".

Instruction. Specify the dimension of the probability distribution matrix (number of rows and columns) and its form. The resulting solution is saved in a Word file.

Example #1. A two-dimensional discrete random variable has a distribution table:

Y/X 1 2 3 4
10 0 0,11 0,12 0,03
20 0 0,13 0,09 0,02
30 0,02 0,11 0,08 0,01
40 0,03 0,11 0,05 q
Find the q value and the correlation coefficient of this random variable.

Solution. We find the value q from the condition Σp ij = 1
Σp ij = 0.02 + 0.03 + 0.11 + … + 0.03 + 0.02 + 0.01 + q = 1
0.91+q = 1. Whence q = 0.09

Using the formula ∑P(x i,y j) = p i(j=1..n), find the distribution series X.

Mathematical expectation M[Y].
M[y] = 1*0.05 + 2*0.46 + 3*0.34 + 4*0.15 = 2.59
Dispersion D[Y] = 1 2 *0.05 + 2 2 *0.46 + 3 2 *0.34 + 4 2 *0.15 - 2.59 2 = 0.64
Standard deviationσ(y) = sqrt(D[Y]) = sqrt(0.64) = 0.801

covariance cov(X,Y) = M - M[X] M[Y] = 2 10 0.11 + 3 10 0.12 + 4 10 0.03 + 2 20 0.13 + 3 20 0.09 + 4 20 0.02 + 1 30 0.02 + 2 30 0.11 + 3 30 0.08 + 4 30 0.01 + 1 40 0.03 + 2 40 0.11 + 3 40 0.05 + 4 40 0.09 - 25.2 2.59 = -0.068
Correlation coefficient rxy = cov(x,y)/σ(x)&sigma(y) = -0.068/(11.531*0.801) = -0.00736

Example 2 . The data of statistical processing of information regarding two indicators X and Y are reflected in the correlation table. Required:

  1. write distribution series for X and Y and calculate sample means and sample standard deviations for them;
  2. write conditional distribution series Y/x and calculate conditional averages Y/x;
  3. graphically depict the dependence of the conditional averages Y/x on the values ​​of X;
  4. calculate the sample correlation coefficient Y on X;
  5. write a sample direct regression equation;
  6. represent geometrically the data of the correlation table and build a regression line.
Solution. An ordered pair (X,Y) of random variables X and Y is called a two-dimensional random variable, or a random vector of a two-dimensional space. A two-dimensional random variable (X,Y) is also called a system of random variables X and Y.
The set of all possible values ​​of a discrete random variable with their probabilities is called the distribution law of this random variable.
A discrete two-dimensional random variable (X,Y) is considered given if its distribution law is known:
P(X=x i , Y=y j) = p ij , i=1,2...,n, j=1,2...,m
X/Y20 30 40 50 60
11 2 0 0 0 0
16 4 6 0 0 0
21 0 3 6 2 0
26 0 0 45 8 4
31 0 0 4 6 7
36 0 0 0 0 3
Events (X=x i , Y=y j) form full group events, so the sum of all probabilities p ij ( i=1,2...,n, j=1,2...,m) indicated in the table is equal to 1.
1. Dependence of random variables X and Y.
Find the distribution series X and Y.
Using the formula ∑P(x i,y j) = p i(j=1..n), find the distribution series X. Mathematical expectation M[Y].
M[y] = (20*6 + 30*9 + 40*55 + 50*16 + 60*14)/100 = 42.3
Dispersion D[Y].
D[Y] = (20 2 *6 + 30 2 *9 + 40 2 *55 + 50 2 *16 + 60 2 *14)/100 - 42.3 2 = 99.71
Standard deviation σ(y).

Since, P(X=11,Y=20) = 2≠2 6, then the random variables X and Y dependent.
2. Conditional distribution law X.
Conditional distribution law X(Y=20).
P(X=11/Y=20) = 2/6 = 0.33
P(X=16/Y=20) = 4/6 = 0.67
P(X=21/Y=20) = 0/6 = 0
P(X=26/Y=20) = 0/6 = 0
P(X=31/Y=20) = 0/6 = 0
P(X=36/Y=20) = 0/6 = 0
Conditional expectation M = 11*0.33 + 16*0.67 + 21*0 + 26*0 + 31*0 + 36*0 = 14.33
Conditional variance D = 11 2 *0.33 + 16 2 *0.67 + 21 2 *0 + 26 2 *0 + 31 2 *0 + 36 2 *0 - 14.33 2 = 5.56
Conditional distribution law X(Y=30).
P(X=11/Y=30) = 0/9 = 0
P(X=16/Y=30) = 6/9 = 0.67
P(X=21/Y=30) = 3/9 = 0.33
P(X=26/Y=30) = 0/9 = 0
P(X=31/Y=30) = 0/9 = 0
P(X=36/Y=30) = 0/9 = 0
Conditional expectation M = 11*0 + 16*0.67 + 21*0.33 + 26*0 + 31*0 + 36*0 = 17.67
Conditional variance D = 11 2 *0 + 16 2 *0.67 + 21 2 *0.33 + 26 2 *0 + 31 2 *0 + 36 2 *0 - 17.67 2 = 5.56
Conditional distribution law X(Y=40).
P(X=11/Y=40) = 0/55 = 0
P(X=16/Y=40) = 0/55 = 0
P(X=21/Y=40) = 6/55 = 0.11
P(X=26/Y=40) = 45/55 = 0.82
P(X=31/Y=40) = 4/55 = 0.0727
P(X=36/Y=40) = 0/55 = 0
Conditional expectation M = 11*0 + 16*0 + 21*0.11 + 26*0.82 + 31*0.0727 + 36*0 = 25.82
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0.11 + 26 2 *0.82 + 31 2 *0.0727 + 36 2 *0 - 25.82 2 = 4.51
Conditional distribution law X(Y=50).
P(X=11/Y=50) = 0/16 = 0
P(X=16/Y=50) = 0/16 = 0
P(X=21/Y=50) = 2/16 = 0.13
P(X=26/Y=50) = 8/16 = 0.5
P(X=31/Y=50) = 6/16 = 0.38
P(X=36/Y=50) = 0/16 = 0
Conditional expectation M = 11*0 + 16*0 + 21*0.13 + 26*0.5 + 31*0.38 + 36*0 = 27.25
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0.13 + 26 2 *0.5 + 31 2 *0.38 + 36 2 *0 - 27.25 2 = 10.94
Conditional distribution law X(Y=60).
P(X=11/Y=60) = 0/14 = 0
P(X=16/Y=60) = 0/14 = 0
P(X=21/Y=60) = 0/14 = 0
P(X=26/Y=60) = 4/14 = 0.29
P(X=31/Y=60) = 7/14 = 0.5
P(X=36/Y=60) = 3/14 = 0.21
Conditional expectation M = 11*0 + 16*0 + 21*0 + 26*0.29 + 31*0.5 + 36*0.21 = 30.64
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0 + 26 2 *0.29 + 31 2 *0.5 + 36 2 *0.21 - 30.64 2 = 12.37
3. Conditional distribution law Y.
Conditional distribution law Y(X=11).
P(Y=20/X=11) = 2/2 = 1
P(Y=30/X=11) = 0/2 = 0
P(Y=40/X=11) = 0/2 = 0
P(Y=50/X=11) = 0/2 = 0
P(Y=60/X=11) = 0/2 = 0
Conditional expectation M = 20*1 + 30*0 + 40*0 + 50*0 + 60*0 = 20
Conditional variance D = 20 2 *1 + 30 2 *0 + 40 2 *0 + 50 2 *0 + 60 2 *0 - 20 2 = 0
Conditional distribution law Y(X=16).
P(Y=20/X=16) = 4/10 = 0.4
P(Y=30/X=16) = 6/10 = 0.6
P(Y=40/X=16) = 0/10 = 0
P(Y=50/X=16) = 0/10 = 0
P(Y=60/X=16) = 0/10 = 0
Conditional expectation M = 20*0.4 + 30*0.6 + 40*0 + 50*0 + 60*0 = 26
Conditional variance D = 20 2 *0.4 + 30 2 *0.6 + 40 2 *0 + 50 2 *0 + 60 2 *0 - 26 2 = 24
Conditional distribution law Y(X=21).
P(Y=20/X=21) = 0/11 = 0
P(Y=30/X=21) = 3/11 = 0.27
P(Y=40/X=21) = 6/11 = 0.55
P(Y=50/X=21) = 2/11 = 0.18
P(Y=60/X=21) = 0/11 = 0
Conditional expectation M = 20*0 + 30*0.27 + 40*0.55 + 50*0.18 + 60*0 = 39.09
Conditional variance D = 20 2 *0 + 30 2 *0.27 + 40 2 *0.55 + 50 2 *0.18 + 60 2 *0 - 39.09 2 = 44.63
Conditional distribution law Y(X=26).
P(Y=20/X=26) = 0/57 = 0
P(Y=30/X=26) = 0/57 = 0
P(Y=40/X=26) = 45/57 = 0.79
P(Y=50/X=26) = 8/57 = 0.14
P(Y=60/X=26) = 4/57 = 0.0702
Conditional expectation M = 20*0 + 30*0 + 40*0.79 + 50*0.14 + 60*0.0702 = 42.81
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0.79 + 50 2 *0.14 + 60 2 *0.0702 - 42.81 2 = 34.23
Conditional distribution law Y(X=31).
P(Y=20/X=31) = 0/17 = 0
P(Y=30/X=31) = 0/17 = 0
P(Y=40/X=31) = 4/17 = 0.24
P(Y=50/X=31) = 6/17 = 0.35
P(Y=60/X=31) = 7/17 = 0.41
Conditional expectation M = 20*0 + 30*0 + 40*0.24 + 50*0.35 + 60*0.41 = 51.76
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0.24 + 50 2 *0.35 + 60 2 *0.41 - 51.76 2 = 61.59
Conditional distribution law Y(X=36).
P(Y=20/X=36) = 0/3 = 0
P(Y=30/X=36) = 0/3 = 0
P(Y=40/X=36) = 0/3 = 0
P(Y=50/X=36) = 0/3 = 0
P(Y=60/X=36) = 3/3 = 1
Conditional expectation M = 20*0 + 30*0 + 40*0 + 50*0 + 60*1 = 60
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0 + 50 2 *0 + 60 2 *1 - 60 2 = 0
covariance.
cov(X,Y) = M - M[X] M[Y]
cov(X,Y) = (20 11 2 + 20 16 4 + 30 16 6 + 30 21 3 + 40 21 6 + 50 21 2 + 40 26 45 + 50 26 8 + 60 26 4 + 40 31 4 + 50 31 6 + 60 31 7 + 60 36 3)/100 - 25.3 42.3 = 38.11
If the random variables are independent, then their covariance is zero. In our case cov(X,Y) ≠ 0.
Correlation coefficient.


The linear regression equation from y to x is:

The linear regression equation from x to y is:

Find the necessary numerical characteristics.
Sample means:
x = (20(2 + 4) + 30(6 + 3) + 40(6 + 45 + 4) + 50(2 + 8 + 6) + 60(4 + 7 + 3))/100 = 42.3
y = (20(2 + 4) + 30(6 + 3) + 40(6 + 45 + 4) + 50(2 + 8 + 6) + 60(4 + 7 + 3))/100 = 25.3
dispersions:
σ 2 x = (20 2 (2 + 4) + 30 2 (6 + 3) + 40 2 (6 + 45 + 4) + 50 2 (2 + 8 + 6) + 60 2 (4 + 7 + 3) )/100 - 42.3 2 = 99.71
σ 2 y = (11 2 (2) + 16 2 (4 + 6) + 21 2 (3 + 6 + 2) + 26 2 (45 + 8 + 4) + 31 2 (4 + 6 + 7) + 36 2 (3))/100 - 25.3 2 = 24.01
Where do we get the standard deviations:
σ x = 9.99 and σ y = 4.9
and covariance:
Cov(x,y) = (20 11 2 + 20 16 4 + 30 16 6 + 30 21 3 + 40 21 6 + 50 21 2 + 40 26 45 + 50 26 8 + 60 26 4 + 40 31 4 + 50 31 6 + 60 31 7 + 60 36 3)/100 - 42.3 25.3 = 38.11
Let's define the correlation coefficient:


Let's write down the equations of the regression lines y(x):

and calculating, we get:
yx = 0.38x + 9.14
Let's write down the equations of regression lines x(y):

and calculating, we get:
x y = 1.59 y + 2.15
If we build the points defined by the table and the regression lines, we will see that both lines pass through the point with coordinates (42.3; 25.3) and the points are located close to the regression lines.
Significance of the correlation coefficient.

According to Student's table with significance level α=0.05 and degrees of freedom k=100-m-1 = 98 we find t crit:
t crit (n-m-1;α/2) = (98;0.025) = 1.984
where m = 1 is the number of explanatory variables.
If t obs > t is critical, then the obtained value of the correlation coefficient is recognized as significant (the null hypothesis asserting that the correlation coefficient is equal to zero is rejected).
Since t obl > t crit, we reject the hypothesis that the correlation coefficient is equal to 0. In other words, the correlation coefficient is statistically significant.

Exercise. The number of hits of pairs of values ​​of random variables X and Y in the corresponding intervals are given in the table. From these data, find the sample correlation coefficient and the sample equations of the straight regression lines Y on X and X on Y .
Solution

Example. The probability distribution of a two-dimensional random variable (X, Y) is given by a table. Find the laws of distribution of the component quantities X, Y and the correlation coefficient p(X, Y).
Download Solution

Exercise. A two-dimensional discrete value (X, Y) is given by a distribution law. Find the distribution laws of the X and Y components, covariance and correlation coefficient.

Quite often, when studying random variables, one has to deal with two, three, or even more random variables. For example, the two-dimensional random variable $\left(X,\ Y\right)$ will describe the hit point of the projectile, where the random variables $X,\ Y$ are the abscissa and the ordinate, respectively. The performance of a random student during the session is characterized by an $n$-dimensional random variable $\left(X_1,\ X_2,\ \dots ,\ X_n\right)$, where the random variables are $X_1,\ X_2,\ \dots ,\ X_n $ - these are the grades put down in the grade book in various disciplines.

The set of $n$ random variables $\left(X_1,\ X_2,\ \dots ,\ X_n\right)$ is called random vector. We restrict ourselves to the case $\left(X,\ Y\right)$.

Let $X$ be a discrete random variable with possible values ​​$x_1,x_2,\ \dots ,\ x_n$, and $Y$ be a discrete random variable with possible values ​​$y_1,y_2,\ \dots ,\ y_n$.

Then a discrete two-dimensional random variable $\left(X,\ Y\right)$ can take the values ​​$\left(x_i,\ y_j\right)$ with probabilities $p_(ij)=P\left(\left(X=x_i \right)\left(Y=y_j\right)\right)=P\left(X=x_i\right)P\left(Y=y_j|X=x_i\right)$. Here $P\left(Y=y_j|X=x_i\right)$ is the conditional probability that the random variable $Y$ takes the value $y_j$ given that the random variable $X$ takes the value $x_i$.

The probability that the random variable $X$ takes the value $x_i$ is equal to $p_i=\sum_j(p_(ij))$. The probability that the random variable $Y$ takes the value $y_j$ is equal to $q_j=\sum_i(p_(ij))$.

$$P\left(X=x_i|Y=y_j\right)=((P\left(\left(X=x_i\right)\left(Y=y_j\right)\right))\over (P\ left(Y=y_j\right)))=((p_(ij))\over (q_j)).$$

$$P\left(Y=y_j|X=x_i\right)=((P\left(\left(X=x_i\right)\left(Y=y_j\right)\right))\over (P\ left(X=x_i\right)))=((p_(ij))\over (p_i)).$$

Example 1 . The distribution of a two-dimensional random variable is given:

$\begin(array)(|c|c|)
\hline
X\backslash Y & 2 & 3 \\
\hline
-1 & 0,15 & 0,25 \\
\hline
0 & 0,28 & 0,13 \\
\hline
1 & 0,09 & 0,1 \\
\hline
\end(array)$

Let us define the distribution laws for the random variables $X$ and $Y$. Let us find the conditional distributions of the random variable $X$ under the condition $Y=2$ and the random variable $Y$ under the condition $X=0$.

Let's fill in the following table:

$\begin(array)(|c|c|)
\hline
X\backslash Y & 2 & 3 & p_i & p_(ij)/q_1 \\
\hline
-1 & 0,15 & 0,25 & 0,4 & 0,29 \\
\hline
0 & 0,28 & 0,13 & 0,41 & 0,54 \\
\hline
1 & 0,09 & 0,1 & 0,19 & 0,17 \\
\hline
q_j & 0.52 & 0.48 & 1 & \\
\hline
p_(ij)/p_2 & 0.68 & 0.32 & & \\
\hline
1 & 0,09 & 0,1 \\
\hline
\end(array)$

Let's explain how the table is filled. The values ​​of the first three columns of the first four rows are taken from the condition. The sum of the numbers of the $2$th and $3$th columns of the $2$th ($3$th) row is indicated in the $4$th column of the $2$th ($3$th) row. The sum of the numbers in the $2$th and $3$th columns of the $4$th row is indicated in the $4$th column of the $4$th row.

The sum of numbers in the $2$th, $3$th and $4$th rows of the $2$th ($3$th) column is written in the $5$th row of the $2$th ($3$th) column. Each number in the $2$th column is divided by $q_1=0.52$, the result is rounded up to two decimal places and written in the $5$th column. The numbers from the $2$th and $3$th columns of the $3$th row are divided by $p_2=0.41$, the result is rounded up to two decimal places and written in the last line.

Then the law of distribution of the random variable $X$ has the following form.

$\begin(array)(|c|c|)
\hline
X & -1 & 0 & 1 \\
\hline
p_i & 0.4 & 0.41 & 0.19 \\
\hline
\end(array)$

The law of distribution of the random variable $Y$.

$\begin(array)(|c|c|)
\hline
Y & 2 & 3 \\
\hline
q_j & 0.52 & 0.48 \\
\hline
\end(array)$

The conditional distribution of the random variable $X$ under the condition $Y=2$ has the following form.

$\begin(array)(|c|c|)
\hline
X & -1 & 0 & 1 \\
\hline
p_(ij)/q_1 & 0.29 & 0.54 & 0.17 \\
\hline
\end(array)$

The conditional distribution of the random variable $Y$ under the condition $X=0$ has the following form.

$\begin(array)(|c|c|)
\hline
Y & 2 & 3 \\
\hline
p_(ij)/p_2 & 0.68 & 0.32 \\
\hline
\end(array)$

Example 2 . We have six pencils, two of which are red. We put the pencils in two boxes. $2$ pieces are put into the first one, and two into the second one. $X$ is the number of red pencils in the first box, and $Y$ is in the second. Write the distribution law for the system of random variables $(X,\ Y)$.

Let the discrete random variable $X$ be the number of red pencils in the first box, and the discrete random variable $Y$ be the number of red pencils in the second box. The possible values ​​of the random variables $X,\ Y$ are respectively $X:0,\ 1,\ 2$, $Y:0,\ 1,\ 2$. Then a discrete two-dimensional random variable $\left(X,\ Y\right)$ can take the values ​​$\left(x,\ y\right)$ with probabilities $P=P\left(\left(X=x\right) \times \left(Y=y\right)\right)=P\left(X=x\right)\times P\left(Y=y|X=x\right)$, where $P\left(Y =y|X=x\right)$ - the conditional probability that the random variable $Y$ takes the value $y$, provided that the random variable $X$ takes the value $x$. Let us represent the correspondence between the values ​​$\left(x,\ y\right)$ and the probabilities $P\left(\left(X=x\right)\times \left(Y=y\right)\right)$ as follows tables.

$\begin(array)(|c|c|)
\hline
X\backslash Y & 0 & 1 & 2 \\
\hline
0 & ((1)\over (15)) & ((4)\over (15)) & ((1)\over (15)) \\
\hline
1 & ((4)\over (15)) & ((4)\over (15)) & 0 \\
\hline
2 & ((1)\over (15)) & 0 & 0 \\
\hline
\end(array)$

The rows of such a table indicate the values ​​$X$, and the columns indicate the values ​​$Y$, then the probabilities $P\left(\left(X=x\right)\times \left(Y=y\right)\right)$ are indicated at the intersection of the corresponding row and column. Calculate the probabilities using the classical definition of probability and the product theorem of probabilities of dependent events.

$$P\left(\left(X=0\right)\left(Y=0\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^2_2) \over (C^2_4))=((6)\over (15))\cdot ((1)\over (6))=((1)\over (15));$$

$$P\left(\left(X=0\right)\left(Y=1\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^1_2\ cdot C^1_2)\over (C^2_4))=((6)\over (15))\cdot ((2\cdot 2)\over (6))=((4)\over (15)) ;$$

$$P\left(\left(X=0\right)\left(Y=2\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^2_2) \over (C^2_4))=((6)\over (15))\cdot ((1)\over (6))=((1)\over (15));$$

$$P\left(\left(X=1\right)\left(Y=0\right)\right)=((C^1_2\cdot C^1_4)\over (C^2_6))\cdot ( (C^2_3)\over (C^2_4))=((2\cdot 4)\over (15))\cdot ((3)\over (6))=((4)\over (15)) ;$$

$$P\left(\left(X=1\right)\left(Y=1\right)\right)=((C^1_2\cdot C^1_4)\over (C^2_6))\cdot ( (C^1_1\cdot C^1_3)\over (C^2_4))=((2\cdot 4)\over (15))\cdot ((1\cdot 3)\over (6))=(( 4)\over(15));$$

$$P\left(\left(X=2\right)\left(Y=0\right)\right)=((C^2_2)\over (C^2_6))\cdot ((C^2_4) \over (C^2_4))=((1)\over (15))\cdot 1=((1)\over (15)).$$

Since in the distribution law (the resulting table) the entire set of events forms a complete group of events, the sum of the probabilities should be equal to 1. Let's check this:

$$\sum_(i,\ j)(p_(ij))=((1)\over (15))+((4)\over (15))+((1)\over (15))+ ((4)\over (15))+((4)\over (15))+((1)\over (15))=1.$$

Distribution function of a two-dimensional random variable

distribution function A two-dimensional random variable $\left(X,\ Y\right)$ is a function $F\left(x,\ y\right)$, which for any real numbers $x$ and $y$ is equal to the probability of joint execution of two events $ \left\(X< x\right\}$ и $\left\{Y < y\right\}$. Таким образом, по определению

$$F\left(x,\ y\right)=P\left\(X< x,\ Y < y\right\}.$$

For a discrete two-dimensional random variable, the distribution function is found by summing all probabilities $p_(ij)$ for which $x_i< x,\ y_j < y$, то есть

$$F\left(x,\ y\right)=\sum_(x_i< x}{\sum_{y_j < y}{p_{ij}}}.$$

Properties of the distribution function of a two-dimensional random variable.

1 . The distribution function $F\left(x,\ y\right)$ is bounded, that is, $0\le F\left(x,\ y\right)\le 1$.

2 . $F\left(x,\ y\right)$ non-decreasing for each of its arguments with the other fixed, i.e. $F\left(x_2,\ y\right)\ge F\left(x_1,\ y\right )$ for $x_2>x_1$, $F\left(x,\ y_2\right)\ge F\left(x,\ y_1\right)$ for $y_2>y_1$.

3 . If at least one of the arguments takes the value $-\infty $, then the distribution function will be equal to zero, i.e. $F\left(-\infty ,\ y\right)=F\left(x,\ -\infty \right ),\ F\left(-\infty ,\ -\infty \right)=0$.

4 . If both arguments take the value $+\infty $, then the distribution function will be equal to $1$, i.e. $F\left(+\infty ,\ +\infty \right)=1$.

5 . In the case when exactly one of the arguments takes the value $+\infty $, the distribution function $F\left(x,\ y\right)$ becomes the distribution function of the random variable corresponding to the other element, i.e. $F\left(x ,\ +\infty \right)=F_1\left(x\right)=F_X\left(x\right),\ F\left(+\infty ,\ y\right)=F_y\left(y\right) =F_Y\left(y\right)$.

6 . $F\left(x,\ y\right)$ is left continuous for each of its arguments, i.e.

$$(\mathop(lim)_(x\to x_0-0) F\left(x,\ y\right)\ )=F\left(x_0,\ y\right),\ (\mathop(lim) _(y\to y_0-0) F\left(x,\ y\right)\ )=F\left(x,\ y_0\right).$$

Example 3 . Let a discrete two-dimensional random variable $\left(X,\ Y\right)$ be given by a distribution series.

$\begin(array)(|c|c|)
\hline
X\backslash Y & 0 & 1 \\
\hline
0 & ((1)\over (6)) & ((2)\over (6)) \\
\hline
1 & ((2)\over (6)) & ((1)\over (6)) \\
\hline
\end(array)$

Then the distribution function:

$F(x,y)=\left\(\begin(matrix)
0,\ at\ x\le 0,\ y\le 0 \\
0,\ at\ x\le 0,\ 0< y\le 1 \\
0,\ for\ x\le 0,\ y>1 \\
0,\ at\ 0< x\le 1,\ y\le 0 \\
((1)\over (6)),\ at\ 0< x\le 1,\ 0 < y\le 1 \\
((1)\over (6))+((2)\over (6))=((1)\over (2)),\ when\ 0< x\le 1,\ y>1 \\
0,\ for\ x>1,\ y\le 0 \\
((1)\over (6))+((2)\over (6))=((1)\over (2)),\ when\ x>1,\ 0< y\le 1 \\
((1)\over (6))+((2)\over (6))+((2)\over (6))+((1)\over (6))=1,\ for\ x >1,\ y>1 \\
\end(matrix)\right.$

We recommend reading

Top