Bishop C.M. Pattern Recognition and Machine Learning (2006) (811375), страница 86

Файл №811375 Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) 86 страницаBishop C.M. Pattern Recognition and Machine Learning (2006) (811375) страница 862020-08-252020-08-25СтудИзба

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 86)

. . , xM and a single child y, used to illustrate the idea of parameterizedconditional distributions for discrete variables.xMx1y3708. GRAPHICAL MODELS8.1.4 Linear-Gaussian modelsIn the previous section, we saw how to construct joint probability distributionsover a set of discrete variables by expressing the variables as nodes in a directedacyclic graph. Here we show how a multivariate Gaussian can be expressed as adirected graph corresponding to a linear-Gaussian model over the component variables. This allows us to impose interesting structure on the distribution, with thegeneral Gaussian and the diagonal covariance Gaussian representing opposite extremes. Several widely used techniques are examples of linear-Gaussian models,such as probabilistic principal component analysis, factor analysis, and linear dynamical systems (Roweis and Ghahramani, 1999). We shall make extensive use ofthe results of this section in later chapters when we consider some of these techniquesin detail.Consider an arbitrary directed acyclic graph over D variables in which node irepresents a single continuous random variable xi having a Gaussian distribution.The mean of this distribution is taken to be a linear combination of the states of itsparent nodes pai of node i⎛ ⎞p(xi |pai ) = N ⎝xi wij xj + bi , vi ⎠(8.11)j∈paiwhere wij and bi are parameters governing the mean, and vi is the variance of theconditional distribution for xi .

The log of the joint distribution is then the log of theproduct of these conditionals over all nodes in the graph and hence takes the formln p(x) =Dln p(xi |pai )(8.12)i=1⎛⎞2D1 ⎝= −xi −wij xj − bi ⎠ + const2vii=1(8.13)j∈paiwhere x = (x1 , . . . , xD )T and ‘const’ denotes terms independent of x. We see thatthis is a quadratic function of the components of x, and hence the joint distributionp(x) is a multivariate Gaussian.We can determine the mean and covariance of the joint distribution recursivelyas follows.

Each variable xi has (conditional on the states of its parents) a Gaussiandistribution of the form (8.11) and so√wij xj + bi + vi i(8.14)xi =j∈paiwhere i is a zero mean, unit variance Gaussian random variable satisfying E[i ] = 0and E[i j ] = Iij , where Iij is the i, j element of the identity matrix. Taking theexpectation of (8.14), we haveE[xi ] =wij E[xj ] + bi .(8.15)j∈pai8.1. Bayesian NetworksFigure 8.14A directed graph over three Gaussian variables,with one missing link.x1x2371x3Thus we can ﬁnd the components of E[x] = (E[x1 ], .

. . , E[xD ])T by starting at thelowest numbered node and working recursively through the graph (here we againassume that the nodes are numbered such that each node has a higher number thanits parents). Similarly, we can use (8.14) and (8.15) to obtain the i, j element of thecovariance matrix for p(x) in the form of a recursion relationcov[xi , xj ] = E [(xi − E[xi ])(xj − E[xj ])]⎧⎫⎤⎡⎨⎬√wjk (xk − E[xk ]) + vj j ⎦= E ⎣(xi − E[xi ])⎩⎭k∈paj=wjk cov[xi , xk ] + Iij vj(8.16)k∈pajSection 2.3Exercise 8.7and so the covariance can similarly be evaluated recursively starting from the lowestnumbered node.Let us consider two extreme cases.

First of all, suppose that there are no linksin the graph, which therefore comprises D isolated nodes. In this case, there are noparameters wij and so there are just D parameters bi and D parameters vi . Fromthe recursion relations (8.15) and (8.16), we see that the mean of p(x) is given by(b1 , . .

. , bD )T and the covariance matrix is diagonal of the form diag(v1 , . . . , vD ).The joint distribution has a total of 2D parameters and represents a set of D independent univariate Gaussian distributions.Now consider a fully connected graph in which each node has all lower numbered nodes as parents.

The matrix wij then has i − 1 entries on the ith row andhence is a lower triangular matrix (with no entries on the leading diagonal). Thenthe total number of parameters wij is obtained by taking the number D2 of elementsin a D × D matrix, subtracting D to account for the absence of elements on the leading diagonal, and then dividing by 2 because the matrix has elements only below thediagonal, giving a total of D(D − 1)/2. The total number of independent parameters{wij } and {vi } in the covariance matrix is therefore D(D + 1)/2 corresponding toa general symmetric covariance matrix.Graphs having some intermediate level of complexity correspond to joint Gaussian distributions with partially constrained covariance matrices.

Consider for example the graph shown in Figure 8.14, which has a link missing between variablesx1 and x3 . Using the recursion relations (8.15) and (8.16), we see that the mean andcovariance of the joint distribution are given byT(8.17)µ = (b1 , b2 + w21 b1 , b3 + w32 b2 + w32 w21 b1 )v1w21 v1w32 w21 v122w21 v1v2 + w21v1w32 (v2 + w21v1 )Σ =. (8.18)222w32 w21 v1 w32 (v2 + w21 v1 ) v3 + w32 (v2 + w21v1 )3728.

GRAPHICAL MODELSWe can readily extend the linear-Gaussian graphical model to the case in whichthe nodes of the graph represent multivariate Gaussian variables. In this case, we canwrite the conditional distribution for node i in the form⎛ ⎞p(xi |pai ) = N ⎝xi Wij xj + bi , Σi ⎠(8.19)j∈paiSection 2.3.6where now Wij is a matrix (which is nonsquare if xi and xj have different dimensionalities). Again it is easy to verify that the joint distribution over all variables isGaussian.Note that we have already encountered a speciﬁc example of the linear-Gaussianrelationship when we saw that the conjugate prior for the mean µ of a Gaussianvariable x is itself a Gaussian distribution over µ. The joint distribution over x andµ is therefore Gaussian. This corresponds to a simple two-node graph in whichthe node representing µ is the parent of the node representing x. The mean of thedistribution over µ is a parameter controlling a prior, and so it can be viewed as ahyperparameter.

Because the value of this hyperparameter may itself be unknown,we can again treat it from a Bayesian perspective by introducing a prior over thehyperparameter, sometimes called a hyperprior, which is again given by a Gaussiandistribution. This type of construction can be extended in principle to any level and isan illustration of a hierarchical Bayesian model, of which we shall encounter furtherexamples in later chapters.8.2.

Conditional IndependenceAn important concept for probability distributions over multiple variables is that ofconditional independence (Dawid, 1980). Consider three variables a, b, and c, andsuppose that the conditional distribution of a, given b and c, is such that it does notdepend on the value of b, so thatp(a|b, c) = p(a|c).(8.20)We say that a is conditionally independent of b given c. This can be expressed in aslightly different way if we consider the joint distribution of a and b conditioned onc, which we can write in the formp(a, b|c) = p(a|b, c)p(b|c)= p(a|c)p(b|c).(8.21)where we have used the product rule of probability together with (8.20).

Thus wesee that, conditioned on c, the joint distribution of a and b factorizes into the product of the marginal distribution of a and the marginal distribution of b (again bothconditioned on c). This says that the variables a and b are statistically independent,given c. Note that our deﬁnition of conditional independence will require that (8.20),8.2. Conditional IndependenceFigure 8.15373cThe ﬁrst of three examples of graphs over three variablesa, b, and c used to discuss conditional independenceproperties of directed graphical models.abor equivalently (8.21), must hold for every possible value of c, and not just for somevalues.

We shall sometimes use a shorthand notation for conditional independence(Dawid, 1979) in whicha⊥⊥b|c(8.22)denotes that a is conditionally independent of b given c and is equivalent to (8.20).Conditional independence properties play an important role in using probabilistic models for pattern recognition by simplifying both the structure of a model andthe computations needed to perform inference and learning under that model.

Weshall see examples of this shortly.If we are given an expression for the joint distribution over a set of variables interms of a product of conditional distributions (i.e., the mathematical representationunderlying a directed graph), then we could in principle test whether any potential conditional independence property holds by repeated application of the sum andproduct rules of probability. In practice, such an approach would be very time consuming. An important and elegant feature of graphical models is that conditionalindependence properties of the joint distribution can be read directly from the graphwithout having to perform any analytical manipulations. The general frameworkfor achieving this is called d-separation, where the ‘d’ stands for ‘directed’ (Pearl,1988). Here we shall motivate the concept of d-separation and give a general statement of the d-separation criterion.

Характеристики

Тип файла

PDF-файл

Размер

9,37 Mb

Материал

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Тип материала

Книга

Предмет

(ММО) Методы машинного обучения

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

bishop-c.m.-pattern-recognition-and-machine-learning-2006.pdf.rar

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.