Bishop C.M. Pattern Recognition and Machine Learning (2006) (811375), страница 84

Файл №811375 Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) 84 страницаBishop C.M. Pattern Recognition and Machine Learning (2006) (811375) страница 842020-08-252020-08-25СтудИзба

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 84)

Thus, for a graph with K nodes, the jointdistribution is given byKp(x) =p(xk |pak )(8.5)k=1Exercise 8.1Exercise 8.2where pak denotes the set of parents of xk , and x = {x1 , . . . , xK }. This keyequation expresses the factorization properties of the joint distribution for a directedgraphical model. Although we have considered each node to correspond to a singlevariable, we can equally well associate sets of variables and vector-valued variableswith the nodes of a graph.

It is easy to show that the representation on the righthand side of (8.5) is always correctly normalized provided the individual conditionaldistributions are normalized.The directed graphs that we are considering are subject to an important restriction namely that there must be no directed cycles, in other words there are no closedpaths within the graph such that we can move from node to node along links following the direction of the arrows and end up back at the starting node. Such graphs arealso called directed acyclic graphs, or DAGs. This is equivalent to the statement thatthere exists an ordering of the nodes such that there are no links that go from anynode to any lower numbered node.8.1.1 Example: Polynomial regressionAs an illustration of the use of directed graphs to describe probability distributions, we consider the Bayesian polynomial regression model introduced in Sec-8.1.

Bayesian NetworksFigure 8.3363wDirected graphical model representing the jointdistribution (8.6) corresponding to the Bayesianpolynomial regression model introduced in Section 1.2.6.t1tNtion 1.2.6. The random variables in this model are the vector of polynomial coefﬁcients w and the observed data t = (t1 , . . . , tN )T . In addition, this model containsthe input data x = (x1 , . . . , xN )T , the noise variance σ 2 , and the hyperparameter αrepresenting the precision of the Gaussian prior over w, all of which are parametersof the model rather than random variables. Focussing just on the random variablesfor the moment, we see that the joint distribution is given by the product of the priorp(w) and N conditional distributions p(tn |w) for n = 1, . .

. , N so thatp(t, w) = p(w)Np(tn |w).(8.6)n=1This joint distribution can be represented by a graphical model shown in Figure 8.3.When we start to deal with more complex models later in the book, we shall ﬁndit inconvenient to have to write out multiple nodes of the form t1 , . . . , tN explicitly asin Figure 8.3. We therefore introduce a graphical notation that allows such multiplenodes to be expressed more compactly, in which we draw a single representativenode tn and then surround this with a box, called a plate, labelled with N indicatingthat there are N nodes of this kind. Re-writing the graph of Figure 8.3 in this way,we obtain the graph shown in Figure 8.4.We shall sometimes ﬁnd it helpful to make the parameters of a model, as well asits stochastic variables, explicit.

In this case, (8.6) becomesp(t, w|x, α, σ ) = p(w|α)2Np(tn |w, xn , σ 2 ).n=1Correspondingly, we can make x and α explicit in the graphical representation. Todo this, we shall adopt the convention that random variables will be denoted by opencircles, and deterministic parameters will be denoted by smaller solid circles. If wetake the graph of Figure 8.4 and include the deterministic parameters, we obtain thegraph shown in Figure 8.5.When we apply a graphical model to a problem in machine learning or patternrecognition, we will typically set some of the random variables to speciﬁc observedFigure 8.4An alternative, more compact, representation of the graphshown in Figure 8.3 in which we have introduced a plate(the box labelled N ) that represents N nodes of which onlya single example tn is shown explicitly.wtnN3648.

GRAPHICAL MODELSFigure 8.5This shows the same model as in Figure 8.4 butwith the deterministic parameters shown explicitlyby the smaller solid nodes.xnαwσ2tnNvalues, for example the variables {tn } from the training set in the case of polynomialcurve ﬁtting. In a graphical model, we will denote such observed variables by shading the corresponding nodes. Thus the graph corresponding to Figure 8.5 in whichthe variables {tn } are observed is shown in Figure 8.6.

Note that the value of w isnot observed, and so w is an example of a latent variable, also known as a hiddenvariable. Such variables play a crucial role in many probabilistic models and willform the focus of Chapters 9 and 12.Having observed the values {tn } we can, if desired, evaluate the posterior distribution of the polynomial coefﬁcients w as discussed in Section 1.2.5. For themoment, we note that this involves a straightforward application of Bayes’ theoremp(w|T) ∝ p(w)Np(tn |w)(8.7)n=1where again we have omitted the deterministic parameters in order to keep the notation uncluttered.In general, model parameters such as w are of little direct interest in themselves,because our ultimate goal is to make predictions for new input values. Suppose wex and we wish to ﬁnd the corresponding probability disare given a new input value tribution for t conditioned on the observed data.

The graphical model that describesthis problem is shown in Figure 8.7, and the corresponding joint distribution of allof the random variables in this model, conditioned on the deterministic parameters,is then given byN2x, x, α, σ ) =p(tn |xn , w, σ 2 ) p(w|α)p(t|x, w, σ 2 ).(8.8)p(t, t, w|n=1Figure 8.6As in Figure 8.5 but with the nodes {tn } shadedto indicate that the corresponding random variables have been set to their observed (training set)values.xnαwσ2tnN3658.1.

Bayesian NetworksFigure 8.7The polynomial regression model, correspondingto Figure 8.6, showing also a new input value xbtogether with the corresponding model predictionbt.xnαwtnσ2Nx̂t̂The required predictive distribution for t is then obtained, from the sum rule ofprobability, by integrating out the model parameters w so that2p(t|x, x, t, α, σ ) ∝ p(t, t, w|x, x, α, σ 2 ) dwwhere we are implicitly setting the random variables in t to the speciﬁc values observed in the data set. The details of this calculation were discussed in Chapter 3.8.1.2 Generative modelsThere are many situations in which we wish to draw samples from a given probability distribution.

Although we shall devote the whole of Chapter 11 to a detaileddiscussion of sampling methods, it is instructive to outline here one technique, calledancestral sampling, which is particularly relevant to graphical models. Consider ajoint distribution p(x1 , . . . , xK ) over K variables that factorizes according to (8.5)corresponding to a directed acyclic graph. We shall suppose that the variables havebeen ordered such that there are no links from any node to any lower numbered node,in other words each node has a higher number than any of its parents.

Our goal is tox1 , . . . , xK from the joint distribution.draw a sample To do this, we start with the lowest-numbered node and draw a sample from thex1 . We then work through each of the nodes in ordistribution p(x1 ), which we call der, so that for node n we draw a sample from the conditional distribution p(xn |pan )in which the parent variables have been set to their sampled values. Note that at eachstage, these parent values will always be available because they correspond to lowernumbered nodes that have already been sampled. Techniques for sampling fromspeciﬁc distributions will be discussed in detail in Chapter 11. Once we have sampled from the ﬁnal variable xK , we will have achieved our objective of obtaining asample from the joint distribution.

To obtain a sample from some marginal distribution corresponding to a subset of the variables, we simply take the sampled valuesfor the required nodes and ignore the sampled values for the remaining nodes. Forexample, to draw a sample from the distribution p(x2 , x4 ), we simply sample fromx2 , x4 and discard the remainingthe full joint distribution and then retain the values values {xj=2,4 }.3668. GRAPHICAL MODELSFigure 8.8A graphical model representing the process by which Objectimages of objects are created, in which the identityof an object (a discrete variable) and the position andorientation of that object (continuous variables) haveindependent prior probabilities. The image (a vectorof pixel intensities) has a probability distribution thatis dependent on the identity of the object as well ason its position and orientation.Position OrientationImageFor practical applications of probabilistic models, it will typically be the highernumbered variables corresponding to terminal nodes of the graph that represent theobservations, with lower-numbered nodes corresponding to latent variables.

Theprimary role of the latent variables is to allow a complicated distribution over theobserved variables to be represented in terms of a model constructed from simpler(typically exponential family) conditional distributions.We can interpret such models as expressing the processes by which the observeddata arose. For instance, consider an object recognition task in which each observeddata point corresponds to an image (comprising a vector of pixel intensities) of oneof the objects. In this case, the latent variables might have an interpretation as theposition and orientation of the object. Given a particular observed image, our goal isto ﬁnd the posterior distribution over objects, in which we integrate over all possiblepositions and orientations.

Характеристики

Тип файла

PDF-файл

Размер

9,37 Mb

Материал

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Тип материала

Книга

Предмет

(ММО) Методы машинного обучения

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

bishop-c.m.-pattern-recognition-and-machine-learning-2006.pdf.rar

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.