Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » Bishop C.M. Pattern Recognition and Machine Learning (2006)

Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf), страница 8

PDF-файл Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf), страница 8 (ММО) Методы машинного обучения (63165): Книга - 10 семестр (2 семестр магистратуры)Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) - PDF, страница 8 (631652020-08-25СтудИзба

Описание файла

PDF-файл из архива "Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf", который расположен в категории "". Всё это находится в предмете "(ммо) методы машинного обучения" из 10 семестр (2 семестр магистратуры), которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 8 страницы из PDF

Ifwe had been asked which box had been chosen before being told the identity ofthe selected item of fruit, then the most complete information we have available isprovided by the probability p(B). We call this the prior probability because it is theprobability available before we observe the identity of the fruit.

Once we are told thatthe fruit is an orange, we can then use Bayes’ theorem to compute the probabilityp(B|F ), which we shall call the posterior probability because it is the probabilityobtained after we have observed F . Note that in this example, the prior probabilityof selecting the red box was 4/10, so that we were more likely to select the blue boxthan the red one. However, once we have observed that the piece of selected fruit isan orange, we find that the posterior probability of the red box is now 2/3, so thatit is now more likely that the box we selected was in fact the red one.

This resultaccords with our intuition, as the proportion of oranges is much higher in the red boxthan it is in the blue box, and so the observation that the fruit was an orange providessignificant evidence favouring the red box. In fact, the evidence is sufficiently strongthat it outweighs the prior and makes it more likely that the red box was chosenrather than the blue one.Finally, we note that if the joint distribution of two variables factorizes into theproduct of the marginals, so that p(X, Y ) = p(X)p(Y ), then X and Y are said tobe independent. From the product rule, we see that p(Y |X) = p(Y ), and so theconditional distribution of Y given X is indeed independent of the value of X.

Forinstance, in our boxes of fruit example, if each box contained the same fraction ofapples and oranges, then p(F |B) = P (F ), so that the probability of selecting, say,an apple is independent of which box is chosen.1.2.1 Probability densitiesAs well as considering probabilities defined over discrete sets of events, wealso wish to consider probabilities with respect to continuous variables. We shalllimit ourselves to a relatively informal discussion. If the probability of a real-valuedvariable x falling in the interval (x, x + δx) is given by p(x)δx for δx → 0, thenp(x) is called the probability density over x. This is illustrated in Figure 1.12.

Theprobability that x will lie in an interval (a, b) is then given by bp(x) dx.(1.24)p(x ∈ (a, b)) =a181. INTRODUCTIONFigure 1.12The concept of probability fordiscrete variables can be extended to that of a probabilitydensity p(x) over a continuousvariable x and is such that theprobability of x lying in the interval (x, x + δx) is given by p(x)δxfor δx → 0. The probabilitydensity can be expressed as thederivative of a cumulative distribution function P (x).p(x)δxP (x)xBecause probabilities are nonnegative, and because the value of x must lie somewhere on the real axis, the probability density p(x) must satisfy the two conditions∞p(x) 0(1.25)p(x) dx = 1.(1.26)−∞Under a nonlinear change of variable, a probability density transforms differentlyfrom a simple function, due to the Jacobian factor.

For instance, if we considera change of variables x = g(y), then a function f (x) becomes f(y) = f (g(y)).Now consider a probability density px (x) that corresponds to a density py (y) withrespect to the new variable y, where the suffices denote the fact that px (x) and py (y)are different densities. Observations falling in the range (x, x + δx) will, for smallvalues of δx, be transformed into the range (y, y + δy) where px (x)δx py (y)δy,and hence dx py (y) = px (x) dy = px (g(y)) |g (y)| .(1.27)Exercise 1.4One consequence of this property is that the concept of the maximum of a probabilitydensity is dependent on the choice of variable.The probability that x lies in the interval (−∞, z) is given by the cumulativedistribution function defined by zp(x) dx(1.28)P (z) =−∞which satisfies P (x) = p(x), as shown in Figure 1.12.If we have several continuous variables x1 , .

. . , xD , denoted collectively by thevector x, then we can define a joint probability density p(x) = p(x1 , . . . , xD ) such1.2. Probability Theory19that the probability of x falling in an infinitesimal volume δx containing the point xis given by p(x)δx. This multivariate probability density must satisfyp(x) 0(1.29)p(x) dx = 1(1.30)in which the integral is taken over the whole of x space. We can also consider jointprobability distributions over a combination of discrete and continuous variables.Note that if x is a discrete variable, then p(x) is sometimes called a probabilitymass function because it can be regarded as a set of ‘probability masses’ concentratedat the allowed values of x.The sum and product rules of probability, as well as Bayes’ theorem, applyequally to the case of probability densities, or to combinations of discrete and continuous variables.

For instance, if x and y are two real variables, then the sum andproduct rules take the formp(x, y) dy(1.31)p(x) =p(x, y) = p(y|x)p(x).(1.32)A formal justification of the sum and product rules for continuous variables (Feller,1966) requires a branch of mathematics called measure theory and lies outside thescope of this book. Its validity can be seen informally, however, by dividing eachreal variable into intervals of width ∆ and considering the discrete probability distribution over these intervals. Taking the limit ∆ → 0 then turns sums into integralsand gives the desired result.1.2.2 Expectations and covariancesOne of the most important operations involving probabilities is that of findingweighted averages of functions. The average value of some function f (x) under aprobability distribution p(x) is called the expectation of f (x) and will be denoted byE[f ].

For a discrete distribution, it is given byp(x)f (x)(1.33)E[f ] =xso that the average is weighted by the relative probabilities of the different valuesof x. In the case of continuous variables, expectations are expressed in terms of anintegration with respect to the corresponding probability densityE[f ] = p(x)f (x) dx.(1.34)In either case, if we are given a finite number N of points drawn from the probabilitydistribution or probability density, then the expectation can be approximated as a201. INTRODUCTIONfinite sum over these pointsN1 f (xn ).E[f ] N(1.35)n=1We shall make extensive use of this result when we discuss sampling methods inChapter 11.

The approximation in (1.35) becomes exact in the limit N → ∞.Sometimes we will be considering expectations of functions of several variables,in which case we can use a subscript to indicate which variable is being averagedover, so that for instanceEx [f (x, y)](1.36)denotes the average of the function f (x, y) with respect to the distribution of x. Notethat Ex [f (x, y)] will be a function of y.We can also consider a conditional expectation with respect to a conditionaldistribution, so thatEx [f |y] =p(x|y)f (x)(1.37)xwith an analogous definition for continuous variables.The variance of f (x) is defined by2var[f ] = E (f (x) − E[f (x)])Exercise 1.5(1.38)and provides a measure of how much variability there is in f (x) around its meanvalue E[f (x)].

Expanding out the square, we see that the variance can also be writtenin terms of the expectations of f (x) and f (x)2var[f ] = E[f (x)2 ] − E[f (x)]2 .(1.39)In particular, we can consider the variance of the variable x itself, which is given byvar[x] = E[x2 ] − E[x]2 .(1.40)For two random variables x and y, the covariance is defined bycov[x, y] = Ex,y [{x − E[x]} {y − E[y]}]= Ex,y [xy] − E[x]E[y]Exercise 1.6(1.41)which expresses the extent to which x and y vary together. If x and y are independent, then their covariance vanishes.In the case of two vectors of random variables x and y, the covariance is a matrixcov[x, y] = Ex,y {x − E[x]}{yT − E[yT ]}= Ex,y [xyT ] − E[x]E[yT ].(1.42)If we consider the covariance of the components of a vector x with each other, thenwe use a slightly simpler notation cov[x] ≡ cov[x, x].1.2.

Probability Theory211.2.3 Bayesian probabilitiesSo far in this chapter, we have viewed probabilities in terms of the frequenciesof random, repeatable events. We shall refer to this as the classical or frequentistinterpretation of probability. Now we turn to the more general Bayesian view, inwhich probabilities provide a quantification of uncertainty.Consider an uncertain event, for example whether the moon was once in its ownorbit around the sun, or whether the Arctic ice cap will have disappeared by the endof the century. These are not events that can be repeated numerous times in orderto define a notion of probability as we did earlier in the context of boxes of fruit.Nevertheless, we will generally have some idea, for example, of how quickly wethink the polar ice is melting.

If we now obtain fresh evidence, for instance from anew Earth observation satellite gathering novel forms of diagnostic information, wemay revise our opinion on the rate of ice loss. Our assessment of such matters willaffect the actions we take, for instance the extent to which we endeavour to reducethe emission of greenhouse gasses. In such circumstances, we would like to be ableto quantify our expression of uncertainty and make precise revisions of uncertainty inthe light of new evidence, as well as subsequently to be able to take optimal actionsor decisions as a consequence. This can all be achieved through the elegant, and verygeneral, Bayesian interpretation of probability.The use of probability to represent uncertainty, however, is not an ad-hoc choice,but is inevitable if we are to respect common sense while making rational coherentinferences.

For instance, Cox (1946) showed that if numerical values are used torepresent degrees of belief, then a simple set of axioms encoding common senseproperties of such beliefs leads uniquely to a set of rules for manipulating degrees ofbelief that are equivalent to the sum and product rules of probability. This providedthe first rigorous proof that probability theory could be regarded as an extension ofBoolean logic to situations involving uncertainty (Jaynes, 2003). Numerous otherauthors have proposed different sets of properties or axioms that such measures ofuncertainty should satisfy (Ramsey, 1931; Good, 1950; Savage, 1961; deFinetti,1970; Lindley, 1982). In each case, the resulting numerical quantities behave precisely according to the rules of probability.

It is therefore natural to refer to thesequantities as (Bayesian) probabilities.In the field of pattern recognition, too, it is helpful to have a more general no-Thomas Bayes1701–1761Thomas Bayes was born in Tunbridge Wells and was a clergymanas well as an amateur scientist anda mathematician. He studied logicand theology at Edinburgh University and was elected Fellow of theRoyal Society in 1742.

Свежие статьи
Популярно сейчас
Как Вы думаете, сколько людей до Вас делали точно такое же задание? 99% студентов выполняют точно такие же задания, как и их предшественники год назад. Найдите нужный учебный материал на СтудИзбе!
Ответы на популярные вопросы
Да! Наши авторы собирают и выкладывают те работы, которые сдаются в Вашем учебном заведении ежегодно и уже проверены преподавателями.
Да! У нас любой человек может выложить любую учебную работу и зарабатывать на её продажах! Но каждый учебный материал публикуется только после тщательной проверки администрацией.
Вернём деньги! А если быть более точными, то автору даётся немного времени на исправление, а если не исправит или выйдет время, то вернём деньги в полном объёме!
Да! На равне с готовыми студенческими работами у нас продаются услуги. Цены на услуги видны сразу, то есть Вам нужно только указать параметры и сразу можно оплачивать.
Отзывы студентов
Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.
Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.
Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.
Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.
Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.
Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.
Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.
Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.
Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.
Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.
Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.
Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.
Популярные преподаватели
Добавляйте материалы
и зарабатывайте!
Продажи идут автоматически
5209
Авторов
на СтудИзбе
430
Средний доход
с одного платного файла
Обучение Подробнее