Bishop C.M. Pattern Recognition and Machine Learning (2006) (811375), страница 18

Файл №811375 Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) 18 страницаBishop C.M. Pattern Recognition and Machine Learning (2006) (811375) страница 182020-08-252020-08-25СтудИзба

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 18)

We can try to approximate this distribution using some parametricdistribution q(x|θ), governed by a set of adjustable parameters θ, for example amultivariate Gaussian. One way to determine θ is to minimize the Kullback-Leiblerdivergence between p(x) and q(x|θ) with respect to θ. We cannot do this directlybecause we don’t know p(x). Suppose, however, that we have observed a ﬁnite setof training points xn , for n = 1, . . .

, N , drawn from p(x). Then the expectationwith respect to p(x) can be approximated by a ﬁnite sum over these points, using(1.35), so thatN{− ln q(xn |θ) + ln p(xn )} .(1.119)KL(pq) n=1The second term on the right-hand side of (1.119) is independent of θ, and the ﬁrstterm is the negative log likelihood function for θ under the distribution q(x|θ) evaluated using the training set. Thus we see that minimizing this Kullback-Leiblerdivergence is equivalent to maximizing the likelihood function.Now consider the joint distribution between two sets of variables x and y givenby p(x, y). If the sets of variables are independent, then their joint distribution willfactorize into the product of their marginals p(x, y) = p(x)p(y). If the variables arenot independent, we can gain some idea of whether they are ‘close’ to being independent by considering the Kullback-Leibler divergence between the joint distributionand the product of the marginals, given byI[x, y] ≡ KL(p(x, y)p(x)p(y))p(x)p(y)p(x, y) lndx dy= −p(x, y)Exercise 1.41(1.120)which is called the mutual information between the variables x and y.

From theproperties of the Kullback-Leibler divergence, we see that I(x, y) 0 with equality if, and only if, x and y are independent. Using the sum and product rules ofprobability, we see that the mutual information is related to the conditional entropythroughI[x, y] = H[x] − H[x|y] = H[y] − H[y|x].(1.121)581. INTRODUCTIONThus we can view the mutual information as the reduction in the uncertainty about xby virtue of being told the value of y (or vice versa). From a Bayesian perspective,we can view p(x) as the prior distribution for x and p(x|y) as the posterior distribution after we have observed new data y. The mutual information therefore representsthe reduction in uncertainty about x as a consequence of the new observation y.Exercises1.1 () www Consider the sum-of-squares error function given by (1.2) in whichthe function y(x, w) is given by the polynomial (1.1). Show that the coefﬁcientsw = {wi } that minimize this error function are given by the solution to the followingset of linear equationsMAij wj = Ti(1.122)j =0whereAij =Nn=1i+ j(xn ),Ti =N(xn )i tn .(1.123)n=1Here a sufﬁx i or j denotes the index of a component, whereas (x)i denotes x raisedto the power of i.1.2 () Write down the set of coupled linear equations, analogous to (1.122), satisﬁedby the coefﬁcients wi which minimize the regularized sum-of-squares error functiongiven by (1.4).1.3 ( ) Suppose that we have three coloured boxes r (red), b (blue), and g (green).Box r contains 3 apples, 4 oranges, and 3 limes, box b contains 1 apple, 1 orange,and 0 limes, and box g contains 3 apples, 3 oranges, and 4 limes.

If a box is chosenat random with probabilities p(r) = 0.2, p(b) = 0.2, p(g) = 0.6, and a piece offruit is removed from the box (with equal probability of selecting any of the items inthe box), then what is the probability of selecting an apple? If we observe that theselected fruit is in fact an orange, what is the probability that it came from the greenbox?1.4 ( ) www Consider a probability density px (x) deﬁned over a continuous variable x, and suppose that we make a nonlinear change of variable using x = g(y),so that the density transforms according to (1.27).

By differentiating (1.27), showthat the location y of the maximum of the density in y is not in general related to thelocation x of the maximum of the density over x by the simple functional relationx = g(y ) as a consequence of the Jacobian factor. This shows that the maximumof a probability density (in contrast to a simple function) is dependent on the choiceof variable. Verify that, in the case of a linear transformation, the location of themaximum transforms in the same way as the variable itself.1.5 () Using the deﬁnition (1.38) show that var[f (x)] satisﬁes (1.39).Exercises591.6 () Show that if two variables x and y are independent, then their covariance iszero.1.7 ( ) www In this exercise, we prove the normalization condition (1.48) for theunivariate Gaussian. To do this consider, the integral ∞1 2I=exp − 2 xdx(1.124)2σ−∞which we can evaluate by ﬁrst writing its square in the form ∞ ∞11I2 =exp − 2 x2 − 2 y 2 dx dy.2σ2σ−∞ −∞(1.125)Now make the transformation from Cartesian coordinates (x, y) to polar coordinates(r, θ) and then substitute u = r2 .

Show that, by performing the integrals over θ andu, and then taking the square root of both sides, we obtain 1 /2.I = 2πσ 2(1.126)Finally, use this result to show that the Gaussian distribution N (x|µ, σ 2 ) is normalized.1.8 ( ) www By using a change of variables, verify that the univariate Gaussiandistribution given by (1.46) satisﬁes (1.49). Next, by differentiating both sides of thenormalization condition ∞N x|µ, σ 2 dx = 1(1.127)−∞with respect to σ 2 , verify that the Gaussian satisﬁes (1.50).

Finally, show that (1.51)holds.1.9 () www Show that the mode (i.e. the maximum) of the Gaussian distribution(1.46) is given by µ. Similarly, show that the mode of the multivariate Gaussian(1.52) is given by µ.1.10 () www Suppose that the two variables x and z are statistically independent.Show that the mean and variance of their sum satisﬁesE[x + z] = E[x] + E[z]var[x + z] = var[x] + var[z].(1.128)(1.129)1.11 () By setting the derivatives of the log likelihood function (1.54) with respect to µand σ 2 equal to zero, verify the results (1.55) and (1.56).601. INTRODUCTION1.12 ( ) wwwUsing the results (1.49) and (1.50), show thatE[xn xm ] = µ2 + Inm σ 2(1.130)where xn and xm denote data points sampled from a Gaussian distribution with meanµ and variance σ 2 , and Inm satisﬁes Inm = 1 if n = m and Inm = 0 otherwise.Hence prove the results (1.57) and (1.58).1.13 () Suppose that the variance of a Gaussian is estimated using the result (1.56) butwith the maximum likelihood estimate µML replaced with the true value µ of themean.

Show that this estimator has the property that its expectation is given by thetrue variance σ 2 .1.14 ( ) Show that an arbitrary square matrix with elements wij can be written inSASA+ wijwhere wijand wijare symmetric and anti-symmetricthe form wij = wijSSAAmatrices, respectively, satisfying wij = wji and wij= −wjifor all i and j. Nowconsider the second order term in a higher order polynomial in D dimensions, givenbyD Dwij xi xj .(1.131)i=1 j =1Show thatD Dwij xi xj =i=1 j =1D DSwijxi xj(1.132)i=1 j =1so that the contribution from the anti-symmetric matrix vanishes. We therefore seethat, without loss of generality, the matrix of coefﬁcients wij can be chosen to besymmetric, and so not all of the D2 elements of this matrix can be chosen indepenSis givendently.

Show that the number of independent parameters in the matrix wijby D(D + 1)/2.1.15 ( ) www In this exercise and the next, we explore how the number of independent parameters in a polynomial grows with the order M of the polynomial and withthe dimensionality D of the input space. We start by writing down the M th orderterm for a polynomial in D dimensions in the formD D···i1 =1 i2 =1Dwi1 i2 ···iM xi1 xi2 · · · xiM .(1.133)iM =1The coefﬁcients wi1 i2 ···iM comprise DM elements, but the number of independentparameters is signiﬁcantly fewer due to the many interchange symmetries of thefactor xi1 xi2 · · · xiM .

Begin by showing that the redundancy in the coefﬁcients canbe removed by rewriting this M th order term in the formi1D i1 =1 i2 =1iM − 1···iM =1w i1 i2 ···iM xi1 xi2 · · · xiM .(1.134)Exercises61 coefﬁcients and w coefﬁcients needNote that the precise relationship between the wnot be made explicit. Use this result to show that the number of independent parameters n(D, M ), which appear at order M , satisﬁes the following recursion relationn(D, M ) =Dn(i, M − 1).(1.135)i=1Next use proof by induction to show that the following result holdsDi=1(i + M − 2)!(D + M − 1)!=(i − 1)! (M − 1)!(D − 1)! M !(1.136)which can be done by ﬁrst proving the result for D = 1 and arbitrary M by makinguse of the result 0! = 1, then assuming it is correct for dimension D and verifyingthat it is correct for dimension D + 1. Finally, use the two previous results, togetherwith proof by induction, to shown(D, M ) =(D + M − 1)!.(D − 1)! M !(1.137)To do this, ﬁrst show that the result is true for M = 2, and any value of D 1,by comparison with the result of Exercise 1.14.

Характеристики

Тип файла

PDF-файл

Размер

9,37 Mb

Материал

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Тип материала

Книга

Предмет

(ММО) Методы машинного обучения

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

bishop-c.m.-pattern-recognition-and-machine-learning-2006.pdf.rar

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.