Bishop C.M. Pattern Recognition and Machine Learning (2006) (811375), страница 13

Файл №811375 Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) 13 страницаBishop C.M. Pattern Recognition and Machine Learning (2006) (811375) страница 132020-08-252020-08-25СтудИзба

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 13)

We seethat, for large D, this fraction tends to 1 even for small values of . Thus, in spacesof high dimensionality, most of the volume of a sphere is concentrated in a thin shellnear the surface!As a further example, of direct relevance to pattern recognition, consider thebehaviour of a Gaussian distribution in a high-dimensional space. If we transformfrom Cartesian to polar coordinates, and then integrate out the directional variables,we obtain an expression for the density p(r) as a function of radius r from the origin.Thus p(r)δr is the probability mass inside a thin shell of thickness δr located atradius r.

This distribution is plotted, for various values of D, in Figure 1.23, and wesee that for large D the probability mass of the Gaussian is concentrated in a thinshell.The severe difﬁculty that can arise in spaces of many dimensions is sometimescalled the curse of dimensionality (Bellman, 1961). In this book, we shall make extensive use of illustrative examples involving input spaces of one or two dimensions,because this makes it particularly easy to illustrate the techniques graphically. Thereader should be warned, however, that not all intuitions developed in spaces of lowdimensionality will generalize to spaces of many dimensions.371.4.

The Curse of DimensionalityPlot of the fraction of the volume ofa sphere lying in the range r = 1−to r = 1 for various values of thedimensionality D.1D = 200.8volume fractionFigure 1.22D =5D =20.6D =10.40.2000.20.40.60.81Although the curse of dimensionality certainly raises important issues for pattern recognition applications, it does not prevent us from ﬁnding effective techniquesapplicable to high-dimensional spaces. The reasons for this are twofold. First, realdata will often be conﬁned to a region of the space having lower effective dimensionality, and in particular the directions over which important variations in the targetvariables occur may be so conﬁned.

Second, real data will typically exhibit somesmoothness properties (at least locally) so that for the most part small changes in theinput variables will produce small changes in the target variables, and so we can exploit local interpolation-like techniques to allow us to make predictions of the targetvariables for new values of the input variables. Successful pattern recognition techniques exploit one or both of these properties. Consider, for example, an applicationin manufacturing in which images are captured of identical planar objects on a conveyor belt, in which the goal is to determine their orientation.

Each image is a pointPlot of the probability density withrespect to radius r of a Gaussian distribution for various valuesof the dimensionality D.In ahigh-dimensional space, most of theprobability mass of a Gaussian is located within a thin shell at a speciﬁcradius.2D=1p(r)Figure 1.23D=210D = 2002r4381. INTRODUCTIONin a high-dimensional space whose dimensionality is determined by the number ofpixels. Because the objects can occur at different positions within the image andin different orientations, there are three degrees of freedom of variability betweenimages, and a set of images will live on a three dimensional manifold embeddedwithin the high-dimensional space.

Due to the complex relationships between theobject position or orientation and the pixel intensities, this manifold will be highlynonlinear. If the goal is to learn a model that can take an input image and output theorientation of the object irrespective of its position, then there is only one degree offreedom of variability within the manifold that is signiﬁcant.1.5. Decision TheoryWe have seen in Section 1.2 how probability theory provides us with a consistentmathematical framework for quantifying and manipulating uncertainty.

Here weturn to a discussion of decision theory that, when combined with probability theory,allows us to make optimal decisions in situations involving uncertainty such as thoseencountered in pattern recognition.Suppose we have an input vector x together with a corresponding vector t oftarget variables, and our goal is to predict t given a new value for x.

For regressionproblems, t will comprise continuous variables, whereas for classiﬁcation problemst will represent class labels. The joint probability distribution p(x, t) provides acomplete summary of the uncertainty associated with these variables. Determinationof p(x, t) from a set of training data is an example of inference and is typically avery difﬁcult problem whose solution forms the subject of much of this book. Ina practical application, however, we must often make a speciﬁc prediction for thevalue of t, or more generally take a speciﬁc action based on our understanding of thevalues t is likely to take, and this aspect is the subject of decision theory.Consider, for example, a medical diagnosis problem in which we have taken anX-ray image of a patient, and we wish to determine whether the patient has canceror not.

In this case, the input vector x is the set of pixel intensities in the image,and output variable t will represent the presence of cancer, which we denote by theclass C1 , or the absence of cancer, which we denote by the class C2 . We might, forinstance, choose t to be a binary variable such that t = 0 corresponds to class C1 andt = 1 corresponds to class C2 . We shall see later that this choice of label values isparticularly convenient for probabilistic models.

The general inference problem theninvolves determining the joint distribution p(x, Ck ), or equivalently p(x, t), whichgives us the most complete probabilistic description of the situation. Although thiscan be a very useful and informative quantity, in the end we must decide either togive treatment to the patient or not, and we would like this choice to be optimalin some appropriate sense (Duda and Hart, 1973). This is the decision step, andit is the subject of decision theory to tell us how to make optimal decisions giventhe appropriate probabilities. We shall see that the decision stage is generally verysimple, even trivial, once we have solved the inference problem.Here we give an introduction to the key ideas of decision theory as required for1.5.

Decision Theory39the rest of the book. Further background, as well as more detailed accounts, can befound in Berger (1985) and Bather (2000).Before giving a more detailed analysis, let us ﬁrst consider informally how wemight expect probabilities to play a role in making decisions. When we obtain theX-ray image x for a new patient, our goal is to decide which of the two classes toassign to the image.

We are interested in the probabilities of the two classes giventhe image, which are given by p(Ck |x). Using Bayes’ theorem, these probabilitiescan be expressed in the formp(Ck |x) =p(x|Ck )p(Ck ).p(x)(1.77)Note that any of the quantities appearing in Bayes’ theorem can be obtained fromthe joint distribution p(x, Ck ) by either marginalizing or conditioning with respect tothe appropriate variables. We can now interpret p(Ck ) as the prior probability for theclass Ck , and p(Ck |x) as the corresponding posterior probability. Thus p(C1 ) represents the probability that a person has cancer, before we take the X-ray measurement.Similarly, p(C1 |x) is the corresponding probability, revised using Bayes’ theorem inlight of the information contained in the X-ray.

If our aim is to minimize the chanceof assigning x to the wrong class, then intuitively we would choose the class havingthe higher posterior probability. We now show that this intuition is correct, and wealso discuss more general criteria for making decisions.1.5.1 Minimizing the misclassiﬁcation rateSuppose that our goal is simply to make as few misclassiﬁcations as possible.We need a rule that assigns each value of x to one of the available classes. Such arule will divide the input space into regions Rk called decision regions, one for eachclass, such that all points in Rk are assigned to class Ck .

The boundaries betweendecision regions are called decision boundaries or decision surfaces. Note that eachdecision region need not be contiguous but could comprise some number of disjointregions. We shall encounter examples of decision boundaries and decision regions inlater chapters. In order to ﬁnd the optimal decision rule, consider ﬁrst of all the caseof two classes, as in the cancer problem for instance. A mistake occurs when an inputvector belonging to class C1 is assigned to class C2 or vice versa. The probability ofthis occurring is given byp(mistake) = p(x ∈ R1 , C2 ) + p(x ∈ R2 , C1 )=p(x, C2 ) dx +p(x, C1 ) dx.R1(1.78)R2We are free to choose the decision rule that assigns each point x to one of the twoclasses.

Clearly to minimize p(mistake) we should arrange that each x is assigned towhichever class has the smaller value of the integrand in (1.78). Thus, if p(x, C1 ) >p(x, C2 ) for a given value of x, then we should assign that x to class C1 . From theproduct rule of probability we have p(x, Ck ) = p(Ck |x)p(x).

Because the factorp(x) is common to both terms, we can restate this result as saying that the minimum401. INTRODUCTIONxx0p(x, C1 )p(x, C2 )xR1Figure 1.24R2Schematic illustration of the joint probabilities p(x, Ck ) for each of two classes plottedagainst x, together with the decision boundary x = xb. Values of x xb are classiﬁed asb are classiﬁedclass C2 and hence belong to decision region R2 , whereas points x < xas C1 and belong to R1 . Errors arise from the blue, green, and red regions, so that forx<xb the errors are due to points from class C2 being misclassiﬁed as C1 (represented bythe sum of the red and green regions), and conversely for points in the region x xb theerrors are due to points from class C1 being misclassiﬁed as C2 (represented by the blueregion).

As we vary the location xb of the decision boundary, the combined areas of theblue and green regions remains constant, whereas the size of the red region varies. Theoptimal choice for xb is where the curves for p(x, C1 ) and p(x, C2 ) cross, corresponding toxb = x0 , because in this case the red region disappears.

This is equivalent to the minimummisclassiﬁcation rate decision rule, which assigns each value of x to the class having thehigher posterior probability p(Ck |x).probability of making a mistake is obtained if each value of x is assigned to the classfor which the posterior probability p(Ck |x) is largest. This result is illustrated fortwo classes, and a single input variable x, in Figure 1.24.For the more general case of K classes, it is slightly easier to maximize theprobability of being correct, which is given byp(correct) =Kp(x ∈ Rk , Ck )k=1=K k=1Rkp(x, Ck ) dx(1.79)which is maximized when the regions Rk are chosen such that each x is assignedto the class for which p(x, Ck ) is largest.

Характеристики

Тип файла

PDF-файл

Размер

9,37 Mb

Материал

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Тип материала

Книга

Предмет

(ММО) Методы машинного обучения

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

bishop-c.m.-pattern-recognition-and-machine-learning-2006.pdf.rar

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.