Bishop C.M. Pattern Recognition and Machine Learning (2006) (811375), страница 14

Файл №811375 Bishop C.M. Pattern Recognition and Machine Learning (2006) (Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf) 14 страницаBishop C.M. Pattern Recognition and Machine Learning (2006) (811375) страница 142020-08-252020-08-25СтудИзба

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 14)

Again, using the product rule p(x, Ck ) =p(Ck |x)p(x), and noting that the factor of p(x) is common to all terms, we seethat each x should be assigned to the class having the largest posterior probabilityp(Ck |x).1.5. Decision TheoryFigure 1.25 An example of a loss matrix with elements Lkj for the cancer treatment problem. The rowscorrespond to the true class, whereas the columns correspond to the assignment of class made by our decision criterion.cancernormal41 cancer normal 01000101.5.2 Minimizing the expected lossFor many applications, our objective will be more complex than simply minimizing the number of misclassiﬁcations. Let us consider again the medical diagnosisproblem. We note that, if a patient who does not have cancer is incorrectly diagnosedas having cancer, the consequences may be some patient distress plus the need forfurther investigations.

Conversely, if a patient with cancer is diagnosed as healthy,the result may be premature death due to lack of treatment. Thus the consequencesof these two types of mistake can be dramatically different. It would clearly be betterto make fewer mistakes of the second kind, even if this was at the expense of makingmore mistakes of the ﬁrst kind.We can formalize such issues through the introduction of a loss function, alsocalled a cost function, which is a single, overall measure of loss incurred in takingany of the available decisions or actions. Our goal is then to minimize the total lossincurred. Note that some authors consider instead a utility function, whose valuethey aim to maximize. These are equivalent concepts if we take the utility to besimply the negative of the loss, and throughout this text we shall use the loss functionconvention.

Suppose that, for a new value of x, the true class is Ck and that we assignx to class Cj (where j may or may not be equal to k). In so doing, we incur somelevel of loss that we denote by Lkj , which we can view as the k, j element of a lossmatrix. For instance, in our cancer example, we might have a loss matrix of the formshown in Figure 1.25. This particular loss matrix says that there is no loss incurredif the correct decision is made, there is a loss of 1 if a healthy patient is diagnosed ashaving cancer, whereas there is a loss of 1000 if a patient having cancer is diagnosedas healthy.The optimal solution is the one which minimizes the loss function.

However,the loss function depends on the true class, which is unknown. For a given inputvector x, our uncertainty in the true class is expressed through the joint probabilitydistribution p(x, Ck ) and so we seek instead to minimize the average loss, where theaverage is computed with respect to this distribution, which is given byE[L] =Lkj p(x, Ck ) dx.(1.80)kjRjEach x can be assigned independently to one of the decision regions Rj . Our goalthe expected loss (1.80), whichis to choose the regions Rj in order to minimizeimplies that for each x we should minimize k Lkj p(x, Ck ). As before, we can usethe product rule p(x, Ck ) = p(Ck |x)p(x) to eliminate the common factor of p(x).Thus the decision rule that minimizes the expected loss is the one that assigns each421.

INTRODUCTIONFigure 1.26Illustration of the reject option. Inputsx such that the larger of the two poste- 1.0rior probabilities is less than or equal toθsome threshold θ will be rejected.p(C1 |x)p(C2 |x)0.0reject regionnew x to the class j for which the quantityLkj p(Ck |x)x(1.81)kis a minimum. This is clearly trivial to do, once we know the posterior class probabilities p(Ck |x).1.5.3 The reject optionExercise 1.24We have seen that classiﬁcation errors arise from the regions of input spacewhere the largest of the posterior probabilities p(Ck |x) is signiﬁcantly less than unity,or equivalently where the joint distributions p(x, Ck ) have comparable values. Theseare the regions where we are relatively uncertain about class membership. In someapplications, it will be appropriate to avoid making decisions on the difﬁcult casesin anticipation of a lower error rate on those examples for which a classiﬁcation decision is made.

This is known as the reject option. For example, in our hypotheticalmedical illustration, it may be appropriate to use an automatic system to classifythose X-ray images for which there is little doubt as to the correct class, while leaving a human expert to classify the more ambiguous cases. We can achieve this byintroducing a threshold θ and rejecting those inputs x for which the largest of theposterior probabilities p(Ck |x) is less than or equal to θ. This is illustrated for thecase of two classes, and a single continuous input variable x, in Figure 1.26.

Notethat setting θ = 1 will ensure that all examples are rejected, whereas if there are Kclasses then setting θ < 1/K will ensure that no examples are rejected. Thus thefraction of examples that get rejected is controlled by the value of θ.We can easily extend the reject criterion to minimize the expected loss, whena loss matrix is given, taking account of the loss incurred when a reject decision ismade.1.5.4 Inference and decisionWe have broken the classiﬁcation problem down into two separate stages, theinference stage in which we use training data to learn a model for p(Ck |x), and the1.5.

Decision Theory43subsequent decision stage in which we use these posterior probabilities to make optimal class assignments. An alternative possibility would be to solve both problemstogether and simply learn a function that maps inputs x directly into decisions. Sucha function is called a discriminant function.In fact, we can identify three distinct approaches to solving decision problems,all of which have been used in practical applications. These are given, in decreasingorder of complexity, by:(a) First solve the inference problem of determining the class-conditional densitiesp(x|Ck ) for each class Ck individually.

Also separately infer the prior classprobabilities p(Ck ). Then use Bayes’ theorem in the formp(Ck |x) =p(x|Ck )p(Ck )p(x)(1.82)to ﬁnd the posterior class probabilities p(Ck |x). As usual, the denominatorin Bayes’ theorem can be found in terms of the quantities appearing in thenumerator, becausep(x|Ck )p(Ck ).(1.83)p(x) =kEquivalently, we can model the joint distribution p(x, Ck ) directly and thennormalize to obtain the posterior probabilities.

Having found the posteriorprobabilities, we use decision theory to determine class membership for eachnew input x. Approaches that explicitly or implicitly model the distribution ofinputs as well as outputs are known as generative models, because by samplingfrom them it is possible to generate synthetic data points in the input space.(b) First solve the inference problem of determining the posterior class probabilitiesp(Ck |x), and then subsequently use decision theory to assign each new x toone of the classes. Approaches that model the posterior probabilities directlyare called discriminative models.(c) Find a function f (x), called a discriminant function, which maps each input xdirectly onto a class label. For instance, in the case of two-class problems,f (·) might be binary valued and such that f = 0 represents class C1 and f = 1represents class C2 .

In this case, probabilities play no role.Let us consider the relative merits of these three alternatives. Approach (a) is themost demanding because it involves ﬁnding the joint distribution over both x andCk . For many applications, x will have high dimensionality, and consequently wemay need a large training set in order to be able to determine the class-conditionaldensities to reasonable accuracy. Note that the class priors p(Ck ) can often be estimated simply from the fractions of the training set data points in each of the classes.One advantage of approach (a), however, is that it also allows the marginal densityof data p(x) to be determined from (1.83).

This can be useful for detecting new datapoints that have low probability under the model and for which the predictions may441. INTRODUCTION51.2p(C1 |x)p(x|C2 )14class densitiesp(C2 |x)0.830.620.4p(x|C1 )100.200.20.40.6x0.81000.20.40.60.81xFigure 1.27 Example of the class-conditional densities for two classes having a single input variable x (leftplot) together with the corresponding posterior probabilities (right plot). Note that the left-hand mode of theclass-conditional density p(x|C1 ), shown in blue on the left plot, has no effect on the posterior probabilities. Thevertical green line in the right plot shows the decision boundary in x that gives the minimum misclassiﬁcationrate.be of low accuracy, which is known as outlier detection or novelty detection (Bishop,1994; Tarassenko, 1995).However, if we only wish to make classiﬁcation decisions, then it can be wasteful of computational resources, and excessively demanding of data, to ﬁnd the jointdistribution p(x, Ck ) when in fact we only really need the posterior probabilitiesp(Ck |x), which can be obtained directly through approach (b).

Характеристики

Тип файла

PDF-файл

Размер

9,37 Mb

Материал

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Тип материала

Книга

Предмет

(ММО) Методы машинного обучения

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

bishop-c.m.-pattern-recognition-and-machine-learning-2006.pdf.rar

Bishop C.M. Pattern Recognition and Machine Learning (2006).pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.