Math (562419), страница 11

Файл №562419 Math (Несколько текстов для зачёта) 11 страницаMath (562419) страница 112015-12-042015-12-04СтудИзба

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 11)

The idea of subsumption deletion is that, since the goal of XCS is to evolve an accurate, maximally general representation, it is useless to specialize classifiers that are already accurate. Accordingly, with subsumption deletion, accurate classifiers can produce only more general offspring.

Specify was introduced to assist the generalization mechanism of XCS in eliminating overly general classifiers. Specify acts when a significant number of overly general classifiers are in the action set. This condition is detected by comparing the average prediction error of classifiers in the action set Epsilon [A] with the average prediction error of classifiers in the population Epsilon [P]. If Epsilon [A] is twice Epsilon[P] and the classifiers in [A] have been updated, on average at least N[sub sp] times, then a classifier is randomly selected from [A] with probability proportional to its prediction error. The selected classifier is used to generate one offspring classifier in which each # symbol is replaced, with a probability of P[sub Sp], with the corresponding digit in the system input. The resulting classifier is then inserted in the population and another is deleted if necessary.

3 Design of Experiments

The experiments presented in this paper were conducted in the woods series of environments. These are grid worlds in which each cell can contain a tree (a T symbol), food (an F symbol), or can be empty. An animat placed in the environment must learn to reach food cells. The animat senses the environment by eight sensors, one for each adjacent cell, and can move in to any of the adjacent cells. If the destination cell contains a tree, the move does not take place. If the destination cell is blank, the move does take place. Finally, if the cell contains food, the animat moves, eats the food, and receives a constant reward. Each sensor is represented by two bits: 10 indicates the presence of tree T; 11 indicates food F; 00 represents an empty cell. Classifier conditions are 16 bits long (2 bits x 8 cells), while the eight actions are represented with three bits.

Each experiment consists of a number of problems that the animat must solve. For each problem, the animat is randomly placed in a blank cell of the environment; then it moves under the control of the system until it enters a food cell, eats the food, and receives a constant reward. The food immediately re-grows and a new problem begins. We employed the following exploration/exploitation strategy (Wilson, 1995; Wilson, 1996): before a new problem begins, the animat decides with probability 0.5 whether it will solve the problem in exploration or exploitation.

We employed two different exploration strategies: random exploration and biased exploration. In random exploration, the system selects the action randomly among those in the match set. In biased exploration, the system decides with a probability P[sub s] whether to select an action randomly or to choose the action which predicts the highest payoff (a typical value for P[sub s] is 0.5). In exploitation, the animat always selects the action which predicts the highest payoff and the GA does not act. In order to evaluate the final solutions evolved, exploration is turned off in each experiment during the last 1000 problems and the system works in exploitation only. The performance of XCS is computed as the average number of steps to food in the last 50 exploitation problems. Every statistic presented in this paper is averaged over ten experiments.

4 XCS in Maze5 and Maze6

The first results reported in the literature for XCS by Wilson (1995) are limited to two regular and aperiodic environments, Woods1 and Woods2, in which the optimal solution requires only a few steps to reach a food position. It can be described by a small number of very general classifiers and, roughly speaking, we say that these environments permit many generalizations. These initial experiments were extended by Lanzi (1997) to a more challenging environment, Maze4, in which the optimal solution requires longer sequences of actions to reach the goal, and the environment permits only a few generalizations. The author observed that in difficult sequential problems the system performance can fail dramatically. It was argued that this happens because in particularly difficult situations, characterized by long sequences of actions and only a few admissible generalizations, the generalization mechanism of XCS can be too slow to eliminate overly general classifiers before they proliferate in the population causing a significant decrease in the system performance (briefly, we say that overly general classifiers corrupt the population)(Lanzi, 1997b). The specify operator was thus introduced in order to help XCS recover from overly general classifiers.

Wilson (1997) suggested that another important factor underlying what was observed in Lanzi (1997) is the amount of random exploration the agent performs. Accordingly, he proposed a different solution in which the amount of random exploration that the agent performs is reduced by replacing random exploration, employed in the first work on XCS with biased exploration. Wilson (1997) also suggested that the behavior discussed in Lanzi (1997) may occur when no classifier in the action set is very accurate. When this occurs, the classifier fitness calculation, which estimates the classifier accuracy with respect to the action set, will give them all substantial fitnesses producing inappropriate results. Specify detects such conditions because it is activated by the error parameter and not by the accuracy. Thus it is able to recover from this type of situation by eliminating the source of inaccuracy in the action set.

We now extend previous results presented in the literature by comparing the two solutions in two new Markovian (i.e., all the states are distinguishable) environments: Maze5 and Maze6 (Figure 1 (a) and Figure 1 (b)). We compare four algorithms for each environment: (i) XCS according to the original definition, that is, without subsumption deletion; (ii) XCS without don't care symbols (#s are not introduced in the initial population, covering nor during mutation); (iii) XCS with specify, referred to here as XCSS; (iv) XCS with biased exploration.

Notice that the performances of algorithms (i) and (ii) are two important references. The former indicates what the original system can do when the generalization mechanism is in operation; while the performance of algorithm (ii) defines the potential capabilities of XCS without generalization operating. Before proceeding, we wish to point out that the results presented are not intended to indicate which strategy is best for solving the proposed problems. Our aim is to analyze more general phenomena which can be easily studied in simple environments but can be difficult to examine in more complex environments, where other settings may not work.

4.1 The Maze5 Environment

We apply the four algorithms to Maze5 using a population of 1600 classifiers.(n1) Results for the four algorithms are shown in Figure 2. Curves are averaged over ten runs. As Figure 2 shows, XCS evolves a solution for Maze5 that is not optimal (algorithm (i)). Conversely, when generalization does not act, i.e., no #s are used, the system easily reaches the optimum (algorithm (ii)).

When a mechanism to help XCS recover from overly general classifiers is added to XCS, we observe an improvement: both algorithms (iii) and (iv) converge to high performance. Specifically, XCS with biased exploration (algorithm (iv)) slowly converges to a near optimal policy; however, XCSS (algorithm (iii)) rapidly converges to a fully optimal solution that is also stable. The analysis of single runs shows that sometimes XCS with biased exploration fails to converge to a stable solution, while XCSS always reaches the optimum in a stable way. This phenomenon is more evident in the experiments with XCS (algorithm (i)) where in the majority of the cases the system does not reach a stable solution.

Lanzi (1997) observed that XCSS is stable with respect to the population size. To verify this, we applied XCS with biased exploration and XCSS to Maze5 using only 800 classifiers. The results described in Figure 3 show that, even with a small population size, XCSS still converges to a near optimal solution and remains stable. On the contrary, XCS's performance significantly decreases. The analysis of single runs exhibits an increase in the number of experiments in which XCS with biased exploration cannot reach a stable solution leading to a reduction in the overall performance.

4.2 The Maze6 Environment

Maze6 is based on Maze5 but includes a set of obstacles covering a small number of free cells. The two environments are topologically similar, however, the following experiments show that Maze6 is much more difficult for XCS to solve.

In this second experiment, we applied the same four versions of XCS to Maze6. The results described in Figure 4 confirm the results for Maze5. XCS does not converge to an optimal solution when generalization is required while when no # symbols are employed the system easily reaches an optimal performance. Furthermore, there is almost no difference between the performance of XCS with random exploration (i) and XCS with biased exploration (iv). Again, XCS with specify converges to a stable optimum (see Figure 2).

In comparing the performance of XCS in these two environments it is worth noting that, although the two environments are very similar, the performance of XCS in Maze6 is at least five times worse than in Maze5.

These results suggest that when the environment becomes more complex, biased exploration may not guarantee the convergence to a stable solution. Conversely, XCSS evolves a stable near optimal solution for Maze6 even if the population size is reduced to 800 classifiers (see Figure 5).

4.3 The Specify Operator and Biased Exploration

The results presented in this section support the findings previously presented in Lanzi (1997). Specify successfully helps the system recover from situations in which overly general classifiers may corrupt the population before the generalization mechanism of XCS eliminates them. Although biased exploration is adequate in simple environments, such as Maze5, it may become infeasible in more complex environments.

In our opinion, this happens because biased exploration is a global solution to the behavior we discussed, while specify is a local solution. Lanzi (1997) observed that XCS acts in environmental niches and suggested that these should be considered a fundamental element for operators in XCS. Specify follows this principle and directly corrects potentially dangerous situations in the niches where they are detected. Biased exploration on the other hand acts on the whole population and must take into account the structure of the entire environment.

5 XCS in Woods14

Cliff and Ross (1994) presented experimental results for ZCS (Wilson, 1994), the system from which XCS was derived. They show that the failure in learning an optimal policy depends on the length of the sequence of actions required to reach food: the longer the sequence is, the more difficult the environment.

Our experiments in Maze5 and Maze6 might seem to confirm the results presented for ZCS. XCS in fact performs better in Maze5, which requires an average of 4.6 steps to reach food, than in Maze6, where the animat takes an average of 5.05 steps to reach food. However, the minor difference between the average number of steps in the two environments seems too small to justify the significant difference in system performance.

We now extend the results presented in the previous section by analyzing the performance of XCS in an environment requiring a long sequence of actions to reach the goal state. For this purpose, we apply three different versions of XCS in the Woods 14 environment. Woods 14 (Figure 6) is a simple environment, which consists of a linear path of 18 blank cells to a food cell, and has an expected optimal path to food of nine steps.

Initially, we applied XCS with biased exploration and XCS without generalization to Woods14 with a population of 2000 classifiers. General parameters are set as in the previous experiment except for the discount factor ? which is set to 0.9. The performance of XCS with biased exploration in Woods14 is shown in Figure 7. The performance of XCS when the generalization mechanism does not act, is shown in Figure 8. Curves are averaged over ten runs.(n2)

These results show that, even if biased exploration is introduced, XCS does not converge to an optimum in the Woods14 environment. However, when # symbols are not used, XCS easily reaches the optimum. The former result may indicate that the problems encountered with XCS depend on the length of the expected optimal path to food. The latter results shown in Figure 8 also suggest that XCS can solve problems which involve long sequences of actions. This result is extremely important; it shows that XCS is a better model of a classifier system than ZCS, because it is able to build long chains of actions, a task in which ZCS fails (Cliff and Ross, 1994).

In the second experiment, we apply XCSS to Woods14 with 2000 classifiers.(n3) Figure 9 reports the performance of XCSS in Woods 14; the curve, averaged over ten runs, shows that XCSS can evolve an optimal solution for Woods 14.

Although these results are interesting, they do not explain the causes which underlie the observed behavior. We need to study the generalization mechanism of XCS and Wilson's generalization hypothesis in order to understand XCS's behavior. This is the subject of the next section where we discuss the generalization capabilities of XCS and formulate a hypothesis to explain our results.

6 Generalization with XCS in Animat Problems

6.1 The Generalization Mechanism of XCS

The experimental results discussed in the previous two sections demonstrate that some grid worlds are more difficult for XCS to navigate than others. For example, in the Woods2 environment (see Wilson (1997a)) XCS easily produces optimal solutions; in others, such as Maze5, Maze6 and Woods 14, XCS may require special exploration policies and/or special operators.

Here we analyze the generalization mechanism of XCS in order to understand which factors may influence the performance of the system. We start by reconsidering Wilson's generalization hypothesis, which explains the fundamental principles of generalization in XCS as follows:

Характеристики

Тип файла

Документ

Размер

281 Kb

Материал

Несколько текстов для зачёта

Тип материала

Другое

Предмет

Английский язык

Высшее учебное заведение

МГТУ им. Н.Э.Баумана

Список файлов учебной работы

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.