Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 42

Файл №811417 Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest.pdf) 42 страницаIntroduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417) страница 422020-08-252020-08-25СтудИзба

Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 42)

Describe a scheme for implementing a direct-address dictionary on a huge array.Each stored object should use O(1) space; the operations SEARCH, INSERT, and DELETEshould take O(1) time each; and the initialization of the data structure should take O(1) time.(Hint: Use an additional stack, whose size is the number of keys actually stored in thedictionary, to help determine whether a given entry in the huge array is valid or not.)11.2 Hash tablesThe difficulty with direct addressing is obvious: if the universe U is large, storing a table T ofsize |U| may be impractical, or even impossible, given the memory available on a typicalcomputer.

Furthermore, the set K of keys actually stored may be so small relative to U thatmost of the space allocated for T would be wasted.When the set K of keys stored in a dictionary is much smaller than the universe U of allpossible keys, a hash table requires much less storage than a direct-address table. Specifically,the storage requirements can be reduced to Θ(|K|) while we maintain the benefit that searchingfor an element in the hash table still requires only O(1) time. The only catch is that this boundis for the average time, whereas for direct addressing it holds for the worst-case time.With direct addressing, an element with key k is stored in slot k.

With hashing, this element isstored in slot h(k); that is, we use a hash function h to compute the slot from the key k. Here hmaps the universe U of keys into the slots of a hash table T[0 m - 1]:h : U → {0, 1, ..., m - 1} .We say that an element with key k hashes to slot h(k); we also say that h(k) is the hash valueof key k. Figure 11.2 illustrates the basic idea. The point of the hash function is to reduce therange of array indices that need to be handled. Instead of |U| values, we need to handle only mvalues.

Storage requirements are correspondingly reduced.Figure 11.2: Using a hash function h to map keys to hash-table slots. keys k2 and k5 map to thesame slot, so they collide.There is one hitch: two keys may hash to the same slot. We call this situation a collision.Fortunately, there are effective techniques for resolving the conflict created by collisions.Of course, the ideal solution would be to avoid collisions altogether. We might try to achievethis goal by choosing a suitable hash function h.

One idea is to make h appear to be "random,"thus avoiding collisions or at least minimizing their number. The very term "to hash," evokingimages of random mixing and chopping, captures the spirit of this approach. (Of course, ahash function h must be deterministic in that a given input k should always produce the sameoutput h(k).) Since |U| > m, however, there must be at least two keys that have the same hashvalue; avoiding collisions altogether is therefore impossible.

Thus, while a well-designed,"random"-looking hash function can minimize the number of collisions, we still need amethod for resolving the collisions that do occur.The remainder of this section presents the simplest collision resolution technique, calledchaining. Section 11.4 introduces an alternative method for resolving collisions, called openaddressing.Collision resolution by chainingIn chaining, we put all the elements that hash to the same slot in a linked list, as shown inFigure 11.3. Slot j contains a pointer to the head of the list of all stored elements that hash to j;if there are no such elements, slot j contains NIL.Figure 11.3: Collision resolution by chaining. Each hash-table slot T[j] contains a linked listof all the keys whose hash value is j.

For example, h(k1) = h(k4) and h(k5) = h(k2) = h(k7).The dictionary operations on a hash table T are easy to implement when collisions areresolved by chaining.CHAINED-HASH-INSERT(T, x)insert x at the head of list T[h(key[x])]CHAINED-HASH-SEARCH(T, k)search for an element with key k in list T[h(k)]CHAINED-HASH-DELETE(T, x)delete x from the list T[h(key[x])]The worst-case running time for insertion is O(1). The insertion procedure is fast in partbecause it assumes that the element x being inserted is not already present in the table; thisassumption can be checked if necessary (at additional cost) by performing a search beforeinsertion. For searching, the worst-case running time is proportional to the length of the list;we shall analyze this operation more closely below.

Deletion of an element x can beaccomplished in O(1) time if the lists are doubly linked. (Note that CHAINED-HASHDELETE takes as input an element x and not its key k, so we don't have to search for x first. Ifthe lists were singly linked, it would not be of great help to take as input the element x ratherthan the key k. We would still have to find x in the list T[h(key[x])], so that the next link of x'spredecessor could be properly set to splice x out.

In this case, deletion and searching wouldhave essentially the same running time.)Analysis of hashing with chainingHow well does hashing with chaining perform? In particular, how long does it take to searchfor an element with a given key?Given a hash table T with m slots that stores n elements, we define the load factor α for T asn/m, that is, the average number of elements stored in a chain. Our analysis will be in terms ofα, which can be less than, equal to, or greater than 1.The worst-case behavior of hashing with chaining is terrible: all n keys hash to the same slot,creating a list of length n. The worst-case time for searching is thus Θ(n) plus the time tocompute the hash function—no better than if we used one linked list for all the elements.Clearly, hash tables are not used for their worst-case performance. (Perfect hashing, describedin Section 11.5, does however provide good worst-case performance when the set of keys isstatic.)The average performance of hashing depends on how well the hash function h distributes theset of keys to be stored among the m slots, on the average.

Section 11.3 discusses these issues,but for now we shall assume that any given element is equally likely to hash into any of the mslots, independently of where any other element has hashed to. We call this the assumption ofsimple uniform hashing.For j = 0, 1, ..., m - 1, let us denote the length of the list T[j] by nj, so that(11.1)and the average value of nj is E[nj] = α = n/m.We assume that the hash value h(k) can be computed in O(1) time, so that the time required tosearch for an element with key k depends linearly on the length nh(k) of the list T[h(k)]. Settingaside the O(1) time required to compute the hash function and to access slot h(k), let usconsider the expected number of elements examined by the search algorithm, that is, thenumber of elements in the list T[h(k)] that are checked to see if their keys are equal to k.

Weshall consider two cases. In the first, the search is unsuccessful: no element in the table haskey k. In the second, the search successfully finds an element with key k.Theorem 11.1In a hash table in which collisions are resolved by chaining, an unsuccessful search takesexpected time Θ(1 + α), under the assumption of simple uniform hashing.Proof Under the assumption of simple uniform hashing, any key k not already stored in thetable is equally likely to hash to any of the m slots. The expected time to searchunsuccessfully for a key k is the expected time to search to the end of list T[h(k)], which hasexpected length E[nh(k)] = α. Thus, the expected number of elements examined in anunsuccessful search is α, and the total time required (including the time for computing h(k)) isΘ(1 + α).The situation for a successful search is slightly different, since each list is not equally likely tobe searched.

Instead, the probability that a list is searched is proportional to the number ofelements it contains. Nonetheless, the expected search time is still Θ(1 + α).Theorem 11.2In a hash table in which collisions are resolved by chaining, a successful search takes timeΘ(1 + α), on the average, under the assumption of simple uniform hashing.Proof We assume that the element being searched for is equally likely to be any of the nelements stored in the table. The number of elements examined during a successful search foran element x is 1 more than the number of elements that appear before x in x's list. Elementsbefore x in the list were all inserted after x was inserted, because new elements are placed atthe front of the list. To find the expected number of elements examined, we take the average,over the n elements x in the table, of 1 plus the expected number of elements added to x's listafter x was added to the list.

Let xi denote the ith element inserted into the table, for i = 1, 2,..., n, and let ki = key[xi]. For keys ki and kj , we define the indicator random variable Xij =I{h(ki) = h(kj)}. Under the assumption of simple uniform hashing, we have Pr{h(ki) = h(kj)} =1/m, and so by Lemma 5.1, E[Xij] = 1/m. Thus, the expected number of elements examined ina successful search isThus, the total time required for a successful search (including the time for computing thehash function) is Θ(2 + α/2 - α/2n) = Θ(1 + α).What does this analysis mean? If the number of hash-table slots is at least proportional to thenumber of elements in the table, we have n = O(m) and, consequently, α = n/m = O(m)/m =O(1).

Thus, searching takes constant time on average. Since insertion takes O(1) worst-casetime and deletion takes O(1) worst-case time when the lists are doubly linked, all dictionaryoperations can be supported in O(1) time on average.Exercises 11.2-1Suppose we use a hash function h to hash n distinct keys into an array T of length m.Assuming simple uniform hashing, what is the expected number of collisions? Moreprecisely, what is the expected cardinality of {{k, l} : k ≠ l and h(k) = h(l)}?Exercises 11.2-2Demonstrate the insertion of the keys 5, 28, 19, 15, 20, 33, 12, 17, 10 into a hash table withcollisions resolved by chaining. Let the table have 9 slots, and let the hash function be h(k) = kmod 9.Exercises 11.2-3Professor Marley hypothesizes that substantial performance gains can be obtained if wemodify the chaining scheme so that each list is kept in sorted order.

Характеристики

Тип файла

PDF-файл

Размер

12,48 Mb

Материал

Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest.pdf

Тип материала

Книга

Предмет

Распределенные алгоритмы

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

introduction-to-algorithms.-2nd-edition-thomas-cormen_-charles-leiserson_-ronad-rivest.pdf.rar

Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.