Programming Java 2 Micro Edition for Symbian OS 2004 (779882), страница 69
Текст из файла (страница 69)
Each time we go down, we push that node onto a stack. Whenwe can go no further, we:1. Pop a node from the stack.2. Add the node to the listV.3. Decrement count.4. Attempt to take a right branch. If we can, we take the right branch butthen continue taking left branches as far as possible. if we cannot,we continue steps 1 to 4 until we can take a right branch, or until wehave copied the whole tree to the vector.5. When count is zero we know we have gone through the whole tree,so we return.This approach is a little ugly because we are copying the whole of thebinary tree to a vector.
An alternative worth exploring is to take advantage384WRITING OPTIMIZED CODEof the fact that the tree is sorted. We would write our own implementationof Enumeration.nextElement() that would use the previous Cellreturned by nextElement() as the starting point for a new search. Thesearch would return the next biggest Cell.7.14.6 SummaryThere is a further optimization we can consider.
The use of Cells wasdriven by the desire to work with standard containers (Hashtable andVector), which hold objects, not primitive types. However, we arenot interested in the cells themselves, but just their positions (a 32bit integer). This means we could reduce the number of Cell objectscreated by changing the signatures in the GenerationMap interface totake integer values, rather than cells. We would also have to implementour own enumerator interface to return integers, not objects. The resultwould be a sorted binary tree implementation that was great for ourapplication, but not much use for anything else. However, the goal ofthis case study is not to make the LifeTime MIDlet as fast (and as memoryefficient) as possible, but rather to encourage good design practice ingeneral and consideration of the wider issues.Each container has its strengths and weaknesses.
If insertion is the bottleneck, then a Vector would be a good choice; for general ease of use,a Hashtable is probably the best choice. GenerationMap.kill()was only used during editing, so its performance is not critical. If removing objects has to be done quickly, then Vector is a bad choice andHashtable or the sorted binary tree the best choice.If we have to draw a conclusion from this exercise, it is the need forbetter containers on wireless devices if we are to run more complex algorithms. Rolling our own containers is a tricky and error-prone business.The ease with which we can optimize our own container has to be offsetagainst the risk of bugs.The study has hopefully demonstrated a few of our optimizationguidelines:• the benefit of working to interfaces: the GenerationMap interfaceallows us to easily try out different implementations• reasonably clean architecture and straightforward code: in the interestsof maintainability, we have avoided the more exotic Game of Lifealgorithms and not overspecialized the containers• the use of profiling and heap analysis tools to identify performanceand memory hotspots: we have concentrated our efforts on fixingthese areas.ARITHMETIC OPERATIONS3857.15 Arithmetic OperationsCurrently there is no hardware assistance available for division and modulo arithmetic in the CPUs used by mobile phones.
For an arithmeticallyintensive application (such as image analysis or speech decoding), seeif you can arrange your divisions so that they are a power of two: youcan then use the shift right operator for division. Similarly, for moduloarithmetic you can use a masking operation if the modulus is a powerof two.As an example, you might be using an array as a re-circulating buffer,with read and write pointers. The read and write methods will need towrap their respective pointers when they reach the end of the array. Ifsize is the size of the array, then on a write we would wrap the pointerwith this line of code:writePointer = (++writePointer) % size;Similarly, on a read:readPointer = (++readPointer) % size;If size is a power of two, e.g.
512, we can replace these lines withsomething a bit faster:writePointer = (++writePointer) & 0x1ff;readPointer = (++readPointer) & 0x1ff;We can also use a shift right operator to multiply by a power of two.In LifeTime we arranged the cell pitch (increment) to be a power oftwo. In fact, it is equal to two to the power of the zoomFactor, wherezoomFactor is 0, 1, 2, or 3. We could thus replace:g.fillRect(xPos * increment, yPos * increment, cellSize, cellSize);with:g.fillRect(xPos << zoomFactor, yPos << zoomFactor, cellSize, cellSize);There was no measurable performance gain in this case because thisline of code was not a bottleneck and because all mobile phone CPUshave hardware multipliers.3867.16WRITING OPTIMIZED CODEDesign PatternsIn Section 7.4, we stated that one of the most important rules for optimization was getting the design right.
For instance, it should be possible todefer the choice of sorting algorithm until the trade-offs between bubblesort, quick sort, or some other algorithm can be made intelligently on thebasis of performance, memory requirements and the size and distributionof the data set to be sorted. However, this requires designing your codesuch that substituting one sorting algorithm for another is painless.This section looks at a couple of patterns that can help achieve abetter design.7.16.1 CachingCaching can produce very significant improvements in performance. TheWorld Wide Web would probably be unusable if your desktop computerdid not cache pages. Disk performance relies on read-ahead caching.Virtual memory is a form of caching.
Almost all modern processors usedata caches because these can be accessed far more quickly than mainmemory. Sun’s Hot Spot compiler technology, e.g. the CLDC HI VM usedby Symbian, caches bytecode as optimized native code.There are a number of issues to consider when designing a cache (seeA System of Patterns by Buschmann et al.):• what to cacheCache objects which are likely to be reused, are not too big, and areslow to create or to access. A cache will be of most benefit if thereis some pattern to the way objects are accessed, e.g. having accesseda web page there is a good chance I shall want to access it again,or having read one sector on a disc there is a good chance the nextsector will be wanted.• how much to cacheThe 80:20 rule applies.
Look for the 20 % that is used 80 % of the time.In practice even a small cache can significantly improve performance.On the other hand, a cache that is a similar size to the data set iswasting memory.• which objects to delete from the cacheWhen the cache becomes full, you will have to throw away old items.Strategies include first in–first out, least recently used, least frequentlyused and random. A random policy works surprisingly well becauseit is immune to pathological patterns.• how to maintain integrity between cached data and the data source.This takes some thought, as you will be writing data into the cache aswell as reading data from the cache. You can maintain cache integrityDESIGN PATTERNS387CacheperformancePerformanceaccesses/secondPrimary dataperformance050Cache size as %of the data-set100Figure 7.13 Achieving optimum cache size.using an observer–observable model: read integrity is maintainedby making the cache an observer of changes made to the primarydata (the cache can also be an observable that is observed by theapplication), while write integrity is maintained either by using awrite-through policy such that data written by the application to thecache is simultaneously written to the primary data source, or bymaking the primary data an observer of the cache.Figure 7.13 shows how the optimum cache size depends on the speedof the cache versus the speed of the primary data source, and the size ofthe primary data set.The reason for having a cache is that it is faster to access objects in thecache.
In this case, the cache is about five times faster to access than theprimary data set. Our actual performance (in accesses per second) willnot quite reach the cache performance because we shall have to spendsome time looking for the object in the cache. Also, of course, the largerthe cache the longer it takes to search, so overall performance might evendeteriorate with increasing cache size.The object access pattern implied by this curve suggests a cache sizethat is 30 % of our primary data set. The lighter-colored straight line givesthe performance if objects were accessed randomly.www.javaworld.com/javaworld/jw-07-2001/jw-0720-cache p.htmlprovides useful ideas on caching.7.16.2 Caching Results From a DatabaseWe often want to scroll through records obtained from a database.
Thismight be a remote or local database, or we might be using the PIM APIs(JSR 75) to access our address book.It is impractical on a constrained device to hold all the data in memoryfrom even a moderate-sized database. Therefore, consider using a cache388WRITING OPTIMIZED CODEto hold data already read from the database and predictively load data.The latter can be carried out in a background thread.For instance, the PIM APIs from JSR 75 access the address book orcalendar databases using an Enumeration.
Caching ahead will allowthe user to look at an entry then quickly iterate through the next fewentries. Keeping a small cache of entries that have already been scrolledthrough will allow the user to scroll back. If the user scrolls back tothe beginning of your cache then you have little choice but to reset theEnumeration and read through the database again (which can also beperformed in a background thread).7.16.3 Early or Lazy InstantiationThe Dice Box created its dice at startup time, which is known as earlyinstantiation. Alternatively, we could have created the dice as needed andadded them to a pool, which is known as just in time or lazy instantiation.This would reduce startup time at the cost of increasing the time takento add more dice the first time round. A third alternative would be tocreate new dice every time we change their number, but being goodprogrammers, we do not give this option too much consideration.We talked earlier about creating object pools for things like databaseconnections or server connections; we can either create a pool at startup(early instantiation), or build up the pool to some maximum as needed(lazy instantiation).7.16.4 Larger-Grained OperationsSetting up and tearing down an operation can take a long time comparedto the time the operation spends doing real work.