Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Monday, December 1, 2008

Update on the Kompimi C Data Structure Library: One of the most useful common collections is the associative array. Today I completed work on two of the three data structures planned to support this abstraction, the hash table and the "dynamic array set", which is simply a dynamic array that uses linear search for lookup, intended for very small collections and as a performance baseline.

Performance results are shown below. The first tests insertion into a hash table preallocated to be large enough to hold all elements, the second tests lookup in a full hash table, and the third tests inserting while growing the hash table from its smallest size. The results are relatively unsurprising; the naive dynamic array set is almost never worthwhile, even on very small lists, except for growing, since it does not need to redo insertions when growing. Direct chaining performs particularly poorly in the growing test, since it needs to free up the memory for the old chains and allocate new ones whenever the table is resized (coallesced hashing may do better here), and because resizing open addressing hash tables by a factor of 2 has good cache performance (each bucket maps to only 2 new buckets). Time per operation is constant for a while, but increases as the table becomes too large to fit in the L2 cache. Quadratic probing is probably the best choice across the board.

The slowdown for growing versus insertion into a preallocated array ranges from 1.5x to 3x, and is worst for small lists - this is why the dynamic array set dominates here. There might be an interesting way to speed up resizing.





All content in this document is granted into the public domain. Where this is not legally possible, the copyright owner releases all rights. This notice may be modified or removed.

Thursday, November 20, 2008

C Data Structure Library: list is done

Today I finished my list implementation, based on unrolled doubly-linked linked lists with external storage on the C Data Structure Library. As a relatively complex data structure, it includes deep invariant checking in debug mode, and has a good set of unit tests to stress it. I built a release version for performance testing and compared it against a simple doubly-linked list doing similar operations. Preliminary results seem to indicate the complexity was worthwhile:
  • An insert at the end requires only 29 ns on average, as compared to 110 ns for a standard doubly-linked list.
  • An iteration over a million-element list requires only 3.7 ms, as compared to 9.5 ms per a standard doubly-linked list.
It's a good thing I did the perf test though - a few critical functions had to be converted into macros to get this performance (before, it was slower than doubly-linked lists with similarly inlined calls).

After this I varied ELEMENTS_PER_LIST_NODE a bit to examine how it affects the performance of these simple tests. The default value is 25. At the minimum, 2, it required about 75 ns per insert and 7 ms per iteration. Around 18, iteration time drops to 2.9 ms. For particularly large values, like 128, iteration drops to 2.2 ms and insert drop to 24.4 ns, but other benchmarks like insert_after would probably show a loss on these.

Update: Improvements to the perf test framework eliminated some of the overhead of the framework itself, which makes a difference for the very short insertion test. Currently inserts are down to 23.5 ns, compared to 98 ns for the standard doubly linked list (4.1x speedup).

All content in this document is granted into the public domain. Where this is not legally possible, the copyright owner releases all rights. This notice may be modified or removed.