Priority-Queue Performance Tests

Settings

This section describes performance tests and their results. In the following, g++, msvc++, and local (the build used for generating this documentation) stand for three different builds:

g++

  • CPU speed - cpu MHz : 2660.644
  • Memory - MemTotal: 484412 kB
  • Platform - Linux-2.6.12-9-386-i686-with-debian-testing-unstable
  • Compiler - g++ (GCC) 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9) Copyright (C) 2005 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

msvc++

  • CPU speed - cpu MHz : 2660.554
  • Memory - MemTotal: 484412 kB
  • Platform - Windows XP Pro
  • Compiler - Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86 Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

local

  • CPU speed - cpu MHz : 2250.000
  • Memory - MemTotal: 2076248 kB
  • Platform - Linux-2.6.16-1.2133_FC5-i686-with-redhat-5-Bordeaux
  • Compiler - g++ (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Tests

  1. Priority Queue Text push Timing Test
  2. Priority Queue Text push and pop Timing Test
  3. Priority Queue Random Integer push Timing Test
  4. Priority Queue Random Integer push and pop Timing Test
  5. Priority Queue Text pop Memory Use Test
  6. Priority Queue Text join Timing Test
  7. Priority Queue Text modify Timing Test - I
  8. Priority Queue Text modify Timing Test - II

Observations

Underlying Data Structures Complexity

The following table shows the complexities of the different underlying data structures in terms of orders of growth. It is interesting to note that this table implies something about the constants of the operations as well (see Amortized push and pop operations).

push pop modify erase join

std::priority_queue

Θ(n) worst

Θ(log(n)) amortized

Θ(log(n)) Worst

Theta;(n log(n)) Worst

[std note 1]

Θ(n log(n))

[std note 2]

Θ(n log(n))

[std note 1]

priority_queue

with Tag =

pairing_heap_tag

O(1)

Θ(n) worst

Θ(log(n)) amortized

Θ(n) worst

Θ(log(n)) amortized

Θ(n) worst

Θ(log(n)) amortized

O(1)

priority_queue

with Tag =

binary_heap_tag

Θ(n) worst

Θ(log(n)) amortized

Θ(n) worst

Θ(log(n)) amortized

Θ(n)

Θ(n)

Θ(n)

priority_queue

with Tag =

binomial_heap_tag

Θ(log(n)) worst

O(1) amortized

Θ(log(n))

Θ(log(n))

Θ(log(n))

Θ(log(n))

priority_queue

with Tag =

rc_binomial_heap_tag

O(1)

Θ(log(n))

Θ(log(n))

Θ(log(n))

Θ(log(n))

priority_queue

with Tag =

thin_heap_tag

O(1)

Θ(n) worst

Θ(log(n)) amortized

Θ(log(n)) worst

O(1) amortized,

or

Θ(log(n)) amortized

[thin_heap_note]

Θ(n) worst

Θ(log(n)) amortized

Θ(n)

[std note 1] This is not a property of the algorithm, but rather due to the fact that the STL's priority queue implementation does not support iterators (and consequently the ability to access a specific value inside it). If the priority queue is adapting an std::vector, then it is still possible to reduce this to Θ(n) by adapting over the STL's adapter and using the fact that top returns a reference to the first value; if, however, it is adapting an std::deque, then this is impossible.

[std note 2] As with [std note 1], this is not a property of the algorithm, but rather the STL's implementation. Again, if the priority queue is adapting an std::vector then it is possible to reduce this to Θ(n), but with a very high constant (one must call std::make_heap which is an expensive linear operation); if the priority queue is adapting an std::dequeu, then this is impossible.

[thin_heap_note] A thin heap has &Theta(log(n)) worst case modify time always, but the amortized time depends on the nature of the operation: I) if the operation increases the key (in the sense of the priority queue's comparison functor), then the amortized time is O(1), but if II) it decreases it, then the amortized time is the same as the worst case time. Note that for most algorithms, I) is important and II) is not.

Amortized push and pop operations

In many cases, a priority queue is needed primarily for sequences of push and pop operations. All of the underlying data structures have the same amortized logarithmic complexity, but they differ in terms of constants.

The table above shows that the different data structures are "constrained" in some respects. In general, if a data structure has lower worst-case complexity than another, then it will perform slower in the amortized sense. Thus, for example a redundant-counter binomial heap (priority_queue with Tag = rc_binomial_heap_tag) has lower worst-case push performance than a binomial heap (priority_queue with Tag = binomial_heap_tag), and so its amortized push performance is slower in terms of constants.

As the table shows, the "least constrained" underlying data structures are binary heaps and pairing heaps. Consequently, it is not surprising that they perform best in terms of amortized constants.

  1. Pairing heaps seem to perform best for non-primitive types (e.g., std::strings), as shown by Priority Queue Text push Timing Test and Priority Queue Text push and pop Timing Test
  2. binary heaps seem to perform best for primitive types (e.g., ints), as shown by Priority Queue Random Integer push Timing Test and Priority Queue Random Integer push and pop Timing Test.

Graph Algorithms

In some graph algorithms, a decrease-key operation is required [clrs2001]; this operation is identical to modify if a value is increased (in the sense of the priority queue's comparison functor). The table above and Priority Queue Text modify Timing Test - I show that a thin heap (priority_queue with Tag = thin_heap_tag) outperforms a pairing heap (priority_queue with Tag =Tag = pairing_heap_tag), but the rest of the tests show otherwise.

This makes it difficult to decide which implementation to use in this case. Dijkstra's shortest-path algorithm, for example, requires Θ(n) push and pop operations (in the number of vertices), but O(n2) modify operations, which can be in practice Θ(n) as well. It is difficult to find an a-priori characterization of graphs in which the actual number of modify operations will dwarf the number of push and pop operations.