As I've been slacking (due to the heat wave here probably), this
is the first real update in quite some time. I asked you to send in
some benchmark results and indeed I got a nice feedback. In this
first part I will just quickly talk about what has been measured,
and give the reported data the second part will interpret the
data.
What the code measures
The code is doing the following. First it will do a little warm up,
doing some allocations untimed. This gives the memory system a
chance to prepare for what is coming :)
The program will then do a million operations of
vm_allocate and vm_deallocate to factor
out the overhead for allocation an deallocation, since the primary
interest of this benchmark is the page faulting (with the zero
filling). The reason 1 mio operations are done, is so that the
runtime can be easily converted into us/operation. A runtime of 10s
means 10us per iteration.
The first loop just does vm_allocate() and
vm_deallocate(). The second does the same thing,
except touches a single page within the allocation. This triggers a
fault, a vm_object() allocation, and zero-fill. Then
when we call vm_deallocate(), there is additional work
to free the page and the object. The third loop allocates and frees
two pages, akin to the first loop. Finally, the fourth loop does
the same for 2 pages per allocation, isolating out the per-page
costs (fault, zero-fill, and free).
The measured results
I had to name one entry Anonymous because
it was submitted with 10.3 and AFAIK that's already problematic
with Apple's touchy-feely lawyers...