> As long as the number of "fetches" from memory is the same...
The issue is the fetches may (and usually do) take different amounts of
time. Not too much of an issue when doing only a small amount of
fetches,
but when you start talking about millions of fetches, it adds up. Some
machines will even generate an exception/fault if a 32 bit unaligned
fetch
is done.
> It might be for space, but I dont get the speed benefit........
A typical example would be a 32 bit memory fetch from an addresses with
different alignment. Depending on the architecture (and I'm not
limiting
this to x86; consider VAX, Alpha, Sparc, PPC, etc.) there can be a
penalty
for unaligned fetched for anything other than the "normal" alignment.
Not
too far back, 32 bit alignment was the norm. But now, it is not unusual
to see 64 bit alignment the performance point.
A way to show this is to allocate a large buffer (several thousand bytes
to minimize data prefetching issues) and time a loop (of say a million
iterations) with random accessing of the buffer entries with differing
alignments. I expect you will see the difference in CPU times.
Most folks never realize this is even happening because the compilers
tend
to hide it by generating aligned references. It usually only shows up
when
data gets sent over the wire; which is why you might see the extra
bytes.
- Mark