"Razii"
news:epp2049fqpfctcookaqsnouvgjo0uiathh@4ax.com...
> On Sat, 12 Apr 2008 07:39:11 -0700, "Chris Thomasson"
>
>
>>That is if the JVM happens to use counter(s) off a base to implement their
>>allocator.
>
> from IBM site...
>
[...]
I know that the most scaleable, most peformant memory allocator designs are
able to hit a fast-path into a per-thread heap. No interlocked RMW, no
memory barriers, just plain load/store instructions. Java can use this
design. C/C++ can as well. End of story. The fast-path, and even the first
and second level slow-paths can be lock/wait-free. What I mean by
first/second level "slow-path" can be explained like:
Very High-Level Allocation Request Outline
_____________________________________________
1. Try per-thread heap. - (no atomics and/or membars)
2. Try remote gather. - (can be no atomics and/or membars, otherwise an
atomic SWAP is needed).
3. Try per-cpu heap - (atomics and/or membars)
4. Try global heap. - (atomics and/or membars)
5. Ask OS! - (CRAP!)
Very High-Level Deallocation Request Outline
_____________________________________________
1. Determine if request was allocated by calling thread. If so goto step 2,
otherwise goto step 3.
2. free to per-thread heap. Done. - (no atomics and/or membars)
3. free to remote-thread heap. - (can be no atomics and/or membars,
otherwise an atomic CAS is needed).
4. On overflow, free to per-cpu heap. - (atomics and/or membars)
5. On per-cpu heap overflow, free to global heap. - (atomics and/or
membars)
6. On global heap overflow, free to OS! (CRAP!)
Anyway Razii, you initial assertion that calls to malloc/new in C/C++ always
hit the OS is misleading, and false. Java, C/C++, .NET, whatever can call
use very highly optimized memory allocation algorithms. IMVHO, GC is not all
that relevant here...