A garbage collector for C++

A garbage collector for C++

Post by brittanyfo » Sun, 23 Apr 2006 20:12:03

Hi, all!

Something interesting, Maybe you wanna to know:

http://www.yqcomputer.com/ ++/browse_thread/thread/95007ffdf246d50e/1288c3a825236840?lnk=st&q=lelandguo%40yahoo.com.cn&rnum=2&hl=zh-CN#1288c3a825236840


A garbage collector for C++

Post by David Hopw » Mon, 24 Apr 2006 00:06:26

It's a bit late for April 1st, I'm afraid.

David Hopwood < XXXX@XXXXX.COM >


A garbage collector for C++

Post by brittanyfo » Mon, 24 Apr 2006 11:58:30

I do not think so.
Could you tell me the reason why you think it is a April 1st joke?


A garbage collector for C++

Post by David Hopw » Tue, 25 Apr 2006 00:05:23

If it's not a joke, then it's a remarkable breakthrough. However, it seems
to claim to do several things that are impossible when taken together.
For example:

Support C++ raw pointer, unions, bit-fields and hidden pointers.

By definition, a "hidden pointer" is a pointer that the GC does not
know about.

Even conducting an accurate tracing, the system does not require
any special information from compiler. Application can use any
standard C++ compiler, such as Visual C++ 8 and GCC.

This isn't consistent with being able to support raw pointers and unions.
The way compilers like VC++ and GCC compile code that uses unions, for
instance, means that there simply isn't sufficient information to do
accurate tracing: the indication of which branch of the union is currently
valid is not stored anywhere, and the branches differ in which locations
are pointers.

If there is no concurrently running scavenging action, there is no
write-barrier overhead, no strong memory ordering requirement, no
synchronization overhead.

There would have to at least be a write barrier that detects whether or
not the collector is running. (Well, unless you run completely different
code when the collector is not running, but that wouldn't necessarily be
more efficient.)

The memory usage is very efficient, acyclic garbage is reclaimed
when the last reference to it is removed. [...]
It is a fully accurate tracing garbage collector. [...]
The runtime cost of application threads is far less than a normal
reference counting. [...]

TANSTAAFL. If you want to do tracing *and* immediate reclamation of
garbage when the last reference is removed, then the only known ways of
doing that incur the overheads of both tracing and reference counting.

In general, the list reads as if someone (with a pretty good knowledge of
the field) had imagined what a feature list would look like if all of the
hard problems in memory management were all solved at once, without having
to make any trade-offs.

It could be that I'm wrong about the incompatibility of some of these features,
but in that case, I would expect that there would be a series of theoretical
papers solving individual problems, leading up to a practical implementation
after several years.

Having said that, if you drop the ability to use standard compilers,
accept some overhead for write barriers and to support deterministic
finalization, and allow deviations from standard C++ semantics for some
of the features it inherited from C, then the rest could be possible
(although still difficult). It's not clear that the value of deterministic
finalization is worth the necessary performance cost, in an otherwise
GC'd language, though.

David Hopwood < XXXX@XXXXX.COM >

A garbage collector for C++

Post by Chris Thom » Tue, 25 Apr 2006 04:14:01

Here is how he is apparently allowing the user to pass compile-time
information to the collector:

http://www.yqcomputer.com/ ++.moderated/msg/183d0a06b54bef68?hl=en

He is using smart pointers to keep objects alive. So a node in a shared
collection would have to use this smart pointer for its links. Traversals
would have to utilize his smart pointer logic for every node visited in any
traversal. Also, I don't think his smart pointers play to well with
lock-free algorithms; they be me somewhat limited... Humm...

Here is my take on what he is doing:

http://www.yqcomputer.com/ ++/browse_frm/thread/95007ffdf246d50e/7a5abe762a470113?hl=en#7a5abe762a470113

In general, I don't think his system will scale "too well" wrt large shared
linked data-structures...

A garbage collector for C++

Post by lelandgu » Tue, 25 Apr 2006 10:50:42

avid Hopwood wrote :

That is prior-art definition. Herein a hidden pointer is a reference to
another object while programmers cannot easily identify its location.
For example, suppose there is a binary ActiveX component, programmers
definitely know it will reference some other objects. The "Set" and
"Clear" operations of this reference are controllable by programmers.
Hence, the reference is a "hidden pointer", since you cannot tell where
the pointer is. However, in this system, we can trace through this
native object as part of global reference map. There is no different
from other pure GC objects.

All listed main features are provided in a single embodiment. I have
compiled it and run well under VC and GCC. Raw pointers are treated as
weak reference or dirty reference. It is fast but cannot keep an object
alive. Maybe 50%-80% of pointers, I guess, in an application could use
raw pointers and worked as usual as conventional C++. In this system,
the relationship of an object is not fixed as Java or C#, e.g. the
programmers can dynamic (at run-time) tell the GC system how to
interpret the specific object. There is an example of traverse function
at my reply to Larry:

Correct but not all. The mutators have to do something to detect
whether or not the collector is running. However, this job is much
lighter than normal write barrier, such as SSB, or Card, and even
hardware Write Watch is not necessary. Detecting a collector is just
reading a flag with some tricky mechanism to avoid strong memory
ordering and synchronization overhead. In this system, detecting a
collector does not cause any these heavy overhead.

You do not have to run completely different code. The application code
is the same while library code path may change depending on whether or
not a collector is running. Unlike modern Java or C# implementation,
which always do write barrier work and slow down the whole system, my
implementation only apply write barrier at scavenge time. Software
write barrier always lead to costly "write" operation. Hardware write
barrier is too conservative, and become even worse if future computer
has larger page size. Detecting a collector is just "read" operation,
and no strong memory-ordering requirement in this system. Actually, if
the application does not explicit invoke GC operation and disable
system GC, the mutators run very fast with acyclic garbage
auto-collecting, because I use another technique to reduce most
reference counting overhead. That is why I say, the overhead is far
less than a normal reference counting. It does not related to write
barrier overhead, it relates to reducing reference counting overhead.

As I describe above, the reference counting overhead is significantly
reduced, and the tracing write barrier overhead is left only to
scavenging time. Therefore, the whole cost is not simple sum of normal
reference counting and write barrier.

This is not an imagined feature list. It is a list of a real
implementation. All features are combined together without constraint
to each other.

Could you tell me more detail about to what C/C++ semantics should pay

Mingnan G.