Lock-free algorithms and reference counting

Lock-free algorithms and reference counting

Post by Joe Seig » Sat, 11 Nov 2006 02:31:16


Does anyone know of any write lock-free algorithms where
on a modification it's not known whether an object in a
collection has been removed or not? For example with
lock-free fifo and lifo queue when you remove a node
it's no longer in the queue.

I'm wondering whether refcounting buys you anything.
It would seem to be unnecessary overhead. I'm
thinking of dropping or rather, not putting it in,
support for refcounting in RCU+SMR. I can't think
of a reason why it would be needed.

This is not about refcounting in general or for other
specific uses.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
 
 
 

Lock-free algorithms and reference counting

Post by Joe Seig » Wed, 15 Nov 2006 04:34:53


Well refcounting support in RCU+SMR is out then.

I tested a proxy collector using RCU+SMR and realized that
you need acquire and release membars since you can't
use dependent load on the proxy since it doesn't point
into the data struture. That sort of slows it down
to where it's only as fast or 10% faster than refcounted
proxy collector.

I also figured out how to make proxy collectors support
write lock-free, not just read lock-free. Except the
only write lock-free algorithms I can think of off
hand are lock-free lifo and fifo queues where non
proxy PDR would be better.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Thu, 16 Nov 2006 09:49:28


My patent application has something called 'distributed proxy reference
counting (DPRC)' which makes use of per-thread counter arrays. The polling
logic takes this information into account before I makes any decisions wrt
calling a deferred objects "status" function pointer. It only checks the
per-thread counters after epoch, therefore reader threads can modify their
counter arrays without membars in the actual proxy critical-region. The DPRC
algorithm describes the "proxy critical-region" in the abstract so that it
can be used with proxy collector logic that is based on vZOOM, RCU+SMR, RCU,
SMR, atomic counting, ect ect...





Our collector algorithms are already work with lock-free stack or queue,
hashed stack/queue ect... Are you referring to a special kind of writer
algorithm?



Read-Only Iteration: lock-free lifo
---------
ref = acquire proxy-ref
traverse lock-free stack ...
release proxy-ref(ref)




Writer Produce :
lock-free lifo push
---------
normal writer lock-free stack push ...



Writer Consume :
lock-free lifo pop
---------
ref = acquire proxy-ref
do writer lock-free stack ...
which implys reads...
to pop an objX...
ref->defer_next_epoch(objX)...
release proxy-ref(ref)...

That's how I would do it.
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Thu, 16 Nov 2006 10:36:53


I think I posted some pseudo-code that described DPRC some more on this
group a while ago...
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Thu, 23 Nov 2006 11:10:46


For lock-free linked-lists you have to identify the nodes removed from it as
logically deleted, or actually deleted. This information is kind of required
for any concurrent operations that either push or remove nodes...





Well, I created by reference counting algorithm so that I could search
through a data-structure and return multiple references to nodes that
matched the search criteria. For instance, I can do searches like this:


// search results
struct my_result_t {
// list of refs to matching nodes
node *front;
};

int my_multi_search(list_t *l, my_result_t *r, // criteria) {
int count = 0;
r->front = 0;

do {
// search, inc count and fill the result
// with references to every node we find.

} while(searching());

return count;
}




Do you know of any other reference counting that can do this without using
any atomic operations or memory barriers? IMHO, very lightweight reference
counting, like DPRC, can be fairly useful when you want to return results
from searches, and other various activities..
 
 
 

Lock-free algorithms and reference counting

Post by Joe Seig » Thu, 23 Nov 2006 12:21:47


Right, so refcounting wouldn't be of any additional benefit since
you already know what node has to be deallocated at some point.

[...]

Read references are a different thing. I can handle them without having to
put in any special logic for RCU+SMR. I just need to document how it's
used. Also maybe put together some PDR based refcount pointers, the don't
increment if zero ones like rcuref has.

It's all on the list of stuff to do at some point.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Fri, 24 Nov 2006 13:37:04


I think I just came up with a way to do efficient 100% lock-free atomic
reference counting without DWCAS! It works on SPARC 32/64! Humm, not sure
what to do with it just yet...

:O
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Mon, 27 Nov 2006 03:09:38


[...]


Can you return an arbitrary number of references to nodes per-search? Did
you create a dynamic hazard pointer implementation? (e.g., adjust number of
hazards at runtime.)

I found a way around extending the SMR algorithm in order to return
plurality of references to plurality of shared data-structures per search...
 
 
 

Lock-free algorithms and reference counting

Post by Joe Seig » Mon, 27 Nov 2006 03:32:11


You can do PDR based refcounting, e.g. w/ RCU (rcuref), or with SMR or whatever.
The stuff I dropped support for was using refcounting to manage the global
references rather than the read references by threads. I don't think it's
needed since you know when the last global reference is dropped anyhow.

You would want some form of refcounting for long term read references by
threads since using large numbers of hazard pointers for this is problematic
not only from the point of managing large numbers of hazard pointers but
from the fact they're not really copyable (they are but only in the same
order the hazard pointer scan is in, which probably won't be convenient
for the application).


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Mon, 27 Nov 2006 03:59:08


Using atomic operations right? I got a way to do it without them, or memory
barriers... Its heavily distributed per-thread reference counting.
Per-thread proxy counting... E.g., more than one object can share the same
proxy counter... The PDR scan logic takes this into account by iterating a
generation of deferred objects against the per-thread counters... A very
simple hash on the pointer value gets to the counter. The counter is can
also be in per-object metadata... It allows a thread to grab multiple
references to an arbitrary number of nodes. The references are also
copyable. This is done through an indirection of the proxy counters...

Also, I should mention another form of reference counting I invented here:

http://www.yqcomputer.com/

What do you think of the reference counting algorithm I created for my
memory allocator?



P.S. -- This allocator is one of the fastest and most scaleable memory
allocators out there... It was included in my initial round vZOOM
submission... It outperforms Hoard by a wide margin... It outperforms these
algorithms by a wide margin:

http://www.yqcomputer.com/


http://www.yqcomputer.com/


http://www.yqcomputer.com/


Luckily, the patent application for this is underway... I had to do it. This
memory allocator is $FUXKING blazing fast!







Any thoughts?
 
 
 

Lock-free algorithms and reference counting

Post by Joe Seig » Mon, 27 Nov 2006 04:30:32


The problem with proxy collectors is their granularity. With refcount
based pointers you can keep the graularity to a minimum. And they
would be used for relatively long references so overhead probaby isn't
a big issue.



--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Mon, 27 Nov 2006 04:48:05


[...]
[...]


The granularity of my reference counting is comparable with normal reference
counting. I can use the counting to make objects persist outside of a proxy
garbage collected region. Then the granularity of the proxy collector is not
that big of an issue. However, I see your point.





Well, if they are frequently and perhaps persistently searching and
operating on the references returns from the search, then the overhead would
become an issue, no?
 
 
 

Lock-free algorithms and reference counting

Post by Joe Seig » Mon, 27 Nov 2006 04:58:54


How many simultaneous references are going to be held and for how long?
The proxy stuff I usually make scoped so you know the interval over
which the references are held. You can make the interval as long as
you want but you tie up resources longer. If that's not acceptable
then you go to reference counting. I'm keeping it simple with a few well
specified trade-offs for now. If I make it more complicated I don't
have a realistic way of testing it. I'm pushing it on the stuff I
have now.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Mon, 27 Nov 2006 05:22:45


It had a race-condition... Luckily, it was fixable... Humm. This counting
algorithm seems to be working out... Humm...
 
 
 

Lock-free algorithms and reference counting

Post by Chris Thom » Mon, 27 Nov 2006 05:32:34


Maybe 20, 30, 100 nodes meet the criteria for a search... Perhaps I need to
operate on all of them. Maybe I release all but one, and hold it for a
longer period of time. I was thinking of a usage pattern like this contrived
example:


--------------

// local vzoom object
vzoom_t *vzthis = vzoom_self();

for (;;) {

// search return
node_t *result[0x100];
node_t *holds[0x100];
int depth, hdepth = 0;

// so the search in collected region
vzoom_proxy_t *ref = vzthis->acquire_proxy(); {
search(vzthis, &result, &depth);
ref->release();
}

// examine an work on the results
for(int i = 0; i < depth; ++i) {
if (search_op_result(result[i])) {
vzoom_copy(vzthis, &result[i], &holds[hdepth]);
++hdepth;
}
}

// release work
vzoom_proxy_t *ref = vzthis->acquire_proxy(); {
ref->drop_and_release(result[i]), depth);
}

// do something else...

// whatever...

// release work
vzoom_proxy_t *ref = vzthis->acquire_proxy(); {
ref->drop_and_release(holds[hdepth], hdepth);
}

// do something else...

// whatever...
}






Yes. I use lightweight refcounting to get objects to persist outside of the
proxy scope.





Right. It just, what kind of reference counting algorithm? This can all be
done without atomic operations...





I hear ya.

;^)