data flow programming in c# (or any lang with coroutines)?

data flow programming in c# (or any lang with coroutines)?

Post by falco » Wed, 26 Jul 2006 15:17:02


Hi,
I am trying to understand if a language with 'yield' functionality such
as C# (yield==coroutines?) can be used to model the 'future and
promise' functionality of langauges such as Mozart/Oz and Alice ML.

I have a specific scenario in mind. If I have an array or a collection
of values, I can loop over it (or use fold/filter/map in FP) to return
another transformed set of data, aggregate the data according to some
function, etc. What if I have an unbounded list, where values are
continously being added to it and I need to process each new datum as
it comes in? The image I have in my mind is of a stock market
application which needs to react to incoming stock quotes.

In Mozart/Oz, I can apply a function to a static list as well as a
continously updated list without changing too much of the code. (I
think I just need to change how the list itself is defined, not how the
functions are applied to it).

Obvsiouly I can set up a queue which is consumed on one end and
extended on the other, wire up the consumer and the producer to set
some flag or trigger some event so the producer and consumer can
communicate.

However, as with so many other things, I can't help but wonder if
coroutines, delegates, generics, etc. can't be combined to produce a
simpler 'data flow' or 'concurrent' 'pattern'?

Thanks.
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by Barry Kell » Wed, 26 Jul 2006 17:49:41


C#'s yield is only half a coroutine. You can't have two iterators
calling each other like you can with full coroutines.


C#'s yield works strictly on a pull-basis. Yield return / yield break is
converted by a mechanical transformation into a state machine
implemented by a class that implements IEnumerable and probably also
some IEnumerable<T>.

In order to get the push behaviour you're talking about, you either need
to pull from the other end, as it were, or create a buffer along with a
processing thread so that whatever is pulling from the IEnumerator (that
they got from an IEnumerable created from the yield etc.) gets values
non-lazily.


C#'s iterator feature (aka yield etc.) is quite composable, but the way
to think about the composability is that you make a number of calls to
create a flow graph, and you then create instances of the flow graph.

For example, given an iterator that returns a somewhat infinite stream
of integers:

IEnumerable<int> Ints()
{
int x = 0;
for (;;) yield return x++;
}

And given a filter creator:

IEnumerable<T> Filter<T>(IEnumerable<T> source, Predicate<T> filter)
{
foreach (T item in source) if (filter(item)) yield return item;
}

And given a predicate which checks for even numbers:

bool IsEven(int value)
{
return (value % 2) == 0;
}

... one can then compose a graph:

IEnumerable<int> evens = Filter(Ints(), IsEven);

... but the "graph" as it were isn't actually instantiated until you
call GetEnumerator() on it, or do so indirectly via 'foreach'.

I myself have put together a few experimental libraries that combine
threading, background operations, marshalling to the UI etc. to create
dataflow graphs along these patterns. It can work well enough, as long
as you don't try to get too clever.

It's easy in C# to create the equivalent of "dataflow variables" in the
style of Oz etc., but it's also very easy to get a deadlock due to the
way they mix with stateful imperative programming. Also, OS-scheduled
threads are not cheap.

-- Barry

--
http://www.yqcomputer.com/

 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by falco » Thu, 27 Jul 2006 01:39:09

arry,
Thanks for the great reply. You mentioned logic variables. I read
another blog post which talked about doing logic programming through
C#'s iterators. Is there a reference which discusses these
non-imperative programming paradigms through a mainstream imperative
language such as C# or Java?

I keep reading about these interesting things, I can even start to
understand the importance of concurrent/logic/constraint programming,
but I wish there were more examples of how to bend C#/Java to provide
as much of their functionality as possible (for those of us who can't
use prolog/oz, etc.).



Barry Kelly wrote:

 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by falco » Thu, 27 Jul 2006 01:47:41

By the way, the main problem remains. While this function will provide
an infinite list of numbers:

IEnumerable<int> Ints()
{
int x = 0;
for (;;) yield return x++;

}

I don't see how one can push arbitrary values (received at run-time),
something like the following:

IEnumerable<int> Ints(int someNumber)
{
int x = 0;
for (;;)) yield return someNumber;

}

Obviously the above method will return the same number each time and
another piece of code can repeatedly call Ints to pass different values
to it. Is that what you meant by C# not having full coroutines?
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by Barry Kell » Thu, 27 Jul 2006 02:09:57


Only in the context of Oz's dataflow variables, as described by
"Concepts, Techniques, and Models of Computer Programming", by van Roy
and Haridi - they effectively just block execution if the variable is
read and the value hasn't been assigned yet.

I recommend that book, by the way, it's good.


Links?


In many ways, it would be counter-productive, because idioms which are
natural and even efficient in a language specifically designed for a
paradigm can look remarkably inelegant and actually be inefficient when
described in many imperative languages. The ideas tend to move up into
other, higher-level abstractions (i.e. not in the language, the lowest
level of all), where their didactic value gets lost in the noise.


I don't know a whole pile of stuff myself about the logic / constraint-
based programming, apart from the combinatorial approaches with
backtracking and constraints to prune the search space. To my eyes, many
of the logic-oriented paradigms are just ways of creating more efficient
yet essentially brute-force solutions. Not that that's bad, of course,
it's often a good use of the computer's power for problems that are hard
or not worth deeply analysing to create custom solutions, reduce amount
of code to maintain etc.

-- Barry

--
http://www.yqcomputer.com/
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by Barry Kell » Thu, 27 Jul 2006 02:57:41


You can't "push" - I thought I explicitly mentioned that C#'s iterators
were strictly pull-only. To get 'push' behaviour, you have to pull from
the other end, or use a separate thread to process values and add them
to a buffer, and have the iterator block when there's no values. (I'm
just repeating my original post here.)

Basically, you stick a producer / consumer queue in there.


When you use 'yield' it means "save up the state of this 'thread' and
return to the caller". When you next call MoveNext() on the IEnumerable,
the iterator picks up where it left off.

With coroutines, you've got a primitive that means something more like
"save state of this 'thread' and jump to this location", except the
location is held in a variable and the variable's value was created by a
similar "save the state of this 'thread'" or "create new state based on
a call to this function".

In order to get the full coroutine power in an imperative language, each
coroutine needs a separate stack. That, in many ways, makes them similar
to user-mode manually-scheduled threads. To that end, Win32's Fiber
feature can be used to implement coroutines.

Check out http://www.yqcomputer.com/ (C++).

-- Barry

--
http://www.yqcomputer.com/
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by Barry Kell » Thu, 27 Jul 2006 03:02:34


Also, there's a subtle misconception here: the call to Ints() in the
original code returns an object which *can* *create* a 'dataflow object
graph' of linked IEnumerator objects. Note that C#'s iterator feature
can automatically generate classes which implement both IEnumerator and
IEnumerable (in separate classes), or just IEnumerator. There is a
distinction between IEnumerable and IEnumerator: you use an instance of
IEnumerable to create an instance of IEnumerator. Note that the above
function returns an IEnumerable instance, not an IEnumerator instance.

-- Barry

--
http://www.yqcomputer.com/
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by falco » Thu, 27 Jul 2006 03:38:17

Sorry, I didn't have a link to this blog entry when I was typing the
message.

Iterators and Nondeterminism
http://www.yqcomputer.com/
LINQ is turning out to be great!
 
 
 

data flow programming in c# (or any lang with coroutines)?

Post by Barry Kell » Thu, 27 Jul 2006 04:13:12


Yes, basically he's talking about using nested foreach's over multiple
iterators to generate combinatorial sequences. He takes advantage of the
composability of iterators to build some primitives similar to Prolog.

That approach is still only useful if your problem is fundamentally
search based, though, unless there's something I'm not aware of.

-- Barry

--
http://www.yqcomputer.com/