Sebastian Millies: Dev-Blog

Lazy tree walking made easy with Kotlin coroutines

2016-10-21T00:59:00.001+02:00

Let's imagine a simple unbalanced binary tree structure, in which an abstract BinaryTree<E> is either a concrete Node labelled with a value attribute of type E and having a left and right subtree, or a concrete Empty unlabelled tree. With trees, one very common requirement is to traverse the nodes in some appropriate order (preorder, inorder, or postorder).

In this post, we are going to consider how to do that lazily. That is, we want to be able to short-circuit the traversal, visiting only as many nodes in the tree as we require to see. Moreover, we'd prefer a single-threaded solution.

It's actually not trivial to do that in Java. With regard to iteration, you have to keep track of a lot state. (Cf. these explanations or look at the source code of java.util.TreeMap) You might scrap single-threadedness and actually have two threads communicating over a blocking queue, but that also entails some complexity, especially with task cancellation on short-circuiting.

And even when turning to stream-based processing instead of iteration, it doesn't quite work out as expected. Here's a proposed solution in a hypothetical BinaryTreeStreamer class:

/**
 * Supplies a postorder stream of the nodes in the given tree.
 */
public static <E> Stream<Node<E>> postorderNodes(BinaryTree<E> t) {
    return t.match(
                empty -> Stream.<Node<E>> empty(),
                node -> concat(Stream.of(node.left, node.right).flatMap(BinaryTreeStreamer::postorderNodes),
                               Stream.of(node)));
}

The corresponding inorder- or preorder-traversals would be similar. The technique for structural pattern-matching with method BinaryTree#match() goes back to Alonzo Church and is explained in more detail on Rúnar Bjarnason's blog. Basically, each subclass of BinaryTree applies the appropriate function to itself, i. e. Empty invokes the first argument of match, and Node the second.

The code above looks quite reasonable, but unfortunately it is broken by the same JDK feature/bug that I mentioned over a year ago in this post. Embedded flatMap just isn't lazy enough, and breaks short-circuiting. Suppose we construct ourselves a tree representing the expression (3 - 1) * (4 / 2 + 5 * 6). I'll use this as an example throughout this article. Then we start streaming, with the aim of finding out whether the expression contains a node for division:

boolean divides = BinaryTreeStreamer.postorderNodes(tree).filter(node -> node.value.equals("/")).findAny().isPresent();

which leads the code to traverse the entire tree down to nodes 5 and 6. And anyway, we are no closer to an iterating solution.

Now in Python, things look quite different. The thing is, Python has coroutines, called generators in Python. Here's how Wikipedia defines coroutines:

Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations.

In Python you can say "yield" anywhere in a coroutine and the calling coroutine starts up again with the value that was yielded. Coroutines are like functions that return multiple times and keep their state (which would include the values of local variables plus the command pointer) so they can resume from where they yielded. Which means they have multiple entry points as well. So here's a Python solution to our problem, with the defaultdict as the tree implementation, using value, left and right as the dictionary keys. (For a presentation that goes a bit beyond our simple example, e. g. see Matt Bone's page.)

tree = lambda: defaultdict(tree)

def postorder(tree):
    if not tree:
        return
    
    for x in postorder(tree['left']):
        yield x
        
    for x in postorder(tree['right']):
        yield x
        
    yield tree

One thing to note is that we must yield each value from the sub-generators. Without that, although the recursive calls would dutifully yield all required nodes, they would yield them in embedded generators. We must append them one level up. That corresponds to the successive flat-mapping in our Java code. Here's how we can enumerate the first few nodes of our example tree in postorder. I also show a bit of the Python tree encoding.

expr = tree()
expr['value'] = '*'
expr['left']['value'] = '-'
expr['left']['left']['value'] = '3'
expr['left']['right']['value'] = '1'
…
node = postorder(expr)  
print(next(node)['value'])
print(next(node)['value'])
print(next(node)['value'])

Many other languages besides Python have coroutines, or something similar, if not in the language, then at least as a library. Java does not have them, so I started looking for other JVM languages that do. There aren't many. But I found a library for Scala. However, Scala is not a language that Java developers readily embrace. The happier I was to learn that coroutines will be a feature of Kotlin 1.1, which is now in the early access phase.

I had already known of Kotlin. It is fun. It's very like Java, only better in many respects, it is 100% interoperable with Java, and – being developed by JetBrains – has great tool support from IntelliJ IDEA out-of-the-box. It really has a host of nice features. You might want to check out the following articles, which piqued my interest in the language.

Kotlin seems to have gained popularity especially among Android developers, among the reasons being its small footprint and the fact that up to the release of Android Nougat, people had been stuck with Java 6 on Android.

The current milestone is Kotlin 1.1-M04. In Kotlin, unlike Python, yield is not a keyword, but a library function. Kotlin as a language has a more basic notions of suspendable computation. You can read all about it in this informal overview. All that talk about suspending lambdas and coroutine builders and what not may seem somewhat intimidating, but fortunately there are already libraries that build upon standard Kotlin to provide functions that are easy to understand and use.

One such library is kotlinx-coroutines. It contains a function generate that takes a coroutine parameter. Inside that coroutine we can use yield to suspend and return a value, just as in Python. The values are returned as a Kotlin Sequence object. Let me show you my attempt to port the above Python code to Kotlin. I tried to do a faithful translation, almost line by line. That turned out to be pretty straightforward, which I can only explain by guessing that the designers of Kotlin's generate must have been influenced by Python.

fun <E> postorderNodes(t : BinaryTree<E>): Iterable<Node<E>> = generate<Node<E>> {
 when(t) {
  is Empty -> {}
  is Node -> {
   postorderNodes(t.left).forEach { yield(it) }
   postorderNodes(t.right).forEach { yield(it) }
   yield(t)
  }
 }
}.asIterable()

We can seamlessly use Kotlin classes in Java code and vice versa. However, instead of the Kotlin sequence, java.util.Iterable is much nicer to work with on the Java side of things. Fortunately, as shown above, we can simply call asIterable() on the sequence to effect the conversion. So, let BinaryTreeWalker be a Kotlin class that contains the Iterable-returning generator method, and look at some Java code exercising that method:

Iterable<Node<String>> postfix = new BinaryTreeWalker().postorderNodes(expr);
Iterator<Node<String>> postfixIterator = postfix.iterator();

System.out.println(postfixIterator.next().value); 
System.out.println(postfixIterator.next().value); 
System.out.println(postfixIterator.next().value);

For our example tree, this will correctly print the sequence "31-" and visit no further nodes.

Stream-processing is for free, as you can easily obtain a Stream from the Iterable with StreamSupport.stream(postfix.spliterator(), false) That gives you a Java stream based on an iterator over a sequence backed by a Kotlin generator. On that stream, that little snippet looking for a division-node would work as well, only now it would really be lazy.

Seamless integration also means that you are at liberty to migrate any class in a project to Kotlin, while all the rest stays in Java. For example, having written some JUnit tests for the expected behavior of the tree iterator in Java, I could simply keep these tests to verify the Kotlin class, after I had thrown out the Java implementation as insufficient.

You can try this out yourself. The easiest way is to download and install IntelliJ IDEA Community Edition. Then follow the instructions under "How to try it" in the Kotlin blog post. I was able to create a Maven project with dependencies on kotlin-stdlib 1.1-M04 and kotlinx-coroutines 0.2-beta without problems.

Edit 2017-03-01: Today Kotlin 1.1 has been released. The generate method has been moved to the Kotlin standard library under the name of buildSequence. Thus to use it you don't have to depend on kotlinx.coroutines, just import that function from the package kotlin.coroutines.experimental. Here is the release announcement.

In closing, I should mention the Quasar library. I must admit I am not sure of the relation between Quasar and Kotlin coroutines. On the one hand, on the page cited, Quasar claims to "provide high-performance lightweight threads, Go-like channels, Erlang-like actors, and other asynchronous programming tools for Java and Kotlin", on the other hand, this very informative presentation from JVMLS 2016 says that Kotlin coroutines are not based on Quasar, and are in effect a much simpler construct. The distinction here is between stackless and stackful coroutines. However, as the Kotlin blog now says (my emphasis)

suspending functions are only allowed to make tail-calls to other suspending functions. This restriction will be lifted in the future.

this distinction may not be so relevant after all. There seems to be discussion at JetBrains whether to integrate more tightly with Quasar (see this issue). It will be interesting to see how this develops.

Addendum: Just in case you're wondering, no, Kotlin sequences are no lazier than Java streams, the following Kotlin version of the initial Java attempt also traverses the entire tree when trying to find the first division node:

 fun <E> postorderNodes(t: BinaryTree<E>): Sequence<Node<E>> =
            when(t) {
                is Node -> {
                    (sequenceOf(t.left, t.right).flatMap { postorderNodes(it) }
                    + sequenceOf(t))
                }
                else -> emptySequence()
            }

Map Inversion

2016-09-30T12:18:00.000+02:00

Recently, I had occasion to invert a map, i. e. exchange its keys and values. Having read several discussions on Stackoverflow (see here, here, and here), I decided to create my own utility, internally using Java streams. Besides simple maps, I wanted to be able to convert multimaps, both of the Google Guava and "standard Java" kind, the difference being that the entry set of a Guava multimap contains only elements with single values, whereas the entry set of a standard multimap contains an element with a Collection value.

As an example, here's the code for the latter (and most complicated) case:

/**
 * Inverts a map. The map may also be a non-Guava multimap (i. e. the entrySet elements may have collections as values).
 * 
 * @param map the map to inverted
 * @param valueStreamer a function that converts an original map value to a stream. (In case of a multimap, may stream the original
 *            value collection.) Can also perform any other transformation on the original values to make them suitable as keys.
 * @param mapFactory a factory which returns a new empty Map into which the results will be inserted
 * @param collectionFactory a factory for the value type of the inverted map
 * @return the inverted map, where all keys that map to the same value are now mapped from that value
 */
public static <K, V, V1, C extends Collection<K>> Map<V, C> invertMultiMap(Map<K, V1> map, Function<V1, Stream<V>> valueStreamer,
  Supplier<Map<V, C>> mapFactory, Supplier<C> collectionFactory) {
    Map<V, C> inverted = map.entrySet().stream().flatMap(e ->
       valueStreamer.apply(e.getValue()).map(v -> new SimpleEntry<>(v, newCollection(e.getKey(), collectionFactory))))
      .collect(toMap(Entry::getKey, Entry::getValue, (a, b) -> {a.addAll(b);  return a;}, mapFactory));
    return inverted;
}

private static <T, E extends T, C extends Collection<T>> C newCollection(E elem, Supplier<C> s) {
    C collection = s.get();
    collection.add(elem);
    return collection;
}

You can find the complete code, covering some more cases and with some tests, in this Gist. The tests depend on Guava.

Concurrent Recursive Function Memoization

2016-05-03T00:02:00.002+02:00

Recently on concurrency-interest, there has been a discussion triggered by an observation that Heinz Kabutz made about ConcurrentHashMap. The observation being that the following plausible-looking coding is broken:

public static class FibonacciCached {
  private final Map<Integer, BigInteger> cache = new ConcurrentHashMap<>();
  public BigInteger fib(int n) {
    if (n <= 2) return BigInteger.ONE;
    return cache.computeIfAbsent(n, key -> fib(n - 1).add(fib(n - 2)));
  }
}

ConcurrentHashMap livelocks for n ≥ 16. The reason, explained by Doug Lea, is that if the computation attempts to update any mappings, the results of this operation are undefined, but may include IllegalStateExceptions, or in concurrent contexts, deadlock or livelock. This is partially covered in the Map Javadocs: "The mapping function should not modify this map during computation."

Mainly in conversation between Viktor Klang and me, and based on an original idea of Viktor's, another approach was developed that appears workable compared to computeIfAbsent. The approach also harks back to two previous posts in this blog, namely the ones about memoization and trampolining. The idea improves on the memoization scheme by providing reusable, thread-safe components for memoizing recursive functions, and combines that with a trampoline to process the function calls in such a way as to eliminate stack overflows.

I'll present the idea with a few explanatory comments. If you want a step-by-step derivation, you can get that by reading the mail list archives. There are three parts to the code.

First, a general purpose memoizer
.

public static class ConcurrentTrampoliningMemoizer<T, R> {
  private static final Executor TRAMPOLINE = newSingleThreadExecutor(new ThreadFactoryBuilder().setDaemon(true).build());
  private final ConcurrentMap<T, CompletableFuture<R>> memo;

  public ConcurrentTrampoliningMemoizer(ConcurrentMap<T, CompletableFuture<R>> cache) {
    this.memo = cache;
  }

  public Function<T, CompletableFuture<R>> memoize(Function<T, CompletableFuture<R>> f) {
    return t -> {
      CompletableFuture<R> r = memo.get(t);
      if (r == null) {
        final CompletableFuture<R> compute = new CompletableFuture<>();
        r = memo.putIfAbsent(t, compute);
        if (r == null) {
          r = CompletableFuture.supplyAsync(() -> f.apply(t), TRAMPOLINE).thenCompose(Function.identity())
                .thenCompose(x -> {
                   compute.complete(x);
                   return compute;
                });
        }
      }
      return r;
    };
  }
}

Second, a class that uses the memoizer to compute Fibonacci numbers.

public static class Fibonacci {
  private static final CompletableFuture<BigInteger> ONE = completedFuture(BigInteger.ONE);
  private final Function<Integer, CompletableFuture<BigInteger>> fibMem;

  public Fibonacci(ConcurrentMap<Integer, CompletableFuture<BigInteger>> cache) {
    ConcurrentTrampoliningMemoizer<Integer, BigInteger> memoizer = new ConcurrentTrampoliningMemoizer<>(cache);
    fibMem = memoizer.memoize(this::fib);
  }

  public CompletableFuture<BigInteger> fib(int n) {
    if (n <= 2) return ONE;
    return fibMem.apply(n - 1).thenCompose(x ->
           fibMem.apply(n - 2).thenApply(y -> 
             x.add(y)));
  }
}

Third, any number of clients of a Fibonacci instance.

Fibonacci fibCached = new Fibonacci(new ConcurrentHashMap<>());
BigInteger result = fibCached.fib(550_000).join();

As the final usage example shows, the 550.000^th Fibonacci number is about the largest I can get on my box, before all the cached BigInteger values cause an OutOfMemoryError. The Fibonacci instance may be shared among several threads. The whole thing seems to scale OK, as my tests with 2, 4, or 8 threads sharing the same instance indicate.

The formulation of the fib method should be familiar to you from the previous blog entry on memoization, except we're using a different monad here (CompletableFuture instead of StateMonad).

The most interesting thing of course is the memoizer. Here are a few observations:

On a cache miss a container (in the form of a CompletableFuture) is inserted into the cache which will later come to contain the value.
Different threads will always wind up using the same value instances, so the cache is suitable also for values that are supposed to be singletons. Null values are not allowed.
Only the thread that first has a cache miss calls the underlying function.
Concurrent readers/writers won't block each other (computeIfAbsent would do that for instance).
The trampolining happens because we can bounce off the queue in the executor to which supplyAsync submits tasks. The technique has been explained in a previous post. We are just inlining our two utility methods terminate and tailcall, and fibMem is the thunk. I find it interesting to see how we benefit from this even if there is no tail recursion.

I haven't shown a few obvious static imports. The ThreadFactoryBuilder is from Guava.

You might want to pass in an executor to the memoizer from the outside. This would facilitate separate testing of the implementation vs. performance/scalability. However, performance seems to degrade when using anything but a dedicated single thread, so it may not be worth much in practice.

I am sure there are very many ways to improve on the idea, but I'll leave it at that.

Recursive Function Memoization with the State Monad

2016-02-08T21:00:00.000+01:00

The State Monad

What is the State Monad? If you have been following this blog, you already know the answer. In fact, the parser that we have seen in the previous post is one embodiment of it. In general terms, the state monad is just a glorified function that takes a state and computes from that a result value and some new state. Apart from embedding this function, the state monad has a few bells and whistles that help in combining such functions together. In functional programming languages, the state monad is used to simulate external mutable state. The state monad's role is to pass down the state through the calls of the function. The state parameter disappears from the functions actually used.

You have already seen a practical application of this in our parser. In the parser, the state values have been character sequences. We were able to process the input sequence without ever seeing a corresponding parameter in our methods.

Besides parsing, another often-mentioned application of the state monad is recursive function memoization. In this case, the state will consist of pre-computed function values (a memo). We'll see how to memoize a recursive function that computes the Fibonacci numbers without tacking an extra memo argument onto the function (such as a HashMap). It is the state monad that will keep a memo of already computed values between recursive calls.

You can google this stuff. People write about it often. For example, here are two articles that both deal with memoizing Fibonacci numbers, in a fashion similar to but not identical with mine: one in Java by Pierre-Yves Saumont and another in Scala by Tony Morris. But both miss a point I am going to make below, namely how to factor out the memoization logic from the function itself. (Saumont, by the way, is in the process of writing what looks to be an interesting book.)

I'll first show you how memoizing Fibonacci numbers is done, and define a generic memoize method that will enable memoization for any function. Only then will I show you how to derive the underlying state monad implementation through a few little refactorings of our SimpleParser. This way, you won't be inundated with details, having no idea what they lead up to. I'm hoping that the code is clear enough to grasp its intent even without knowing the internals. If it doesn't work for you, try reading this post from the end to the beginning.

And in the last section, I'll give you my evaluation of the whole thing.

Memoizing Fibonacci numbers

I expect you all know from your introduction to algorithms course that the following naive version will blow up in your face:

BigInteger fibNaive(int n) {
    if (n <= 2) {
        return BigInteger.ONE;
    }
    BigInteger x = fibNaive(n - 1);
    BigInteger y = fibNaive(n - 2);
    return x.add(y);
}

One way to make the algorithm feasible is to remember previously computed values.

BigInteger fibMemo(int n, Map<Integer, BigInteger> memo) {
    BigInteger value = memo.get(n);
    if (value != null) {
       return value;
    }
    BigInteger x = fibMemo(n - 1, memo);
    BigInteger y = fibMemo(n - 2, memo);
    BigInteger z = x.add(y);
    memo.put(n, z);
    return z;
}

The above is written in a non-functional style with an explicit extra argument. (I'll name it the explicit version.) You would call it like this, with a memo already containing the first two Fibonacci numbers:

BigInteger fibMemo(int n) {
    if (n <= 2) {
        return BigInteger.ONE;
    }
    Map<Integer, BigInteger> memo = new HashMap<>();
    memo.put(1, BigInteger.ONE);
    memo.put(2, BigInteger.ONE);
    return fibMemo(n, memo);
}

Now wouldn't it be nice if we could get rid of that extra argument and the tedium of looking up and storing values, to derive a memoizing version that is closer to the naive version? And, yes, we can. Simply apply a new function, called memoize, to the original function.

StateMonad<BigInteger, Memo<Integer,BigInteger>> fib(int n) {
    return memoize(this::fib).apply(n-1).bind(x ->              // x = memoize (fib) (n-1)
           memoize(this::fib).apply(n-2).bind(y ->              // y = memoize (fib) (n-2)
           StateMonad.get(_s -> x.add(y))));                    // return x.add(y)
}

Basically, this is only a notational variant of the naive version, a bit difficult to read at first because Java lacks the syntactic sugar other languages have. I have tried to indicate such sugar in the inline comment. Having had time to get used to it, I find it doesn't read too bad, just ignore the clutter of "apply" and "bind", and remember that the assignment is notated at the end of the line instead of the beginning.

Well, we also had to change the return type a bit, it no longer represents a value, but a computation that yields this value (the function represented by the state monad). As you perhaps have guessed, the method bind above is the equivalent of then in our parser: the signature is the same, except the continuation that we put in, and the combined computation that we get out, are not parsers, but a more general StateMonad class that can have a memo as its state. As with the parser, the function inside the monad takes no arguments except the state, all other arguments are supplied beforehand.

My main point is that here we have a very clear separation of concerns: There is one entity memoize, concerned with caching computed values, another entity, the StateMonad, concerned with passing the cached values between function calls, and finally fib itself, that embodies the definition of the Fibonacci numbers and is concerned with making recursive calls and combining their results.This is in contrast to the explicit version and also to both posts quoted above, where these concerns are more intertwined.

For me, this is of the essence of functional programming, and therein lies its beauty: that it is so neat and modular on a small scale.

In order to get a value from the computation that is returned by fib, we need to evaluate it against some suitable initial state. The function that does this is called evalState. (And this you also already know from the parser, where it was called just eval.)

BigInteger fibMonadic(int n) {
    return n <= 2 ? BigInteger.ONE
                  : fib(n).evalState(new Memo<Integer,BigInteger>().insert(1, BigInteger.ONE).insert(2, BigInteger.ONE));
}

The class Memo is intended to be a functional equivalent to HashMap.

Implementing generic memoization

So, what is memoize? The signature of memoize is obvious. It takes a function of the type of fib and returns another function of the same type. We can abstract over the input and result value types Integer and BigInteger of fib and replace them with generic type parameters T and R.

Now think of what memoize has to do. It takes a function as its parameter. It must return a new function that

when given an argument, tries to look up a previously computed value from the memo
if it finds the value, returns a function that will yield this value
otherwise, applies the given function to the given argument, stores that result in the memo, and returns a function will yield this value plus an updated state

As usual with functional maps, the memo shall return an Optional upon lookup, so that we can make the case distinction with Optional's methods map and orElse.Without further ado, here's the code:

static <T,R> Function<T, StateMonad<R, Memo<T,R>>> memoize(Function<T, StateMonad<R, Memo<T,R>>> f) {
    return t -> {   
                  StateMonad<Optional<R>, Memo<T,R>> s = StateMonad.get(m -> m.lookup(t)); // create a computation that would try to find a cached entry
                  return s.bind(v ->                                                       // perform the computation, call the result v and do: 
                              v.map(s::unit)                                               // if value is present, return a computation that will yield the value
                               .orElse(                                                    // otherwise  
                                     f.apply(t).bind(r ->                                  //   compute r = f(t)  
                                     s.mod(m -> m.insert(t, r)).map(_v -> r)               //   apply a function to the memo that stores r in it, set the value of s to r
                              )));                                                         //   return a computation that will yield r
                };
};

The appeal of memoize is that it is completely general and can be applied to any unary function.

What about functions that take multiple arguments? Well, in Java you cannot abstract over functions with arbitrary arity. I suggest to take a suitable class, like the one from the wonderful jOOλ library, that can represent an n-ary tuple, and treat an n-ary function as a unary function of such a tuple. (They should have added tuples to Java, every functional programming person is asking for them.)

Implementing the state monad

The state monad itself is not so interesting. As I have noted above, it should be self-explanatory to readers familiar with the simple parser. Here's how you might change the parser to derive the monad in a few simple steps, mainly consisting of renamings:

Abstract over the type of the state (replace CharSequence with a type variable S)
Rename a few methods to what's customary in the functional world (e. g. Haskell)

eval to evalState
parse to runState
then to bind

Delete many, many1, and orElse. (We don't need orElse in our example, but you might nevertheless decide to keep it. I guess that in some contexts, it might be useful for backtracking.)
Simplify evalState by removing the checking for unused input that was completely specific to parsing
You might also wish to rename a few variables. (I renamed inp for "input" to s for "state", p for "parser" to m for "monad" etc.)

All that remains is to define the new state-manipulating method mod and the static value-extracting utility get, which are used in the definition of memoize. That is very little work, and here is the complete resulting code. (I've left out the overloaded version of bind/then that we don't need for this example. Furthermore, the state monad is usually given a few more convenience methods, which we also don't need and which I'm not going to discuss here.)

@FunctionalInterface
public interface StateMonad<T,S> {
    
    default T evalState(S s) {
        Objects.requireNonNull(s);
        Tuple<T, S> t = runState(s);
        if (t.isEmpty()) {
            throw new IllegalArgumentException("Invalid state: " + s);
        }
        return t.first();
    }

    /** 
     * The type of the functional interface. A state monad is an abstraction over a function:
     * StateMonad<T,S> :: S -> Tuple<T, S> 
     * In other words, a state monad represents a stateful computation that derives a value and 
     * some new state from an input state.
     */
    abstract Tuple<T, S> runState(S s);
    
    // Monadic operations

    default <V> StateMonad<V,S> unit(V v) {
        return inp -> tuple(v, inp);
    }

    default <V> StateMonad<V,S> bind(Function<? super T, StateMonad<V,S>> f) {
        Objects.requireNonNull(f);
        StateMonad<V,S> m = s -> {
            Tuple<T, S> t = runState(s);
            if (t.isEmpty()) {
                return empty();
            }
            return f.apply(t.first()).runState(t.second());
        };
        return m;
    }
    
    default <V> StateMonad<V,S> map(Function<? super T, V> f) {
        Objects.requireNonNull(f);
        Function<V, StateMonad<V,S>> unit = x -> unit(x);
        return bind(unit.compose(f));
    }
       
    // Additional functions
       
    /** Modify the current state with the given function. */
    default StateMonad<T, S> mod(Function<S, S> f) {
        return s -> runState(f.apply(s)); 
    }

    /** Create a computation that extracts a value from the state with the given function. */
    static <V, S> StateMonad<V, S> get(Function<S, V> stateProjector) {
        return s -> tuple(stateProjector.apply(s), s);
    }
}

Of course we could have inlined that static call to the convenience function StateMonad.get in memoize (i. e. used the lambda directly), but that would have made the use of tuples visible in memoize.

Discussion

The trick I've shown you in this post is certainly appealing in its cleverness, and sometimes I quite admire it. But most of the time I feel that it is too clever by half. You've got to think practically. The functional version is (in Java) no better with regard to conciseness or readability than the explicit version. The effort to rewrite the naive definition to make use of memoization is also about the same in each case, because you have to make all those additional syntactic changes in addition to just throwing in a call to memoize. And as for performance, the monadic solution is about 5 times slower on my machine than the explicit one (for 100 ≤ n ≤ 1000).

Then there is the matter of representing state. In OO, state is most naturally represented as a member variable of a class. Here's equivalent coding that fits into this paradigm.(Well, almost equivalent. This solution is not thread-safe. On the other hand one might reuse the Memoizer instance.)

static class Memoizer<T, R> {
  private final Map<T, R> memo;

  public Memoizer(Map<T, R> memo) {
    this.memo = memo;
  }

  public Function<T, R> memoize(Function<T, R> f) {
    return t -> {
      R r = memo.get(t);
      if (r == null) {
        r = f.apply(t);
        memo.put(t, r);
      }
      return r;
    };
  }
}


Memoizer<Integer, BigInteger> m = new Memoizer<>(new HashMap<>());

BigInteger fib(int n) {
  if (n <= 2) return BigInteger.ONE;
  return m.memoize(this::fib).apply(n - 1).add(
         m.memoize(this::fib).apply(n - 2));
}

And of course there is a simple, idiomatic, non-recursive, constant-space solution with mutable variables:

BigInteger fib(int n) {
 if (n <= 2) return BigInteger.ONE;
 
 BigInteger x = BigInteger.ONE;
 BigInteger y = BigInteger.ONE;
 BigInteger z = null;
 for (int i = 3; i <= n; i++) {
  z = x.add(y);
  x = y;
  y = z;
 }
 return z;
}

So be careful when you take over an idea from functional programming. You may be able to retain some of the beauty, but in other respects you won't always win. In fact, sometimes you'll lose. Don't overdo it. Functional thinking may offer you a different - and useful - perspective on a problem, as I hope to have shown with the parser and some other posts. But a functional solution is not guaranteed to be per se more readable, more maintainable, or more efficient than a good old vanilla object-oriented or imperative solution. Often, it will be neither. Java is not a functional programming language!

Addendum

For the sake of completeness, here's the Memo class that I have used for demo purposes. It's a mutable map masquerading as a functional data structure. Very bad. Don't copy it. You may put trace statements in these two methods to see in what order elements are computed and retrieved, and that they are indeed computed only once. The method names are the same as in TotallyLazy's PersistentMap.

class Memo<K, V> extends HashMap<K, V> {

    public Optional<V> lookup(K key) {
        Optional<V> value = Optional.ofNullable(get(key));
        return value;
    }

    public Memo<K, V> insert(K key, V value) {
        put(key, value);
        return this;
    }
}

DSLs with Graham Hutton's Functional Parser

2016-01-10T23:05:00.000+01:00

One of the best introductions to functional programming is Graham Hutton's excellent book Programming in Haskell. Although it is primarily a course on the Haskell programming language, it also provides an introduction to the underpinnings of functional programming in general. It is very readable, aimed at the undergraduate level. I think it should be on the shelf of every student of functional programming.

I had thought of writing a book review, but there's really no need for it. You can find in-depth technical reviews on the book's site, and more: You can download slides for each chapter, watch video lectures, and get the complete Haskell source code of all the examples. That is for free, without even buying the book - a tremendous service.

Instead of a review, I'll show you what I did for an exercise. I ported the functional parser that Hutton presents in chapter 8 of his book to Java. My aim is to demonstrate the following:

Mastering functional idioms will help you to create powerful interfaces supporting a fluent programming style in Java
Of particular use in abstracting from tedious glue code are monadic interfaces, i. e. basically those that have a flatMap method
There is a special way of encapsulating internal state that is sometimes useful

The parser itself is a recursive descent parser of the kind you wouldn't necessarily use for large and complex grammars. (You'd perhaps use a compiler-compiler or parser generator tool like ANTLR or Xtext/Xtend.) However, you might use it to quickly spin-off a little DSL or demo coding. The real point is not the parser, but the approach to application design.

In the following, some of the text is taken verbatim from Hutton's book, some is my own. I shall not bother with quotes and attributions, because I am not claiming any original ideas here. Remember: this is just an exercise in porting Haskell to Java. However, I'll try to be helpful by interspersing a few extra examples and comments.

Nevertheless, this post is going to be a bit tedious, in the way textbooks are. I apologize in advance. If you're feeling impatient, perhaps you might want to scroll down and look at the final example, a little parser for arithmetic expressions. I summarize my experiences and give some advice in the concluding section.

Basic parsers

So what is a parser? It is a function that takes an input character sequence and produces a pair comprising a result value and an output character sequence, the unconsumed rest of the input. Let's abstract over the type of the result value and define a corresponding functional interface:

@FunctionalInterface
public interface SimpleParser<T> {   
    abstract Tuple<T, CharSequence> parse(CharSequence cs);
}

Hutton's parser in fact returns a list, to include the possibility of returning more than one result if the input sequence can be parsed in more than one way. For simplicity, however, he only considers parsers that return at most one result. For that reason, I have opted for the simpler signature. It follows that our parser cannot handle ambiguity. You might want to do the generalization yourself.

I shall assume the existence of a special empty tuple, which has no components, as an analogue to the empty list, which signifies that the input cannot be parsed at all. (Otherwise, the type Tuple is just what you might expect.)

We will now define some basic parsers, and then see how we can arrange them in a hierarchy of ever higher-level derived parsers. First, here's a method that produces a basic parser that always succeeds with the given result value without consuming any input:

default <V> SimpleParser<V> unit(V v) {
    return inp -> tuple(v, inp);
}

In Haskell, this method would be called "return", but of course that's not possible in Java. We'll also provide output, a static analogue to unit, and failure, a way to produce a parser that always fails regardless of the input. In fact, we'll define quite a few static factory methods for creating parsers. These I have collected in a separate utility class SimpleParserPrimitives. The parser interface itself will contain only the (non-static) methods to combine and manipulate parsers. It will turn out that these methods are quite general. You might re-use them, and each time you might well provide a different set of parsing "primitives" for a different parsing purpose. The primitives I will be discussing are geared towards parsing programming languages.

public static <T> SimpleParser<T> output(T v) {
    return inp -> tuple(v, inp); 
}

public static <T> SimpleParser<T> failure() {
    return _inp -> empty();
}

The method empty returns the empty tuple, representing failure. Our final basic parser is item, which fails if the input string is empty, and succeeds with the first character as the result value otherwise.

public static SimpleParser<Character> item() {
    return inp -> inp.length() == 0 ? empty() : tuple(inp.charAt(0), inp.subSequence(1, inp.length()));
}

Sequencing

Suppose we want to parse an arithmetic expression like "5 + 4 * 3". One way to decompose the problem is to create a parser for the numeral "5", and a parser for the rest of the expression "+ 4 * 3" (which we would again decompose into a parsers for the symbol "+" and the expression "4 * 3" and so on.) We then look for a way to combine these parsers into a parser for the whole expression. The combined parser shall execute the parsing steps one after the other. In the process of doing that we need to make decisions based on the result of the previous step. One such decision is if and how to continue parsing the rest of the input or not. If we want to continue, we must pass on the intermediate result to the following step.

Here's the signature of such a method that takes a description of the future parsing steps, and returns a new parser that will execute the current step combined with those future ones:

default <V> SimpleParser<V> then(Function<? super T, SimpleParser<V>> f)

The returned parser fails if the application of the current parser to the input string fails, and otherwise applies the function f to the result value to give a second parser q, which is then applied to the output string to give the final result. Thus, the result value produced by this parser is made directly available for processing in the remaining steps by binding it to a lambda variable in a function describing those remaining steps. For example, the expression

p.then(x -> output(x));

may be read as: Parse the input using the parser p, and then, calling the result "x", apply output() to x.

The method then corresponds to flatMap on Java Stream and Optional, resp. to CompletableFuture.thenCompose(), which are the three "monadic" interfaces built into Java 8. These methods all have the property of chaining computations together (in the case of streams like nested for-loops, as I have discussed in several earlier posts.) Here's the implementation. You can see how each line corresponds to a part of the description we have given above.

default <V> SimpleParser<V> then(Function<? super T, SimpleParser<V>> f) {
    Objects.requireNonNull(f);
    SimpleParser<V> pf = inp -> {
        Tuple<T, CharSequence> t = parse(inp);
        if (t.isEmpty()) {
            return empty();
        }
        return f.apply(t.first()).parse(t.second());
    };
    return pf;
}

Let's look what we can do with the building blocks we have assembled so far. Here's a contrived example. The parser p consumes three characters, discards the second, and returns the distance between the first and the third:

@Test 
public void composition() {
    SimpleParser<Integer> p = 
            item().then( x -> 
            item().then( y ->
            item().then( z ->
            output(z - x)
            )));
    assertEquals(tuple(2,"def"), p.parse("abcdef"));
    assertEquals(empty(), p.parse("ab"));
}

Notice how the formatting goes some way to hide the nesting of the constructs. I find the code better readable that way, because conceptually, we are just executing parsing steps in sequence. Note, too, how we are ignoring the variable y in the last two steps. Let's introduce a variant on then that allows us to dispense with this superfluous variable.

default <V> SimpleParser<V> then(Supplier<SimpleParser<V>> p) {
    Objects.requireNonNull(p);
    return then(_v -> p.get());
}

The above parser can now be rewritten without the unused variable:

    SimpleParser<Integer> p = 
            item().then( x -> 
            item().then( () ->
            item().then( z ->
            output(z - x)
            )));

Note the use of Supplier to enforce laziness: One could not simply overload then to accept a parser instance, because that would imply that the argument parser p were immediately constructed by the caller, while otherwise it is constructed only after the current parser has completed successfully. But at least we do no longer have to litter our code with dummy variables. Haskell has nice syntactic sugar ("do-notation") that allows just leaving out the unused variables, and also hides the nesting of the calls (what I have tried to achieve by code formatting).

Also note how there's no reference to the input that is being parsed in the definition of our little parser. There is no explicit variable of type CharSequence being threaded through the calls, or anything like that. All the mechanics of state handling are neatly encapsulated inside then.

Derived parsers

We can now start combining our basic parsers in meaningful ways to derive more parsing primitives. First, we build upon item to define a parser that accepts a certain character only if it satisfies some predicate. And with the help of the utility methods in java.lang.Character we can go on to define parsers for all kinds of character classes, like digits, lower and upper case letters, alphanumeric characters or whitespace. For example:

/** A parser that returns the next character in the input if it satisfies a predicate. */
public static SimpleParser<Character> sat(Predicate<Character> pred) {
    Objects.requireNonNull(pred);
    return item().then(x -> pred.test(x) ? output(x) : failure());
}

/** A parser that returns the next character in the input if it is the given character. */
public static SimpleParser<Character> character(Character c) {
    return sat(x -> c.equals(x));
}

/** A parser that returns the next character in the input if it is a digit. */
public static SimpleParser<Character> digit() {
    return sat(Character::isDigit);
}

/** A parser that returns the next character in the input if it is a whitespace character. */
public static SimpleParser<Character> whitespace() {
    return sat(Character::isWhitespace);
}

With character we can recursively recognize strings.

public static SimpleParser<String> string(String s) {
    Objects.requireNonNull(s);
    return s.isEmpty() ? output("") : character(s.charAt(0)).then(() -> string(s.substring(1)).then(() -> output(s)));

The base case states that the empty string can always be parsed. The recursive case states that a non-empty string can be parsed by parsing the first character, parsing the remaining characters, and returning the entire string as the result value.

Choice and repetition

There are more ways to combine parsers than simple sequencing. We often need to express alternative ways of parsing one string ("choice"), and we need repetition, where a string is parsed by the repeated application of the same parser.

/** Applies the given parser if this parser fails. */
default SimpleParser<T> orElse(SimpleParser<T> p) {
    Objects.requireNonNull(p);
    return inp -> {
        Tuple<T, CharSequence> out = parse(inp);
        return out.isEmpty() ? p.parse(inp) : out;
    };

/** Applies this parser at least once or not at all. */
default SimpleParser<List<T>> many() {
    return many1().orElse(unit(emptyList()));
}

/** Applies this parser once and then zero or more times. */
default SimpleParser<List<T>> many1() {
    return then(c -> many().then(cs -> unit(cons(c, cs))));
}

(cons is a utility that adds an element at the front of a persistent list. Think of copying the cs to a new list and adding c at index 0.)

Note that many and many1 are mutually recursive. In particular, the definition for many states that a parser can either be applied at least once or not at all, while the definition for many1 states that it can be applied once and then zero or more times. The result of applying a parser repeatedly is a list of the collected results. At some point, we may want to convert this list back to a single value, e. g. we'd perhaps like to convert a sequence of digits to an integer value.

public static SimpleParser<Integer> nat() {
    return digit().many1()
                  .then(cs -> output(ParserUtils.charsToString(cs)))
                  .then(s -> output(Integer.parseInt(s)));
}

This looks a bit clumsy, after all we are only transforming the result value inside a parser without actually consuming any input. It would be more natural if we had a method map to do that. (BTW, Hutton does not include such a method.) Fortunately, map is easy to define in terms of unit and then. In fact, it is axiomatic that this definition should be valid, otherwise we have made a mistake in the definition of our flatMap-analogue then (probably not with unit, since it is so simple).

default <V> SimpleParser<V> map(Function<? super T, V> f) {
    Objects.requireNonNull(f);
    Function<V, SimpleParser<V>> unit = this::unit;
    return then(unit.compose(f));
}

A new parser is "wrapped" around the output of the function f, as opposed to then, which already takes a parser-bearing function and does not wrap it with another parser. The analogy with map and flatMap on Stream is obvious. Our code for parsing a natural number now becomes more readable. And we can now also use method references.

public static SimpleParser<Integer> nat() {
    return digit().many1()
                  .map(ParserUtils::charsToString)
                  .map(Integer::parseInt);
}

Tokenization

In order to handle spacing, we introduce a new parser token that ignores any space before and after applying some given parser to an input string.

/** A parser that collapses as many whitespace characters as possible. */
public static SimpleParser<String> space() {
    return whitespace().many().map(s -> "");
}

/** A parser that ignores any space around a given parser for a token. */ 
public static <T> SimpleParser<T> token(SimpleParser<T> p) {
    Objects.requireNonNull(p);
    return space().then(() -> p.then(v -> space().then(() -> output(v)))); 
}

We may be interested in a diversity of tokens, according to our target language, such as tokens for variable names and other identifiers, reserved words, operators, numbers etc. For example, we can define

/** A parser for a natural number token. */ 
public static SimpleParser<Integer> natural() {
    return token(nat());
} 

/** A parser for a symbol token. */
public static SimpleParser<String> symbol(String sym) {
    Objects.requireNonNull(sym);
    return token(string(sym));
}

This completes our overview of the parsing primitives. Let's see a few applications.

Lists of numbers

The following test defines a parser for a non-empty list of natural numbers that ignores spacing around tokens. This definition states that such a list begins with an opening square bracket and a natural number, followed by zero or more commas and natural numbers, and concludes with a closing square bracket. Note that the parser should only succeed if a complete list in precisely this format is consumed.

@Test
public void tokenization() {
    SimpleParser<List<Integer>> numList = 
                            symbol("[").then(() ->
                            natural().then(n ->
                                symbol(",").then(() -> natural()).many().then(ns ->
                            symbol("]").then(() ->
                            output(cons(n,ns))))));

    assertEquals(tuple(asList(1,2,3),""), numList.parse("[1,2,3]"));
    assertEquals(tuple(asList(11,22,33),""), numList.parse(" [11,  22, 33 ] "));
    assertEquals(tuple(asList(11,22,33),"abc"), numList.parse(" [11,  22, 33 ] abc"));
    assertEquals(empty(), numList.parse("[1,  2,]"));
}

You get the idea. There is one shortcoming: Detecting failure on account of ill-formed or unused input involves examining the output tuples in caller code. This may not always be what we want.

Representing failure

What would one expect to happen when some input is not accepted? In Java one would certainly expect an exception to be thrown. However, that is not so easy here. Of course, we cannot make then or failure throw an exception instead of returning empty(), because that would abort all possible search paths, not just the one we're currently on. Putting exception handling code for regular control flow into orElse I also consider bad. The best thing to do, I suppose, is to add a top-level method to the parser that would examine the parsing result.

default T eval(CharSequence cs) {
    Objects.requireNonNull(cs);
    Tuple<T, CharSequence> t = parse(cs);
    if (t.isEmpty()) {
        throw new IllegalArgumentException("Invalid input: " + cs);
    }
    CharSequence unusedInput = t.second();
    if (unusedInput.length() > 0) {
        throw new IllegalArgumentException("Unused input: " + unusedInput);
    }
    return t.first();
}

This way, we have also successfully hidden the internal use of tuples from callers of the parser, which is a good thing.

Arithmetic expressions

Hutton presents an extended example for parsing arithmetic expressions. The example serves to show how easy it can be to create a little DSL.

Consider a simple form of arithmetic expressions built up from natural numbers using addition, multiplication, and parentheses. We assume that addition and multiplication associate to the right, and that multiplication has higher priority than addition. Here's an unambiguous BNF grammar for such expressions.

expr ::= term (+ expr | nil)
term ::= factor (∗ term | nil)
factor ::= natural | (expr)
natural ::= 0 | 1 | 2 | · · ·

It is now straightforward to translate this grammar into a parser for expressions, by simply rewriting the rules using our parsing primitives. We choose to have the parser itself evaluate the expression to its integer value, rather than returning some form of tree.

SimpleParser<Integer> expr() {
    return term().then(t ->
                    symbol("+").then(() -> expr().then(e -> output(t + e)))
                    .orElse(output(t)));
}
    
SimpleParser<Integer> term() {
    return factor().then(f ->
                      symbol("*").then(() -> term().then(t -> output(f * t)))
                      .orElse(output(f)));
}
    
SimpleParser<Integer> factor() {
    return natural()
           .orElse(
               symbol("(").then(() ->
               expr().then(e ->
               symbol(")").then(() ->
               output(e)))));
}

I like the look of that. Once you get used to the formalism, writing the parser is as easy as writing the grammar in the first place. Let's test the whole thing:

@Test
public void multiplicationTakesPrecedence() {
    assertThat( expr().eval("2+3*5"), is(17) );
}

Laziness

Notice how we have created functions at each step that only get evaluated when we actually use them. Only then is a new parser instantiated. The construction of the parser is interleaved with its execution. As mentioned above, it is important in this context that then always takes a function, if necessary even a no-argument function, that yields a parser, never directly a parser instance. If that were different, we might actually have an infinite loop, e. g. expr() → term() → factor() → expr(), where the last call to expr() would be that in the "orElse"-clause, because Java as a strict language evaluates method arguments immediately.

Recursive lambdas

You may wonder why I have written the expression parser in terms of methods, instead of declaring instance variables for the functions expr, term, and factor.The reason is that lambdas capture local variables by value when they are created, and we would therefore get a compile-time error: "Cannot reference a field before it is defined". The only way to write recursive lambdas is to make everything static and use fully qualified names (see the Lambda FAQ).

Lessons learned

I feel I have learned something from this exercise, especially about the feel of functional programming in Java. A few particular lessons have been:

"Monadic" interfaces are easy to design once you get your head around flatMap and its relatives. This method in its various guises, in Stream, Optional, CompletableFuture or our SimpleParser, is a powerful tool you need to understand.
When you need to add functionality to your lambdas, default methods in interfaces are the way to go. You can instantiate your functions with lambda expressions, and then combine them fluently in a variety of interesting ways.This is also what the JDK does, e. g. the Function interface has a default method compose for functional composition.
Our parser interface has rather many (in fact, eight) default methods. All these default methods are strongly coupled. It is generally accepted that this is bad when designing for inheritance (cf. what Josh Bloch has to say in Effective Java, 2nd ed., items 17 and 18). Perhaps that is a reason to feel a bit queasy about the approach.Which leads me to the next point.
Functional interfaces need not be designed for inheritance, so that in contrast to "ordinary" interfaces, strong coupling is acceptable: I have adopted a practice according to which I do not extend functional interfaces and implement them only with lambdas. (It's nevertheless OK to extend another functional interface in order to specialize a generic type variable or add a few little static utilities. That is what UnaryOperator and BinaryOperator do, and they are the only two of the 57 functional interfaces in the JDK to extend another interface.)
Functions are great for re-usability. Being implemented via interfaces, they are usually public. As a consequence, data structures tend to leak all over the place. (We had to take special care to hide the use of tuples.) I guess that this is part of the functional mind-set: While in the OO-world, we encapsulate and hide as much we can (with reason), this is less important in the functional world.
Don't clutter your interfaces with too many static methods. It is sometimes difficult to decide where a static method should go (e. g. output and failure are so essential, should they have been put into the parser interface?).
Java is really not a functional programming language.Stay with idiomatic Java as far as possible. Some of the things that I found to be missing in Java to make it more "functional" are

persistent collections, tuples (you can pull-in third party libraries)
some important syntactic sugar like do-notation, inconspicuous unused variables (perhaps underscore only) etc. Nested constructs can quickly become difficult to read.
the ability to write non-static recursive lambdas.
a nice syntax for functional application.

Beware of Java's eagerness, prefer lazy idioms.
Don't kill your backtracking with exceptions.
Do not throw checked exceptions from your code. Sooner or later (probably sooner) it's going to be a pain when you want it to work with lambdas, because all the most important standard functional methods in the JDK - like Function#apply() - do not have a throws-clause.

In my IDE (Eclipse Mars) I have been annoyed by two things in particular. I don't know if other IDEs offer a better experience.

Debugging lambdas is a pain. (As an exercise, you might replace equals with == in the character parser, pretend you didn't know that you made this silly mistake, and try to fix it.) Unit testing becomes all the more important. Here are a few negative experiences I have made:

Being restricted to line breakpoints is bad when you're interested only in a part of those nice one-liners.
In order to set breakpoints inside a lambda, you need blocks, again bad for elegance.
When you're inside the lambda, you may see synthetic members named arg$1 etc. without being really able to tell what they are. It helps a bit when you at least show the declared type in the Variables view.
Stepping into a lambda is not really an option, because of all the synthetic stuff in between.

Formatting lambdas is not always easy (I often do it manually, case by case, because I don't always like what Eclipse does automatically).

And do have a look at Hutton's book!

CompletableFuture as a Trampoline for Tail Recursion Elimination

2015-12-05T00:12:00.001+01:00

A function is said to be tail recursive when the result of the recursive call is what we return from the function. In other words, the recursive call is the last thing we do, and in particular, when the recursive call returns, we do not need any of the local variables and parameters of the current stack frame.

In many languages, the compiler will recognize this situation and re-use the current stack frame for the recursive call. This is called tail recursion elimination (or tail call optimization, in the more general case that the final function call is not recursive). But it doesn't happen in Java.

There are standard techniques to implement tail call optimization when the language does not have it. One possibility is to transform the recursive function to a loop manually, which is easy to do, typically easier than transforming a non-tail-recursive function into a tail recursive one. Another such method is called trampolining. A trampoline is simply a loop that performs the function calls iteratively, one after the other. In order to achieve that, the recursive function is rewritten so that it is no longer actually recursive, but instead returns immediately with a higher-order function that will call the original function when evaluated inside the loop. These higher-order wrappers are usually called thunks. The term "trampoline" derives from the visual image of control bouncing up and down between the loop and the function, without ever spiralling downwards as in true recursion.

Information on trampolining is not hard to find. For example, here and here are two nice posts, with illustrations. As for Java, Pierre-Yves Saumont presents a trampoline implementation in this post (without actually mentioning the term).

I have noticed that Java actually contains a built-in class that implements trampolining, namely CompletableFuture. You get the desired behavior by making asynchronous tail calls (Viktor Klang's term). We won't need any custom classes or explicit loops. Let's use the Fibonacci function as an example. Here's a tail recursive formulation of it:

public static BigInteger fibTailRecursive(int n) {
    return n <= 2 ? ONE : fibTailRecursiveAux(n, ZERO, ONE);
}

private static BigInteger fibTailRecursiveAux(int n, BigInteger a, BigInteger b) {
    return n <= 0 ? a : fibTailRecursiveAux(n - 1, b, a.add(b));
}

To get the trampoline, we delay the recursive calls by wrapping them in CompletableFuture. The terminal value will be wrapped in an already completed future. Here's the corresponding thunked version of the above function definition:

public static BigInteger fibCF(int n) {
    return n <= 2 ? ONE : fibCF(n, ZERO, ONE).join();
}

private static CompletableFuture<BigInteger> fibCF(int n, BigInteger a, BigInteger b) {
    return n <= 0 ? terminate(a) : tailcall(() -> fibCF(n - 1, b, a.add(b)));
}

The complete "framework" consists only of the two utility methods terminate and tailcall. Plus we also should provide a dedicated thread to run the async calls in. (Adding more threads, or using the common Fork-Join pool actually slows things down in my environment.)

public static <T> CompletableFuture<T> terminate(T t) {
    return CompletableFuture.completedFuture(t);
}

public static <T> CompletableFuture<T> tailcall(Supplier<CompletableFuture<T>> s) {
    return CompletableFuture.supplyAsync(s, POOL).thenCompose(identity()); 
}

private static final ExecutorService POOL = Executors.newSingleThreadExecutor(new ThreadFactoryBuilder().setDaemon(true).build());

The class ThreadFactoryBuilder is from Guava. Composing with the identity function will unwrap the nested CompletableFuture that comes out of the call to supplyAsync.

Note that it is essential to use supplyAsync. Making synchronous calls, or using an Executor that runs tasks immediately in the caller thread (for example, Executor e = Runnable::run), would lead to a stackoverflow for large inputs.The trampoline loop is realised inside CompletableFuture by taking tasks from the queue associated with the Executor. This feature is not really documented. Although Doug Lea has pointed out to me that there is an implementation comment at the top of CompletableFuture that points in that direction

      * Method postComplete is called upon completion unless the target

      * is guaranteed not to be observable (i.e., not yet returned or

      * linked). Multiple threads can call postComplete, which

      * atomically pops each dependent action, and tries to trigger it

      * via method tryFire, in NESTED mode. Triggering can propagate

      * recursively, so NESTED mode returns its completed dependent (if

      * one exists) for further processing by its caller (see method

      * postFire).

The bad news is performance. I have benchmarked this solution against a manually optimized iterative version and against Saumont's TailCall class. The upshot is that TailCall performs as well as the manually coded loop. Using CompletableFuture is three times as slow. Here's a representative measurement for computing the 5000th Fibonacci number:

# Warmup: 5 iterations, 1 s each
# Measurement: 25 iterations, 1 s each
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op

Benchmark                                   Mode Cnt Score   Error Units
TrampolineBenchmark.measureLoop avgt   25 0.403 ± 0.004 ms/op
TrampolineBenchmark.measureTC    avgt   25 0.415 ± 0.002 ms/op
TrampolineBenchmark.measureCF    avgt   25 1.258 ± 0.009 ms/op

Nevertheless, I like the simplicity of it.

Deterministic and Non-Deterministic Finite State Machines with Cyclops

2015-11-13T18:39:00.000+01:00

I have just recently become aware of the Cyclops library for functional programming with Java 8. This exciting library offers many excellent features, among them

Enhancements and utilities for Streams
Advanced monadic operations
Powerful pattern matching
Interoperability with both TotallyLazy and javaslang

and more. I have only looked at a part of it, and I'm sure there are many useful things to discover. Go look for yourself!

Abstracting over Monads

The feature that drew my attention initially, however, were Cyclops' tools for monadic programming. In particular, they can abstract over the three monads in JDK 8: Stream, Optional and CompletableFuture (and any other monads you may find or create). By using what they call Comprehenders, indeed they can coerce even classes that are technically non-monadic (like List) into a monad. You can find pointers to articles about this on the Cyclops page, for example this readable introduction.

Why did this interest me so much? Well, monads are particularly good at combining computations into a coherent whole, abstracting away some boilerplate code. The algorithm for combining the computations can always be the same (using a method that is usually called flatMap or bind), with the details of what happens being delegated to the particular monad. An instructive example is given by Mike MacHenry in this post. MacHenry models a deterministic finite state machine (DFA) with a transition function that has type Optional, and a nondeterminstic fine state machine (NFA) with a transition function that has type List. He then shows how we can abstract over DFAs and NFAs by implementing a traversal function that works for both kinds of machines. The advantage is interface economy, and consequently more elegance and less maintenance.

MacHenry's post is in the context of Haskell, and I have always wished to be able to do the same thing in Java. But in the JDK, although Optional basically implements the same conceptual API as List (you might think of Optional as a sequence with a cardinality of at most 1, cf. here), there is no actual interface to capture the common functionality. You might even be driven to model a DFA as an NFA and place the constraint that the lists must contain at most one state in the Javadoc. The point is that this would be a decidedly inferior DFA model.

Finally, with Cyclops, we can do what we really want. Cyclops introduces a simple wrapper class, called AnyM, to manage the abstraction.

The technicalities of working with AnyM, however, can also be a bit involved, so we want to hide that from our application logic and provide a common implementation for DFAs and NFAs with an API that exposes only standard JDK classes. I'd like to show you how I did it. The solution requires Cyclops 6.1.1 or later.

Modelling Finite State Machines

A finite state machine is defined by it's inventory of states, an alphabet of permissible input symbols and a transition table that takes the current state, the next input symbol to be consumed and yields the next state (or states, in the case of an NFA). In Java, the transition function can be modelled by

BiFunction<S, Character, M>

where S denotes the type of the machine's states, and M the monadic container used as the result value of the transition function (Optional<S> or List<S> for a DFA or NFA, resp.)

The states and alphabet will be left implicit in the transition table. In addition, for convenience we will allow the transition table to be partial, in the sense that it need only contain the valid transitions. The appropriate empty value that signifies an invalid transition is supplied separately when creating a state machine. Here's the factory method signature:

public static <S, M> StateMachine<S,M> newStateMachine(BiFunction<S, Character, M> partialTransitions, M defaultValue)

The machine itself is represented as a traversal function that takes a state and a sequence of input characters, and returns the resulting state(s) after consuming the entire input, in which case the input is said to have been accepted. (Well, really it is only accepted when you start from a properly identified initial state and end up in a final state, but I'm ignoring this detail here. You can easily build it on top of the implementation given below.)

@FunctionalInterface
public interface StateMachine<S, M> {
    M accept(S state, CharSequence input);
}

The states themselves can be anything. We will assume a simple immutable class State hat has an integer identity attribute with equals and hashCode defined accordingly, and an associated factory method state(int).

Working with Finite State Machines

In proper test-first fashion, let us first consider how we want to be able to use our state machines. Here are a couple of JUnit tests. I will use Guava's two-dimensional array table to provide the transition function.

import static java.util.Arrays.asList;
import static java.util.Collections.emptyList;
import static java.util.Optional.empty;
import static java.util.Optional.of;

@Test
public void demoDFA() {
    ArrayTable<State, Character, Optional<State>> transitiontable = ArrayTable.create(
        asList(state(1), state(2)), asList('a', 'b'));

    transitiontable.put(state(1), 'a', of(state(2)));
    transitiontable.put(state(1), 'b', of(state(1)));
    transitiontable.put(state(2), 'b', of(state(1)));

    StateMachine<State,Optional<State>> dfa = newStateMachine(transitiontable::get, empty());
    Optional<State> finalState = dfa.accept(state(1), "ab");
    assertEquals(state(1), finalState.get());
}

@Test
public void demoNFA() {
    ArrayTable<State, Character, List<State>> transitiontable = ArrayTable.create(
        asList(state(1), state(2), state(3)), asList('a', 'b'));

    transitiontable.put(state(1), 'a', asList(state(2), state(3)));
    transitiontable.put(state(2), 'a', asList(state(2)));
    transitiontable.put(state(3), 'b', asList(state(3)));

    StateMachine<State, List<State>> nfa = newStateMachine(transitiontable::get, emptyList());
    List<State> finalStates = nfa.accept(state(1), "ab");
    assertEquals(asList(state(3)), finalStates);
}

I have not shown any test with inacceptable input. In that case, because the transition function will return either an empty Optional or empty List when it encounters a state out of which there is no path, the result will also be empty. And of course, in case the input is empty, we shall expect the result to be just the initial state (wrapped in its appropriate monad).

Implementing the Finite State Machine

Without further ado, here's the implementation. It's quite concise. Explanatory comments will follow.

public static <S, M> StateMachine<S,M> newStateMachine(BiFunction<S, Character, M> partialTransitions, M defaultValue) {
    Objects.requireNonNull(partialTransitions);
    Objects.requireNonNull(defaultValue);
 
     BiFunction<S, Character, AnyM<S>> totalTransitions = (s, c) -> 
         AnyM.ofMonad(Optional.ofNullable(partialTransitions.apply(s, c)).orElse(defaultValue));

    Function<S, AnyM<S>> unit = s -> AnyM.ofMonad(defaultValue).unit(s);
 
    return (state,input) -> machine(totalTransitions, unit).accept(state, input).unwrap();
}

private static <S> StateMachine<S, AnyM<S>> machine(BiFunction<S, Character, AnyM<S>> transitions, Function<S, AnyM<S>> unit) {
    return (state,input) -> {
                if (input.length() == 0) return unit.apply(state);
                return transitions.apply(state, input.charAt(0))
                           .flatMap(next -> machine(transitions,unit).accept(next, input.subSequence(1, input.length())));
           };
}

First thing in the factory method newStateMachine we extend the given function to a total function, and make it put the function value into Cyclops' AnyM wrapper. (Making the function total will obviate dealing with any null value later.) AnyM.ofMonad takes an Object that is already of a supported type and makes it accessible through the wrapper's monadic interface.

Then we define a function that creates a new instance of the underlying monad when given a state. We need that function to terminate the recursive traversal. Cyclops provides a unit method that we can use. We can supply the given default value to provide the required monad type.

Finally, we return the traversal function. Cyclops' unwrap method will remove the AnyM wrapper around the resulting monad.

The traversal function is recursive, terminating with the current state when the input is exhausted. It uses the transition function to look up any new state (or states) reachable from the current state while consuming the next input character, and for each new state that it finds calls that state next and continues to traverse the graph with next as the current state and consuming the rest of the input.

Cyclops API

Cyclops before release 6.1.1 used to contain a host of conversion functions. It was really difficult to know when to use what method. I am indebted to John McClean, the author of Cyclops, for some friendly help with the correct usage of the Cyclops APIs, having made many wrong choices myself. This has not been an easy library to learn!

Fortunately, with Cyclops 6.1.1 many of these methods have been deprecated (see the release notes), which has improved the situation a lot. Just use the factory methods on class AnyM. Javadoc coverage has also been extended.

Some conversion methods ensure that the wrapped type inside AnyM remains unchanged, some convert to Stream implicitly. The latter approach (although it can be more efficient) may impose some additional burden on the application coding. In our case that burden would be allowing for a different monadic type to be returned from newStateMachine than was put in, and explicitly collecting to a list in the caller. The choice is yours.

Conclusion

With DFAs, flatMap over Optional abstracts away the tedium of looking at the intermediate result and checking whether it's empty and continuing only with a non-empty value. With NFAs, flatMap over List abstracts away the tedium of applying a map and concatenating the results. With Cyclops, we can do both in a uniform manner.

I hope you have found this example as instructive as I did.

Cartesian Products with Kleisli Composition

2015-09-10T13:29:00.000+02:00

Over at the JOOQ blog, they have written about How to use Java 8 Functional Programming to Generate an Alphabetic Sequence. By "alphabetic sequence" they mean the (n-order) Cartesian product of the letters in the alphabet. For example, the third-order Cartesian product of the alphabet would be

AAA, AAB, AAC, ..., ZZZ

The author of that post (it is unsigned, but I believe it was written by Lukas Eder) claims that

the Java 8 Stream API does not offer enough functionality for this task.

I will show that this claim is wrong, and that there is in fact a pure Java 8 solution. This solution brings out the structure of the problem rather nicely. It is general in the sense that it uses a general function combinator well-known in functional programming. It relies only on the Java 8 Streams API without needing any constructs from jOOλ.

The first building block of the solution is provided by Misha in his answer to this Stackoverflow question. Here's Misha's function that will create a stream of combinations of its first argument with the given stream, where the mode of combination can be externally specified as well. (In the following I will not only call this method crossJoin, but also the function that it returns.)

<T, U, R> Function<T, Stream<R>> crossJoin(Supplier<Stream<? extends U>> streamSupplier, BiFunction<? super T, ? super U, ? extends R> combiner) {
    return t -> streamSupplier.get().map(u -> combiner.apply(t, u));
}

Now you can do the nth-order Cartesian product by applying flatMap with crossJoin a fixed number of times. A question that goes unanswered in that Stackoverflow discussion is at the core of the JOOQ post: How deal with the situation when the required number of applications is not fixed, but supplied at runtime as a parameter? That is, how do we implement

List<String> cartesian(int n, Collection<String> coll)

Of course, you might do it recursively. But that would rather count as supporting the claim that Java streams were not sufficient! Fortunately, the implementation becomes easy once we remember that the repeated application of functions is equivalent to a single application of the composition of these functions. In our case, we actually want sequential composition, which is just the flip of regular composition. Looking at the signature

crossJoin :: a -> Stream b

it may be hard to see at first how we might compose functions of this signature sequentially. Actually, however, there is guaranteed to be such a combinator. It is called "Kleisli composition", written ">=>", and has the following signature:

>=> :: (a -> Stream b) -> (b -> Stream c) -> (a -> Stream c)

In fact, >=> has a completely general definition for all so-called "monadic types", of which Stream in Java is one. What this means is that we could even abstract over Stream in the above signature, and implement sequential composition for all those types in the same way. This is easier in some languages (Haskell, Scala) than in others (Java). This article by Yoav Abrahami gives an instructive example of what you can do with Kleisli composition in Scala. It demonstrates the same trick we are about to perform, namely replacing recursion with a form of functional composition.

For the sake of example, instead of generalizing we will simplify somewhat. In our case we are dealing only with streams of strings. Here's the Java implementation of >=> for this special case:

BinaryOperator<Function<String,Stream<String>>> kleisli = (f,g) -> s -> f.apply(s).flatMap(g);

The Stream type, being "monadic", obeys certain axioms, one of which states that >=> is associative. So we may use it in a reduction. The required identity function is s -> Stream.of(s). You can easily convince yourself of that by considering the first-order product, which is of course just the original list, which we'll get by lifting every element to Stream and flat-mapping back down, without invoking crossJoin at all. (Of course, you can also prove it by plugging the identity into the definition of >=>.)

Let's put it all together: The idea is to create a stream of as many crossJoin functions as we need, reduce this stream to a single function in memory by composing them together, and finally apply the entire function chain in one fell swoop.

List<String> cartesian(int n, Collection<String> coll) {
    return coll.stream()
           .flatMap( IntStream.range(1, n).boxed()
                     .map(_any -> crossJoin(coll::stream, String::concat)) // create (n-1) appropriate crossJoin instances
                     .reduce(s -> Stream.of(s), kleisli)                   // compose them sequentially with >=>
                    )                                                      // flatMap the stream with the entire function chain
           .collect(toList());
}

The following call will give you a list of all three-letter alphabetic sequences:

cartesian(3, alphabet)

A variation on the above is when you do not want to multiply a collection n times with itself, but with n other collections (all of the same type). Instead of passing in n, you might pass in the sequence of those other collections, and instead of streaming an integer range you might stream that sequence, creating a crossJoin function for each element, like this: Stream.of(colls).map(c -> crossJoin(c::stream, String::concat))

For good measure, here's how you may construct the list of letters alphabet, if you do not wish to write them down severally with Arrays.asList:

List<String> alphabet = IntStream.rangeClosed('A','Z')
                        .mapToObj(c -> String.valueOf((char) c))
                        .collect(toList());

[Addendum]:

It's important that we pass a Supplier<Stream> to crossJoin, not the stream itself.
You might also be interested in Tagir Valeev's answer to a related question on Stackoverflow.

Recursive Parallel Search with Shallow Backtracking

2015-06-03T11:23:00.001+02:00

The implementation in the previous post had a serious disadvantage: We generated substitutions for all the letters, and then checked if those substitutions constituted a solution. However, in a more efficient approach some partial substitutions could be rejected out-of-hand as not leading to a solution. This is called "shallow backtracking". In this post I show how to examine the letters as they occur from right to left in the operands, and interleave the checking of arithmetic constraints with the generation of substitutions.

This solution combines the advantages of flatMap-based search in Java 8, parallelization, persistent collections, and recursion. This code that I will show is completely general for cryptarithmetic puzzles with two summands, because the constraints are no longer problem-specific: in fact they encode only the rules of long addition plus a general side condition.

The implementation is also a bit more idiomatic (for Java), with less clutter in the method interfaces, because we keep the read-only operands in instance variables instead of passing them around explicitly.

The really nice thing is that this approach is about 25 times faster than the original parallel flatMap solution with exhaustive search.

Here's the code:

import static com.googlecode.totallylazy.collections.PersistentList.constructors.list;
import static com.googlecode.totallylazy.collections.PersistentMap.constructors.map;
import static java.util.stream.Collectors.toList;

import java.util.Collection;
import java.util.Map;
import java.util.stream.Stream;

import com.googlecode.totallylazy.collections.PersistentList;
import com.googlecode.totallylazy.collections.PersistentMap;

public class SendMoreMoneyShallow {

    static final int PRUNED = -1;
    static final char PADDING = ' ';
    static final PersistentList<Integer> DIGITS = list(0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
    
    // padded puzzle arguments: op1 + op2 = op3
    final String op1;
    final String op2;
    final String op3;
    
    public static void main(String[] args) {
        SendMoreMoneyShallow puzzle = new SendMoreMoneyShallow(" send", " more", "money");
        Collection<String> solutions = puzzle.solve();
        System.out.println("There are " + solutions.size() + " solutions: " + solutions);
    }

    public SendMoreMoneyShallow(String op1, String op2, String op3) {
        // the arguments come padded with blanks so they all have the same length
        // there is no need to reverse the strings, because we have random access and can process them backwards
        assert op1.length() == op3.length();
        assert op2.length() == op3.length();
        this.op1 = op1;
        this.op2 = op2;
        this.op3 = op3;
    }
    
    public Collection<String> solve() {
        PersistentMap<Character, Integer> subst = map();
        Collection<String> solutions = go(op1.length() - 1, subst, 0).collect(toList());
        return solutions;
    }

    Stream<String> go(int i, PersistentMap<Character, Integer> subst, int carry) {
        // Each level of recursion accomplishes up to three substitutions of a character with a number. The recursion
        // should end when we run out of characters to substitute. At this point, all constraints have already been
        // checked and therefore the substitutions must represent a solution.
        if (i < 0) {
            return solution(subst);
        }
        
        // the state consists of partial substitutions and the carry. Every time we have made a substitution for a column
        // of letters (from right to left), we immediately check constraints.
        Character sx = op1.charAt(i);
        Character sy = op2.charAt(i);
        Character sz = op3.charAt(i);
        return candidates(sx, subst).stream().parallel().flatMap(x -> {
                PersistentMap<Character, Integer> substX = subst.insert(sx,x);
                return candidates(sy, substX).stream().flatMap(y -> {
                    PersistentMap<Character, Integer> substXY = substX.insert(sy,y);
                    return candidates(sz, substXY).stream().flatMap(z ->   {
                        PersistentMap<Character, Integer> substXYZ = substXY.insert(sz, z);
                        // recurse if not pruned, using the tails of the strings, the substitutions we have just made, and
                        // the value for carry that results from checking the arithmetic constraints
                        int nextCarry = prune(substXYZ, carry, x, y, z);
                        return nextCarry == PRUNED ? Stream.empty() : go(i - 1, substXYZ, nextCarry);
                    });});});
    }

    int prune(PersistentMap<Character, Integer> subst, int carry, Integer x, Integer y, Integer z) {
        // neither of the most significant digits may be 0, and we cannot be sure the substitutions have already been made
        if (subst.getOrDefault(mostSignificantLetter(op1), 1) == 0 || subst.getOrDefault(mostSignificantLetter(op2), 1) == 0) {
            return PRUNED;
        }

        // the column sum must be correct
        int zPrime = x + y + carry;
        if (zPrime % 10 != z) {
            return PRUNED;
        }

        // return next carry
        return zPrime / 10;
    }

    PersistentList<Integer> candidates(Character letter, PersistentMap<Character, Integer> subst) {
        if (letter == PADDING) {
            return list(0);
        }
        // if we have a substitution, use that, otherwise consider only those digits that have not yet been assigned
        return subst.containsKey(letter) ? list(subst.get(letter)) : DIGITS.deleteAll(subst.values());
    }

    Stream<String> solution(PersistentMap<Character, Integer> subst) {
        // transform the set of substitutions to a solution (in this case a String because Java has no tuples)
        int a = toNumber(subst, op1.trim());
        int b = toNumber(subst, op2.trim());
        int c = toNumber(subst, op3.trim());
        return Stream.of("(" + a + "," + b + "," + c + ")");
    }

    static final int toNumber(Map<Character, Integer> subst, String word) {
        // return the integer corresponding to the given word according to the substitutions
        assert word.length() > 0;
        return word.chars().map(x -> subst.get((char)x)).reduce((x, y) -> 10 * x + y).getAsInt();
    }

    static char mostSignificantLetter(String op) {
        return op.trim().charAt(0);
    }
}

And here's a representative performance measurement:

# JMH 1.9.1 (released 40 days ago)
# VM invoker: C:\Program Files\Java\jdk1.8.0_25\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 25 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: java8.streams.SendMoreMoneyShallowBenchmark.measureShallowBacktrackingPerformance

Benchmark Mode Cnt Score Error Units
measureShallowBacktrackingPerformance avgt 25 6.319 ± 0.082 ms/op

There is room for still more improvement, because the substitution for "z" in each round is in fact determined by the previous substitutions and need not be guessed.

Recursive Parallel Search

2015-05-27T22:26:00.004+02:00

Bartosz Milewski has been discussing monadic programming in C++. In this post he presents a recursive version of the flatMap-based solution of the "Send more money" cryptarithmetic puzzle.

(BTW: Bartosz' writings always make for excellent reading if you're at all interested in functional programming.)

Here's a Java version of the recursive approach. I have already discussed the non-recursive solution in a previous blog entry. The main advantage over the original solution is that the recursive approach is less redundant and much easier to generalize. The main drawback is that on my machine it is about 6 times slower.

Please note how easy it is to parallelize this recursive solution when using persistent collections, in this case the ones from the TotallyLazy framework. Please spend a minute thinking about how you would go about coding this solution using only classes from the JDK. It's not trivial. (I would guess the same is true of the C++ implementation).

So here goes:

import static com.googlecode.totallylazy.collections.PersistentList.constructors.list;
import static com.googlecode.totallylazy.collections.PersistentMap.constructors.map;
import static java.util.stream.Collectors.toList;

import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;

import com.googlecode.totallylazy.collections.PersistentList;
import com.googlecode.totallylazy.collections.PersistentMap;

public class SendMoreMoneyRecursive {

    static final PersistentList<Character> LETTERS = list(uniqueChars("sendmoremoney"));
    static final PersistentList<Integer> DIGITS = list(0, 1, 2, 3, 4, 5, 6, 7, 8, 9);

    public static void main(String[] args) {
        SendMoreMoneyRecursive puzzle = new SendMoreMoneyRecursive();
        Collection<String> solutions = puzzle.solve(LETTERS, DIGITS);
        System.out.println("There are " + solutions.size() + " solutions: " + solutions);
    }

    Collection<String> solve(PersistentList<Character> str, PersistentList<Integer> digits) {
        PersistentMap<Character, Integer> subst = map();
        Collection<String> solutions = go(str, digits, subst).collect(toList());
        return solutions;
    }

    Stream<String> go(PersistentList<Character> str, PersistentList<Integer> digits,
            PersistentMap<Character, Integer> subst) {
        // Each level of nesting accomplishes one substitution of a character by a number. 
        // The recursion ends when we run out of characters to substitute.
        if (str.isEmpty()) {
            return prune(subst);
        }
        return digits.stream().parallel().flatMap(n -> go(str.tail(), digits.delete(n), subst.insert(str.head(), n)));
    }

    Stream<String> prune(PersistentMap<Character, Integer> subst) {
        // we know that we never look up a value that’s not in the map
        if (subst.get('s') == 0 || subst.get('m') == 0) {
            return Stream.empty();
        }
        int send = toNumber(subst, "send");
        int more = toNumber(subst, "more");
        int money = toNumber(subst, "money");
        return send + more == money ? Stream.of(solution(send, more, money)) : Stream.empty();
    }

    String solution(int send, int more, int money) {
        return "(" + send + "," + more + "," + money + ")";
    }

    static final int toNumber(Map<Character, Integer> subst, String word) {
        assert word.length() > 0;
        return word.chars().map(x -> subst.get((char)x)).reduce((x, y) -> 10 * x + y).getAsInt();
    }

    static List<Character> uniqueChars(String s) {
        return s.chars().distinct().mapToObj(d -> Character.valueOf((char) d)).collect(toList());
    }
}

Stream#flatMap() may cause short-circuiting of downstream operations to break

2015-05-01T13:42:00.001+02:00

There is a bug report showing how flatMapping to an infinite stream may lead to non-termination of stream processing even in the presence of a short-circuiting terminal operation.

On StackOverflow, there is a discussion (in fact it was this discussion that led to the entry in the bug database) in which participants agree that the behavior is confusing and perhaps unexpected, but do not agree on whether it is actually a bug. There seem to be different ways to read the spec.

Here's an example:

Stream.iterate(0, i->i+1).findFirst()

works as expected, while

Stream.of("").flatMap(x->Stream.iterate(0, i->i+1)).findFirst()

will end up in an infinite loop.

The behavior of flatMap becomes still more surprising when one throws in an intermediate (as opposed to terminal) short-circuiting operation. While the following works as expected, printing out the infinite sequence of integers

Stream.of("").flatMap(x -> Stream.iterate(1, i -> i + 1)).forEach(System.out::println);

the following code prints out only the "1", but still does not terminate:

Stream.of("").flatMap(x -> Stream.iterate(1, i -> i + 1)).limit(1).forEach(System.out::println);

I cannot imagine a reading of the spec in which this were not a bug.

Yield Return in Java (comment on Benji Webber)

2015-04-27T23:11:00.001+02:00

Benji Webber has said that a feature often missed in Java by C# developers is yield return, and considers very complex ways of bringing this into Java for the implementation of iterators and generators. In particular he discusses a threading approach in some detail, which makes it seem really hard to generate and print out the integers from one to five.

In fact, with Java 8, there is no need for any of that, because the required behavior is already built into the lazy execution model of streams. Iterators and generators are part and parcel of the Stream API.

The following code, for example, will print the infinite series of positive numbers:

Stream.iterate(1, x->x+1).forEach(System.out::println);

Throwing in a limit(5) will give you the one-to-five example.

The reason that this works is that each stream element is only generated (lazily) when a downstream method explicitly asks for it. Contrary to appearances, no list of five elements is ever constructed in the snippet

Stream.iterate(1, x->x+1).limit(5).forEach(System.out::println);

To my eyes, the Java 8 code is even nicer to read than the corresponding code in C#.

Easy exhaustive search with Java 8 Streams

2015-04-27T20:15:00.000+02:00

I have just been reading this post by Mark Dominus on Haskell. It discusses how the Haskell list monad can be used to hide some of the glue code involved in doing exhaustive searches. Java 8 Streams, which are somewhat similar to Haskell lists in also being monadic, lend themselves to the same style of coding.

The example used in the post I have quoted is the well-known crypt-arithmetics puzzle in which you are asked to find all possible ways of mapping the letters S, E, N, D, M, O, R, Y to distinct digits 0 through 9 (where we may assume that S is not 0) so that the following comes out true:

    S E N D
  + M O R E
  ---------
  M O N E Y

Here's my Java 8 port of Mark's Haskell example.

public class SendMoreMoney {

    static final List<Integer> DIGITS = unmodifiableList(asList(0,1,2,3,4,5,6,7,8,9));
    
    public static void main(String[] args) {
        List<String> solutions = 
            remove(DIGITS, 0).stream().flatMap( s ->
            remove(DIGITS, s).stream().flatMap( e ->
            remove(DIGITS, s, e).stream().flatMap( n ->
            remove(DIGITS, s, e, n).stream().flatMap( d ->
            remove(DIGITS, s, e, n, d).stream().flatMap( m ->
            remove(DIGITS, s, e, n, d, m).stream().flatMap( o ->
            remove(DIGITS, s, e, n, d, m, o).stream().flatMap( r ->
            remove(DIGITS, s, e, n, d, m, o, r).stream().flatMap( y ->
                { int send = toNumber(s, e, n, d);
                  int more = toNumber(m, o, r, e);
                  int money = toNumber(m, o, n, e, y);
                  return  send + more == money ? Stream.of(solution(send, more, money)) : Stream.empty();
                }
            ))))))))
            .collect(toList());
           
         System.out.println(solutions);
    }

    static String solution(int send, int more, int money) {
        return "(" + send + "," + more + "," + money + ")";
    }
    
    static final int toNumber(Integer... digits) {
        assert digits.length > 0;
        return Stream.of(digits).reduce((x,y) -> 10*x + y).get();
    }
    
    static final List<Integer> remove(List<Integer> xs, Integer... ys) {
        // this naive implementation is O(n^2).
        List<Integer> zs = new ArrayList<>(xs);
        zs.removeAll(asList(ys));
        return zs;
    }
}

The minor optimization of not unncecessarily recomputing "send" and "more" is left out for the sake of readability. The methods remove() - which implements list difference - toNumber(), and solution() have simple implementations. Of these, toNumber() is again a lot like the corresponding Haskell code. Method solution() here returns a String because Java does not have tuples.

Too bad that in Java one must have the nested method calls, but the formatting goes some way to hide this. All in all, I think this is quite nice.

But how fast is it? I did a simple micro-benchmark with JMH 1.9.1 (available from Maven Central) on my laptop computer, which is a quad-core machine with an Intel i7 processor.

Here are the measurement parameters:

# JMH 1.9.1 (released 5 days ago)
# VM invoker: C:\Program Files\Java\jdk1.8.0_25\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 25 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op

I measured the flatMap solution against the equivalent formulation with eight nested forEach-loops and an external accumulator variable. The flatMap solution is about half as fast. Here's a representative measurement:

Benchmark                        Mode Cnt    Score   Error Units
measureFlatMapSearchPerformance avgt   25 662.377 ± 3.747 ms/op
measureForLoopSearchPerformance avgt   25 316.105 ± 3.823 ms/op

The nice thing abbout Streams is they are so easily parallelizable. Just throw in a .parallel() in the first line like this:

   remove(DIGITS, 0).stream().parallel().flatMap( s ->

leaving everything else unchanged, and the (parallel) flatMap version becomes twice as fast as the (serial) for-loop version:

Benchmark                        Mode Cnt    Score   Error Units
measureFlatMapSearchPerformance avgt   25 168.278 ± 1.700 ms/op
measureForLoopSearchPerformance avgt   25 315.806 ± 2.878 ms/op

The Y Combinator

2013-09-16T12:16:00.001+02:00

The Y combinator is one of the most beautiful ideas in programming theory. It allows one to define recursive functions from non-recursive, anonymous functions. The "anonymous" is what makes this particularly amazing: We are going to have recursion without having a way for a function to refer to itself by name. The usual example is the factorial function.

The theoretical background is beautifully and thoroughly explained by Mike Vanier in his article The Y Combinator (Slight Return), which I am not going to repeat here. (For a very short, readable intro to the topic, you might also refer to Deron Meranda's blog post on fixpoints.)

Many people seem to enjoy implementing the Y combinator, even in Java. Most implementations seem in some way to go back to this one by Ken Shirriff, which is in Java 7 and almost unreadable due to the use of inner classes. And here is a Java 8 implementation by Arul Dhesiaseelan. Note that the code on that last-mentioned post uses a method name Y, and that the comments on it propose a simplification which uses the this reference. Both are strictly not allowed, one might say that this is cheating. So here is my own, very similar but non-cheating implementation:

 
import java.util.function.Function;

public class YCombinator {

    // First a few abbreviations

    /** Function on T returning T. */
    interface Func<T> extends Function<T,T> {
    }

    /**
     * Higher-order function returning a T function.
     * <br>F: F -> (T -> T)
     * <br>This is the type of the Y combinator subexpressions.
     */
    interface FuncToTFunc<T> extends Function<FuncToTFunc<T>, Func<T>> {
    }

    /**
     * Function from a function on T functions to a T function.
     * <br>((T -> T) -> (T -> T)) -> (T -> T)
     * <br>This is the type of the fixed point operator.
     */
    interface Fix<T> extends Function<Func<Func<T>>, Func<T>>  {
    }

    // And now the real thing

    /**
     * Computes the factorial with the applicative-order Y combinator.
     * <br>Y = λr.(λf.(f f)) λf.(r λx.((f f) x))
     */
    public static int factorial(int n) {
        return  // Y combinator
                ((Fix<Integer>) r ->
                    ((FuncToTFunc<Integer>) (f -> f.apply(f)))
                            .apply(f -> r.apply(x -> f.apply(f).apply(x)))
                )
                .apply(
                        // Recursive function generator
                        f -> x -> (x == 0) ? 1 : x * f.apply(x - 1)
                )
                .apply(
                        // Argument
                        n);
    }
}

This code really computes the factorial function. And it does this using only anonymous functions: no function names or variables anywhere except lamba parameters. (You can plug-in any other recursive function you might want to compute, e. g. Fibonacci, or Ackermann, etc.).

The regrettable thing is having to have those casts. It would be nice to be able to get rid of them.

-- Sebastian

Higher-Order Functions in Java 8 (Part 2) - Laziness

2013-06-14T11:45:00.000+02:00

In this post I'd like to show that the typical elegance that results from the combination of lazy evaluation and higher-order functions in functional programming languages can sometimes also be had in Java. As our example, let's look at generating the first n primes, for some arbitrary n. We'll use the method of trial division, which determines the primality of a number k by testing whether it is divisible by any prime smaller than k.We will see how to use Java lambda expressions to generate an infinite sequence of integers and filter out the non-primes from it.

Of course, we will not really generate an infinite sequence, but rather an unbounded one, and limit ourselves to taking out n elements. This is where lazy evaluation comes in. Being lazy means computing a value only when it is actually needed. (The opposite - immediate evaluation - is called being strict. The article Functional Programming For the Rest of Us, which offers an elementary and informal introduction to functional programming concepts from a Java perspective, contains, among other things, a discussion of laziness and its advantages as well as disadvantages.)

Some functional languages like Haskell are inherently lazy, others like Scala have both kinds of evaluation. Java 8 also falls into this latter category: Traditionally, Java is a strict language: all method parameters are evaluated before a method is called etc. However, the new Streams API is inherently lazy: An object is only created from the stream when there is demand for it, i. e. only when some terminal operation like collect or forEach is called. And lambda expressions allow passing functions as arguments for delayed evaluation.

Let's return to our example. In Scala, we might write the following code, lifted from Louis Botterill's blog:

def primes : Stream[Int] = {  
   var is = Stream from 2  
   def sieve(numbers: Stream[Int]): Stream[Int] = {  
    Stream.cons(  
     numbers.head,  
     sieve(for (x <- numbers.tail if x % numbers.head > 0) yield x))  
   }  
   sieve(is)  
  }

By the way, this algorithm is not the Sieve of Eratosthenes, although it is often so presented. This paper by Melissa O'Neill has all the details. It's an interesting read: O'Neill describes an optimized implementation of the genuine sieve, making use of heap-like data structures in Haskell. I'd be interested to know if someone has ported this to Java 8. We'll stick with the unfaithful sieve (O'Neill 's term) for now.

What's important in the above code is the ability to access the head of the stream and still lazily operate on the rest.Unfortunately, in Java findFirst is also one of those terminal operations on streams, so there is no obvious way to port this bit to Java without creating our own lazy list structure. We'll do that by building on the simple ConsList implementation alluded to in the previous post, and extending it in several ways:

add a factory method that accepts a Function<T,T> to generate the elements in the tail in sequence
add a factory method that accepts a Supplier<FilterableConsList<T>> to create a tail
add an instance method that accepts a Predicate<T> to filter out non-matching elements from a list
make the implementation class lazy for the tail (but strict for the head).

As a further goodie, we can easily implement the collection interface by providing an iterator and make the thing streamable by providing a spliterator of unknown size that is based on the iterator. The coding is lengthy but straightforward, it is shown at the bottom of this post. With it, the driver for generating the first 100 primes becomes:

 LazyConsList<Integer> s = sieve(LazyConsList.newList(2, x -> x + 1));   
 s.stream().limit(100).forEach(x->System.out.print(x + " "));

where the interesting sieve-method is this:

 private LazyConsList<Integer> sieve(FilterableConsList<Integer> s) {  
  Integer p = s.head();  
  return cons(p, () -> sieve(s.tail().filter(x -> x % p != 0)));  
 }

Now I think that looks pretty neat (and very similar to the corresponding Scala code). That's all I wanted to show. However, as Java 8 is still very new, it may not be quite obvious to many readers what's going on here. Let me try to explain.

The text declares very clearly what it takes to output the first 100 primes: Create a list of integers starting at two, sieve it, take the first 100 elements and output them severally, where sieving consists of taking the first element of the list, then taking the rest disregarding any multiples of the first element, and sieving that again.

Procedurally, the picture is very different: some method calls are deferred until the tail of a cons-list is accessed. In particular, there is no infinite loop, because every call to sieve in the expression () -> sieve(...)is delayed until forEach requests another prime for output. Here's the sequence:

forEach causes the stream to get the next element from its underlying iterator
whereupon the iterator will return the first list element and change its own state by retrieving the tail of its list, which will be a list based on a Supplier, namely the one which came out of the previous call to sieve
the iterator's call to tail will in turn lead to strict evaluation of the method arguments to sieve, accessing the tail of the generator-based list and creating a new list from it by adding a new filter
so that when sieve is executed, a new cons-cell with a supplied tail is constructed, the head of which is the first element of the filtered list from step 4 and therefore a prime
which will be the next element returned from the stream when processing continues at step 1.

Contrary to appearances (and to the suggestive wording of the declarative meaning of the program) no list containing 100 primes is ever constructed and then output. Instead, you might say that the processing takes place "vertically", constructing a series of one-element lists and outputting their heads. You can see that clearly in a debugger, or by including some logging when entering and leaving the sieve.

-- Sebastian

Here's the complete code for our lazy list. I'm sure there's much to improve. One thing is that the toString() method that is inherited from the superclass is based on the list's iterator and will not terminate for an infinite list. I'd be grateful for suggestions or error reports (perhaps someone might like to write some unit tests?). First the interface:

 
import java.util.function.Predicate;

/**  
  * Interface for a list that may be filtered.  
  * @param <T> element type  
  * @author Sebastian Millies  
  */  
 public interface FilterableConsList<T> extends ConsList<T> {  
   
    /**  
    * Applies the given filter to this list.  
    *  
    * @param f a filter predicate  
    * @return a list containing the elements of this list for which the filter   
    * predicate returns <code>true</code>)  
    */  
   FilterableConsList<T> filter(Predicate<T> f);  
     
   /**  
    * Returns the tail of this list.  
    *  
    * @return tail  
    * @throws EmptyListException if the list is empty  
    */  
   @Override  
   FilterableConsList<T> tail();  
 }

And then the implementation:

 import java.util.AbstractCollection;  
 import java.util.Iterator;  
 import java.util.Objects;  
 import java.util.Spliterator;  
 import java.util.Spliterators;  
 import java.util.function.Predicate;  
 import java.util.function.Supplier;  
 import java.util.function.UnaryOperator;  
   
 /**  
  * An immutable, filterable list that can be based on a generator function and  
  * is strict for the head and lazy for the tail. The tail can also be computed  
  * on demand by a Supplier instance. Via the Collection interface, this class   
  * is connected to the streams API.  
  *  
  * @param <T> element type  
  * @author Sebastian Millies  
  */  
 public abstract class LazyConsList<T> extends AbstractCollection<T> implements FilterableConsList<T> {  
   
   // ------------- Factory methods  
   
   /**  
    * Create a non-empty list based on a generator function. The generator is  
    * evaluated when the first element in the tail of the list is accessed.  
    * @param <T> element type  
    * @param head the first element  
    * @param next the generator function.  
    * @return a new list with head as its head and the tail generated by next  
    */  
   public static <T> LazyConsList<T> newList(final T head, final UnaryOperator<T> next) {  
     Objects.requireNonNull(next);  
     return new FunctionTail<>(head, next, x->true);  
   }  
   
   /**  
    * Create a list containing the given elements.  
    *  
    * @param <T> element type  
    * @param elems elements  
    * @return a new list  
    */  
   public static <T> LazyConsList<T> newList(T... elems) {  
     LazyConsList<T> list = nil();  
     for (int i = elems.length - 1; i >= 0; i--) {  
       list = cons(elems[i], list);  
     }  
     return list;  
   }  
     
   /**  
    * Add an element at the front of a list.  
    * @param <T> element type  
    * @param elem new element  
    * @param list list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> LazyConsList<T> cons(T elem, FilterableConsList<T> list) {  
     Objects.requireNonNull(list);  
     return new Cons<>(elem, list, x->true);  
   }  
   
   /**  
    * Add an element at the front of a list that will be supplied by the given supplier.  
    * @param <T> element type  
    * @param elem new element  
    * @param listSupplier list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> LazyConsList<T> cons(T elem, Supplier<FilterableConsList<T>> listSupplier) {  
     Objects.requireNonNull(listSupplier);  
     return new SuppliedTail<>(elem, listSupplier, x->true);  
   }  
     
   /**  
    * Create an empty list.  
    * @param <T> element type  
    * @return an empty list  
    */  
   public static <T> LazyConsList<T> nil() {  
     return new Nil<>();  
   }  
   
   /**  
    * Utility method that recurses through the specified list and its  
    * tails until it finds the first one with a head matching the filter.  
    * @param <T> element type  
    * @param list list that is filtered  
    * @param filter element filter  
    * @return the specified list or one of its tails  
    */  
   protected static <T> FilterableConsList<T> applyFilter(FilterableConsList<T> list, Predicate<T> filter) {  
     FilterableConsList<T> filtered = list;  
     while (!filter.test(filtered.head())) {  
       filtered = filtered.tail();  
     }  
     return filtered;  
   }  
   
   
   // ------------- Constructor  
   
   protected LazyConsList() {  
   }  
     
   // ------------- Collection interface  
     
   @Override  
   public final Iterator<T> iterator() {  
     return new Iterator<T>() {  
       private ConsList<T> current = LazyConsList.this;  
         
       @Override  
       public boolean hasNext() {  
         return !current.isNil();  
       }  
   
       @Override  
       public T next() {  
         T head = current.head();  
         current = current.tail();  
         return head;  
       }  
     };  
   }  
   
   @Override  
   public final int size() {  
     throw new UnsupportedOperationException("the size of a lazy list cannot be determined");  
   }  
     
   @Override  
   public Spliterator<T> spliterator() {  
     return Spliterators.spliteratorUnknownSize(iterator(), 0);  
   }  
     
   // ------------- concrete subclasses  
     
   /**  
    * A non-empty list.  
    * @param <T> element type  
    */  
   private static class Cons<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final FilterableConsList<T> tail;  
     private final Predicate<T> filter;  
       
     public Cons(T head, FilterableConsList<T> tail, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.tail = tail;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       return tail;  
     }  
   
      @Override  
     public boolean isNil() {  
       return false;  
     }  
       
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new Cons(filtered.head(), filtered.tail(), newFilter);  
     }  
   }  
   
   /**  
    * A non-empty list based on a generator function. The tail is lazily computed  
    * on demand and cached.  
    * @param <T> element type  
    */  
   private static class FunctionTail<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final UnaryOperator<T> next;  
     private final Predicate<T> filter;  
     private FilterableConsList<T> tailCache;  
   
     public FunctionTail(T head, UnaryOperator<T> next, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.next = next;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       // construct a new lazy list with the first element that passes the filter.  
       // use the generator function to construct the elements.  
       if (tailCache == null) {  
         T nextHead = head;  
         do {  
          nextHead = next.apply(nextHead);  
         } while (!filter.test(nextHead));  
         tailCache = new FunctionTail(nextHead, next, filter);  
       }  
       return tailCache;  
     }  
   
     @Override  
     public boolean isNil() {  
       return false;  
     }  
        
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new FunctionTail(filtered.head(), next, newFilter);  
     }  
   }  
   
   /**  
    * A non-empty list based on a supplied computation. The tail is lazily computed  
    * on demand and cached.  
    * @param <T> element type  
   */  
   private static class SuppliedTail<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final Supplier<FilterableConsList<T>> supplier;  
     private final Predicate<T> filter;  
     private FilterableConsList<T> tailCache;  
   
     public SuppliedTail(T head, Supplier<FilterableConsList<T>> supplier, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.supplier = supplier;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       // construct a new lazy list with the first element that passes the filter.  
       // delegate to the supplied function to create the tail of the current list.
       if (tailCache == null) {  
         tailCache = applyFilter(supplier.get(), filter);  
       }  
       return tailCache;  
     }  
   
     @Override  
     public boolean isNil() {  
       return false;  
     }  
   
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new Cons(filtered.head(), filtered.tail(), newFilter);  
     }  
   }  
     
   /**  
    * An empty list. Trying to access components of this class will result in   
    * an EmptyListException at runtime.  
    * @param <T> element type  
    */  
   private static class Nil<T> extends LazyConsList<T> {  
   
     @Override  
     public T head() {  
       throw new EmptyListException();  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       throw new EmptyListException();  
     }  
   
     @Override  
     public boolean isNil() {  
       return true;  
     }  
   
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       return this;  
     }  
      
   }  
 }

Higher-Order Functions in Java 8 (Part 1)

2013-06-14T10:55:00.000+02:00

This post will show how the first few examples (pages 4 - 7) from John Hughes' 1984 paper Why Functional Programming Matters ([WHYFP]) may be implemented in Java 8. The paper has been revised several times, here's a link to the 1990 version. It's a useful starting point to get acquainted with functional programming idioms. What's more, we'll reconstruct part of the new Java Collection (or Streams) API, namely the map/reduce operations.

What this post is not:

a general discussion of what's new in Java 8. Here is an excellent overview. In addition, I found this description of multiple inheritance resolution very helpful.
An introduction to functional programming with lambdas in Java.Good places to start with that topic would be the "official" State of the Lambda and Maurice Naftalin's Lambda FAQ or Angelika Langer's Lambda Tutorial.

I'd like to make one very technical point: Often, lambda expressions in Java are presented as syntactic sugar for (anonymous) inner classes. They're not. They are translated to byte code differently and make use of new JVM instructions. If you would like to know more about this, read Brian Goetze's article Translation of Lambda Expressions and consult the Java API docs, perhaps starting with LambdaMetaFactory.

The code examples have been compiled and tested with Java Lambda b88 and the Netbeans IDE Development version. (Build jdk8lambda-1731-on-20130523). They may or may not work with the latest builds, which can be downloaded from

NetBeans IDE (IntelliJ IDEA CE 12.1.3 or higher also has good Java 8 support)
Java 8 JDK

Set up your environment as follows:

Download and unzip NetBeans and Java 8
Run NetBeans with command line parameter --jdkhome <path to jdk8>

Before we can get started, we need to define the recursive list data structure with which we're going to work: a list is something that is either nil (empty) or a cons-cell consisting of a head and a tail which is itself a list. Here's the interface:

 
   public interface ConsList<T> {  
   /**  
    * Returns the first element of this list.   
    *  
    * @return head  
    * @throws EmptyListException if the list is empty  
    */  
   T head();  

   /**  
    * Returns the tail of this list.  
    *  
    * @return tail  
    * @throws EmptyListException if the list is empty  
    */  
   ConsList<T> tail();  

   /**  
    * Tests if this list is nil. Nil is the empty list.  
    *  
    * @return <code>true</code> if the list is empty, otherwise <code>false</code>  
    */  
   boolean isNil();   
 }

The implementation class (here called SimpleConsList) is not shown. It should provide some static factory methods (signatures see below) and a useful toString method. Implementation guidelines can be found here. (An enhanced version, which you can use with only minimal adjustments to some code shown later, is included in the follow-up to this post.)

   
   /**  
    * Create a list containing the given elements.  
    *  
    * @param <T> element type  
    * @param elems elements  
    * @return a new list  
    */ 
   public static <T> SimpleConsList<T> newList(T... elems);  

   /**  
    * Add an element at the front of a list.  
    * @param <T> element type  
    * @param elem new element  
    * @param list list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> SimpleConsList<T> cons(T elem, ConsList<T> list);  

   /**  
    * Create an empty list.  
    * @param <T> element type  
    * @return an empty list  
    */  
   public static <T> SimpleConsList<T> nil();

Here's the motivating example taken from [WHYFP]: Suppose we wanted to add all the elements of a list of integers. Consider the following code:

 
   public static Integer sum(ConsList<Integer> list) {  
     if (list.isNil()) {  
       return 0;  
     } else {  
       return list.head() + sum(list.tail());  
     }  
   }

What's really Integer specific here? Only the type, the neutral element, and the addition operator. This means that the computation of a sum can be modularized into a general recursive pattern and the specific parts. This recursive pattern is conventionally called foldr. Here's it:

   public static <T, R> R foldr(BiFunction<T, R, R> f, R x, ConsList<T> list) {  
     if (list.isNil()) {  
       return x;  
     }  
     return f.apply(list.head(), foldr(f, x, list.tail()));  
   }

BiFunction is a functional interface from the package java.util.function. It is the type of functions taking two arguments. Now we may print out the sum of a list of integers like this:

  
  ConsList<Integer> ints = SimpleConsList.newList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
  System.out.println("Sum = " + foldr((x, y) -> x + y, 0, ints));

foldr is called a higher-order function, because it takes another function as its argument. (This is sometimes expressed from a programming perspective as having "code as data".)

There is also a closely related way to traverse a list, called foldl, which does the function application from left to right:

   public static <T, R> R foldl(BiFunction<T, R, R> f, R x, ConsList<T> list) {  
     if (list.isNil()) {  
       return x;  
     }  
     return foldl(f, f.apply(list.head(), x), list.tail());  
   }

In fact a variant of foldl has been built into the Java Streams API. It's called reduce, because it may be used to "reduce" all the elements of a list to a single value (e. g. their sum). The naming is unfortunate, in my opinion, because as we shall see it relates only to a special (if typical) case of what can be done with it, but is usual in functional programming. Here's corresponding code using that built-in method on a java.util.List

 List<Integer> intList = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);  
 System.out.println("Sum = " + intList.stream().reduce(0, (x, y) -> x + y));

There is no correspondence to foldr in the Java API. Perhaps because foldl is tail recursive. However, reduce also differs in important ways from foldl: It requires the folding function to be associative (as in our example), and it is in fact not guaranteed to process the stream elements in a left-to-right order.

The following code demonstrates how foldr (and its relative foldl) can be used to write many other functions on lists with no more programming effort. Here's how it works (cf. [WHYFP]):

length increments 0 as many times as there are cons'es
copy cons'es the list elements onto the front of an empty list from right to left
reverse is similar, except it uses foldl to do the cons-ing from left to right. Just for fun, the lambda expression has been replaced with a method reference.
append cons'es the elements of chars1 onto the front of chars2

     ConsList<Integer> ints = SimpleConsList.newList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);  
     ConsList<Boolean> bools = SimpleConsList.newList(true, true, false);  
     ConsList<Character> chars1 = SimpleConsList.newList('a', 'b', 'c');  
     ConsList<Character> chars2 = SimpleConsList.newList('x', 'y', 'z'); 
 
     System.out.println("Product = " + foldr((x, y) -> x * y, 1, ints));  
     System.out.println("One true = " + foldr((x, y) -> x || y, false, bools));  
     System.out.println("All true = " + foldr((x, y) -> x && y, true, bools));  
     System.out.println("Length = " + foldr((Boolean x, Integer y) -> y + 1, 0, bools));  
     System.out.println("Copy = " + foldr((x, y) -> cons(x, y), nil(), chars1));  
     System.out.println("Reverse = " + foldl(SimpleConsList::cons, nil(), chars1));  
     System.out.println("Append = " + foldr((x, y) -> cons(x, y), chars2, chars1));

There's one more higher-order function I'd like to mention, namely map. This function takes a function argument f and applies f to all elements of a list, yielding a new list. For example, we may use it to double every element in a list:

   System.out.println("Double all = " + map(x -> 2 * x, ints));

The map-function is also part of the Java Streams API. Following the derivation in [WHYFP] we may reconstruct it like this:

  // map f = foldr (cons . f ) nil   
  public static <T, R> ConsList<R> map(Function<T, R> f, ConsList<T> list) {  
    Function<R, Function<ConsList<R>, ConsList<R>>> cons = curry((x, y) -> cons(x, y));  
    return foldr((T x, ConsList<R> y) -> cons.compose(f).apply(x).apply(y), nil(), list);  
  }

Here we observe several things: map makes use of functional composition, a standard operator which is built into the interface java.util.function.Function. The expression cons.compose(f) returns a new function that first applies f to its argument and then applies cons to the result of that application. (Download the JDK source code and study how compose is implemented.)

In order to apply functional composition, we must first "curry" the binary function. Currying transforms a single function of n arguments into n functions with a single argument each. Of course, we could just have defined cons in the method above as x -> y -> cons(x, y), but it's perhaps instructive to see a method that curries a BiFunction (side note: unfortunately, I see no general way to curry an n-ary function for arbitrary n in Java, the type system is just not flexible enough):

public static <T, U, R> Function<T, Function<U, R>> curry(BiFunction<T, U, R> f) {  
    return (T t) -> (U u) -> f.apply(t, u);  
}

Note that no actual function invocation takes place, i. e. apply is not invoked until both function arguments have been supplied. After all this effort, although the coding may look intimidating, map is just like our simple copy-list example above, except that the list elements are not just copied, but first transformed through f.

Now we have all the pieces in hand to sum the elements of a matrix, where a matrix (implementation not shown) is represented as a list of lists. The function sumList adds up all the rows, and then the leftmost application of that function adds up the row totals to get the sum of the whole matrix.

 Matrix<Integer> matrix = new Matrix(ints, ints, ints);  
 Function<ConsList<Integer>, Integer> sumList = x -> foldr((a, b) -> a + b, 0, x);  
 System.out.println("Matrix sum = " + sumList.compose((Matrix<Integer> m) -> map(x -> sumList.apply(x), m)).apply(matrix));

In my opinion, this is not too verbose, but they certainly have not turned Java into Scala or Haskell. In Scala the same expression would read something like val summatrix: List[List[Int]] => Int = sum.compose(map(sum))

In fact, by partially evaluating the above expression with respect to functional composition and application to the matrix argument, we can manually derive the following equivalent expression, which is somewhat easier to understand: sumList.apply(map(x -> sumList.apply(x), matrix))

Mark Mahieu has written on the topic of partial evaluation in Java.

In this post you have seen how Java 8 makes it possible to pass functions around not only to apply them to data but also to manipulate them and combine them into other functions. You have also seen how the map-/reduce operations from the Java 8 Streams API may be understood in these terms. This has been textbook stuff. In another post, I'll discuss the more exciting (I hope) topic of how lambda expressions are especially useful when combined with lazy evaluation.

-- Sebastian

Note: If you're curious how to use higher-order functions without language support for functional expressions, there are Java 7 (no lambda) versions of foldr, foldl, and map on Adrian Walker's blog.