SyntaxHighlighter

14 Jun 2013

Higher-Order Functions in Java 8 (Part 1)

This post will show how the first few examples (pages 4 - 7) from John Hughes' 1984 paper Why Functional Programming Matters ([WHYFP]) may be implemented in Java 8. The paper has been revised several times, here's a link to the 1990 version. It's a useful starting point to get acquainted with functional programming idioms. What's more, we'll reconstruct part of the new Java Collection (or Streams) API, namely the map/reduce operations.

What this post is not:
I'd like to make one very technical point: Often, lambda expressions in Java are presented as syntactic sugar for (anonymous) inner classes. They're not. They are translated to byte code differently and make use of new JVM instructions. If you would like to know more about this, read Brian Goetze's article Translation of Lambda Expressions and consult the Java API docs, perhaps starting with LambdaMetaFactory

The code examples have been compiled and tested with Java Lambda b88 and the Netbeans IDE Development version. (Build jdk8lambda-1731-on-20130523). They may or may not work with the latest builds, which can be downloaded from
Set up your environment as follows:
  1. Download and unzip NetBeans and Java 8
  2. Run NetBeans with command line parameter --jdkhome <path to jdk8>
Before we can get started, we need to define the recursive list data structure with which we're going to work: a list is something that is either nil (empty) or a cons-cell consisting of a head and a tail which is itself a list. Here's the interface:
 
   public interface ConsList<T> {  
   /**  
    * Returns the first element of this list.   
    *  
    * @return head  
    * @throws EmptyListException if the list is empty  
    */  
   T head();  

   /**  
    * Returns the tail of this list.  
    *  
    * @return tail  
    * @throws EmptyListException if the list is empty  
    */  
   ConsList<T> tail();  

   /**  
    * Tests if this list is nil. Nil is the empty list.  
    *  
    * @return <code>true</code> if the list is empty, otherwise <code>false</code>  
    */  
   boolean isNil();   
 }  
The implementation class (here called SimpleConsList) is not shown. It should provide some static factory methods (signatures see below) and a useful toString method. Implementation guidelines can be found here. (An enhanced version, which you can use with only minimal adjustments to some code shown later, is included in the follow-up to this post.)
   
   /**  
    * Create a list containing the given elements.  
    *  
    * @param <T> element type  
    * @param elems elements  
    * @return a new list  
    */ 
   public static <T> SimpleConsList<T> newList(T... elems);  

   /**  
    * Add an element at the front of a list.  
    * @param <T> element type  
    * @param elem new element  
    * @param list list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> SimpleConsList<T> cons(T elem, ConsList<T> list);  

   /**  
    * Create an empty list.  
    * @param <T> element type  
    * @return an empty list  
    */  
   public static <T> SimpleConsList<T> nil();  

Here's the motivating example taken from [WHYFP]: Suppose we wanted to add all the elements of a list of integers. Consider the following code:
 
   public static Integer sum(ConsList<Integer> list) {  
     if (list.isNil()) {  
       return 0;  
     } else {  
       return list.head() + sum(list.tail());  
     }  
   } 

What's really Integer specific here? Only the type, the neutral element, and the addition operator. This means that the computation of a sum can be modularized into a general recursive pattern and the specific parts. This recursive pattern is conventionally called foldr. Here's it:
   public static <T, R> R foldr(BiFunction<T, R, R> f, R x, ConsList<T> list) {  
     if (list.isNil()) {  
       return x;  
     }  
     return f.apply(list.head(), foldr(f, x, list.tail()));  
   }   
BiFunction is a functional interface from the package java.util.function. It is the type of functions taking two arguments. Now we may print out the sum of a list of integers like this:
  
  ConsList<Integer> ints = SimpleConsList.newList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
  System.out.println("Sum = " + foldr((x, y) -> x + y, 0, ints)); 
foldr is called a higher-order function, because it takes another function as its argument. (This is sometimes expressed from a programming perspective as having "code as data".)

There is also a closely related way to traverse a list, called foldl, which does the function application from left to right:
   public static <T, R> R foldl(BiFunction<T, R, R> f, R x, ConsList<T> list) {  
     if (list.isNil()) {  
       return x;  
     }  
     return foldl(f, f.apply(list.head(), x), list.tail());  
   }   

In fact a variant of foldl has been built into the Java Streams API. It's called reduce, because it may be used to "reduce" all the elements of a list to a single value (e. g. their sum). The naming is unfortunate, in my opinion, because as we shall see it relates only  to a special (if typical) case of what can be done with it, but is usual in functional programming. Here's corresponding code using that built-in method on a java.util.List
 List<Integer> intList = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);  
 System.out.println("Sum = " + intList.stream().reduce(0, (x, y) -> x + y));  

There is no correspondence to foldr  in the Java API. Perhaps because foldl is tail recursive. However, reduce also differs in important ways from foldl: It requires the folding function to be associative (as in our example), and it is in fact not guaranteed to process the stream elements in a left-to-right order.

The following code demonstrates how foldr (and its relative foldl) can be used to write many other functions on lists with no more programming effort. Here's how it works (cf. [WHYFP]):
  • length increments 0 as many times as there are cons'es
  • copy  cons'es the list elements onto the front of an empty list from right to left
  • reverse is similar, except it uses foldl to do the cons-ing from left to right. Just for fun, the lambda expression has been replaced with a method reference.
  • append cons'es the elements of chars1 onto the front of chars2 
     ConsList<Integer> ints = SimpleConsList.newList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);  
     ConsList<Boolean> bools = SimpleConsList.newList(true, true, false);  
     ConsList<Character> chars1 = SimpleConsList.newList('a', 'b', 'c');  
     ConsList<Character> chars2 = SimpleConsList.newList('x', 'y', 'z'); 
 
     System.out.println("Product = " + foldr((x, y) -> x * y, 1, ints));  
     System.out.println("One true = " + foldr((x, y) -> x || y, false, bools));  
     System.out.println("All true = " + foldr((x, y) -> x && y, true, bools));  
     System.out.println("Length = " + foldr((Boolean x, Integer y) -> y + 1, 0, bools));  
     System.out.println("Copy = " + foldr((x, y) -> cons(x, y), nil(), chars1));  
     System.out.println("Reverse = " + foldl(SimpleConsList::cons, nil(), chars1));  
     System.out.println("Append = " + foldr((x, y) -> cons(x, y), chars2, chars1));  

There's one more higher-order function I'd like to mention, namely map. This function takes a function argument f and applies f to all elements of a list, yielding a new list. For example, we may use it to double every element in a list:
   System.out.println("Double all = " + map(x -> 2 * x, ints));
The map-function is also part of the Java Streams API. Following the derivation in [WHYFP] we may reconstruct it like this:
  // map f = foldr (cons . f ) nil   
  public static <T, R> ConsList<R> map(Function<T, R> f, ConsList<T> list) {  
    Function<R, Function<ConsList<R>, ConsList<R>>> cons = curry((x, y) -> cons(x, y));  
    return foldr((T x, ConsList<R> y) -> cons.compose(f).apply(x).apply(y), nil(), list);  
  }  
Here we observe several things: map makes use of functional composition, a standard operator which is built into the interface java.util.function.Function. The expression cons.compose(f) returns a new function that first applies f to its argument and then applies cons to the result of that application. (Download the JDK source code and study how compose is implemented.) 

In order to apply functional composition, we must first "curry" the binary function. Currying transforms a single function of n arguments into n functions with a single argument each. Of course, we could just have defined cons in the method above as x -> y -> cons(x, y), but it's perhaps instructive to see a method that curries a BiFunction (side note: unfortunately, I see no general way to curry an n-ary function for arbitrary n in Java, the type system is just not flexible enough):
public static <T, U, R> Function<T, Function<U, R>> curry(BiFunction<T, U, R> f) {  
    return (T t) -> (U u) -> f.apply(t, u);  
}  
Note that no actual function invocation takes place, i. e. apply is not invoked until both function arguments have been supplied. After all this effort, although the coding may look intimidating, map is just like our simple copy-list example above, except that the list elements are not just copied, but first transformed through f.

Now we have all the pieces in hand to sum the elements of a matrix, where a matrix (implementation not shown) is represented as a list of lists. The function sumList adds up all the rows, and then the leftmost application of that function adds up the row totals to get the sum of the whole matrix.
 Matrix<Integer> matrix = new Matrix(ints, ints, ints);  
 Function<ConsList<Integer>, Integer> sumList = x -> foldr((a, b) -> a + b, 0, x);  
 System.out.println("Matrix sum = " + sumList.compose((Matrix<Integer> m) -> map(x -> sumList.apply(x), m)).apply(matrix));  

In my opinion, this is not too verbose, but they certainly have not turned Java into Scala or Haskell. In Scala the same expression would read something like val summatrix: List[List[Int]] => Int = sum.compose(map(sum))

In fact, by partially evaluating the above expression with respect to functional composition and application to the matrix argument, we can manually derive the following equivalent expression, which is somewhat easier to understand: sumList.apply(map(x -> sumList.apply(x), matrix))

Mark Mahieu has written on the topic of partial evaluation in Java.

In this post you have seen how Java 8 makes it possible to pass functions around not only to apply them to data but also to manipulate them and combine them into other functions. You have also seen how the map-/reduce operations from the Java 8 Streams API may be understood in these terms. This has been textbook stuff. In another post, I'll discuss the more exciting (I hope) topic of how lambda expressions are especially useful when combined with lazy evaluation.

-- Sebastian

Note: If you're curious how to use higher-order functions without language support for functional expressions, there are Java 7 (no lambda) versions of foldr, foldl, and map on Adrian Walker's blog.

No comments:

Post a Comment