SyntaxHighlighter

14 Jun 2013

Higher-Order Functions in Java 8 (Part 2) - Laziness

In this post I'd like to show that the typical elegance that results from the combination of lazy evaluation and higher-order functions in functional programming languages can sometimes also be had in Java. As our example, let's look at generating the first n primes, for some arbitrary n. We'll use the method of trial division, which determines the primality of a number k by testing whether it is divisible by any prime smaller than k.We will see how to use Java lambda expressions to generate an infinite sequence of integers and filter out the non-primes from it.

Of course, we will not really generate an infinite sequence, but rather an unbounded one, and limit ourselves to taking out n elements. This is where lazy evaluation comes in. Being lazy means computing a value only when it is actually needed. (The opposite - immediate evaluation - is called being strict. The article Functional Programming For the Rest of Us, which offers an elementary and informal introduction to functional programming concepts from a Java perspective, contains, among other things, a discussion of laziness and its advantages as well as disadvantages.)

Some functional languages like Haskell are inherently lazy, others like Scala have both kinds of evaluation. Java 8 also falls into this latter category: Traditionally, Java is a strict language: all method parameters are evaluated before a method is called etc. However, the new Streams API is inherently lazy: An object is only created from the stream when there is demand for it, i. e. only when some terminal operation like collect or forEach is called. And lambda expressions allow passing functions as arguments for delayed evaluation.

Let's return to our example. In Scala, we might write the following code, lifted from Louis Botterill's blog:

def primes : Stream[Int] = {  
   var is = Stream from 2  
   def sieve(numbers: Stream[Int]): Stream[Int] = {  
    Stream.cons(  
     numbers.head,  
     sieve(for (x <- numbers.tail if x % numbers.head > 0) yield x))  
   }  
   sieve(is)  
  }   

By the way, this algorithm is not the Sieve of Eratosthenes, although it is often so presented. This paper by Melissa O'Neill has all the details. It's an interesting read: O'Neill describes an optimized implementation of the genuine sieve, making use of heap-like data structures in Haskell. I'd be interested to know if someone has ported this to Java 8. We'll stick with the unfaithful sieve (O'Neill 's term) for now.

What's important in the above code is the ability to access the head of the stream and still lazily operate on the rest.Unfortunately, in Java findFirst is also one of those terminal operations on streams, so there is no obvious way to port this bit to Java without creating our own lazy list structure. We'll do that by building on the simple ConsList implementation alluded to in the previous post, and extending it in several ways:
  • add a factory method that accepts a Function<T,T> to generate the elements in the tail  in sequence
  • add a factory method that accepts a Supplier<FilterableConsList<T>> to create a tail
  • add an instance method that accepts a Predicate<T> to filter out non-matching elements from a list
  • make the implementation class lazy for the tail (but strict for the head).
As a further goodie, we can easily implement the collection interface by providing an iterator and make the thing streamable by providing a spliterator of unknown size that is based on the iterator. The coding is lengthy but straightforward, it is shown at the bottom of this post. With it, the driver for generating the first 100 primes becomes:
 LazyConsList<Integer> s = sieve(LazyConsList.newList(2, x -> x + 1));   
 s.stream().limit(100).forEach(x->System.out.print(x + " "));   
where the interesting sieve-method is this:
 private LazyConsList<Integer> sieve(FilterableConsList<Integer> s) {  
  Integer p = s.head();  
  return cons(p, () -> sieve(s.tail().filter(x -> x % p != 0)));  
 }
Now I think that looks pretty neat (and very similar to the corresponding Scala code). That's all I wanted to show. However, as Java 8 is still very new, it may not be quite obvious to many readers what's going on here. Let me try to explain.

The text declares very clearly what it takes to output the first 100 primes: Create a list of integers starting at two, sieve it, take the first 100 elements and output them severally, where sieving consists of taking the first element of the list, then taking the rest disregarding any multiples of the first element, and sieving that again.

Procedurally, the picture is very different: some method calls are deferred until the tail of a cons-list is accessed. In particular, there is no infinite loop, because every call to sieve in the expression () -> sieve(...)is delayed until forEach requests another prime for output. Here's the sequence:
  1. forEach causes the stream to get the next element from its underlying iterator
  2. whereupon the iterator will return the first list element and change its own state by retrieving the tail of its list, which will be a list based on a Supplier, namely the one which came out of the previous call to sieve
  3. the iterator's call to tail will in turn lead to strict evaluation of the method arguments to sieve, accessing the tail of the generator-based list and creating a new list from it by adding a new filter
  4. so that when sieve is executed, a new cons-cell with a supplied tail is constructed, the head of which is the first element of the filtered list from step 4 and therefore a prime
  5. which will be the next element returned from the stream when processing continues at step 1.
Contrary to appearances (and to the suggestive wording of the declarative meaning of the program)  no list containing 100 primes is ever constructed and then output. Instead, you might say that the processing takes place "vertically", constructing a series of one-element lists and outputting their heads. You can see that clearly in a debugger, or by including some logging when entering and leaving the sieve.

-- Sebastian

Here's the complete code for our lazy list. I'm sure there's much to improve. One thing is that the toString() method that is inherited from the superclass is based on the list's iterator and will not terminate for an infinite list. I'd be grateful for suggestions or error reports (perhaps someone might like to write some unit tests?). First the interface:
 
import java.util.function.Predicate;

/**  
  * Interface for a list that may be filtered.  
  * @param <T> element type  
  * @author Sebastian Millies  
  */  
 public interface FilterableConsList<T> extends ConsList<T> {  
   
    /**  
    * Applies the given filter to this list.  
    *  
    * @param f a filter predicate  
    * @return a list containing the elements of this list for which the filter   
    * predicate returns <code>true</code>)  
    */  
   FilterableConsList<T> filter(Predicate<T> f);  
     
   /**  
    * Returns the tail of this list.  
    *  
    * @return tail  
    * @throws EmptyListException if the list is empty  
    */  
   @Override  
   FilterableConsList<T> tail();  
 }
And then the implementation:
 import java.util.AbstractCollection;  
 import java.util.Iterator;  
 import java.util.Objects;  
 import java.util.Spliterator;  
 import java.util.Spliterators;  
 import java.util.function.Predicate;  
 import java.util.function.Supplier;  
 import java.util.function.UnaryOperator;  
   
 /**  
  * An immutable, filterable list that can be based on a generator function and  
  * is strict for the head and lazy for the tail. The tail can also be computed  
  * on demand by a Supplier instance. Via the Collection interface, this class   
  * is connected to the streams API.  
  *  
  * @param <T> element type  
  * @author Sebastian Millies  
  */  
 public abstract class LazyConsList<T> extends AbstractCollection<T> implements FilterableConsList<T> {  
   
   // ------------- Factory methods  
   
   /**  
    * Create a non-empty list based on a generator function. The generator is  
    * evaluated when the first element in the tail of the list is accessed.  
    * @param <T> element type  
    * @param head the first element  
    * @param next the generator function.  
    * @return a new list with head as its head and the tail generated by next  
    */  
   public static <T> LazyConsList<T> newList(final T head, final UnaryOperator<T> next) {  
     Objects.requireNonNull(next);  
     return new FunctionTail<>(head, next, x->true);  
   }  
   
   /**  
    * Create a list containing the given elements.  
    *  
    * @param <T> element type  
    * @param elems elements  
    * @return a new list  
    */  
   public static <T> LazyConsList<T> newList(T... elems) {  
     LazyConsList<T> list = nil();  
     for (int i = elems.length - 1; i >= 0; i--) {  
       list = cons(elems[i], list);  
     }  
     return list;  
   }  
     
   /**  
    * Add an element at the front of a list.  
    * @param <T> element type  
    * @param elem new element  
    * @param list list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> LazyConsList<T> cons(T elem, FilterableConsList<T> list) {  
     Objects.requireNonNull(list);  
     return new Cons<>(elem, list, x->true);  
   }  
   
   /**  
    * Add an element at the front of a list that will be supplied by the given supplier.  
    * @param <T> element type  
    * @param elem new element  
    * @param listSupplier list to extend  
    * @return a new list with elem as its head and list as its tail  
    */  
   public static <T> LazyConsList<T> cons(T elem, Supplier<FilterableConsList<T>> listSupplier) {  
     Objects.requireNonNull(listSupplier);  
     return new SuppliedTail<>(elem, listSupplier, x->true);  
   }  
     
   /**  
    * Create an empty list.  
    * @param <T> element type  
    * @return an empty list  
    */  
   public static <T> LazyConsList<T> nil() {  
     return new Nil<>();  
   }  
   
   /**  
    * Utility method that recurses through the specified list and its  
    * tails until it finds the first one with a head matching the filter.  
    * @param <T> element type  
    * @param list list that is filtered  
    * @param filter element filter  
    * @return the specified list or one of its tails  
    */  
   protected static <T> FilterableConsList<T> applyFilter(FilterableConsList<T> list, Predicate<T> filter) {  
     FilterableConsList<T> filtered = list;  
     while (!filter.test(filtered.head())) {  
       filtered = filtered.tail();  
     }  
     return filtered;  
   }  
   
   
   // ------------- Constructor  
   
   protected LazyConsList() {  
   }  
     
   // ------------- Collection interface  
     
   @Override  
   public final Iterator<T> iterator() {  
     return new Iterator<T>() {  
       private ConsList<T> current = LazyConsList.this;  
         
       @Override  
       public boolean hasNext() {  
         return !current.isNil();  
       }  
   
       @Override  
       public T next() {  
         T head = current.head();  
         current = current.tail();  
         return head;  
       }  
     };  
   }  
   
   @Override  
   public final int size() {  
     throw new UnsupportedOperationException("the size of a lazy list cannot be determined");  
   }  
     
   @Override  
   public Spliterator<T> spliterator() {  
     return Spliterators.spliteratorUnknownSize(iterator(), 0);  
   }  
     
   // ------------- concrete subclasses  
     
   /**  
    * A non-empty list.  
    * @param <T> element type  
    */  
   private static class Cons<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final FilterableConsList<T> tail;  
     private final Predicate<T> filter;  
       
     public Cons(T head, FilterableConsList<T> tail, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.tail = tail;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       return tail;  
     }  
   
      @Override  
     public boolean isNil() {  
       return false;  
     }  
       
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new Cons(filtered.head(), filtered.tail(), newFilter);  
     }  
   }  
   
   /**  
    * A non-empty list based on a generator function. The tail is lazily computed  
    * on demand and cached.  
    * @param <T> element type  
    */  
   private static class FunctionTail<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final UnaryOperator<T> next;  
     private final Predicate<T> filter;  
     private FilterableConsList<T> tailCache;  
   
     public FunctionTail(T head, UnaryOperator<T> next, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.next = next;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       // construct a new lazy list with the first element that passes the filter.  
       // use the generator function to construct the elements.  
       if (tailCache == null) {  
         T nextHead = head;  
         do {  
          nextHead = next.apply(nextHead);  
         } while (!filter.test(nextHead));  
         tailCache = new FunctionTail(nextHead, next, filter);  
       }  
       return tailCache;  
     }  
   
     @Override  
     public boolean isNil() {  
       return false;  
     }  
        
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new FunctionTail(filtered.head(), next, newFilter);  
     }  
   }  
   
   /**  
    * A non-empty list based on a supplied computation. The tail is lazily computed  
    * on demand and cached.  
    * @param <T> element type  
   */  
   private static class SuppliedTail<T> extends LazyConsList<T> {  
   
     private final T head;  
     private final Supplier<FilterableConsList<T>> supplier;  
     private final Predicate<T> filter;  
     private FilterableConsList<T> tailCache;  
   
     public SuppliedTail(T head, Supplier<FilterableConsList<T>> supplier, Predicate<T> filter) {  
       assert filter.test(head);  
       this.head = head;  
       this.supplier = supplier;  
       this.filter = filter;  
     }  
   
     @Override  
     public T head() {  
       return head;  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       // construct a new lazy list with the first element that passes the filter.  
       // delegate to the supplied function to create the tail of the current list.
       if (tailCache == null) {  
         tailCache = applyFilter(supplier.get(), filter);  
       }  
       return tailCache;  
     }  
   
     @Override  
     public boolean isNil() {  
       return false;  
     }  
   
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       Predicate<T> newFilter = filter.and(f);  
       FilterableConsList<T> filtered = applyFilter(this, newFilter);  
       return new Cons(filtered.head(), filtered.tail(), newFilter);  
     }  
   }  
     
   /**  
    * An empty list. Trying to access components of this class will result in   
    * an EmptyListException at runtime.  
    * @param <T> element type  
    */  
   private static class Nil<T> extends LazyConsList<T> {  
   
     @Override  
     public T head() {  
       throw new EmptyListException();  
     }  
   
     @Override  
     public FilterableConsList<T> tail() {  
       throw new EmptyListException();  
     }  
   
     @Override  
     public boolean isNil() {  
       return true;  
     }  
   
     @Override  
     public FilterableConsList<T> filter(Predicate<T> f) {  
       return this;  
     }  
      
   }  
 }  

2 comments:

  1. Here's a post with a step-by-step exposition of a genuine sieve implementation, based on the priority-queue algorithm from O'Neill's paper:

    http://en.it-usenet.org/thread/1150/101027/#post101142

    It doesn't need a lazy list, just regular Java 8 streams.

    -- Sebastian

    ReplyDelete
    Replies
    1. As the above link seems to have expired, I just post the code here, with some comments, but withoout the step-by-step exposition. (Sorry about the lack of formatting, but I can't figure out how that is supported in blog comments. But you should be alright when you copy the following into your favorite Java IDE.)

      import static java.util.stream.Collectors.toCollection;

      import java.util.Collection;
      import java.util.Iterator;
      import java.util.LinkedList;
      import java.util.List;
      import java.util.PriorityQueue;
      import java.util.Queue;
      import java.util.function.Predicate;
      import java.util.function.UnaryOperator;
      import java.util.stream.Stream;

      /**
      * A simple lambda implementation of the sieve of Eratosthenes, with optimizations
      *
      * June 18, 2013
      *
      * @author Brenden Towey, Sebastian Millies
      * @see Melissa O'Neill's excellent paper
      * @see This argument on the Haskell Cafè mailing list
      */
      public class Sieve {

      public static void main(String[] args) {
      System.out.println( oddsOnly(200) );
      }


      /** An infinite sequence of composite numbers represented as multiples of some prime factor. */
      private static class CompositeIterator implements Comparable, Iterator {

      public CompositeIterator(int prime, int factor, UnaryOperator next) {
      this.number = prime * factor;
      this.prime = prime;
      this.factor = factor;
      this.next = next;
      }

      private int number; // the composite number represented by this instance
      private final UnaryOperator next; // the factor increment
      private final int prime; // the prime of which this composite is a multiple
      private int factor; // the remainder

      @Override
      public int compareTo(CompositeIterator other) {
      return Integer.compare(number, other.number);
      }

      @Override
      public boolean hasNext() {
      return true;
      }

      @Override
      public CompositeIterator next() {
      this.factor = next.apply(factor);
      this.number = prime * factor;
      return this;
      }
      }

      private static Collection oddsOnly(int n) {

      UnaryOperator nextInt = x -> x + 2;

      Predicate sieve = new Predicate() {
      Queue composites = new PriorityQueue<>(n);

      @Override
      public boolean test(Integer candidate) {
      boolean prime = composites.isEmpty() || composites.peek().number != candidate;
      if (prime) {
      // cross off the square of this prime
      composites.offer(new CompositeIterator(candidate, candidate, nextInt));
      } else {
      // cross off the next multiple of each prime that is a factor of this composite
      while (!composites.isEmpty() && composites.peek().number == candidate) {
      CompositeIterator cp = composites.poll();
      composites.offer(cp.next());
      }
      }
      return prime;
      }
      };

      List primes = new LinkedList<>();
      primes.add(2);

      return Stream.iterate(3, nextInt)
      .filter(sieve)
      .limit(n - 1)
      .collect(toCollection(() -> primes));
      }
      }

      Delete