Scala Functional Design & Programming - Chapter 5 (5)

2016年10月10日 1:15

だいたい 18 分で読めます

Table of Contents

Top
Chapter 5
Scala's Non-Strict Functions
Stream
Thoughts

I bought "Scala Functional Design & Programming - A Comprehensive Guide to Functional Programming by Scalaz Contributors".
I'll read it bit by bit and summarize each chapter.

Scala関数型デザイン&プログラミング ―Scalazコントリビューターによる関数型徹底ガイド (impress top gear)

Paul Chiusano (著),Rúnar Bjarnason (著),株式会社クイープ (翻訳)

Amazon

This time, I'll cover Chapter 5. It's about lazy evaluation.

Chapter 5

This chapter deals with non-strictness, which is essentially lazy evaluation. This is a very functional programming-like concept and has a significant impact on execution efficiency. I want to delve deeper into it.

Scala's Non-Strict Functions

This is often represented in the form of ↓ in JS.

def hoge(f: () => Int) = ... f()

hoge( () => 21 )

This form is called a thunk and is commonly used. Even in JS libraries, the name "redux-thunk" is used.

However, writing it every time is cumbersome, so in Scala, the initial empty parentheses are omitted, and the parentheses are not needed when calling it either. Convenient!

def hoge(f: => Int) = ... f

When retrieving a value from a sink, the result is not cached, so it's necessary to put it in a lazy constant.

lazy val x = f

This way, x is evaluated when it's first used, and the result is cached.

Stream

After introducing the smart constructor for caching, it's time for the usual practice problems.

Create a toList for Stream.

  def toList: List[A] = {
    this match {
      case Cons(h, t) => h() :: t().toList
      case _ => Nil
    }
  }

This works, but it's not tail-recursive, so it's slow. The answer also included a solution using reverse and tailrec with mutable.ListBuffer.

Create take, drop, and takeWhile for Stream.

  def take(n: Int): Stream[A] =
    this match {
      case Cons(h, t) if n > 0 => cons(h(), t().take(n - 1))
      case _ => Empty
    }

  def drop(n: Int): Stream[A] =
    this match {
      case Cons(h, t) if n > 0 => t().drop(n - 1)
      case _ => this
    }

  def takeWhile(p: A => Boolean): Stream[A] =
    this match {
      case Cons(h, t) if p(h()) => cons(h(), t().takeWhile(p))
      case _ => Empty
    }

In the answer for take, it's pointed out that when n == 1, an unnecessary t() is called, so it's better to branch for n == 1. It's a small but important detail.

Next, the implementation of exists is explained, which utilizes Stream to stop evaluating as soon as it finds a match. foldRight is also implemented in a way that stops evaluating as soon as it finds a match.

Here's another practice problem.

Implement forAll to check if all elements satisfy a condition.

  def forAll(p: A => Boolean): Boolean =
    foldRight(true)((a, b) => p(a) && b)

I was able to write it without thinking about Stream at all. It's truly generalized.

Implement takeWhile using foldRight.

  def takeWhile2(p: A => Boolean): Stream[A] =
    foldRight(Empty: Stream[A])((a, b) => if (p(a)) cons(a, b) else Empty)

Implement headOption using foldRight.

  def headOption: Option[A] = foldRight(None: Option[A])((a, _) => Some(a))

If the list is empty, it returns None, so this implementation is straightforward. The second argument b is not evaluated, so the loop doesn't occur, and Stream is utilized properly.

Implement map, filter, append, and flatMap using foldRight.

I realized that the type specification for Empty can be done in this way, which is convenient.

  def map[B](f: A => B): Stream[B] =
    foldRight(Empty: Stream[B])((a,b) => cons(f(a), b))

  def filter(f: A => Boolean): Stream[A] =
    foldRight(Empty: Stream[A])((a,b) => if(f(a)) cons(a, b) else b)

When I wrote it enthusiastically, I encountered a compile error for append. It's a problem related to covariance and contravariance, which I didn't fully understand, so I had to rely on another site.

It's very easy to understand. I'm not sure if I can use it fluently, but I was able to avoid the variance check for now.

  def append[B>:A](a: => Stream[B]): Stream[B] =
    foldRight(a)((c,d) => cons(c, d))

  def flatMap[B>:A](s: A => Stream[B]): Stream[B] =
    foldRight(Empty: Stream[B])((a,b) => s(a).append(b))

By using these functions to process Stream, only the necessary parts are evaluated, and unnecessary instances are not generated. Even when processing heavy lists, it's very efficient.

st.map(---).filter(---)
st.filter(---).headOption

The order of loop application itself seems to change, so Stream is also called "first-class loop". It's also memory-friendly because the filtered-out elements and newly generated elements are immediately garbage-collected.

Next, I'll cover the basics of Stream and then move on to infinite streams, which are often cited as examples.

Create a function constant that generates an infinite stream of a specified value.

def constant[A](a: A): Stream[A] = cons(a, constant(a))

It's simple, but it's a bit wasteful to create a new cons every time. A better approach is to create a lazy object and reuse it.

  def constant2[A](a: A): Stream[A] = {
    lazy val v: Stream[A] = Cons(() => a, () => v)
    v
  }

Create a function from that generates an infinite stream of n, n+1, n+2, ...

  def from(n: Int): Stream[Int] = cons(n, from(n+1))

Create a function fibs that generates a Fibonacci sequence stream.

  def fibs(): Stream[Int] = {
    def loop(i1: Int, i2: Int): Stream[Int] = cons(i2, loop(i1 + i2, i1))
    loop(1, 0)
  }

It's a common example.

Create a more general stream generation function unfold and use it to implement fibs, from, constant, and ones.

  def u_fibs(): Stream[Int] =
    unfold((0, 1))(a => Some(a._1, (a._2, a._1 + a._2)))
  def u_fibs2(): Stream[Int] =
    unfold((0, 1))(a => {
      a match {
        case (i1, i2) => Some(i1, (i2, i1 + i2))
      }
    })
  def u_fibs3(): Stream[Int] =
    unfold((0, 1)) { case (i1, i2) => Some(i1, (i2, i1 + i2)) }

  def u_from(n: Int): Stream[Int] =
    unfold(n)(s => Some(s, s + 1))

  def u_constant(n: Int): Stream[Int] =
    unfold(n)(s => Some(s, s))

  def u_ones: Stream[Int] =
    unfold(1)(s => Some(s, s))

There are three implementations of fibs, but they're all the same. When separating variables, I can omit match and write it more concisely.

Implement map, take, takeWhile, zipWith, and zipAll using unfold.

  def u_take(n: Int): Stream[A] =
    unfold((this, n)) {
      case (Cons(h, t), i) if i > 0 => Some(h(), (t(), i-1))
      case _ => None
    }
  def u_takeWhile(f: A => Boolean): Stream[A] =
    unfold(this) {
      case Cons(h, t) if f(h()) => Some(h(), t())
      case _ => None
    }
  def zipWith[B, C](l2: Stream[B])(f: (A, B) => C): Stream[C] = {
    unfold((this, l2)) {
      case (Cons(h1, t1), Cons(h2, t2)) => Some(f(h1(), h2()), (t1(), t2()))
      case _ => None
    }
  }
  def zipAll[B](l2: Stream[B]): Stream[(Option[A], Option[B])] = {
    unfold((this, l2)) {
      case (Empty, Cons(h2, t2)) => Some((None, Some(h2())), (Empty, t2()))
      case (Cons(h1, t1), Empty) => Some((Some(h1()), None), (t1(), Empty))
      case (Cons(h1, t1), Cons(h2, t2)) => Some((Some(h1()), Some(h2())), (t1(), t2()))
      case _ => None
    }
  }

According to the answer, zipAll returns (Option[A], Option[B]), so I followed that.

Implement startsWith.

  def startsWith[B](s: Stream[B]): Boolean =
    zipAll(s).takeWhile(_._2.isDefined).forAll({
        case (Some(a), Some(b)) => a == b
        case _ => false
    })

I paired the two streams and took the prefix part, then checked if all elements matched.

Implement tails using unfold.

It seems like I can implement it by gradually reducing the stream. I need to explicitly append an empty stream.

  def tails: Stream[Stream[A]] =
    unfold(this) {
      case Empty => None
      case s => Some(s, s.drop(1))
    }.append(Stream(empty))

Implement scanRight by generalizing tails.

I didn't understand, so I checked the answer. It seems like it's generating a memoized variable and using foldRight. It's strange that unfold is not used here, even though it was used in tails.

Thoughts

I finally finished it. It looks light, but it's actually heavy. I still need to practice to master it, but it seems like the range of applications will expand greatly depending on whether I know it or not. It's essential for efficient processing, and it's said to improve modularity by separating description and evaluation.

There are many examples in the next chapter, and it seems like there's still a lot to learn.