Skip to content

Archive

Tag: scala

A couple of people pointed out that my posting about why Monads are so awesome left a bit too much to the imagination. I tossed in an update that should make it less mysterious about what the heck I’m talking about.

Update: In the comments below, Alex is trying to educate me on something… I’m still not entirely sure what he’s on about but it’s been keeping me thinking about this post. I’ve modified it a bit here and there to try and be more correct. Essentially, I think I’ve been attributing too much to Monads where only part of what I like is attributed to them where the other part is attributed to partial functions (or function composition).

Ok, I’m no Monad expert by any stretch of the imagination… seriously. But, I did something today that has me so jazzed about Monads, partial functions and Scala in general, that I just gotta share.

So my task was to take some JSON and convert it to a structure of my own classes, but still really general – i.e. we’re not talking about “real” marshalling here.

So I took a simple approach, and used the Scala Library’s JSON Module. I had to deal with type-casting, defaulting of values, and constraint checking (i.e. field X must be defined). The code to do all of this is so simple and so awesome that it’s just hard to believe. Now, it’s not perfect… I’ll shore it up over the next while as need requires… but it’s passing all of my tests pretty nicely right now.

I want to convert the following JSON:

{
  "vendor"  : "Whoever", // required
  "colour"  : "blue",    // default to "None"
  "widgets" :            // default to Empty list of strings
  [
    {
      "name"    : "One",   // required
      "flavour" : "Lime"   // default to Strawberry
    },
    {
      "name"    : "Two",   // required
      "flavour" : "Banana" // default to Strawberry
    }
  ]
}

to the following concrete Scala:

case class Widget(name: String, flavour: String)
case class Contract(vendor: String, colour: String, widgets: List[Widget])
Contract("Whoever", "blue", List(Widget("One", "Lime"),
                                 Widget("Two", "Banana")))

I used a Pattern Matching example for this sort of thing from an awesome StackOverflow post as a starter and then fleshed it out to meet my requirements. Here it is:

// Types we need
case class Widget(name: String, flavour: String)
case class Contract(vendor: String, colour: String, widgets: List[Widget])

object Contract {
  // The class caster that came from the StackOverflow page
  // Yeah, this can throw exceptions... I don't care about
  // those for this example
  class ClassCaster[T] {
    def unapply(a: Any): Option[T] = Some(a.asInstanceOf[T])
  }

  // Concrete instances of the casters for pattern matching
  object AsMap extends ClassCaster[Map[String, Any]]
  object AsList extends ClassCaster[List[Any]]
  object AsString extends ClassCaster[String]

  // It can happen
  case class ContractException(message: String) extends Exception(message)

  // Returns an Either that is a hell of a lot more deterministic than
  // throwing exceptions around
  def fromJSON(json: String): Either[ContractException, Contract] = {
    // Helps us get rid of boilerplate
    def error[T](prefix: String): Option[T] =
      throw ContractException("%s must be specified".format(prefix))

    // Catch the exceptions and turn them into Left() instances
    try {
      val parsed = JSON.parseFull(json)
      if (parsed.isEmpty) throw ContractException("Unable to parse JSON.")

      // Here comes the monad / partial function love...
      val contract = for {
        Some(AsMap(map))  < - List(parsed)
        AsString(vendor)  <- map get "vendor" orElse error("vendor")
        AsString(colour)  <- map get "colour" orElse Some("None")
        AsList(widgets)   <- map get "widgets"
      } yield Contract(vendor, colour, for {
          AsMap(widget)     <- widgets
          AsString(name)    <- widget get "name" orElse error("name")
          AsString(flavour) <- widget get "flavour" orElse Some("Strawberry")
        } yield Widget(name, flavour)
      )
      Right(contract.head)
    } catch {
      case e: ContractException =>
        Left(e)
      case e =>
        Left(ContractException(e.getMessage))
    }
  }
}

It’s very impressive what this code actually handles when you start digging into it. I’m sure you can easily think about what it would be like to code this in Java (or spaghetti monster forbid C++) and know that it’s going to be a lot more verbose and probably a lot less reliable.

UPDATE: One of the key reasons why the above is so cool is this:

<- map get "colour" orElse Some("None")
<- widget get "name" orElse error("name")

By using a generator syntax (<-) and the get function on Map we get to work at a higher abstraction that allows us to chain the decision making process rather than what the imperative approach would demand of us. It’s the monadic and compositional nature of Option that allows us to do this. Consider:

String colour = map.get("colour");
if (colour == null)
  colour = "None";

String name = widget.get("name");
if (name == null)
  error("name");

I’m going to editorialise (I mean, this is my blog after all) and point out the following problems:

  • It’s unclear. Sure, if you’ve seen it a million times (and let’s face, you have… we all have) you know exactly what this “means” but that’s not because its meaning is clear.
  • It’s not syntactically atomic. Some nasty merge, or some careless programmer can easily insert something between lines 1 and 2 that uses colour.
  • It’s repetitive in a bad way (is there a good way?). That pattern is going to be copy-pasted over and over again. How many times do you see “colour” in there? Yeah… three. So if I copy-paste that and forget to change the third one, you get this:
String flavour = widget.get("flavour");
if (flavour == null)
  colour = "None";

GHACK!

We need to program at a higher level.

I took a look at Scala Streams tonight (or was that last night? When am I posting this?) and thought I’d share what I learned from Literate Programs and the Scala source code.

Streams

For the uninitiated, a Stream in Scala helps realize one of the fundamental concepts of Functional Programming, that of laziness. In essence, a Stream as infinite – think of a collection that just goes on and on and on and on and on and on. It would be asinine to construct the entire collection before ever using it as that would only ensure that you never used it. A Stream is evaluated on an as-needed basis and only up to the point that you need it.

The Delicious, Delicious Code…

Let’s illustrate using good ol’ Fibonacci Numbers. We’ll construct a Stream recursively because that’s more fun and it gives more meat to discuss. (This example comes straight from the Literate Programs website but I’m hoping to explain it in a bit more depth, and not get it too wrong in the process)

import scala.math.BigInt
lazy val fibs: Stream[BigInt] = BigInt(0) #::
                                BigInt(1) #::
                                fibs.zip(fibs.tail).map { n => n._1 + n._2 }

(Note: The above must all be on one line or the compiler is going to have a tough time pimping it out)

Sweet and delicious, no?

Dissection

Let’s break it down:

lazy

Strictly speaking, this isn’t needed, but why not be as lazy as you can possibly be? For the truly uninitiated, this ensures that the fibs value is not evaluated until it’s actually used.

val fibs: Stream[BigInt]

“What the hell is this? I thought Scala inferred types!”, I hear you scream. Scala does infer types but we’re defining a recursive value here and Scala needs to understand that the recursive call is a recursive call on a Stream as opposed to a String or some other completely unrelated type.

BigInt(0) #:: BigInt(1) #::

The Fibonacci series starts with 0 and 1 so we shove those in right here. But what’s this #:: stuff? Here’s our first real magical point and even the somewhat initiated may be scratching their heads at this one.

If you look at the Stream api you won’t find the #:: member function anywhere, and indeed you shouldn’t as we’re not working with a Stream right now, but a BigInt.

So with that realization in place, we must also recognize that methods ending in : are right associative. What this means is that the object we’re calling the #:: method on is actually the Stream object that is returned from the map command at the end of the call (we’ll get back to that, just hang on). Remember how we had to declare that fibs was a Stream instead of letting Scala infer it? Well, that’s (one of the reasons) why we need to do that.

Now we know that the method is being called on the Stream but when we look at the api for Stream, again we don’t see that method call. So where is it?

This method gets pimped on through the implicit def consWrapper[A](stream: => Stream): ConsWrapper[A] method defined on the Stream companion object. Scala will resolve the lack of a #:: method on Stream by finding an implicit conversion to a type that does have that method, and this is the ConsWrapper. The result of that method is to return a new Stream.

Ok, but how’s that of any use? The conversion to ConsWrapper is going to kill us due to the fact that it’s going to call the recursive function… or is it? Well, of course it’s not. If you look at the paramter to the ConsWrapper you’ll see why this doesn’t cause immediate recursion: new ConsWrapper(tl: => Stream[A]). That’s a by-name parameter. This means that it’s not invoked until it’s used.

Wow, that’s a lot of stuff for one bullet point.

fibs.zip(fibs.tail)

Go look at the Stream api again, specifically the zip. I’ll wait.

You didn’t read it, did you? Fine… on your own head, be it.

The size of the Stream returned by tail is one-less than the size of the Stream itself. This means that the zipped result is going to be the size of the Stream returned by tail, not the size of the Stream itself. So, in the initial case (where the Stream is Stream(0, 1, [Stream]), the zipped result is actually Stream((0, 1), ([Stream])) because the tail, of one element, is paired with the head.

Stream.zip creates a Cons cell containing (this.head, that.head) as the Cons head and (this.tail zip that.tail) as the Cons tail but as a Stream of (A1, B) so we’re not evaluating it now – we’ll evaluate it when it gets called.

map { n => n._1 + n._2 }

Hmmm… I hesitate to say “duh?” but it really does seem appropriate to do so. I mean, if you’ve made it this far, you know what this does.

Use it

Ok, so what happens when we evaluate it? Let’s get the first 20 numbers:

scala> fibs take 20 foreach println
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
scala>

That looks right to me.

WTF?

Some of you may still be scratching your heads because I wasn’t terribly clear on how this actually works. How is it that each number is evaluated as needed? Well, first let’s prove that it is so. I’m going to modify the code slightly.

import scala.math.BigInt
lazy val fibs: Stream[BigInt] =
    BigInt(0) #::
    BigInt(1) #::
    fibs.zip(fibs.tail).map(n => {
      println("Evaluating: %s -> %s".format(n._1, n._2))
      n._1 + n._2
    })

Now let’s take the first 5:

scala> fibs take 5 foreach println
0
1
Evaluating: 0 -> 1
1
Evaluating: 1 -> 1
2
Evaluating: 1 -> 2
3
scala>

And let’s take them again…

scala> fibs take 5 foreach println
0
1
1
2
3
scala>

And let’s take the first 7…

scala> fibs take 7 foreach println
0
1
1
2
3
Evaluating: 2 -> 3
5
Evaluating: 3 -> 5
8
scala>

Ok, so it works; things are properly evaluated in a lazy manner. But how?

Why…

The culprit behind this magic appears to be the interplay between the Cons class, the StreamIterator class and the LazyCell class:

final class Cons[+A](hd: A, tl: => Stream[A]) extends Stream[A]
// Note that the head is a value of type A and the tail is a by-name Stream
final class StreamIterator[+A](self: Stream[A]) extends Iterator[A] {
  class LazyCell(st: => Stream[A]) {
    // Note the laziness
    lazy val v = st
  }
  private var these = new LazyCell(self)
  def next: A =
    if (isEmpty) Iterator.empty.next
    else {
      // Evaluate the laziness
      val cur = these.v
      // And the concrete value of type A
      val result = cur.head
      // Assign the next lazy cell to be the Stream in the tail
      these = new LazyCell(cur.tail)
      result
    }
}

Couple that together with the recursively defined zip we saw earlier, and you’ve got your lazy Stream of Fibonacci numbers.

I love this stuff…

Switch to our mobile site