Skip to content

I took a look at Scala Streams tonight (or was that last night? When am I posting this?) and thought I’d share what I learned from Literate Programs and the Scala source code.

Streams

For the uninitiated, a Stream in Scala helps realize one of the fundamental concepts of Functional Programming, that of laziness. In essence, a Stream as infinite – think of a collection that just goes on and on and on and on and on and on. It would be asinine to construct the entire collection before ever using it as that would only ensure that you never used it. A Stream is evaluated on an as-needed basis and only up to the point that you need it.

The Delicious, Delicious Code…

Let’s illustrate using good ol’ Fibonacci Numbers. We’ll construct a Stream recursively because that’s more fun and it gives more meat to discuss. (This example comes straight from the Literate Programs website but I’m hoping to explain it in a bit more depth, and not get it too wrong in the process)

import scala.math.BigInt
lazy val fibs: Stream[BigInt] = BigInt(0) #::
                                BigInt(1) #::
                                fibs.zip(fibs.tail).map { n => n._1 + n._2 }

(Note: The above must all be on one line or the compiler is going to have a tough time pimping it out)

Sweet and delicious, no?

Dissection

Let’s break it down:

lazy

Strictly speaking, this isn’t needed, but why not be as lazy as you can possibly be? For the truly uninitiated, this ensures that the fibs value is not evaluated until it’s actually used.

val fibs: Stream[BigInt]

“What the hell is this? I thought Scala inferred types!”, I hear you scream. Scala does infer types but we’re defining a recursive value here and Scala needs to understand that the recursive call is a recursive call on a Stream as opposed to a String or some other completely unrelated type.

BigInt(0) #:: BigInt(1) #::

The Fibonacci series starts with 0 and 1 so we shove those in right here. But what’s this #:: stuff? Here’s our first real magical point and even the somewhat initiated may be scratching their heads at this one.

If you look at the Stream api you won’t find the #:: member function anywhere, and indeed you shouldn’t as we’re not working with a Stream right now, but a BigInt.

So with that realization in place, we must also recognize that methods ending in : are right associative. What this means is that the object we’re calling the #:: method on is actually the Stream object that is returned from the map command at the end of the call (we’ll get back to that, just hang on). Remember how we had to declare that fibs was a Stream instead of letting Scala infer it? Well, that’s (one of the reasons) why we need to do that.

Now we know that the method is being called on the Stream but when we look at the api for Stream, again we don’t see that method call. So where is it?

This method gets pimped on through the implicit def consWrapper[A](stream: => Stream): ConsWrapper[A] method defined on the Stream companion object. Scala will resolve the lack of a #:: method on Stream by finding an implicit conversion to a type that does have that method, and this is the ConsWrapper. The result of that method is to return a new Stream.

Ok, but how’s that of any use? The conversion to ConsWrapper is going to kill us due to the fact that it’s going to call the recursive function… or is it? Well, of course it’s not. If you look at the paramter to the ConsWrapper you’ll see why this doesn’t cause immediate recursion: new ConsWrapper(tl: => Stream[A]). That’s a by-name parameter. This means that it’s not invoked until it’s used.

Wow, that’s a lot of stuff for one bullet point.

fibs.zip(fibs.tail)

Go look at the Stream api again, specifically the zip. I’ll wait.

You didn’t read it, did you? Fine… on your own head, be it.

The size of the Stream returned by tail is one-less than the size of the Stream itself. This means that the zipped result is going to be the size of the Stream returned by tail, not the size of the Stream itself. So, in the initial case (where the Stream is Stream(0, 1, [Stream]), the zipped result is actually Stream((0, 1), ([Stream])) because the tail, of one element, is paired with the head.

Stream.zip creates a Cons cell containing (this.head, that.head) as the Cons head and (this.tail zip that.tail) as the Cons tail but as a Stream of (A1, B) so we’re not evaluating it now – we’ll evaluate it when it gets called.

map { n => n._1 + n._2 }

Hmmm… I hesitate to say “duh?” but it really does seem appropriate to do so. I mean, if you’ve made it this far, you know what this does.

Use it

Ok, so what happens when we evaluate it? Let’s get the first 20 numbers:

scala> fibs take 20 foreach println
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
scala>

That looks right to me.

WTF?

Some of you may still be scratching your heads because I wasn’t terribly clear on how this actually works. How is it that each number is evaluated as needed? Well, first let’s prove that it is so. I’m going to modify the code slightly.

import scala.math.BigInt
lazy val fibs: Stream[BigInt] =
    BigInt(0) #::
    BigInt(1) #::
    fibs.zip(fibs.tail).map(n => {
      println("Evaluating: %s -> %s".format(n._1, n._2))
      n._1 + n._2
    })

Now let’s take the first 5:

scala> fibs take 5 foreach println
0
1
Evaluating: 0 -> 1
1
Evaluating: 1 -> 1
2
Evaluating: 1 -> 2
3
scala>

And let’s take them again…

scala> fibs take 5 foreach println
0
1
1
2
3
scala>

And let’s take the first 7…

scala> fibs take 7 foreach println
0
1
1
2
3
Evaluating: 2 -> 3
5
Evaluating: 3 -> 5
8
scala>

Ok, so it works; things are properly evaluated in a lazy manner. But how?

Why…

The culprit behind this magic appears to be the interplay between the Cons class, the StreamIterator class and the LazyCell class:

final class Cons[+A](hd: A, tl: => Stream[A]) extends Stream[A]
// Note that the head is a value of type A and the tail is a by-name Stream
final class StreamIterator[+A](self: Stream[A]) extends Iterator[A] {
  class LazyCell(st: => Stream[A]) {
    // Note the laziness
    lazy val v = st
  }
  private var these = new LazyCell(self)
  def next: A =
    if (isEmpty) Iterator.empty.next
    else {
      // Evaluate the laziness
      val cur = these.v
      // And the concrete value of type A
      val result = cur.head
      // Assign the next lazy cell to be the Stream in the tail
      these = new LazyCell(cur.tail)
      result
    }
}

Couple that together with the recursively defined zip we saw earlier, and you’ve got your lazy Stream of Fibonacci numbers.

I love this stuff…

Note: This is C++0x stuff and it’s really just an investigation into some of the more obvious parts of it, including the auto keyword and lambda functions as well as working with higher kinded types (signified in C++ as template-template parameters).

I’ve been doing a lot of Scala recently and have thus become quite spoiled by its awesomeness. It’s made me look at C++ again with a different eye – a more functional eye. I’ve been doing C++ for so long, that I forget about that functional style of programming, and I also forget just how bad C++ is at functional programming – when you’re a multi-paradigm language, it’s hard to be great at all the paradigms simultaneously.

So, I took a stab at writing a more “functional” map function. C++ provides the transform function template in the functional header file but I find it lacking because you have to pre-create the container that will be mapped-to. This is an attempt at creating that inside the map function.

First, we’ll need some header files:

#include <iostream>
#include <vector>
#include <list>
#include <string>
#include <functional>
#include <algorithm>
#include <iterator>
#include <boost/lexical_cast.hpp>

And now we can define the function.

template <typename InType,
  template <typename U, typename alloc = allocator<U>>
            class InContainer,
  template <typename V, typename alloc = allocator<V>>
            class OutContainer = InContainer,
  typename OutType = InType>
OutContainer<OutType> mapf(const InContainer<InType>& input,
                           function<OutType(const InType&)> func)
{
  OutContainer<OutType> output;
  output.resize(input.size());
  transform(input.begin(), input.end(), output.begin(), func);
  return output;
}

The idea is that the output container may be a different type of container from the input container and the output type may also be of a different type from the input type.

A really simple usage of this could be:

vector<int> v1 = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
auto v2 = mapf(v1, function<int(const int&)>([](const int& i) { return i + 9; }));

The reason it’s simple is that all it does is just take a vector of ints and map it to a vector of ints. That’s nothing flashy.

It’s more interesting if you do something like this:

auto v3 = mapf<float, vector, list>(
            mapf(
              mapf(v1, function<int(const int&)>([](const int& i) { return i + 9; })),
              function<float(const int&)>([](const int& i) {
                return i * 3.1415;
              })),
            function<string(const float&)>([](const float& i) {
              return string("\"" + lexical_cast<string>(i) + "\"");
            }));

This is the standard functional “chaining” we see with more fluid, immutable data structures. Futher, it is actually returning a list<string> instead of a vector<string>. In Scala (you could do something similar in Ruby or many other languages) this would be:

v1.map(_ + 9).map(_ * 3.1415).map(_.toString).toList

Now, I’m not exactly happy with what we’ve got here. It’s not that I can’t pimp the int and float types and thus make it more fluid (although that would be nice). It’s more that the template function specification is hideous. In order to actually use different output container types, you must specify too much in the template parameter list.

auto l1 = mapf<int, vector, list>(v1,
  function<int(const int&)>([](const int& i) { return i + 2; }));
copy(l1.begin(), l1.end(), ostream_iterator<int>(cout, "\n"));

Ideally we’d like to simply specify something like mapf<list>(...) instead. I need to work on this and see if I can come up with something nicer. Perhaps it’s as simple as wrapping it up with another function or a template class… when I get more time.

I’ve put my Vim configuration on GitHub now. Have a gander at http://github.com/derekwyatt/vim-config

The Vim ‘indent’ script for Scala that I’ve been working on is now starting to take some decent shape. If you’re a Scala coder and are working in Vim, I can now actually recommend this script :). It’s actually the least complicated one I’ve written that achieves this level of functionality; I’ve had a few kicks at this particular can and one of them in particular was way complicated and also slow.

Go grab it from Github.

Posted with WordPress for BlackBerry.

I found some time to update the indent file for Scala recently. It’s getting better and more complicated, unfortunately. Please feel free to fork it and make changes/fixes… I hate writing this thing :)

Grab it from my vim-scala repository on github.

Posted with WordPress for BlackBerry.

So, a little over a month ago, I changed departments where I work – I used to work in the Enterprise Software group and have moved to the System Architecture / Research group. There are some great things that go with this:

  1. I get to work with Vim more than I used to because I now code a lot more than I used to.
  2. I’ve effectively deleted Windows entirely and replaced it with Ubuntu (YAY!)
  3. I’m getting a chance to deeply investigate Scala and Akka, which is a real treat! My research may not go anywhere, but the research itself is incredibly useful, I think – Scala and Akka are absolutely fantastic projects.

Once I figure out the right path to take in order to get some changes into the Scala repository for Vim, I’ll do that, but for the moment I’m just having a great time banging away at the docs and shoving things in my own repository.

What I’ve got at the moment is a 4000+ line reference file (in Vim help format, of course) that I’ve put together to help me better understand Scala, and give me a spot to put some critical information and tips / tricks I find as I go. I’ve also fixed up the indent file a bit and plan to enhance the syntax file when I get a chance too.

Because of Scala’s flexibility and terseness I find it a bit difficult to create code snippets for it, but I may do that in the future as well.

If you’re interested, head over to my vim-scala repository on GitHub and clone it down to your vim configuration.

Well, I got another patch from someone for protodef. It was just getting silly that I didn’t have anywhere to actually put this stuff and I was getting tired of people sending patches :). Now y’all can just fix stuff for me directly… go to it.

Head here for protodef.
Head here for fswitch.

I’ve finally updated FSwitch and ProtoDef after a long time waiting. Three guys gave me patches over many months and I’ve finally put them together into an actual couple of releases.

Thanks to Matt Spear, Timon Kelter and Dmitry Bashkatov. Sorry about the wait fellas.

So there’s been some scuttlebutt on the Twitters recently regarding this “Pathogen” script for Vim and I decided to have a look. In a word? “Sweet”. In a few words? Tim Pope is the absolute man.

This is an extremely simple and elegant script. All it does is manipulate the ‘runtimepath’ but it has a nice focus on allowing you to componentize your Vim extensions into their own, private ‘runtimepath’ tree segments. So what? So what?!? Now you can easily upgrade your extensions by just deleting the old tree, downloading the package and exploding it in place.

This would have saved my ass when xptemplate went through a revision that deleted files, and I didn’t notice. Having unwanted, autoloaded files in place was not a good thing.

And you can also just toss git suppositories straight into this as well – perfect updating.

Check out Tammer Saleh‘s post called The Modern Vim Config with Pathogen for a concise description on how to get it into your vimrc.

Rick over at Lococast.Net has some great screencasts up for Vim. I’ve watched a couple of them now, and I’m a happy dood… nice stuff! Go check ‘em out. Go… go now… stop… no, don’t do that, you know what I mean… that thing you were going to do, that dirty, naughty, disgusting thing?? Yeah, that. Don’t. Go watch his screencasts instead. Go here instead.

Switch to our mobile site