Friday, October 30, 2009

99 Beers!

I was perusing Lau Jensen's excellent Best In Class blog. Now, Lau Jensen is a Clojure-person, and that's one of the best languages out there in my opinion. So, when he compares it to other languages, that's never a nice experience to those being compared.

Personally, I think OO and infix methods (dot-style or operator notation) are essential for a well-rounded, popular language, which makes something of an uncrossable chasm between Clojure and me for "serious" stuff.

At any rate, his recent second take at Python got me curious about the Rosetta Code project. First, I got curious about his discussion of the Counting Programming Example problem, and the Python and Clojure implementations shown. I took my own shot at it, going the parallel way like the Clojure version, but I'm unsatisfied with the result. Sure, it is short, but it lacks... sense of cohesion, I guess. In fact, I just changed it from what I submitted to Rosetta. Here:

import java.net.{URL, URLEncoder}

val URLs = Map(
'allTasks -> "http://www.rosettacode.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Tasks&cmlimit=500&format=xml",
'oneTask -> "http://www.rosettacode.org/w/index.php?title=%s&action=raw"
)
def oneTaskURL(title: String) = URLs('oneTask) format URLEncoder.encode(title.replace(' ', '_'), "UTF-8")

def getPage(url: String) = scala.io.Source.fromURL(new URL(url))(scala.io.Codec.UTF8)

val pattern = "(?i)==\\{\\{header\\|".r
def countPattern(title: String) =
pattern findAllIn (getPage(oneTaskURL(title)) mkString) length

val allTasks = scala.xml.parsing.XhtmlParser(getPage(URLs('allTasks)))
val counts =
for (task <- allTasks \\ "cm" \\ "@title" map (_.text))
yield actors.Futures.future((task, countPattern(task)))

counts map (_.apply) map Function.tupled("%s: %d examples." format (_, _)) foreach println
println("\nTotal: %d examples." format (counts map (_.apply._2) sum))

I won't bother explaining it, because I think it is self-explanatory, if a bit arcane. Scala's partial XPath support is put to good use, and future just works here. In case you aren't familiar with Future, it just detaches a thread to handle some computation you passed, and then, when you call "apply", it joins that thread and get the result. Sort of. If anyone wants to take a go at making this better, I'd like to see the results.

At any rate, while perusing the Rosetta site, I noticed that 99 Bottles of Beer problem allowed for creativity in reaching a solution. Most examples went for conciseness. My own concise Scala looks like this:

99 to 1 by -1 foreach {n =>
println("""|%d bottles of beer on the wall
|%d bottles of beer
|Take one down, pass it around
|%d bottles of beer on the wall\n""".stripMargin format (n, n, n -1))}

However, as I thought about the problem and how to make it interesting, I decided to model it with Actors. Sure, it is an absolute overkill for this problem, but I think the modelling itself shows off Actors pretty nice.

Here it is:

object Song {
import scala.actors._
import scala.actors.Actor._

abstract class Beverage { def name = this.toString.toLowerCase }
case object Beer extends Beverage

object Wall {
private var contents: List[Beverage] = Nil

def count(what: Beverage) = contents count (_ == what)
def isEmpty = contents isEmpty
def stock(n: Int, what: Beverage) = contents :::= List.fill(n)(what)
def get(what: Beverage) {
def takeOneFrom(contents: List[Beverage]): List[Beverage] = contents match {
case `what` :: rest => rest
case other :: rest => other :: takeOneFrom(rest)
case Nil => println("Sorry, we are out of "+what.name); Nil
}
contents = takeOneFrom(contents)
}
}

sealed abstract class Messages
case class SingSong(what: Beverage) extends Messages
case class HowManyMore(what: Beverage) extends Messages
case class HowManyNow(what: Beverage) extends Messages
case class ThereAreStill(n: Int, what: Beverage) extends Messages
case class ThereAreNow(n: Int, what: Beverage) extends Messages
case class Gimme(what: Beverage) extends Messages
case class HereIs(what: Beverage) extends Messages
case object ClosingTime extends Messages

def plural(count: Int, noun: String, nouns: String) = if (count == 1) noun else nouns
def countIt(n: Int, what: Beverage) = "%d %s of %s" format (n, plural(n, "bottle", "bottles"), what.name)

object Waitress extends Actor {
def tellThem(what: String) = println("%s on the wall" format what)
def act = loop {
react {
case HowManyMore(it) =>
val total = Wall count it
tellThem(countIt(total, it))
reply (ThereAreStill(total, it))
case Gimme(it) =>
print("Take one down, ")
Wall get it
reply (HereIs(it))
case HowManyNow(it) =>
val total = Wall count it
tellThem(countIt(total, it))
if (Wall isEmpty) {
reply (ClosingTime) // You don't have to go home, but you can't stay here
exit
} else
reply (ThereAreNow(total, it))
case _ =>
println("You wish, honey!")
}
}
}

object Patrons extends Actor {
def act = loop {
react {
case SingSong(what: Beverage) =>
Waitress ! HowManyMore(what)
case ThereAreStill(n, it) =>
println(countIt(n, it))
Waitress ! Gimme(it)
case HereIs(it) =>
println("pass it around")
Waitress ! HowManyNow(it)
case ThereAreNow(n, it) =>
println()
Waitress ! HowManyMore(it)
case ClosingTime =>
exit
case _ =>
println("Say what???")
}
}
}

def Sing99Beers = {
Wall stock (99, Beer)
Waitress.start
Patrons.start

Patrons ! SingSong(Beer)
}
}

Now, this code has a structure to it. To begin with, there are two concerns. First, there is the "wall", where the beverages are kept, and the beverages themselves. The beverages are modeled as case objects derived from a sealed class, which is an alternative to enumerations, as I have discussed in my Matrix series.

I suppose the wall ought to be synchronized, but I'm working with an assumption here. Only one agent can get things from the wall, and the wall can only be restocked while that agent is not working. It would be nice to have this enforced by the compiler, and a paper from the EPFL guys gives me hope of being able to in the future.

The second concern is the interaction between the patrons and the waitress. Again, it is further divided into two: the messages they exchange between themselves, and the actors themselves and their behavior.

Defining the messages as a subclass from an abstract class doesn't really give me anything. The Actors in the standard library are untyped, so they can receive anything as message. There is one typed Actor library for Scala, though, which is a model much more to my liking. If used, I'd not have to provide for handling of unknown messages.

The song gets sang as the patrons ask how many beers there are, the waitress answers, the patrons order one beer, get it, and then ask again how many beers there are. No actor ever waits for the answer. Instead, they finish their present reaction, and wait for another message -- which might be the answer they were waiting for, or not. What the actor do is solely defined by what message they received.

To get the song started, the wall is first stocked, patrons and waitress started, and, then, the patrons are asked to begin the song. I'm pretty sure bar owners everywhere hope it would be this easy to get people drinking until the stock is sold-out. :-)

I hope you enjoyed this post half as much as I enjoyed writing that code! If you are feeling the need to write some code right now, try modifying the program so that, when the stock of whatever is being consumed is finished, the patrons ask what else there is, and start a new song on a new beverage.

Or, better yet, why don't you find a fun problem on Rosetta Code, for which no solution exists in a language you like, and submit a solution to it? It's as easy as editting a wiki article, so go for it!

Wednesday, October 28, 2009

Strict Ranges?

Note: This article was created before Scala 2.8.0 was out, and is mostly related to a problem that is no longer present as of that version. The code in here is supposed to be run on Scala 2.7.x unless otherwise noted, and the output of Scala 2.8.0 usually refers to some work-in-progress version of it, not the final version.

A common mistake newcomers to Scala usually make is to treat Scala ranges as strict. It's easy to see why they make such mistake, because the very expression "strict", though relatively easy to find in the Scala API documentation, is unfamiliar to them.

So, before getting into the problem, I'll make a brief detour. In functional languages, there is a concept of "strict functions" and "non-strict functions". Informally, a "strict function" is one which always evaluates its argument. For programmers from a non-functional background, that's most of the functions they ever saw, but not all. For instance, the trinary operator, "test ? f1 : f2", does not evaluate all its arguments, f1 and f2, even though it always evaluate at least one of them.

Now, to see where that can lead to trouble, let's first see an example of non-strictness in action, with Scala 2.7:

scala> object X {
| val x = -10 to 10 map (2 / _)
| }
defined module X

scala> object Y {
| val y = List.range(-10, 10) map (2 / _)
| }
defined module Y

scala> Y.y(0)
java.lang.ArithmeticException: / by zero
at Y$$anonfun$1.apply(<console>:5)
at Y$$anonfun$1.apply(<console>:5)
at scala.List.map(List.scala:812)
at Y$.<init>(<console>:5)
at Y$.<clinit>(<console>)
at .<init>(<console>:6)
at .<clinit>(<console>)
at RequestResult$.<init>(<console>:3)
at RequestResult$.<clinit>(<console>)
at RequestResult$result(<console>)
at sun.reflect.Nativ...
scala> X.x(0)
res2: Int = 0

We see an error happens when the map on the list is applied, but not when the map on the range is applied. That's because the function "map" for Range (the result of "x to y" or "x until y" expressions) is non-strict. I can get the value for any part of that Range, except the eleventh element, without throwing an exception.

But let say we have a function expensiveComputation, which takes quite a while to process. What happens, then, if I do this:

scala> import scala.actors.Futures._
import scala.actors.Futures._

scala> val m = 1 to 1000 map (i => future(expensiveComputation(i)))
m: RandomAccessSeq.Projection[scala.actors.Future[Int]] = RangeM(<function>, <function>, <function>,
<function>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <fu
nction>, <function>, <function>, <function>, <function>, <function>, <function>, <function>, <functi
on>, <function>, <function>, <function>, <function>, <function>, <function>, <function>...

If you haven't seen it before, the "future" function uses threads to compute the function you passed to it. Your program can go on to do other stuff, while all that work is being done concurrently.

Except, of course, that's not what's happening. No computation is being done, because Range's non-strict map hasn't evaluated its arguments. In fact, each time I call "apply" to get the result of the computation, the function I passed to map (the future) will be computed again! For example:

scala> def expensiveComputation(n: Int) = {
|   println("Starting expensive computation")
|   val x = n * 2
|   println("Finished expensive computation")
|   x
| }
expensiveComputation: (Int)Int

scala> val m = 1 to 10 map (i => future(expensiveComputation(i)))
(output from "m.toString" skipped)

scala> m(0).apply
Starting expensive computation
Finished expensive computation
res12: Int = 2

scala> m(0).apply
Starting expensive computation
Finished expensive computation
res13: Int = 2

To see where that can be a problem to beginners, take a look at the code in this question at Stack Overflow:

def ttest() = {
val threads =
for (i <- 1 to 5)
yield new Thread() {
override def run() {
println("going to sleep")
Thread.sleep(1000)
println("awake now")
}
}

threads.foreach(t => t.start())
threads.foreach(t => t.join())
println("all done")
}

Can you imagine what happens?

Well, gladly, Scala 2.8 should have a strict Range, so newcomers will be spared the trouble and lost debugging time. Here's what this will look like with Scala 2.8:

scala> val m = 1 to 10 map (i => future(expensiveComputation(i)))
Starting expensive computation
Finished expensive computation
Starting expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
m: scala.collection.immutable.Vector[scala.actors.Future[Int]] = NewVectorImpl(<function0>, <function0>, <function0>, <f
unction0>, <function0>, <function0>, <function0>, <function0>, <function0>, <function0>)
Finished expensive computation
Starting expensive computation
Finished expensive computation

scala> Finished expensive computation


scala> m(0).apply
res11: Int = 2

scala> m(0).apply
res12: Int = 2

Also, if you try out the first code snippet on Scala 2.8, "x.x(0)" will also throw an exception. Now, if we want the previous behavior, we can just call the "view" method. It will work not only on Range, but on pretty much all standard collections, as it is defined on the class IterableLike.

scala> val m = (1 to 10 view) map (i => future(expensiveComputation(i)))
m: scala.collection.SeqView[scala.actors.Future[Int],Seq[_]] = SeqViewM(...)

scala> m.toList
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Starting expensive computation
Starting expensive computation
Starting expensive computation
Finished expensive computation
Starting expensive computation
Finished expensive computation
Finished expensive computation
Finished expensive computation
Finished expensive computation
Starting expensive computation
Starting expensive computation
Finished expensive computation
res5: List[scala.actors.Future[Int]] = List(, , , , , 
, , , , )
Finished expensive computation

scala> m(0).apply
Starting expensive computation
Finished expensive computation
res6: Int = 2

scala> m(0).apply
Starting expensive computation
Finished expensive computation
res7: Int = 2

And, for the first example:

scala> object X {
| val x = (-10 to 10 view) map (2 / _)
| }
defined module X

scala> object Y {
| val y = List.range(-10, 10).view map (2 / _)
| }
defined module Y

scala> Y.y(0)
res0: Int = 0

scala> X.x(0)
res1: Int = 0

And, speaking of Scala 2.8 changes, there is one that is also likely to reduce confusion among newcomers, and is related to what we spoke of. It's related to how guards on for-comprehensions work. For instance, take a look at this code:

scala> def find(n: Int, l: List[Int]) = {
| var found = false
| for (el <- l; if !found) found = el == n
| found
| }
find: (Int,List[Int])Boolean

scala> find(5, List.range(1, 10))
res14: Boolean = false

This code doesn't work as expected because of how for-comprehensions are translated:

l.filter(el => !found).foreach(el => found = el == n)

Since "filter" for List is a strict method, this just doesn't work, as "filter" evaluates before "foreach". However, Range's "filter" on Scala 2.7 is non-strict. Does it work? Yes:

scala> def findR(n: Int, l: Range) = {
| var found = false
| for (el <- l; if !found) found = el == n
| found
| }
findR: (Int,Range)Boolean

scala> findR(5, 1 to 10)
res17: Boolean = true

But, back to the first example, as of Scala 2.8 the for-comprehension code will translate like this:

l.withFilter(el => !found).foreach(el => found = el == n)

This new function, "withFilter", will be non-strict for standard Scala collections. There's no way, however, to ensure that third-party collections will preserve the non-strictness. Hopefully, people will act in a sensible manner! :-) At any rate, the "find" function, as written, will work on Scala 2.8.

I hope this helps people working with Scala 2.7, and hope even more for Scala 2.8 to arrive soon! As a parting thought, though, I'd like to briefly speak of "<-", seen in for-comprehensions. I have seen people calling it many things. Martin Odersky's (et al) Programming in Scala book, however, calls it "in". So "for (el <- l)" ought to read "for el in l". So, if you weren't sure what to call it, now you know! Happy coding!