Tuesday, January 29, 2008

Cute Techie Kids!


I don't really have any excuse for posting this photo of Leo (19 mos. old) sleeping with a bottle dangling from his lips. You wanna make something of it?

I Couldn't Be Happier!

From slate.com:

A study suggests extreme happiness may be bad for you. Findings: 1) "The highest levels of income, education and political participation were reported not by the most satisfied individuals, but by moderately satisfied individuals." 2) Extremely happy people "earned significantly less money" and earned lower school grades than moderately happy people. 3) They "may not live as long," either. Theories: 1) Happiness makes you complacent and kills your drive. 2) It makes you slow to adapt. 3) It makes you too optimistic and insufficiently vigilant about your health. 4) It may overstimulate your cardiovascular system. Researchers' conclusions: 1) "Happiness may need to be moderated for success." 2) "Extremely high levels of happiness might not be a desirable goal." Human Nature's conclusions: 1) Success may need to be moderated for happiness. 2) Extremely high levels of success might not be a desirable goal.

Sunday, January 20, 2008

Huffman Coding in F#

#light // I.e. lightweight syntax

namespace Nassar
// An F# module is a "static class." All the declarations
// that follow are static members of this class.

module Huffman

type Incidence = HashMultiMap<char, int>
type CharSet = Set<char>

let addToCount (d : Incidence) (c : char) =
// For some reason, the "do" is necessary here.
do match d.TryFind c with
Some(n) -> d.Replace(c, n + 1)
_ -> d.Add(c, 1)
d // Return the dictionary.

let count (s : string) =
// I only just noticed that System.String implements
// IEnumerable. The F# compiler correctly inferred from
// the type of Incidence that s has to be a sequence of
// char.

s |> Seq.fold addToCount (Incidence.Create())

type HuffmanNode =
Leaf of HuffmanLeaf
Tree of HuffmanTree
and HuffmanLeaf = {Count: int;
Char: char;}
and HuffmanTree = {Count: int;
Chars: CharSet;
Left: HuffmanNode;
Right: HuffmanNode;}

let get_count node =
match node with
Leaf l -> l.Count
Tree t -> t.Count

let has_char node c =
match node with
Leaf l -> l.Char = c
Tree t -> t.Chars.Contains c

let get_set node =
match node with
Leaf leaf -> CharSet.Singleton leaf.Char
Tree tree -> tree.Chars

let merge_nodes left right =
Tree { Count = get_count left + get_count right;
Chars = (get_set left) + (get_set right);
Left = left;
Right = right }

let node_priority l r =
get_count l - get_count r

let rec reduce_list l =
// What I should really do is put all the new leaf nodes
// into a priority queue, and pull off two at a time as
// long as the length of the queue is > 1. Then
// make_tree wouldn't have to convert the sequence
// it creates from the dictionary into a list.

match l with
l::r::tail -> (merge_nodes l r)::tail
|> List.sort node_priority
// ...and recurse.
|> reduce_list
[tree] -> tree
[] -> failwith "Can't operate on empty list!"

let rec encode h s =
// Encode a character by branching to the leaf node
// with that character.

let rec encode' node c =
match node with
Leaf l -> []
// TODO Add an accumulator to make this tail-recursive,
// rather than prepending the new element.

Tree t -> if has_char t.Left c then 0::(encode' t.Left c) else 1::(encode' t.Right c)
s > Seq.map (fun c -> encode' h c) > Seq.concat

// Decode the int list l using the tree h.
let rec decode h l =
let rec decode' node l =
match node, l with
// If we've arrived at a leaf, this must be the character
// we're looking for. Add it to the sequence, and resume
// from the top of the tree.
// TODO Add an accumulator to make this tail-recursive.

Leaf leaf, _ -> leaf.Char :: (decode h l)
// A 0 in the stream means a left branch during
// the encoding, and v.v.

Tree tree, n::tail -> if n = 0 then decode' tree.Left tail else decode' tree.Right tail
// Something was wrong in the encoding.
_, _ -> failwith "Something has gone very wrong."
match l with
[] -> []
// Start navigating down the tree.
_ -> decode' h l

let make_tree s =
let d = count s
// IDictionary<_, _> implements
// ICollection<KeyValuePair<_,_>>, so enumerate the
// pairs and convert each into a leaf node.

let leaves = d |> Seq.map (fun pair ->
Leaf { Count = pair.Value;
Char = pair.Key }; )
|> Seq.to_list
|> List.sort node_priority
reduce_list leaves

Wednesday, January 16, 2008

Huffman Coding, SICP Exercise 2.69

As a convenience, I defined Incidence as System.Collections.Generic.Dictionary. F#'s Map type
is actually a functor, and each insertion or removal returns a new instance. What I'd really like is something like the FindOrAdd() method available in the C5 dictionary implementations, but I can live without it. Anyway, I intend to fold a sequence of characters into a dictionary, for which purpose I'll first define the function that will do the folding:

> let count (d : Incidence) (c : char) =
- let n = ref 0
- // I need a "ref" variable to serve as an "out" parameter.
- if d.TryGetValue(c, n) then
- // Assign to a mutable property...
- d.[c] <- !n + 1
- else
- d.Add(c, 1)
- d;;
val count : Incidence -> char -> Incidence

I'm getting good at this:

> let d = "Tony Nassar, Software Engineer" |> Seq.fold count (new Incidence());;

val d : Incidence

> d;;
val it : Incidence
= dict
[('T', 1); ('o', 2); ('n', 3); ('y', 1); (' ', 3); ('N', 1); ('a', 3);
('s', 2); ('r', 3); (',', 1); ('S', 1); ('f', 1); ('t', 1); ('w', 1);
('e', 3); ('E', 1); ('g', 1); ('i', 1)]
>
I did not know until today that System.String implements IEnumerable! And, fortunately, F#'s type inference lets me pipe that sequence of characters into my count() function. There are two steps that I need to take now: I need to create a sequence sorted by frequency (actually, a mapping from frequency to character, where the frequencies don't have to be unique) and turn it into a tree. This took me a while. Unlike C#, F# only allows mutually recursive type definitions if they're linked with "and." Here I'm declaring HuffmanNode as a "discriminated union," where Leaf and Tree are implemented as tags (you can verify this by using Reflector to examine the compiled DLL), and a leaf consists of the character to be encoded and its frequency (the latter is necessary during the construction of the tree), and a tree (or any non-leaf node) is...well, it's kinda obvious. If this were C, and someone were using a union and a type code, I'd say "Yuck!" But since it's functional programming, and this is a very handy way to build up a tree, it's suddenly cool again.

> type HuffmanNode =
- | Leaf of HuffmanLeaf
- | Tree of HuffmanTree
- and HuffmanLeaf = { Count : int; Char : char }
- and HuffmanTree = { Count : int; Chars : CharSet; Left : HuffmanNode; Right : HuffmanNode };;

type HuffmanNode =
| Leaf of HuffmanLeaf
| Tree of HuffmanTree
and HuffmanLeaf = {Count: int;
Char: char;}
and HuffmanTree = {Count: int;
Chars: CharSet;
Left: HuffmanNode;
Right: HuffmanNode;}

>

Tuesday, January 15, 2008

Eight Queens: I couldn't let it go!

If you're not used to functional programming, and I'm not, you may respond with irritation at an FPer's "How few lines of code can I use to solve this problem?" showpiece. A good example would be the textbook problem of transposing a matrix (for which the answer is posted on Jomo Fischer's blog). If you really need to transpose a lot of matrices, you're probably going to be working in an environment where such operations are optimized, e.g. Mathematica. In other words, you're not going to do it yourself. So if anyone is irritated with the following code, let me say that I had to ponder this (on and off, not continuously!) for a few months before I arrived at the "right" solution, and I do feel that it's right. However, trying to implement it in F# gave me some valuable insights into how it ought to be done in other languages. Here's my solution:
> let queens size =
- let range = [1..size]
- range |> Seq.fold (fun answers column ->
- let positions = [for row in range -> (column, row)]
- [for p in positions for a in answers when all_safe p a -> a@[p]]) [[]]
- ;;

val queens : int -> (int * int) list list

> queens 5;;
val it : (int * int) list list
= [[(1, 4); (2, 2); (3, 5); (4, 3); (5, 1)];
[(1, 3); (2, 5); (3, 2); (4, 4); (5, 1)];
[(1, 5); (2, 3); (3, 1); (4, 4); (5, 2)];
[(1, 4); (2, 1); (3, 3); (4, 5); (5, 2)];
[(1, 5); (2, 2); (3, 4); (4, 1); (5, 3)];
[(1, 1); (2, 4); (3, 2); (4, 5); (5, 3)];
[(1, 2); (2, 5); (3, 3); (4, 1); (5, 4)];
[(1, 1); (2, 3); (3, 5); (4, 2); (5, 4)];
[(1, 3); (2, 1); (3, 4); (4, 2); (5, 5)];
[(1, 2); (2, 4); (3, 1); (4, 3); (5, 5)]]
> queens 4;;
val it : (int * int) list list
= [[(1, 3); (2, 1); (3, 4); (4, 2)]; [(1, 2); (2, 4); (3, 1); (4, 3)]]

First, I did not use a for loop. Not because that would just be too imperative, but because I realized that the range (e.g. from 1..8, for an 8 x 8 chessboard) was actually an (immutable) object that I needed to use in several ways. To generate all the positions on a chessboard, I would then use a "sequence comprehension" such as { for column_number in range for row_number in range -> (column_number, row_number) }. I've begun to see this as more natural than nested for loops. I'm not doing with side effects in a loop; I'm asking for a single object (a collection of board positions) to be created for me. So the comprehension, a declarative construct, is more natural: "Here's what I want; give it to me. I'm not interested in putting it together piece by piece; that's your job." In comparison, the idiom 1, create a list, 2, go through several loops, appending on each innermost iteration, 3, finally do something with the collection (which is probably mutable, even though it really shouldn't be), obscures the intention, and is error-prone.

Then I iterate over that range (now treating it as the sequence of column numbers), and for every column I take the sequence of answers so far (i.e., each such answer is a sequence of one position each from the columns I've seen so far) and extend each one with each position in the current column (i.e. the current element of the range) that's safe. Below, I defined the functions safe() and all_safe(). I can easily generate board positions for the current column by means of another comprehension (let positions = etc.). Then I take the Cartesian product of the answers so far and this new column, filter all these new candidates through all_safe, and pass this new collection on to the next iteration, via fold(). For obvious reasons, Seq.fold starts with the collection of known 0-length answers, which is of course [[]].

If I were (re)doing it in Scheme, I'd rip off Kent Dybvig's comprehension macro, first off. I'd also rip off an implementation of fold, if I didn't already have one. I'd generate a range 1..n just as I do above. Then, on every iteration (I believe it is an iteration, if I'm folding from the left) I create the Cartesian product of the answers so far and the current column (which I just generated), (filter) these pairs through (all-safe), and continue. Something like that.

Eight Queens, and an End

Once again I was led astray by having done too much imperative programming. I hate nesting loops to create collections, but I didn't go all the way. If the columns and rows of the chessboard are represented as a range—the same range—then all the positions are produced by taking the Cartesian product of that range with itself. Then I can iterate across the columns, and, for each one, take the Cartesian product of the rows in that column and the answers so far, and fold the new answers into a collection that was empty to start with. Ah, fold, delight of all functional programmers. To wit:

> let queens size =
- // the row numbers *and* the column numbers
- let range = [1..size]
- let queens' columns =
- columns |> Seq.fold (fun answers column ->
- let positions = [for row in range -> (column, row)]
- [for p in positions for a in answers when all_safe p a -> a@[p]])
- [[]]
- queens' range;;

val queens : int -> (int * int) list list

> queens 4;;
val it : (int * int) list list
= [[(1, 3); (2, 1); (3, 4); (4, 2)]; [(1, 2); (2, 4); (3, 1); (4, 3)]]
>