Red-black trees

May 7th, 2011 by James Chapman

I explained that a red-black tree is really just a 2-3-4 tree in disguise. I then reviewed Chris Okasaki’s simple, short and elegant treatment of red-black trees (Red-Black Tress in a Functional Setting) which I think is a great example of functional programming and rightly a “functional pearl”. This is interesting stuff I think and not as widely known as it should be.

Further reading:
The correspondence between 2-3-4 trees and red-black trees is made sharper by considering Sedgwick’s Left-Leaning Red-Black Trees. See here for a formalisation in Agda.

Professor Layton and the Projective Plane

December 7th, 2010 by Christian Sattler

Last Friday, Thorsten came up with a puzzle he encountered while playing “Professor Layton and the Unwound Future” with his son. The statement of the puzzle is as follows:

Puzzle #123 – Cat’s-Eye View:

Goal: Determine the maximum number of stones that can be placed in the [4x4] grid allowing for no four stones to form a square or rectangle horizontally or vertically anywhere on the grid.

He guessed the answer to be 9, though lacking a proof.

The discussion quickly went on to generalizations to arbitrary nxn grids. For the first few n, solutions with 1, 3, 6, 9 stones, respectively, were quickly found. The maximality of these can easily be verified: For example, in the case of a grid size of 4, assume there was a solution with 10 stones. Obviously, no single row can be completely filled with stones, so by the pigeon hole principle, there would have to be two rows with 3 stones each. Again by the pigeon hole principle, these two rows must have stones in at least two common columns, yielding a contradicting rectangle.

Venanzio proposed how to construct, from a valid placement of k stones on an nxn grid, a solution for an (n+1)x(n+1) grid with k+3 stones: Assuming, without loss of generality, that the upper right corner is not inhabited by any stone, add a column to the left and a row to the bottom, placing stones in the three newly created corners. Using this construction, we get solutions with 12, 15, and 18 stones for the 5×5, 6×6, and 7×7 grid, respectively.

But are these solutions maximal in the number of stones? As it turned out, no. There is an intriguing connection of the puzzle to finite projective planes we can use to construct better (and, as it turns out, optimal) solutions. For starters, consider the Fano plane:

The Fano plane is a highly symmetric combinatorial object consisting of 7 points, marked in blue, connected by 7 lines, marked in red. Each point lies on 3 lines and each line connects 3 points. Tabulating the line/point incidence relation in a table, with the rows corresponding to lines and the columns to points, and the numbering as indicated in the above picture, yields the following 7×7 grid:

This gives us a solution with 21 stones, in contrast to the 18 stones predicted above.

This construction generalizes to arbitrary projective planes: In general, a projective plane consists of a set of points and a set of lines, sets of points, such that for every two points, there is exactly one line containing these points, and for every two lines, their intersection consists of exactly one point, together with a certain non-degeneration condition. It can be shown that for finite projective planes, there always exists a natural number n such that the number of points as well as the number of lines is n^2 + n + 1, every line contains exactly n + 1 points, and every point is contained in exactly n + 1 lines.

Tabulating the line/point incidence relation as we did for the Fano plane, which is in fact the smallest projective plane possible, choosing an arbitrary numbering in the process, we arrive at a placement of (n + 1)(n^2 + n + 1) stones on an (n^2 + n + 1) \times (n^2 + n + 1) grid. This configuration is valid for the following reason: Four stones forming a rectangle in the grid would correspond to two different points being contained in two different lines, violating the uniqueness conditions of the projective plane.

It is worth noting that although identifying lines with rows and points with columns was an arbitrary decision, identifying lines with columns and points with rows does not yield “new” solutions: There is complete symmetry of duality in the definition of projective planes in regard to the notions of points and lines, meaning these two notions can be interchanged without violating the projective plane conditions.

For what values of n is there a projective plane? For any prime power n = p^k, there is a finite field over p^k elements. Over this field, we can build the affine plane with lines the one-dimensional affine subspaces. Taking the projective closure, we have to add an “infinite” intersection point for any class of parallel lines, of which there are p^k + 1. Finally, we add a line “at infinity” connecting the newly created points. Whether there are projective planes for n not a prime power is an open question.

What remains to be shown is that this construction always yields an optimal solution. For this, consider an arbitrary valid placement of s stones on an mxm grid. For 1 \leq i \leq m, let k_i denote the number of stones in the i-th row. Note that s = \sum_{i=1}^m k_i. Each row i gives rise to subsets \{u, v\} of \{1, \ldots, m\} of size 2 having stones in columns u and v. But by the conditions of the puzzle, for any subset \{u, v\} of \{1, \ldots, m\} of size 2, there can be at most one row containing stones in both columns u and v. Since there are only of such size 2 subsets in total, we get the inequality \sum_{i=1}^m \binom{k_i}{2} \leq \binom{m}{2}, or \sum_{i=1}^m k_i (k_i – 1) \leq m (m – 1). Since f(x) := x(x – 1) is a convex function, we can apply Jensen’s inequality:

Multipliying by 4m and adding m^2, we get 4s(s – m) + m^2 \leq 4m^2(m – 1) + m^2, or (2s – m)^2 \leq m^2(4m – 3), that is, 2s – m \leq m \sqrt{4m – 3}, or

If m is of the form m = n^2 + n + 1, this simplifies to

which coincides with the number of stones we can place using a projective plane construction, proving it to be optimal.

What is more, we can conversely show that each solution to the puzzle on grid size m for which this bound turns into an equality, which in fact can only happen for m of the form n^2 + n + 1, gives rise to a possibly degenerate projective plane on m points, which is non-degenerate for n \geq 2 (the proof is left to the reader). Classes of optimal solutions under row and column renumbering then correspond to isomorphism classes of projective planes. Since on 7 points there is only one projective plane up to isomorphism, we can conclude that our solution for the 7×7 grid is in fact unique (up to row and column renumbering).

Size-indexed logical relations

November 30th, 2010 by Nils Anders Danielsson

Previously we have discussed using the partiality monad to define operational semantics for partial languages. Last Friday I talked about how I have used sized types (in MiniAgda) to define a compatible program inequality relation on top of such a semantics. The definition I ended up with turned out to be quite similar to step-indexed logical relations.

Interested readers can have a look at the (very verbose) MiniAgda code.

Modular datatypes and folding for functions

November 5th, 2010 by Laurence E. Day

Today I presented part of Wouter Swierstra’s “Data types à la carte” (thanks for the correction, French guy!) paper (JFP 2008). To this end, I showed a solution to the expression problem by way of modularising datatypes with coproducts of type constructors and constructing initial algebras for generic functor folds as a supported operation of a type-class, as well as a method for defining ‘smart constructors’. By way of explanation, I defined a function isomorphic to the traditional evaluate function on integers and addition. Discussion then (predictably!) turned towards implementing these ideas in Agda – interest seemed quite high.

data Fix f = In (f (Fix f))
data (f :+: g) e = Inl (f e) | Inr (g e)

fold :: Functor f => (f a -> a) -> Fix f -> a
fold f (In t) = f (fmap (fold f) t)

data Val e = Val Int
data Add e = Add e e
type Expr = Fix (Val :+: Add)

For the class constraint on the fold operation to be satisfied, we need the both the type constructor coproduct and all components to be functors themselves.

instance (Functor f, Functor g) => Functor (f :+: g) where
fmap f (Inl x) = Inl (fmap f x)
fmap g (Inr y) = Inr (fmap g y)

instance Functor Val where
fmap f (Val n) = Val n

instance Functor Add where
fmap f (Add x y) = Add (f x) (f y)

We declare our evaluator algebra as a supported operation of a typeclass defined for an arbitrary functor.

class (Functor f) => Eval f where
  evalAlg :: f Int -> Int

instance (Eval f, Eval g) => Eval (f :+: g) where
  evalAlg (Inl x)  = evalAlg x
  evalAlg (Inr y)  = evalAlg y

instance Eval Val where
  evalAlg (Val n) = n

instance Eval Add where
  evalAlg (Add x y) = x + y

eval               :: (Eval f) => Fix f -> Int
eval               = fold evalAlg

In order to define the smart constructors for manipulating data in this format, we need to define a subtyping relation with (unfortunately) a pair of overlapping instances.

class (Functor sub, Functor sup) => sub :<: sup where
  inj :: sub a -> sup a

instance (Functor f) => (:<:) f f where
  inj = id

instance (Functor f,
          Functor g) => (:<:) f (f :+: g) where
  inj = Inl

instance (Functor f,
          Functor g,
          Functor h,
          (:<:) f g) => (:<:) f (h :+: g) where
  inj = Inr . inj

We're now in a position where we can easily define both a carrier function into a supertype (inject) and smart constructors for our language.

inject      :: (g :<: f) => g (Fix f) -> Fix f
inject      = In . inj

val         :: ((:<:) Val f) => Int -> Fix f
val n       = inject (Val n)

add         :: ((:<:) Add f) => Fix f -> Fix f -> Fix f
n `add` m   = inject (Add n m)

Now we can very easily define instances of the Expr datatype and evaluate them just as we would expect to if our datatype was defined in the standard way:

example :: Expr
example = val 18 `add` val 24

> eval example
> 42

I wish to use the work presented here as part of a framework upon which to construct a compiler fully modular in its source/target languages and the computations it supports.

Compatibility of equalities based on operational semantics

October 29th, 2010 by Nils Anders Danielsson

Today we discussed the kind of issues one can run into when proving that a program equivalence, defined on top of an operational semantics, is a congruence.

Using reflection to improve automatization

May 13th, 2010 by gallais

Last week I talked about a solver for propositional logic that uses reflection. This work is the opportunity to present how one shall develop a solver using reflection.

Reflection

The purpose of reflection is being able to manipulate terms of the language inside the language itself. It allows you to design certified solvers whereas the use of a MetaLanguage (Ltac for example) doesn’t guarantee anything.

Since AIM XI, the latest version of Agda has a couple of new features. One of them is the possibility for the user to have access to the current goal [1]. From now, you can use :

  • A datatype Term that represents the terms in Agda
  • A command quoteGoal t in e which has the typing rule: e[t := `T] : T ⊢ quote t in e : T
  • A command quote which gives you the internal representation of an identifier

A solver will be designed in three steps. Let’s say that the type MyType will represent the set of goals that you want to deal with and that MyTerm will be the representation of the inhabitants of MyType. We need to:

  • Add a proper quoting function taking a Term and outputing a MyType element (preferably a non provable one if the Term has not a good shape)
  • Design the solver taking a MyType term and outputing a MyTerm element
  • Give the semantics of our datatypes and prove the soundness of our solver

A solver for propositional logic

Proving a formula of propositional logic is the same (thanks to Curry-Howard’s isomorphism) as finding a lambda term which is an inhabitant of the corresponding type. Our work is based on the (said to be “structural” but not in Agda’s sense) deduction rules presented in a paper by Roy Dyckhoff and Sara Negri [2].

Implementation

The “MyType” datatype:

data Type : ℕ → Set where
atom : ∀ {n} → Fin n → Type n
⊥ : ∀ {n} → Type n
_∩_ _⊃_ _∪_ : ∀ {m} → Type m → Type m → Type m

The “MyTerm” datatype is more verbose but pretty straight-forward so I won’t include it here. It contains all the basic constructors for this simply-typed lambda caculus with sum and product types (var, lam, app, inj1, inj2, case, proj1, proj2, and).

The only tricky thing is the lift function that lifts all the free variables of a given term because it has to deal with modifications of the environment when going under a lambda.

Interface

The issue of partiality (the formula is maybe not provable) is solved by using dependent types: the solver requires un argument that will either have the type ⊤ if the proposition is provable (the placeholder tt is then inferred by agda) or have the same type as the goal if it is not provable.

Example of the use of the solver:


Ex : ∀ {A B C D : Set} → ((A → B) × (C → A)) → (A ⊎ C) → B × (((A → D) ⊎ D) → D)
Ex {A} {B} {C} {D} = quoteGoal t in solve 4 t (A ∷ B ∷ C ∷ D ∷ []) _

References

[1] See Agda/test/succeed/Reflection.agda and Agda/doc/release-notes/2-2-8.txt

[2] Roy Dyckhoff and Sara Negri, Admissibility of Structural Rules for Contraction-Free Systems of Intuitionistic Logic, http://www.jstor.org/stable/2695061

You can get the source code on the following darcs repository: darcs get http://patch-tag.com/r/gallais/agda

Solving first order stream equations

May 10th, 2010 by fyb

I presented a solver for first order stream equations. Those are simple equations of the form:

 x = 2 :: head (tail x) :: head (3 :: head x :: x) :: tail (4 :: x)

The solver is parametric in the domain type. It turns out that unless the domain is empty, those equations always have solutions. Given a term representing the right hand side of an equation, it builds a template which is a finite representation of its solutions. A template is instantiated by providing an environment in which the under-constrained elements of the stream get a value.

There are two steps: first reduce the term to a normal form by eliminating redexes:

  • head (d :: s) reduces to d
  • tail (d :: s) reduces to s

Then, solve the problem for normal forms. An equation whose right hand side is in normal form looks like:

 x = d_1 :: ... :: d_n :: tail^j x

where the d_i are either constants from the domain or equal to some element x_k of the stream x.

We get n equations: x_i = d_i. We get also a condition on the rest of the stream: if j = n, then there is no constraint on the remaining elements of the solutions (after the n-th), otherwise, there will be a loop of length abs(n-j).

Solving the n conditions boils to extracting the strongly connected components of a graph whose vertices are the x_1 ... x_n and the edges are the dependencies between them. The problem is actually simpler than the general one because each vertex is target of at most one edge.

I implemented such an algorithm in Agda but chickened out from writing the proofs in Agda that the algorithm produced a template whose instances are exactly the solutions of the given equation. I wrote the proofs in English instead.

Bag Equality via a Proof-Relevant Membership Relation

March 16th, 2010 by Nils Anders Danielsson

Last week I talked about a definition of bag equality—equality up to permutation—which I have used recently. The definition works for arbitrary containers, but for simplicity I will focus on lists.

The predicate transformer Any can be defined as follows (in Agda):

  data Any {A : Set} (P : A → Set) : List A → Set where
    here  : ∀ {x xs} → P x      → Any P (x ∷ xs)
    there : ∀ {x xs} → Any P xs → Any P (x ∷ xs)

Any P xs means that any element in xs satisfies P. Using Any and an equality relation it is easy to define the membership relation:

  _∈_ : {A : Set} → A → List A → Set
  x ∈ xs = Any (_≡_ x) xs

Finally bag equality can be defined:

  _≈_ : {A : Set} → List A → List A → Set
  xs ≈ ys = ∀ x → x ∈ xs ⇿ x ∈ ys

Here A ⇿ B is the set of invertible functions between A and B. Two lists xs and ys are equal when viewed as bags if, for every element x, the two sets x ∈ xs and x ∈ ys are equipotent; note that if x occurs a certain number of times in xs, then there will be a corresponding number of distinct elements in x ∈ xs.

The definition above has some nice properties:

  • If _⇿_ is replaced by equivalence (coinhabitance), then we get set equality instead of bag equality. This uniformity can be exploited; one can for instance prove that map preserves bag and set equality using a single, uniform proof.
  • The definition generalises to other containers, including types with infinite values (like streams).
  • The use of Any can give some convenient wiggle room in proofs.

Let me illustrate the last point by proving that bind distributes from the left over append:

  xs >>= (λ x → f x ++ g x) ≈ (xs >>= f) ++ (xs >>= g)

This property does not hold for ordinary list equality (one way to see this is to let xs be true ∷ false ∷ [] and f and g both be return). I will make use of the following properties:

  (∀ x → P₁ x ⇿ P₂ x) → xs₁ ≈ xs₂ → Any P₁ xs₁ ⇿ Any P₂ xs₂
  A ⇿ B → C ⇿ D → (A ⊎ C) ⇿ (B ⊎ D)

  Any P (xs ++ ys)          ⇿  Any P xs ⊎ Any P ys
  Any (λ x → P x ⊎ Q x) xs  ⇿  Any P xs ⊎ Any Q xs
  Any P (map f xs)          ⇿  Any (P ∘ f) xs
  Any P (concat xss)        ⇿  Any (Any P) xss

From the last two properties we get a derived property for bind:

  Any P (xs >>= f)           ⇿
  Any P (concat (map f xs))  ⇿
  Any (Any P) (map f xs)     ⇿
  Any (Any P ∘ f) xs

Using these properties we can prove left distributivity. The proof is a simple calculation:

  y ∈ (xs >>= λ x → f x ++ g x)                    ⇿
  Any (λ x → y ∈ f x ++ g x) xs                    ⇿
  Any (λ x → y ∈ f x ⊎ y ∈ g x) xs                 ⇿
  Any (λ x → y ∈ f x) xs ⊎ Any (λ x → y ∈ g x) xs  ⇿
  y ∈ (xs >>= f) ⊎ y ∈ (xs >>= g)                  ⇿
  y ∈ (xs >>= f) ++ (xs >>= g)

The “wiggle room” I mentioned above is visible in the calculation: the proof does not refer solely to _∈_ y, i.e. Any (_≡_ y), but also to Any P for other predicates P.

The definitions of Any and bag and set equality are available in Agda’s standard library (Data.List.Any), along with the proof of left distributivity (Data.List.Any.BagAndSetEquality).

Causal Semantics of (A)FRP

March 8th, 2010 by Neil Sculthorpe

Having being inspired by Conal Elliott’s blog post about the inaccuracy of the (Arrowised) FRP semantical model, I was trying to find as simple a model as possible that is inherently causal.

The basic concept in FRP is of time-varying values called signals (or behaviours):

Signal : Set -> Set
Signal A = Time -> A

Where Time is the non-negative real numbers.

In Arrowised FRP there are then Signal Functions, conceptually functions between signals:

SF : Set -> Set -> Set
SF A B = Signal A -> Signal B

Signal functions are nicely modular, and AFRP programs are constructed by composing signal functions together.

As an example of a signal function, consider “delay”, which delays an input signal by a specified amount of time:

delay : Time -> SF A A
delay d = \ s t -> s (t - d)

The problem with this semantics of SF, is that it isn’t inherently causal. One could define a “future” signal function that is similar to delay, but produces the future input as the current output.

future : Time -> SF A A
future d = \ s t -> s (t + d)

In FRP we want to always be able to produce the current output from the current and past inputs, so this is bad.

Consequently, a causality constraint is usually placed on signal functions. All primitive signal functions must be causal, and all primitive combinators must preserve causality.

Causality can be defined:

Causal : SF A B -> Set
Causal sf = (s1 s2 : Signal A) -> (t : Time) -> ((t' : Time) -> (t' <= t) -> (s1 t' == s2 t')) -> (sf s1 t == sf s2 t)

So a causal signal function is a product of the signal function and the causality property:

CausalSF A B = Sigma (SF A B) Causal

I was hoping to find a model of AFRP that is inherently causal, and so doesn’t require the addition of a causality property on top of it. Starting from the definition of SF:

SF A B = Signal A -> Signal B
{expanding}
SF A B = (Time -> A) -> (Time -> B)
{flipping arguments}
SF A B = Time -> (Time -> A) -> B
{constraining the input signal}
SF A B = (t : Time) -> ((t' : Time) -> (t' <= t) -> A) -> B

We’re now constraining the input signal. We are now only required to know the input signal up to the point in time at which we are sampling the output.

As far as I can tell, this is now inherently causal, and it seems to work out okay for defining primitives. For example, composition is:

_>>>_ : SF A B -> SF B C -> SF A C
sf1 >>> sf2 = \ t sa -> sf2 t (\ t' ltB -> sf1 t' (\ t'' ltA -> sa t'' (transitive ltA ltB)))

However, so far we’ve only considered continuous signals. FRP also contains discrete signals, most notably events. Events only occur at certain points in time, and the event signal is undefined elsewhere. We require that only a finite number of events occur within any finite interval. This can be modelled as a (co)list of time-delta/value pairs:

EventSignal : Set -> Set
EventSignal A = List (Dt , A)

where Dt is the type of positive Time, representing the time since the previous event occurrence.

We can incorporate this into the original model by parametrising signals with a signal descriptor that contains this time-domain information (rather than just a Set):

data SigDesc : Set1 where
C : Set -> SigDesc
E : Set -> SigDesc

Signal : SigDesc -> Set
Signal (C A) = Time -> A
Signal (E A) = List (Dt , A)

SF : SigDesc -> SigDesc -> Set
SF x y = Signal x -> Signal y

However, it doesn’t work with the causal version, as we don’t have the same Signal types. Once we’ve put the time constraint on, we really have a signal history. That is, a signal indexed by some finish time:

SignalHistory : SigDesc -> Time -> Set
SignalHistory (C A) t = (t' : Time) -> (t' <= t) -> A
SignalHistory (E A) t = EventList A t

where an EventList is just a list of time-delta/value pairs with an added constraint that the sum of all the time deltas does not exceed the index. (If anyone can suggest a better datatype for this then please do so; this is just the first one that came to mind.)

data EventList (A : Set) : Time -> Set where
[] : {t : Time} -> EventList A t
cons : {t t' : Time} -> (d : Dt) -> A -> EventList A t'
-> (t' + d) <= t -> EventList A t

Returning to our signal function definition, we can now define signal functions for either kind of input signal, but only continuous output signals:

SF : SigDesc -> SigDesc -> Set
SF x (C B) = (t : Time) -> SignalHistory x t -> B
SF x (E B) = ?

And that’s where I got stuck stuck.

I did consider:

SF : SigDesc -> SigDesc -> Set
SF x y = (t : Time) -> SignalHistory x t -> SignalHistory y t

but this isn’t properly causal. It is to some degree (at any given point in time, the output signal history does not depend upon future input), but there’s nothing preventing the past from being rewritten later (because at each point in time a new signal history is produced, which may or may not be consistent with previous histories).

The codensity monad

February 12th, 2010 by Graham

Today I gave a introduction to the codensity monad aimed at the “functional programmer in the street”, rather than the “category theorist in the cupboard”. The emphasis was on understanding how one can arrive at the codensity monad in a natural manner starting from the simple and familiar example of improving the performance of list reverse.