summaryrefslogtreecommitdiff
path: root/tfm.tex
diff options
context:
space:
mode:
Diffstat (limited to 'tfm.tex')
-rw-r--r--tfm.tex416
1 files changed, 396 insertions, 20 deletions
diff --git a/tfm.tex b/tfm.tex
index 946ffa5..0a0ce0a 100644
--- a/tfm.tex
+++ b/tfm.tex
@@ -646,6 +646,7 @@ context)
\subsection{Extensions}
+\label{sec:extensions}
Until there we have seen the ``basic'', or ``pure'' Lambda Calculus, where the
only data primitive is the function. Being turing complete, it is all that is
@@ -767,7 +768,9 @@ To begin our study of the implementation of abstract machines, we will start
with the Krivine Machine. It is a relatively simple and well-known model that
will help us see the steps that we need to take in order to implement a real
abstract machine. We will be using the OCaml language from now on (not only in
-this section but also in the next ones).
+this section but also in the next ones), and while snippets of code will be
+presented to illustrate the concepts, the full code is available in the
+annex \ref{sec:kriv-mach-source} for reference.
The Krivine Machine \cite{Krivine07} is an implementation of the weak-head
call-by-name reduction strategy for pure lambda terms. What that means is that:
@@ -800,21 +803,25 @@ module Lambda = struct
| App (m1, m2) -> "(" ^ show m1 ^ " " ^ show m2 ^ ")"
| Abs (x, m) -> "(λ" ^ show_symbol x ^ "." ^ show m ^ ")"
end
- \end{minted}
+ \end{minted}
\end{code}
(From now on, the auxiliar (e.g., pretty-printing) functions will be ommited for
brevity.)
-Now we have a representation of the pure Lambda Calculus. As the Krivine Machine
-usually works with expressions in \textbf{de Bruijn} notation\footnote{ The de
+The module Lambda encodes the pure Lambda Calculus with its main type <<t>>
+\footnote{Naming the main type of a module <<t>> is an idiom when using modules
+ in OCaml}. Variables are just symbols (strings tagged with an integer so that
+they're unique), Applications are pairs of terms and Abstractions are pairs of
+symbols (the binding variable) and terms (the body). As the Krivine Machine
+usually works with expressions in \textbf{de Bruijn} notation\footnote{The de
Bruijn's notation gets rid of variable names by replacing them by the number
of ``lambdas'' between it and the lambda that is binding the variable. For
example, $(\lambda x . (\lambda y . (x \: y)))$ will be written in de Bruijn
notation as $(\lambda . (\lambda . (1 \: 0)))$}, we'll need to write an
-algorithm to do the conversion. To be sure we do not accidentally build terms
-mixing the two notatios, we'll create another module, \textbf{DBILambda}, with a
-different data type to represent them:
+algorithm to do the conversion of variables. To be sure we do not accidentally
+build terms mixing the two notatios, we'll create another module,
+\textbf{DBILambda}, with a different data type to represent them:
\begin{code}
\begin{minted}[fontsize=\scriptsize]{ocaml}
@@ -835,40 +842,408 @@ module DBILambda = struct
| Lambda.Abs (x, m) -> Abs (x, of_lambda (x :: dbis) m)
in of_lambda []
end
- \end{minted}
+ \end{minted}
\end{code}
-The new variables areThe function <<of_lambda>> accepts traditional lambda terms (Lambda.t) and
-returns its representation as a term using de Bruijn notation (DBILambda.t). Now
-we are ready to implement the actual reduction.
+The new variables, of type <<dbi_symbol>>, store the de Bruijn number together
+with the previous value of the symbol, to help debugging and
+pretty-printing. The function <<of_lambda>> accepts traditional lambda terms
+(Lambda.t) and returns its representation as a term using de Bruijn notation
+(DBILambda.t). Now we are ready to implement the actual reduction.
-The Krivine Machine has a state consisting on the code it is evaluating
-(\textbf{C}), an environmentThe typical description of the Krivine Machine \cite{Douence07} is given by the
+The Krivine Machine has a state $(C, S, E)$ consisting on the \textbf{code} it
+is evaluating ($C$), an auxiliar \textbf{stack} ($S$) and the current
+\textbf{environment} ($E$). The code is just a Lambda expression, the stack is
+where the machine will store \textbf{closures} consisting of non-evaluated code
+together with its environment at the time the closure was created, and an
+environment associates variables (de Bruijn indices) to closures.
+
+The typical description of the Krivine Machine \cite{Douence07} is given by the
following set of rules:
+\newpage
\begin{center}
\line(1,0){250}
\end{center}
\begin{table}[H]
\centering
\begin{tabular}{lllll}
-$(M N, S, E)$ & \rightsquigarrow & $(M, (S,(N,E)), E)$ \\ \\
-$(\lambda M, (S,N), E)$ & \rightsquigarrow & $(M, S, (E,N))$ \\ \\
-$(i+1, S, (E,N))$ & \rightsquigarrow & $(i, S, E)$ \\ \\
-$(0, S, (E_1,(M,E_2)))$ & \rightsquigarrow & $(M, S, E_2)$
+$(M N, S, E)$ & \rightsquigarrow & $(M, (N,E) :: S, E)$ \\ \\
+$(\lambda M, N :: S, E)$ & \rightsquigarrow & $(M, S, N :: E)$ \\ \\
+$(i+1, S, N :: E)$ & \rightsquigarrow & $(i, S, E)$ \\ \\
+$(0, S, (M,E_1) :: E_2)$ & \rightsquigarrow & $(M, S, E_1)$
\end{tabular}
\end{table}
\begin{center}
\line(1,0){250}
\end{center}
-\section{With power} %%
+In the previous diagram, $S$ and $E$ are both described as \textbf{lists}, with
+the syntax (H :: T) meaning ``list whose head is H and tail is T''. The stack
+$S$ is a list of closures that implements the push/pop operations over its
+head. The environment is a list whose $i$-th position stores the variable with
+de Bruijn index $i$.
+
+\begin{itemize}
+\item In the first rule, to evaluate an application $M N$, the machine builds a
+ closure from the argument $N$ and the current environment $E$, pushes it into
+ the stack, and keeps on reducing the function $M$.
+\item To reduce an abstraction $\lambda M$, the top of the stack is moved to the
+ environment and proceeds with the reduction of the body $M$. What this means
+ is that the last closure in the stack (the function argument) is now going to
+ be the variable in position (de Bruijn index) 0.
+\item The last two rules rearch through the environment to find the closure
+ corresponding to the current variable. Once it is found, the closure's code
+ $M$ and environment $E_1$ replace the current ones.
+\end{itemize}
+
+From this specification we can write a third module, \textbf{KM}, to define the
+data structures (state, closure, stack) and implement the reduction rules of the
+Krivine Machine:
+
+\newpage
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+module KM = struct
+ open DBILambda
+
+ type st_elm = Clos of DBILambda.t * stack
+ and stack = st_elm list
+
+ type state = DBILambda.t * stack * stack
+
+ let reduce m =
+ let rec reduce (st : state) : DBILambda.t =
+ match st with
+ (* Pure lambda calculus *)
+ | (Var (0, _), s, Clos (m, e') :: e) -> reduce (m, s, e')
+ | (Var (n, x), s, _ :: e) -> reduce (Var (n-1, x), s, e)
+ | (App (m1, m2), s, e) -> reduce (m1, Clos (m2, e) :: s, e)
+ | (Abs (_, m), c :: s, e) -> reduce (m, s, c :: e)
+ (* Termination checks *)
+ | (m, [], []) -> m
+ | (_, _, _) -> m in
+ reduce (m, [], [])
+end
+ \end{minted}
+\end{code}
+
+At this point, we have a working implementation of the Krivine Machine and can
+execute some tests to see that everything works as expected. We'll write some
+example $\lambda$-terms:
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+let m1 = (* (λx.((λy.y) x)) *)
+ let x = symbol "x" in
+ let y = symbol "y" in
+ Abs (x, App (Abs (y, Var y), Var x))
+
+let m2 = (* (((λx.(λy.(y x))) (λz.z)) (λy.y)) *)
+ let x = symbol "x" in
+ let y = symbol "y" in
+ let z = symbol "z" in
+ App (App (Abs (x, Abs (y, App (Var y, Var x))), Abs (z, Var z)), Abs (y, Var y))
+ \end{minted}
+\end{code}
+And lastly, a helper function that accepts a $\lambda$-term, translates it to de
+Bruijn notation and reduces it, outputting the results:
-%\section{Compiled} %%
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+let dbi_and_red m =
+ let dbi_m = DBILambda.of_lambda m in
+ print_endline ("# Lambda term:\n " ^ DBILambda.show dbi_m);
+ let red_m = KM.reduce dbi_m in
+ print_endline ("# Reduced term:\n " ^ DBILambda.show red_m);
+ print_endline "-----------------------------------------------\n"
+ \end{minted}
+\end{code}
+\newpage
+The results:
+\begin{code}
+\begin{verbatim}
+ocaml> List.iter dbi_and_red [m1; m2];;
+
+# Lambda term:
+ (λx.((λy.y) x))
+# Reduced term:
+ (λx.((λy.y) x))
+----------------------------------------------------------
+
+# Lambda term:
+ (((λx.(λy.(y x))) (λz.z)) (λy.y))
+# Reduced term:
+ (λz.z)
+----------------------------------------------------------
+\end{verbatim}
+\end{code}
+
+As we can see, the first term is not reduced because it is already in weak head
+normal form (abstraction). The second term reduces as we would expect. As a
+sidenote, we can easily tweak the DBILambda module to show the real de Bruijn
+variables:
+
+\begin{code}
+\begin{verbatim}
+ocaml> List.iter dbi_and_red [m1; m2];;
+
+# Lambda term:
+ (λ.((λ.0) 0))
+# Reduced term:
+ (λ.((λ.0) 0))
+----------------------------------------------------------
+
+# Lambda term:
+ (((λ.(λ.(0 1))) (λ.0)) (λ.0))
+# Reduced term:
+ (λ.0)
+----------------------------------------------------------
+\end{verbatim}
+\end{code}
+
+
+
+\section{Extending the machine} %%
+
+Now that we have a working Krivine Machine over basic Lambda Terms, we want to
+extend it to work with the extensions we saw in section \ref{sec:extensions}:
+\textbf{case expressions} and \textbf{fixpoints}.
+
+
+\subsection{Case expressions}
+
+%% TODO: maybe later
+% \begin{center}
+% \line(1,0){250}
+% \end{center}
+% \begin{table}[H]
+% \centering
+% \begin{tabular}{lllll}
+% $(Case(M, BS), S, E)$ & \rightsquigarrow & $(M, CaseCont(BS,E) :: S, E)$ \\ \\
+% $(Constr(X, MS), CaseCont(X, AS) :: S, E)$ & \rightsquigarrow & $(M, CaseCont(BS,E)
+% :: S, E)$ \\ \\
+% $(\lambda M, N :: S, E)$ & \rightsquigarrow & $(M, S, N :: E)$ \\ \\
+% $(0, S, (M,E_1) :: E_2)$ & \rightsquigarrow & $(M, S, E_1)$
+% \end{tabular}
+% \end{table}
+% \begin{center}
+% \line(1,0){250}
+% \end{center}
+
+First, we will need to extend our definition of the Lambda Calculus to support
+constructors and case expressions, both in Lambda and DBILambda:
+
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+module Lambda = struct
+ type t = Var of symbol | App of t * t | Abs of symbol * t
+ (* constructors / case expressions *)
+ | Constr of t constr
+ | Case of t * (symbol constr * t) list
+ and 'a constr = symbol * 'a list
+
+ (* ... *)
+end
+
+module DBILambda = struct
+ type t = Var of dbi_symbol | App of t * t | Abs of symbol * t
+ (* constructors / case expressions *)
+ | Constr of t constr
+ | Case of t * (symbol constr * t) list
+ and 'a constr = symbol * 'a list
+
+ let of_lambda =
+ let rec of_lambda dbis = function
+ (* ... *)
+ | Lambda.Constr (x, ms) -> Constr (x, List.map (of_lambda dbis) ms)
+ | Lambda.Case (m, bs) -> Case (of_lambda dbis m,
+ List.map (trans_br dbis) bs)
+ in of_lambda []
+
+ (* ... *)
+end
+ \end{minted}
+\end{code}
+
+The basic approach to implement the reduction will be the same as when reducing
+applications in $\lambda$-terms: when encountering a case expression, create a
+custom closure ``CaseCont'' containing the branches and push it into the
+stack. When a constructor is encountered, if there is a CaseCont closure in the
+stack, the machine will iterate over the branches in the closure until it finds
+one whose symbol matches with the constructor, and evaluate the body:
+
+\newpage
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+module KM = struct
+ (* ... *)
+ type st_elm = Clos of DBILambda.t * stack
+ (* Specific closure for case expressions *)
+ | CaseCont of (symbol DBILambda.constr * DBILambda.t) list * stack
+
+ let reduce m =
+ let rec reduce (st : state) : DBILambda.t =
+ match st with
+ (* ... *)
+ (* Case expressions (+ constructors) *)
+ | (Case (m, bs), s, e) -> reduce (m, CaseCont (bs, e) :: s, e)
+ | (Constr (x, ms), CaseCont (((x', args), m) :: bs, e') :: s, e)
+ when x == x' && List.length ms == List.length args ->
+ reduce (List.fold_left (fun m x -> Abs (x, m)) m args,
+ map_rev (fun m -> Clos (m, e)) ms @ s, e')
+ | (Constr (x, ms), CaseCont (_ :: bs, e') :: s, e) ->
+ reduce (Constr (x, ms), CaseCont (bs, e') :: s, e)
+ | (Constr (x, ms), s, e) ->
+ Constr (x, List.map (fun m -> reduce (m, s, e)) ms)
+ reduce (m, [], [])
+end
+ \end{minted}
+\end{code}
+
+Note that te last rule reduces the terms inside a constructor even when it is
+not being pattern matched. It can be deleted to more closely resemble a
+weak-head normal form behavior.
+
+With this new functionality we can encode Peano arithmetic by using the
+constructors <<z>> and <<s>>, and try some examples of pattern matching:
+
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+(* Peano arithmetic helpers *)
+let z = symbol "z"
+let s = symbol "s"
+
+let rec peano_add n x =
+ if n == 0 then x else peano_add (n-1) (Constr (s, [x]))
+
+let peano_of_int ?(base=Constr (z, [])) n = peano_add n base
+ \end{minted}
+\end{code}
+
+\newpage
+\begin{code}
+\begin{verbatim}
+# Lambda term:
+ ((λc.(case c of (Some(x) → x)
+ (None() → c)))
+ Some(s(z())))
+# Reduced term:
+ s(z())
+----------------------------------------------------------
+
+# Lambda term:
+ ((λc.(case c of (triple(x,y,z) → y)))
+ triple(s(z()), s(s(z())), s(s(s(z())))))
+# Reduced term:
+ s(s(z()))
+----------------------------------------------------------
+\end{verbatim}
+\end{code}
+
+
+\subsection{Fixpoints}
+
+The second extension we are going to implement in our Krivine Machine is the
+ability to support fixpoints, that is, limited recursion. To prevent infinite
+reductions, every fixpoint must be defined over a constructor, in a way that
+TODO.
+
+Again, we extend the data structures of the Lambda Calculus. As the extension to
+DBILambda follows trivially, we will omit it here:
+
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+module Lambda = struct
+ type t = Var of symbol | App of t * t | Abs of symbol * t
+ | Constr of t constr
+ | Case of t * (symbol constr * t) list
+ (* fixpoints *)
+ | Fix of symbol * t
+ (* ... *)
+end
+ \end{minted}
+\end{code}
+
+The ``Fix'' constructor is parameterized by a symbol and another lambda
+term. The symbol is the name of the fixpoint, and will expand to itself when
+referenced inside the term. Let's see an example:
+
+\newpage
+\begin{code}
+\begin{verbatim}
+fix(λf.λc. case c of (s(x) → s(s(f x)))
+ (z → z))
+ s(s(s(z)))
+\end{verbatim}
+\end{code}
+
+Here, the name of the fixpoint is <<f>>, and is an implicitly-passed argument
+that references to the term itself (<<fix(...)>>). So, in this case, <<f>> will be
+bound to <<fix($\lambda$f. $\lambda$c. case c of ... )>> and <<c>> will be bound
+to <<s(s(s(z)))>> initially.
+
+The reduction itself is simpler than the previous case:
+
+\begin{code}
+ \begin{minted}[fontsize=\scriptsize]{ocaml}
+module KM = struct
+ (* ... *)
+ type st_elm = Clos of DBILambda.t * stack
+ | CaseCont of (symbol DBILambda.constr * DBILambda.t) list * stack
+ (* Specific closure for fixpoints *)
+ | FixClos of symbol * DBILambda.t * stack
+
+ let reduce m =
+ let rec reduce (st : state) : DBILambda.t =
+ match st with
+ (* ... *)
+ (* Fixpoints *)
+ | (Var (0, _), s, FixClos (f, m, e') :: e) ->
+ reduce (m, s, FixClos (f, m, e') :: e')
+ | (Fix (x, m), s, e) -> reduce (m, s, FixClos (x, m, e) :: e)
+ reduce (m, [], [])
+end
+ \end{minted}
+\end{code}
+
+Again, here we create a new closure when reducing the Fix constructor. Then,
+whenever some variable refers to a fixpoint closure, the result is its body,
+with the side effect of keeping the closure it in the environment (so that it is
+automatically bound to the first argument <<f>> of itself).
+
+\newpage
+The fixpoint tests:
+
+\begin{code}
+\begin{verbatim}
+# Lambda term:
+ (fix(λf.(λc.(case c of (s(x) → s(s((f x))))
+ (z() → z()))))
+ s(s(s(z()))))
+# Reduced term:
+ s(s(s(s(s(s(z()))))))
+----------------------------------------------------------
+
+# Lambda term:
+ ((fix(λf.(λg.(λc.(case c of (s(x) → (g ((f g) x)))
+ (z() → z())))))
+ (λy.s(s(s(y)))))
+ s(s(s(z()))))
+# Reduced term:
+ s(s(s(s(s(s(s(s(s(z())))))))))
+----------------------------------------------------------
+\end{verbatim}
+\end{code}
+
+The first test is the example we have previously seen; the fixpoint receives a
+number in Peano notation and multiplies it by two by iterating over its
+structure. The second one generalizes it by accepting a function <<g>> that is
+applied once for each iteration over the number: in this case, <<g>> adds 3 to
+its argument, so the whole term is a ``by 3'' multiplier.
\chapter{ZAM} %%%%%%%%%%
@@ -930,10 +1305,11 @@ $(0, S, (E_1,(M,E_2)))$ & \rightsquigarrow & $(M, S, E_2)$
\section{Krivine Machine source code} %%
+\label{sec:kriv-mach-source}
\inputminted[linenos=true,fontsize=\scriptsize]{ocaml}{code/krivine.ml}
-\newpage
+%\newpage
\section{ZAM source code} %%
%\inputminted[linenos=true,fontsize=\scriptsize]{ocaml}{code/zam.ml}