diff options
Diffstat (limited to 'tfm.tex')
-rw-r--r-- | tfm.tex | 416 |
1 files changed, 396 insertions, 20 deletions
@@ -646,6 +646,7 @@ context) \subsection{Extensions} +\label{sec:extensions} Until there we have seen the ``basic'', or ``pure'' Lambda Calculus, where the only data primitive is the function. Being turing complete, it is all that is @@ -767,7 +768,9 @@ To begin our study of the implementation of abstract machines, we will start with the Krivine Machine. It is a relatively simple and well-known model that will help us see the steps that we need to take in order to implement a real abstract machine. We will be using the OCaml language from now on (not only in -this section but also in the next ones). +this section but also in the next ones), and while snippets of code will be +presented to illustrate the concepts, the full code is available in the +annex \ref{sec:kriv-mach-source} for reference. The Krivine Machine \cite{Krivine07} is an implementation of the weak-head call-by-name reduction strategy for pure lambda terms. What that means is that: @@ -800,21 +803,25 @@ module Lambda = struct | App (m1, m2) -> "(" ^ show m1 ^ " " ^ show m2 ^ ")" | Abs (x, m) -> "(λ" ^ show_symbol x ^ "." ^ show m ^ ")" end - \end{minted} + \end{minted} \end{code} (From now on, the auxiliar (e.g., pretty-printing) functions will be ommited for brevity.) -Now we have a representation of the pure Lambda Calculus. As the Krivine Machine -usually works with expressions in \textbf{de Bruijn} notation\footnote{ The de +The module Lambda encodes the pure Lambda Calculus with its main type <<t>> +\footnote{Naming the main type of a module <<t>> is an idiom when using modules + in OCaml}. Variables are just symbols (strings tagged with an integer so that +they're unique), Applications are pairs of terms and Abstractions are pairs of +symbols (the binding variable) and terms (the body). As the Krivine Machine +usually works with expressions in \textbf{de Bruijn} notation\footnote{The de Bruijn's notation gets rid of variable names by replacing them by the number of ``lambdas'' between it and the lambda that is binding the variable. For example, $(\lambda x . (\lambda y . (x \: y)))$ will be written in de Bruijn notation as $(\lambda . (\lambda . (1 \: 0)))$}, we'll need to write an -algorithm to do the conversion. To be sure we do not accidentally build terms -mixing the two notatios, we'll create another module, \textbf{DBILambda}, with a -different data type to represent them: +algorithm to do the conversion of variables. To be sure we do not accidentally +build terms mixing the two notatios, we'll create another module, +\textbf{DBILambda}, with a different data type to represent them: \begin{code} \begin{minted}[fontsize=\scriptsize]{ocaml} @@ -835,40 +842,408 @@ module DBILambda = struct | Lambda.Abs (x, m) -> Abs (x, of_lambda (x :: dbis) m) in of_lambda [] end - \end{minted} + \end{minted} \end{code} -The new variables areThe function <<of_lambda>> accepts traditional lambda terms (Lambda.t) and -returns its representation as a term using de Bruijn notation (DBILambda.t). Now -we are ready to implement the actual reduction. +The new variables, of type <<dbi_symbol>>, store the de Bruijn number together +with the previous value of the symbol, to help debugging and +pretty-printing. The function <<of_lambda>> accepts traditional lambda terms +(Lambda.t) and returns its representation as a term using de Bruijn notation +(DBILambda.t). Now we are ready to implement the actual reduction. -The Krivine Machine has a state consisting on the code it is evaluating -(\textbf{C}), an environmentThe typical description of the Krivine Machine \cite{Douence07} is given by the +The Krivine Machine has a state $(C, S, E)$ consisting on the \textbf{code} it +is evaluating ($C$), an auxiliar \textbf{stack} ($S$) and the current +\textbf{environment} ($E$). The code is just a Lambda expression, the stack is +where the machine will store \textbf{closures} consisting of non-evaluated code +together with its environment at the time the closure was created, and an +environment associates variables (de Bruijn indices) to closures. + +The typical description of the Krivine Machine \cite{Douence07} is given by the following set of rules: +\newpage \begin{center} \line(1,0){250} \end{center} \begin{table}[H] \centering \begin{tabular}{lllll} -$(M N, S, E)$ & \rightsquigarrow & $(M, (S,(N,E)), E)$ \\ \\ -$(\lambda M, (S,N), E)$ & \rightsquigarrow & $(M, S, (E,N))$ \\ \\ -$(i+1, S, (E,N))$ & \rightsquigarrow & $(i, S, E)$ \\ \\ -$(0, S, (E_1,(M,E_2)))$ & \rightsquigarrow & $(M, S, E_2)$ +$(M N, S, E)$ & \rightsquigarrow & $(M, (N,E) :: S, E)$ \\ \\ +$(\lambda M, N :: S, E)$ & \rightsquigarrow & $(M, S, N :: E)$ \\ \\ +$(i+1, S, N :: E)$ & \rightsquigarrow & $(i, S, E)$ \\ \\ +$(0, S, (M,E_1) :: E_2)$ & \rightsquigarrow & $(M, S, E_1)$ \end{tabular} \end{table} \begin{center} \line(1,0){250} \end{center} -\section{With power} %% +In the previous diagram, $S$ and $E$ are both described as \textbf{lists}, with +the syntax (H :: T) meaning ``list whose head is H and tail is T''. The stack +$S$ is a list of closures that implements the push/pop operations over its +head. The environment is a list whose $i$-th position stores the variable with +de Bruijn index $i$. + +\begin{itemize} +\item In the first rule, to evaluate an application $M N$, the machine builds a + closure from the argument $N$ and the current environment $E$, pushes it into + the stack, and keeps on reducing the function $M$. +\item To reduce an abstraction $\lambda M$, the top of the stack is moved to the + environment and proceeds with the reduction of the body $M$. What this means + is that the last closure in the stack (the function argument) is now going to + be the variable in position (de Bruijn index) 0. +\item The last two rules rearch through the environment to find the closure + corresponding to the current variable. Once it is found, the closure's code + $M$ and environment $E_1$ replace the current ones. +\end{itemize} + +From this specification we can write a third module, \textbf{KM}, to define the +data structures (state, closure, stack) and implement the reduction rules of the +Krivine Machine: + +\newpage +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +module KM = struct + open DBILambda + + type st_elm = Clos of DBILambda.t * stack + and stack = st_elm list + + type state = DBILambda.t * stack * stack + + let reduce m = + let rec reduce (st : state) : DBILambda.t = + match st with + (* Pure lambda calculus *) + | (Var (0, _), s, Clos (m, e') :: e) -> reduce (m, s, e') + | (Var (n, x), s, _ :: e) -> reduce (Var (n-1, x), s, e) + | (App (m1, m2), s, e) -> reduce (m1, Clos (m2, e) :: s, e) + | (Abs (_, m), c :: s, e) -> reduce (m, s, c :: e) + (* Termination checks *) + | (m, [], []) -> m + | (_, _, _) -> m in + reduce (m, [], []) +end + \end{minted} +\end{code} + +At this point, we have a working implementation of the Krivine Machine and can +execute some tests to see that everything works as expected. We'll write some +example $\lambda$-terms: +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +let m1 = (* (λx.((λy.y) x)) *) + let x = symbol "x" in + let y = symbol "y" in + Abs (x, App (Abs (y, Var y), Var x)) + +let m2 = (* (((λx.(λy.(y x))) (λz.z)) (λy.y)) *) + let x = symbol "x" in + let y = symbol "y" in + let z = symbol "z" in + App (App (Abs (x, Abs (y, App (Var y, Var x))), Abs (z, Var z)), Abs (y, Var y)) + \end{minted} +\end{code} +And lastly, a helper function that accepts a $\lambda$-term, translates it to de +Bruijn notation and reduces it, outputting the results: -%\section{Compiled} %% +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +let dbi_and_red m = + let dbi_m = DBILambda.of_lambda m in + print_endline ("# Lambda term:\n " ^ DBILambda.show dbi_m); + let red_m = KM.reduce dbi_m in + print_endline ("# Reduced term:\n " ^ DBILambda.show red_m); + print_endline "-----------------------------------------------\n" + \end{minted} +\end{code} +\newpage +The results: +\begin{code} +\begin{verbatim} +ocaml> List.iter dbi_and_red [m1; m2];; + +# Lambda term: + (λx.((λy.y) x)) +# Reduced term: + (λx.((λy.y) x)) +---------------------------------------------------------- + +# Lambda term: + (((λx.(λy.(y x))) (λz.z)) (λy.y)) +# Reduced term: + (λz.z) +---------------------------------------------------------- +\end{verbatim} +\end{code} + +As we can see, the first term is not reduced because it is already in weak head +normal form (abstraction). The second term reduces as we would expect. As a +sidenote, we can easily tweak the DBILambda module to show the real de Bruijn +variables: + +\begin{code} +\begin{verbatim} +ocaml> List.iter dbi_and_red [m1; m2];; + +# Lambda term: + (λ.((λ.0) 0)) +# Reduced term: + (λ.((λ.0) 0)) +---------------------------------------------------------- + +# Lambda term: + (((λ.(λ.(0 1))) (λ.0)) (λ.0)) +# Reduced term: + (λ.0) +---------------------------------------------------------- +\end{verbatim} +\end{code} + + + +\section{Extending the machine} %% + +Now that we have a working Krivine Machine over basic Lambda Terms, we want to +extend it to work with the extensions we saw in section \ref{sec:extensions}: +\textbf{case expressions} and \textbf{fixpoints}. + + +\subsection{Case expressions} + +%% TODO: maybe later +% \begin{center} +% \line(1,0){250} +% \end{center} +% \begin{table}[H] +% \centering +% \begin{tabular}{lllll} +% $(Case(M, BS), S, E)$ & \rightsquigarrow & $(M, CaseCont(BS,E) :: S, E)$ \\ \\ +% $(Constr(X, MS), CaseCont(X, AS) :: S, E)$ & \rightsquigarrow & $(M, CaseCont(BS,E) +% :: S, E)$ \\ \\ +% $(\lambda M, N :: S, E)$ & \rightsquigarrow & $(M, S, N :: E)$ \\ \\ +% $(0, S, (M,E_1) :: E_2)$ & \rightsquigarrow & $(M, S, E_1)$ +% \end{tabular} +% \end{table} +% \begin{center} +% \line(1,0){250} +% \end{center} + +First, we will need to extend our definition of the Lambda Calculus to support +constructors and case expressions, both in Lambda and DBILambda: + +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +module Lambda = struct + type t = Var of symbol | App of t * t | Abs of symbol * t + (* constructors / case expressions *) + | Constr of t constr + | Case of t * (symbol constr * t) list + and 'a constr = symbol * 'a list + + (* ... *) +end + +module DBILambda = struct + type t = Var of dbi_symbol | App of t * t | Abs of symbol * t + (* constructors / case expressions *) + | Constr of t constr + | Case of t * (symbol constr * t) list + and 'a constr = symbol * 'a list + + let of_lambda = + let rec of_lambda dbis = function + (* ... *) + | Lambda.Constr (x, ms) -> Constr (x, List.map (of_lambda dbis) ms) + | Lambda.Case (m, bs) -> Case (of_lambda dbis m, + List.map (trans_br dbis) bs) + in of_lambda [] + + (* ... *) +end + \end{minted} +\end{code} + +The basic approach to implement the reduction will be the same as when reducing +applications in $\lambda$-terms: when encountering a case expression, create a +custom closure ``CaseCont'' containing the branches and push it into the +stack. When a constructor is encountered, if there is a CaseCont closure in the +stack, the machine will iterate over the branches in the closure until it finds +one whose symbol matches with the constructor, and evaluate the body: + +\newpage +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +module KM = struct + (* ... *) + type st_elm = Clos of DBILambda.t * stack + (* Specific closure for case expressions *) + | CaseCont of (symbol DBILambda.constr * DBILambda.t) list * stack + + let reduce m = + let rec reduce (st : state) : DBILambda.t = + match st with + (* ... *) + (* Case expressions (+ constructors) *) + | (Case (m, bs), s, e) -> reduce (m, CaseCont (bs, e) :: s, e) + | (Constr (x, ms), CaseCont (((x', args), m) :: bs, e') :: s, e) + when x == x' && List.length ms == List.length args -> + reduce (List.fold_left (fun m x -> Abs (x, m)) m args, + map_rev (fun m -> Clos (m, e)) ms @ s, e') + | (Constr (x, ms), CaseCont (_ :: bs, e') :: s, e) -> + reduce (Constr (x, ms), CaseCont (bs, e') :: s, e) + | (Constr (x, ms), s, e) -> + Constr (x, List.map (fun m -> reduce (m, s, e)) ms) + reduce (m, [], []) +end + \end{minted} +\end{code} + +Note that te last rule reduces the terms inside a constructor even when it is +not being pattern matched. It can be deleted to more closely resemble a +weak-head normal form behavior. + +With this new functionality we can encode Peano arithmetic by using the +constructors <<z>> and <<s>>, and try some examples of pattern matching: + +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +(* Peano arithmetic helpers *) +let z = symbol "z" +let s = symbol "s" + +let rec peano_add n x = + if n == 0 then x else peano_add (n-1) (Constr (s, [x])) + +let peano_of_int ?(base=Constr (z, [])) n = peano_add n base + \end{minted} +\end{code} + +\newpage +\begin{code} +\begin{verbatim} +# Lambda term: + ((λc.(case c of (Some(x) → x) + (None() → c))) + Some(s(z()))) +# Reduced term: + s(z()) +---------------------------------------------------------- + +# Lambda term: + ((λc.(case c of (triple(x,y,z) → y))) + triple(s(z()), s(s(z())), s(s(s(z()))))) +# Reduced term: + s(s(z())) +---------------------------------------------------------- +\end{verbatim} +\end{code} + + +\subsection{Fixpoints} + +The second extension we are going to implement in our Krivine Machine is the +ability to support fixpoints, that is, limited recursion. To prevent infinite +reductions, every fixpoint must be defined over a constructor, in a way that +TODO. + +Again, we extend the data structures of the Lambda Calculus. As the extension to +DBILambda follows trivially, we will omit it here: + +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +module Lambda = struct + type t = Var of symbol | App of t * t | Abs of symbol * t + | Constr of t constr + | Case of t * (symbol constr * t) list + (* fixpoints *) + | Fix of symbol * t + (* ... *) +end + \end{minted} +\end{code} + +The ``Fix'' constructor is parameterized by a symbol and another lambda +term. The symbol is the name of the fixpoint, and will expand to itself when +referenced inside the term. Let's see an example: + +\newpage +\begin{code} +\begin{verbatim} +fix(λf.λc. case c of (s(x) → s(s(f x))) + (z → z)) + s(s(s(z))) +\end{verbatim} +\end{code} + +Here, the name of the fixpoint is <<f>>, and is an implicitly-passed argument +that references to the term itself (<<fix(...)>>). So, in this case, <<f>> will be +bound to <<fix($\lambda$f. $\lambda$c. case c of ... )>> and <<c>> will be bound +to <<s(s(s(z)))>> initially. + +The reduction itself is simpler than the previous case: + +\begin{code} + \begin{minted}[fontsize=\scriptsize]{ocaml} +module KM = struct + (* ... *) + type st_elm = Clos of DBILambda.t * stack + | CaseCont of (symbol DBILambda.constr * DBILambda.t) list * stack + (* Specific closure for fixpoints *) + | FixClos of symbol * DBILambda.t * stack + + let reduce m = + let rec reduce (st : state) : DBILambda.t = + match st with + (* ... *) + (* Fixpoints *) + | (Var (0, _), s, FixClos (f, m, e') :: e) -> + reduce (m, s, FixClos (f, m, e') :: e') + | (Fix (x, m), s, e) -> reduce (m, s, FixClos (x, m, e) :: e) + reduce (m, [], []) +end + \end{minted} +\end{code} + +Again, here we create a new closure when reducing the Fix constructor. Then, +whenever some variable refers to a fixpoint closure, the result is its body, +with the side effect of keeping the closure it in the environment (so that it is +automatically bound to the first argument <<f>> of itself). + +\newpage +The fixpoint tests: + +\begin{code} +\begin{verbatim} +# Lambda term: + (fix(λf.(λc.(case c of (s(x) → s(s((f x)))) + (z() → z())))) + s(s(s(z())))) +# Reduced term: + s(s(s(s(s(s(z())))))) +---------------------------------------------------------- + +# Lambda term: + ((fix(λf.(λg.(λc.(case c of (s(x) → (g ((f g) x))) + (z() → z()))))) + (λy.s(s(s(y))))) + s(s(s(z())))) +# Reduced term: + s(s(s(s(s(s(s(s(s(z()))))))))) +---------------------------------------------------------- +\end{verbatim} +\end{code} + +The first test is the example we have previously seen; the fixpoint receives a +number in Peano notation and multiplies it by two by iterating over its +structure. The second one generalizes it by accepting a function <<g>> that is +applied once for each iteration over the number: in this case, <<g>> adds 3 to +its argument, so the whole term is a ``by 3'' multiplier. \chapter{ZAM} %%%%%%%%%% @@ -930,10 +1305,11 @@ $(0, S, (E_1,(M,E_2)))$ & \rightsquigarrow & $(M, S, E_2)$ \section{Krivine Machine source code} %% +\label{sec:kriv-mach-source} \inputminted[linenos=true,fontsize=\scriptsize]{ocaml}{code/krivine.ml} -\newpage +%\newpage \section{ZAM source code} %% %\inputminted[linenos=true,fontsize=\scriptsize]{ocaml}{code/zam.ml} |