diff options
author | Guillermo Ramos | 2015-07-19 22:44:35 +0200 |
---|---|---|
committer | Guillermo Ramos | 2015-07-19 22:45:57 +0200 |
commit | 740b7982238ffe8c2eaae79fd153117fd258ff04 (patch) | |
tree | 59deae95e4af2d91d8bef02ba07b63afc7e459bc /tfm.tex | |
parent | 5e4f88057da5b416624b7d0cb82a42ac1ea8981a (diff) | |
download | tfm-740b7982238ffe8c2eaae79fd153117fd258ff04.tar.gz |
19/06/15, sent for revision
Diffstat (limited to 'tfm.tex')
-rw-r--r-- | tfm.tex | 427 |
1 files changed, 387 insertions, 40 deletions
@@ -38,6 +38,14 @@ \pagenumbering{roman} +\tikzstyle{block} = [rectangle, draw, fill=blue!20, +text width=5em, text centered, rounded corners, minimum height=4em] +\tikzstyle{invisbl} = [text width=5em, text centered, minimum height=4em] +\tikzstyle{line} = [draw, -latex'] +\tikzstyle{cloud} = [draw, ellipse,fill=blue!10, node distance=3cm, +minimum height=2em] + + \emptypage \chapter*{Resumen} @@ -51,8 +59,7 @@ recientemente en el Instituto IMDEA Software en respuesta a la creciente necesidad de disponer de herramientas fiables de verificación formal de criptografía. -(TODO: En este documento se explicará cripto y reescritura de términos para bla -bla) +(TODO: crypto, reescritura de términos, máquinas abstractas, mejoras a EasyCrypt) \chapter*{Abstract} @@ -65,8 +72,7 @@ these tools is EasyCrypt, developed recently at IMDEA Software Institute in response to the increasing need of reliable formal verification tools for cryptography. -(TODO: In this document we will see crypto and term rewriting theory in order to -understand EasyCrypt and implement bla bla bla) +(TODO: crypto, term rewriting, abstract machines, EasyCrypt impovements) \emptypage \tableofcontents @@ -100,49 +106,123 @@ storage. Building strong security systems is not an easy task, because there are lots of parts that must be studied in order to assure the system as a whole behaves as intended. + + + +\section{Cryptography} + One of the most fundamental tools used to build security computer systems is -\textbf{cryptography}. Due to its heavy mathematical roots, cryptography today -is a mature science that, when correctly implemented, can provide strong -security guarantees to the systems using it. In fact, it is usually the case -that cryptography is one of the \textit{most secure} parts of the system, and a -lot of other concerns should be taken care of, such as underlying operating -systems, security policies, or the human factor. +\textbf{cryptography}. As a relatively low-level layer in the security stack, it +is often the cornerstone over which all the system relies in order to keep being +safe. Due to its heavy mathematical roots, cryptography today is a mature +science that, when correctly implemented, can provide strong security guarantees +to the systems using it. At this point, one could be tempted of just ``using strong, NIST-approved -cryptography'' and focusing on the security of other parts of the system. The -reality is that correctly implementing cryptography is a pretty difficult task +cryptography'' and focusing on the security of other parts of the system. The +problem is that correctly implementing cryptography is a pretty difficult task on its own, mainly because there is not a one-size-fits-all construction that covers all security requirements. Every cryptographic primitive has its own security assumptions and guarantees, so one must be exceptionally cautious when -combining them in order to build larger systems. For example, a given -cryptographic construction could be well suited for a concrete scenario and -totally useless in some others. In turn, this situation can produce a false -sense of security, potentially worse that not having any security at all. +combining them in order to build larger systems. A given cryptographic +construction is usually well suited for some kind of scenarios, and offers +little to no security otherwise. In turn, this can produce a false sense of +security, potentially worse that not having any security at all. + + + + +\section{Formal Methods} %% In order to have the best guarantee that some cryptographic construction meets its security requirements, we can use use formal methods to prove that the -requirements follow from the assumptions (scenario). +requirements follow from the assumptions (scenario). (TODO: some more explaining.) -(TODO: maybe a sub-section on sequences of games, Hoare logic, etc.) +While mathematical proofs greatly enhance the confidence we have in that a given +cryptosystem behaves as expected, with the recent increase in complexity it has +become more and more difficult to write and verify the proofs by hand, to the +point of being practically non-viable. In the recent years there has been an +increasing effort in having computers help us write and verify this proofs. + +There are various methods and tools for doing this, but one of the most +versatile and powerful are the \textbf{proof assistants}, which are tools +designed to help users develop formal proofs interactively. A proof assistant +usually follows the rules of one or more \textbf{logics} to derive theorems from +previous facts, and the user helps it by giving ``hints'' on how to +proceed. This is in contrast to some other theorem provers that use little or no +help from the user, making them easier to use but fundamentally more limited. +Coq\footnote{\url{http://coq.inria.fr/}} and +Isabelle\footnote{\url{http://isabelle.in.tum.de/}} are examples of widely used +proof assistants. + +(TODO: more explaining, screenshots?) + +One downside of proof assistants is that they require a considerable amount of +knowledge from the user, making them difficult to use for people that is not +somewhat fluent with theoretical computer science and logic. This is a +significant obstacle to the application of this technologies to other scientific +fields that could benefit from adopting the formal methods approach to +verification. + + +\section{EasyCrypt} +\label{sec:easycrypt} +EasyCrypt \cite{BartheGHB11} is a toolset conceived to help cryptographers +construct and verify cryptographic proofs. It is an open source +project\footnote{\url{https://www.easycrypt.info}} being developed +currently at IMDEA Software Institute and Inria. It is the evolution of the +previous CertiCrypt system \cite{BartheGB09}. +EasyCrypt's works as an interpreter of its own \textbf{programming language}, in +which the programmer can express all that's needed in order to develop the +proofs. At every step of the evaluation, EasyCrypt can output some information +regarding the state of the system so that external tools can parse and show it +to the user. Together with the fact that the evaluation steps can be reversed, +this forms the basis of the interactivity of the EasyCrypt system: the user can +evaluate the program step by step, and if needed, undo it and re-evaluate in the +fly. -\section{Problem} %% +\begin{figure}[H] + \centering + \includegraphics[width=1\textwidth]{img/easycrypt.png} + \caption{EasyCrypt} + \label{fig:easycrypt} +\end{figure} -(TODO: Need of Computer-assisted proofs) +The preferred way of working with EasyCrypt is using the +\textbf{Emacs}\footnote{\url{http://www.gnu.org/software/emacs/}} text editor, +with the \textbf{Proof + General}\footnote{\url{http://proofgeneral.inf.ed.ac.uk/}} interface to proof +assistants (figure \ref{fig:easycrypt}). This interface shows both the source +code and the EasyCrypt output at the point of evaluation (the already evaluated +code is displayed in a different color), and offers standard key combinations +for forward/backward code stepping. +As we'll see later (section \ref{sec:verif-easycrypt}), EasyCrypt has different +sub-languages for working with different things, e.g., representing games, +developing the proofs, etc. One of them is specially relevant in this thesis: +the \textbf{expression language}. It is the language EasyCrypt uses to define +typed values, like quantified formulas, arithmetic expressions, functions, +function application and such. (TODO: briefly explain what it is used for and +what can be improved) -\subsection{EasyCrypt} -(TODO: What is EasyCrypt (Coq-like, screens, etc)) + + +\section{Term Rewriting} + +(TODO: explain the need to improve the EasyCrypt's one) \section{Contributions} %% +(TODO:) + \begin{itemize} -\item Reference implementations of various rewriting engines -\item Improvement of EasyCrypt's one +\item Study and implement some rewriting engines +\item Improve the EasyCrypt's one \end{itemize} @@ -156,7 +236,7 @@ requirements follow from the assumptions (scenario). \chapter{CRYPTOGRAPHY} %%%%%%%%%% -(TODO: Intro to crypto) +(TODO: brief introduction to crypto: encryption (signatures?) etc.) @@ -170,7 +250,8 @@ requirements follow from the assumptions (scenario). Here we will introduce some of the most fundamental concepts in asymmetric cryptography, as they will be useful to understand the next sections on -sequences of games (TODO: ref). +EasyCrypt's proof system and sequences of games (section +\ref{sec:sequences-games}). \textbf{Asymmetric cryptography} (also called \textbf{Public Key cryptography}), refers to cryptographic algorithms that make use of two different keys, $pk$ @@ -189,37 +270,257 @@ $$\Dec(sk,C) = M$$ That is, a message ($M$) can be encrypted using a public key to obtain a ciphertext ($C$). + + +\section{Proofs by reduction} %% +\label{sec:proofs-reduction} + +(TODO: explain: Shannon + perfect security size problems, need to rely on strong +primitives -RSA, DDH...- and use them in reduction proofs) + +\begin{figure}[ht] + \centering + \begin{tikzpicture}[node distance = 6cm, auto] + % Nodes + \node [block, fill=blue!40] (primitive) {Primitive}; + \node [block, below of=primitive, fill=blue!40] (scheme) {Scheme}; + \node [block, right of=primitive, fill=red!60] (attack1) {Attack}; + \node [block, right of=scheme, fill=red!60] (attack2) {Attack}; + % Edges + \path [line,dashed] (attack1) -- (primitive); + \path [line,dashed] (attack2) -- (scheme); + \path [line] + (primitive) edge node[left] {\textbf{Construction}} (scheme) + (attack2) edge node[right] {\textbf{Reduction}} (attack1); + \end{tikzpicture} + \caption{Proofs by reduction} + \label{fig:proofs} +\end{figure} + + + + \section{Sequences of games} %% +\label{sec:sequences-games} -While mathematical proofs greatly enhance the confidence we have in that a given -cryptosystem behaves as expected, with the recent increase in complexity it has -become more and more difficult to write and verify the proofs by hand, to the -point of being practically non-viable. In 2004 \cite{Shoup04}, Victor Shoup -introduced the concept of \textbf{sequences of games} as a method of taming the -complexity of cryptography related proofs. A game is like a program written in a -well-defined, probabilistic programming language, and a sequence of games is the -result of applying transformations over the initial one. +In 2004 \cite{Shoup04}, Victor Shoup introduced the concept of \textbf{sequences + of games} as a method of taming the complexity of cryptography related +proofs. A \textbf{game} is like a program written in a well-defined, +probabilistic programming language, and a sequence of games is the result of +applying transformations over the initial one. Every game represents the +interaction between a \textbf{challenger} and an \textbf{adversary}, with the +last one being usually encoded as a function (probabilistic program). In the +end, we will want the sequence of games to form a proof by reduction (see +section \ref{sec:proofs-reduction}), where the transition of games proves that +our system can be reduced, under certain conditions, to some well-known +cryptographic primitive (TODO: is this correct?). + +(TODO: explain transitions and events (adversary winning)) + +In order to see a practical example of how sequences of games work, let us +define the following game: + +\newpage +\begin{game}[caption=IND-CPA game (from \cite{BartheGB09}), + label={lst:indcpa}]{} + $(pk,sk) \leftarrow \KG();$ + $(m_0,m_1) \leftarrow A_1(pk);$ + $b \overset{\$}{\leftarrow} \{0,1\};$ + $c \leftarrow \Enc(pk,m_b);\\$ + $\tilde{b} \leftarrow A_2(c)$ +\end{game} + +The Game \ref{lst:indcpa} can be used to define the IND-CPA property of public +key encryption schemes. (TODO: explain IND-CPA) + +In the game, $\KG$ and $\Enc$ are the key generation and encryption functions +provided by the encryption algorithm, respectively, and $A_1$ is the encoding of +the adversary. + +(TODO) \section{Verification: EasyCrypt} %% +\label{sec:verif-easycrypt} + +EasyCrypt has different languages to perform different tasks: \subsection{Specification languages} +This are the languages EasyCrypt uses to declare and define types, functions, +axioms, games, oracles, adversaries and other entities involved in the proofs. + \subsubsection{Expressions} -\subsubsection{Probabilistic expressions} +The main specification language of EasyCrypt is the expression language, in +which \textbf{types} are defined together with \textbf{operators} that can be +applied to them (or be constant). EasyCrypt follows the usual semicolon +notation\cite{Church40} to denote the typing relationship: <<$a : T$>> means +``$a$ has type $T$''. EasyCrypt has a type system supporting parametric +polymorphism: \rawec{int list} represents a list of integers. + +The operators are functions over types, defined with the keyword \rawec{op} +(e.g., \\ \rawec{op even : nat -> bool}). An operator can be applied to some +argument by putting them separated by a space: \rawec{even 4}. Operators can be +abstract, i.e., defined without any actual implementation; with semantics given +by the definition of axioms and lemmas that describe its observational +behavior. Operators are also \textit{curried}, so they support multiple +arguments by returning new functions that consume the next one. For example, +$f : (A \times B\times C) \rightarrow D$ would be encoded as +$f : A \rightarrow (B \rightarrow (C \rightarrow D))$, or, by associativity, +$f : A \rightarrow B \rightarrow C \rightarrow D$. + +In this example (from the current EasyCrypt library) we can see the how actual +types and operators are defined in the EasyCrypt's expression language: + +\begin{easycrypt}[caption=Lists (expression language), label={lst:exprlang}]{} +type 'a list = [ + | "[]" + | (::) of 'a & 'a list ]. + +op hd: 'a list -> 'a. +axiom hd_cons (x:'a) xs: hd (x::xs) = x. + +op map (f:'a -> 'b) (xs:'a list) = + with xs = "[]" => [] + with xs = (::) x xs => (f x)::(map f xs). +\end{easycrypt} + +The first line defines the \rawec{list} type as a sum type with two constructors +(cases): the \textit{empty list} and the \textit{construction} of a new list +from an existing one and an element that will appear at its head position. The +rest of the code defines operators working with lists. + +The next line abstractly defines the operator \rawec{hd}, together with its +type. The axiom following it partially specifies the behavior of the \rawec{hd} when +applied to some list: if the list has the form \rawec{x::xs} (element \rawec{x} +followed by \rawec{xs}), the return value is \rawec{x}. The other case (empty +list) is left unspecified. + +The last line defines the \rawec{map} operator directly, using pattern +matching. This operator receives a function and a list, and returns the list +consisting of the results of evaluating the function over each element of the +list, preserving its order. + +\paragraph{Probabilistic expressions} + +Additionally, EasyCrypt defines some standard types and operators to work with +probabilistic expressions. The type \rawec{'a distr} represents discrete +sub-distributions over types. The operator \rawec{mu} represents the probability +of some event in a sub-distribution: + +\begin{easycrypt}[]{} +op mu : 'a distr -> ('a -> bool) -> real +\end{easycrypt} + +For example, the uniform distribution over booleans is defined in the +EasyCrypt's standard library as follows: + +\begin{easycrypt}[caption=Uniform distribution over bool, + label={lst:booldistr}]{} +op dbool : bool distr. +axiom mu_def : forall (p : bool -> bool), + mu dbool p = + (1/2) * charfun p true + + (1/2) * charfun p false. +\end{easycrypt} \subsubsection{pWhile language} +Expression languages are usually not adequate to define games and other data +structures as cryptographic schemes and oracles, due to the stateful nature of +sequential algorithms. That's why EasyCrypt uses a different language +called \textbf{pWhile} \cite{BartheGB12} (probabilistic while) to define them: + +\begin{bnf}[caption=pWhile language, label={lst:pwhile}]{} + $\mathcal{C} ::=$ skip + | $\mathcal{V \leftarrow E}$ + | $\mathcal{V \overset{\$}{\leftarrow} DE}$ + | if $\mathcal{E}$ then $\mathcal{C}$ else $\mathcal{C}$ + | while $\mathcal{E}$ do $\mathcal{C}$ + | $\mathcal{V} \leftarrow \mathcal{P}(\mathcal{E}, ..., \mathcal{E})$ + | $\mathcal{C}$; $\mathcal{C}$ +\end{bnf} + \subsection{Proof languages} +These are the languages used to write and prove theorems: + \subsubsection{Judgments} +Whenever there is some statement that we want to prove, it must be written as a +judgment in some \textbf{logic}. Apart from the first order logic expressions, +EasyCrypt supports judgments in some logics derived from Hoare logic: + +\begin{itemize} +\item Hoare Logic (\textit{HL}). These judgments have the following shape: + + $$c : P \Longrightarrow Q$$ + + where $P$ and $Q$ are assertions (predicates) and $c$ is a statement or + program. $P$ is the \textbf{precondition} and $Q$ is the + \textbf{postcondition}. The validity of this kind of Hoare judgment implies + that if $P$ holds before the execution of $c$ and it terminates, then $Q$ must + also hold. + +\item Probabilistic Hoare Logic (\textit{pHL}). This is the logic resulting from + assigning some probability to the validity of the previously seen Hoare + judgments. The probability can be a number or an upper/lower bound: + + $$[c : P \Longrightarrow Q] \leq \delta$$ + $$[c : P \Longrightarrow Q] = \delta$$ + $$[c : P \Longrightarrow Q] \geq \delta$$ + +\item Probabilistic Relational Hoare Logic (\textit{pRHL}). These have the + following shape: + + $$c_1 \sim c_2 : \Psi \Longrightarrow \Phi$$ + + In this case, the pre and postconditions are not standalone predicates, but + \textbf{relationships} between the memories of the two programs $c_1$ and + $c_2$. This judgment means that if the precondition $\Psi$ holds before the + execution of $c_1$ and $c_2$, the postcondition $\Phi$ will also hold after + finishing its execution. + + This logic is the most complete and useful when developing game-based + reduction proofs, because it allows to encode each game transition as a + judgment. Twe two games are $c_1$ and $c_2$ respectively and the + pre/postconditions refer to the probability of the adversary winning the + games. + +\end{itemize} + \subsubsection{Tactics} +If the judgment is declared as an axiom, it is taken as a fact and does not need +to be proven. Lemmas, however, will make EasyCrypt enter in ``proof mode'', +where it stops reading declarations, takes the current judgment as a goal and +and starts accepting \textbf{tactics} until the current goal is trivially +true. Tactics are indications on what rules EasyCrypt must apply to transform +the current goal. + +This is a simplified example of proof from the EasyCrypt's library: + +\begin{easycrypt}[caption=Tactics usage, label={lst:tactics}]{} +lemma cons_hd_tl : + forall (xs:'a list), + xs <> [] => (hd xs)::(tl xs) = xs. +proof. + intros xs. + elim / list_ind xs. + simplify. + intros x xs' IH h {h}. + rewrite hd_cons. + rewrite tl_cons. + reflexivity. +qed. +\end{easycrypt} + +(TODO: briefly explain proof) + @@ -233,6 +534,8 @@ result of applying transformations over the initial one. \section{Lambda Calculus} %% +\cite{Church36} + \subsection{Extensions} @@ -255,19 +558,32 @@ http://adam.chlipala.net/cpdt/html/Equality.html \section{Evaluation Strategies} %% +\cite{Sestoft02} + \section{Abstract Machines} %% +(TODO: basic intro, motivation, function application implementations +(push-enter, eval-apply)) + -\subsection{Krivine Machine} +\subsection{Krivine Machine} %%%%%%%%%% \cite{Krivine07} +\cite{Douence07} + + +\subsection{ZAM: Standard version} + +\cite{Leroy90} -\subsection{ZAM} +\subsection{ZAM: Alternative version} + +\cite{LeroyG02} + -\cite{GregoireLeroy02} @@ -275,14 +591,21 @@ http://adam.chlipala.net/cpdt/html/Equality.html + \chapter{KRIVINE MACHINE} %%%%%%%%%% -- Outside EasyCrypt: weak symbolic with fixpts and cases -- Inside EasyCrypt: bla bla +\cite{Krivine07} +\cite{Douence07} + + +\section{Not compiled} %% +\section{Compiled} %% + + \chapter{ZAM} %%%%%%%%%% @@ -290,6 +613,8 @@ http://adam.chlipala.net/cpdt/html/Equality.html + + \chapter{REDUCTION IN EASYCRYPT} %%%%%%%%%% @@ -298,6 +623,14 @@ http://adam.chlipala.net/cpdt/html/Equality.html +\section{Data types} + + + +\section{Implementation} + + + \part{EPILOGUE} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -307,11 +640,25 @@ http://adam.chlipala.net/cpdt/html/Equality.html \chapter{CONCLUSIONS} %%%%%%%%%% +\begin{itemize} +\item Brief wrapping of importance of verified crypto today +\item Survey of available reduction machineries, differences between them +\item Improvements to EasyCrypt: modularity, extensibility, efficiency (?) +\item Problems encountered during the implementation (debugging!!! specially the + compiled one, progressive replacing of the EasyCrypt's one due to state) +\end{itemize} + \chapter{FUTURE WORK} %%%%%%%%%% +\begin{itemize} +\item Compile code to abstract machine opcodes to improve efficiency +\item Tests +\item Make users able to define their own rules (expand the engine in ``userland'') +\end{itemize} + |