summaryrefslogtreecommitdiff
path: root/tfm.tex
diff options
context:
space:
mode:
authorGuillermo Ramos2015-07-19 22:44:35 +0200
committerGuillermo Ramos2015-07-19 22:45:57 +0200
commit740b7982238ffe8c2eaae79fd153117fd258ff04 (patch)
tree59deae95e4af2d91d8bef02ba07b63afc7e459bc /tfm.tex
parent5e4f88057da5b416624b7d0cb82a42ac1ea8981a (diff)
downloadtfm-740b7982238ffe8c2eaae79fd153117fd258ff04.tar.gz
19/06/15, sent for revision
Diffstat (limited to 'tfm.tex')
-rw-r--r--tfm.tex427
1 files changed, 387 insertions, 40 deletions
diff --git a/tfm.tex b/tfm.tex
index 3dc078a..d363857 100644
--- a/tfm.tex
+++ b/tfm.tex
@@ -38,6 +38,14 @@
\pagenumbering{roman}
+\tikzstyle{block} = [rectangle, draw, fill=blue!20,
+text width=5em, text centered, rounded corners, minimum height=4em]
+\tikzstyle{invisbl} = [text width=5em, text centered, minimum height=4em]
+\tikzstyle{line} = [draw, -latex']
+\tikzstyle{cloud} = [draw, ellipse,fill=blue!10, node distance=3cm,
+minimum height=2em]
+
+
\emptypage
\chapter*{Resumen}
@@ -51,8 +59,7 @@ recientemente en el Instituto IMDEA Software en respuesta a la creciente
necesidad de disponer de herramientas fiables de verificación formal de
criptografía.
-(TODO: En este documento se explicará cripto y reescritura de términos para bla
-bla)
+(TODO: crypto, reescritura de términos, máquinas abstractas, mejoras a EasyCrypt)
\chapter*{Abstract}
@@ -65,8 +72,7 @@ these tools is EasyCrypt, developed recently at IMDEA Software Institute in
response to the increasing need of reliable formal verification tools for
cryptography.
-(TODO: In this document we will see crypto and term rewriting theory in order to
-understand EasyCrypt and implement bla bla bla)
+(TODO: crypto, term rewriting, abstract machines, EasyCrypt impovements)
\emptypage
\tableofcontents
@@ -100,49 +106,123 @@ storage. Building strong security systems is not an easy task, because there are
lots of parts that must be studied in order to assure the system as a whole
behaves as intended.
+
+
+
+\section{Cryptography}
+
One of the most fundamental tools used to build security computer systems is
-\textbf{cryptography}. Due to its heavy mathematical roots, cryptography today
-is a mature science that, when correctly implemented, can provide strong
-security guarantees to the systems using it. In fact, it is usually the case
-that cryptography is one of the \textit{most secure} parts of the system, and a
-lot of other concerns should be taken care of, such as underlying operating
-systems, security policies, or the human factor.
+\textbf{cryptography}. As a relatively low-level layer in the security stack, it
+is often the cornerstone over which all the system relies in order to keep being
+safe. Due to its heavy mathematical roots, cryptography today is a mature
+science that, when correctly implemented, can provide strong security guarantees
+to the systems using it.
At this point, one could be tempted of just ``using strong, NIST-approved
-cryptography'' and focusing on the security of other parts of the system. The
-reality is that correctly implementing cryptography is a pretty difficult task
+cryptography'' and focusing on the security of other parts of the system. The
+problem is that correctly implementing cryptography is a pretty difficult task
on its own, mainly because there is not a one-size-fits-all construction that
covers all security requirements. Every cryptographic primitive has its own
security assumptions and guarantees, so one must be exceptionally cautious when
-combining them in order to build larger systems. For example, a given
-cryptographic construction could be well suited for a concrete scenario and
-totally useless in some others. In turn, this situation can produce a false
-sense of security, potentially worse that not having any security at all.
+combining them in order to build larger systems. A given cryptographic
+construction is usually well suited for some kind of scenarios, and offers
+little to no security otherwise. In turn, this can produce a false sense of
+security, potentially worse that not having any security at all.
+
+
+
+
+\section{Formal Methods} %%
In order to have the best guarantee that some cryptographic construction meets
its security requirements, we can use use formal methods to prove that the
-requirements follow from the assumptions (scenario).
+requirements follow from the assumptions (scenario). (TODO: some more explaining.)
-(TODO: maybe a sub-section on sequences of games, Hoare logic, etc.)
+While mathematical proofs greatly enhance the confidence we have in that a given
+cryptosystem behaves as expected, with the recent increase in complexity it has
+become more and more difficult to write and verify the proofs by hand, to the
+point of being practically non-viable. In the recent years there has been an
+increasing effort in having computers help us write and verify this proofs.
+
+There are various methods and tools for doing this, but one of the most
+versatile and powerful are the \textbf{proof assistants}, which are tools
+designed to help users develop formal proofs interactively. A proof assistant
+usually follows the rules of one or more \textbf{logics} to derive theorems from
+previous facts, and the user helps it by giving ``hints'' on how to
+proceed. This is in contrast to some other theorem provers that use little or no
+help from the user, making them easier to use but fundamentally more limited.
+Coq\footnote{\url{http://coq.inria.fr/}} and
+Isabelle\footnote{\url{http://isabelle.in.tum.de/}} are examples of widely used
+proof assistants.
+
+(TODO: more explaining, screenshots?)
+
+One downside of proof assistants is that they require a considerable amount of
+knowledge from the user, making them difficult to use for people that is not
+somewhat fluent with theoretical computer science and logic. This is a
+significant obstacle to the application of this technologies to other scientific
+fields that could benefit from adopting the formal methods approach to
+verification.
+
+
+\section{EasyCrypt}
+\label{sec:easycrypt}
+EasyCrypt \cite{BartheGHB11} is a toolset conceived to help cryptographers
+construct and verify cryptographic proofs. It is an open source
+project\footnote{\url{https://www.easycrypt.info}} being developed
+currently at IMDEA Software Institute and Inria. It is the evolution of the
+previous CertiCrypt system \cite{BartheGB09}.
+EasyCrypt's works as an interpreter of its own \textbf{programming language}, in
+which the programmer can express all that's needed in order to develop the
+proofs. At every step of the evaluation, EasyCrypt can output some information
+regarding the state of the system so that external tools can parse and show it
+to the user. Together with the fact that the evaluation steps can be reversed,
+this forms the basis of the interactivity of the EasyCrypt system: the user can
+evaluate the program step by step, and if needed, undo it and re-evaluate in the
+fly.
-\section{Problem} %%
+\begin{figure}[H]
+ \centering
+ \includegraphics[width=1\textwidth]{img/easycrypt.png}
+ \caption{EasyCrypt}
+ \label{fig:easycrypt}
+\end{figure}
-(TODO: Need of Computer-assisted proofs)
+The preferred way of working with EasyCrypt is using the
+\textbf{Emacs}\footnote{\url{http://www.gnu.org/software/emacs/}} text editor,
+with the \textbf{Proof
+ General}\footnote{\url{http://proofgeneral.inf.ed.ac.uk/}} interface to proof
+assistants (figure \ref{fig:easycrypt}). This interface shows both the source
+code and the EasyCrypt output at the point of evaluation (the already evaluated
+code is displayed in a different color), and offers standard key combinations
+for forward/backward code stepping.
+As we'll see later (section \ref{sec:verif-easycrypt}), EasyCrypt has different
+sub-languages for working with different things, e.g., representing games,
+developing the proofs, etc. One of them is specially relevant in this thesis:
+the \textbf{expression language}. It is the language EasyCrypt uses to define
+typed values, like quantified formulas, arithmetic expressions, functions,
+function application and such. (TODO: briefly explain what it is used for and
+what can be improved)
-\subsection{EasyCrypt}
-(TODO: What is EasyCrypt (Coq-like, screens, etc))
+
+
+\section{Term Rewriting}
+
+(TODO: explain the need to improve the EasyCrypt's one)
\section{Contributions} %%
+(TODO:)
+
\begin{itemize}
-\item Reference implementations of various rewriting engines
-\item Improvement of EasyCrypt's one
+\item Study and implement some rewriting engines
+\item Improve the EasyCrypt's one
\end{itemize}
@@ -156,7 +236,7 @@ requirements follow from the assumptions (scenario).
\chapter{CRYPTOGRAPHY} %%%%%%%%%%
-(TODO: Intro to crypto)
+(TODO: brief introduction to crypto: encryption (signatures?) etc.)
@@ -170,7 +250,8 @@ requirements follow from the assumptions (scenario).
Here we will introduce some of the most fundamental concepts in asymmetric
cryptography, as they will be useful to understand the next sections on
-sequences of games (TODO: ref).
+EasyCrypt's proof system and sequences of games (section
+\ref{sec:sequences-games}).
\textbf{Asymmetric cryptography} (also called \textbf{Public Key cryptography}),
refers to cryptographic algorithms that make use of two different keys, $pk$
@@ -189,37 +270,257 @@ $$\Dec(sk,C) = M$$
That is, a message ($M$) can be encrypted using a public key to obtain a
ciphertext ($C$).
+
+
+\section{Proofs by reduction} %%
+\label{sec:proofs-reduction}
+
+(TODO: explain: Shannon + perfect security size problems, need to rely on strong
+primitives -RSA, DDH...- and use them in reduction proofs)
+
+\begin{figure}[ht]
+ \centering
+ \begin{tikzpicture}[node distance = 6cm, auto]
+ % Nodes
+ \node [block, fill=blue!40] (primitive) {Primitive};
+ \node [block, below of=primitive, fill=blue!40] (scheme) {Scheme};
+ \node [block, right of=primitive, fill=red!60] (attack1) {Attack};
+ \node [block, right of=scheme, fill=red!60] (attack2) {Attack};
+ % Edges
+ \path [line,dashed] (attack1) -- (primitive);
+ \path [line,dashed] (attack2) -- (scheme);
+ \path [line]
+ (primitive) edge node[left] {\textbf{Construction}} (scheme)
+ (attack2) edge node[right] {\textbf{Reduction}} (attack1);
+ \end{tikzpicture}
+ \caption{Proofs by reduction}
+ \label{fig:proofs}
+\end{figure}
+
+
+
+
\section{Sequences of games} %%
+\label{sec:sequences-games}
-While mathematical proofs greatly enhance the confidence we have in that a given
-cryptosystem behaves as expected, with the recent increase in complexity it has
-become more and more difficult to write and verify the proofs by hand, to the
-point of being practically non-viable. In 2004 \cite{Shoup04}, Victor Shoup
-introduced the concept of \textbf{sequences of games} as a method of taming the
-complexity of cryptography related proofs. A game is like a program written in a
-well-defined, probabilistic programming language, and a sequence of games is the
-result of applying transformations over the initial one.
+In 2004 \cite{Shoup04}, Victor Shoup introduced the concept of \textbf{sequences
+ of games} as a method of taming the complexity of cryptography related
+proofs. A \textbf{game} is like a program written in a well-defined,
+probabilistic programming language, and a sequence of games is the result of
+applying transformations over the initial one. Every game represents the
+interaction between a \textbf{challenger} and an \textbf{adversary}, with the
+last one being usually encoded as a function (probabilistic program). In the
+end, we will want the sequence of games to form a proof by reduction (see
+section \ref{sec:proofs-reduction}), where the transition of games proves that
+our system can be reduced, under certain conditions, to some well-known
+cryptographic primitive (TODO: is this correct?).
+
+(TODO: explain transitions and events (adversary winning))
+
+In order to see a practical example of how sequences of games work, let us
+define the following game:
+
+\newpage
+\begin{game}[caption=IND-CPA game (from \cite{BartheGB09}),
+ label={lst:indcpa}]{}
+ $(pk,sk) \leftarrow \KG();$
+ $(m_0,m_1) \leftarrow A_1(pk);$
+ $b \overset{\$}{\leftarrow} \{0,1\};$
+ $c \leftarrow \Enc(pk,m_b);\\$
+ $\tilde{b} \leftarrow A_2(c)$
+\end{game}
+
+The Game \ref{lst:indcpa} can be used to define the IND-CPA property of public
+key encryption schemes. (TODO: explain IND-CPA)
+
+In the game, $\KG$ and $\Enc$ are the key generation and encryption functions
+provided by the encryption algorithm, respectively, and $A_1$ is the encoding of
+the adversary.
+
+(TODO)
\section{Verification: EasyCrypt} %%
+\label{sec:verif-easycrypt}
+
+EasyCrypt has different languages to perform different tasks:
\subsection{Specification languages}
+This are the languages EasyCrypt uses to declare and define types, functions,
+axioms, games, oracles, adversaries and other entities involved in the proofs.
+
\subsubsection{Expressions}
-\subsubsection{Probabilistic expressions}
+The main specification language of EasyCrypt is the expression language, in
+which \textbf{types} are defined together with \textbf{operators} that can be
+applied to them (or be constant). EasyCrypt follows the usual semicolon
+notation\cite{Church40} to denote the typing relationship: <<$a : T$>> means
+``$a$ has type $T$''. EasyCrypt has a type system supporting parametric
+polymorphism: \rawec{int list} represents a list of integers.
+
+The operators are functions over types, defined with the keyword \rawec{op}
+(e.g., \\ \rawec{op even : nat -> bool}). An operator can be applied to some
+argument by putting them separated by a space: \rawec{even 4}. Operators can be
+abstract, i.e., defined without any actual implementation; with semantics given
+by the definition of axioms and lemmas that describe its observational
+behavior. Operators are also \textit{curried}, so they support multiple
+arguments by returning new functions that consume the next one. For example,
+$f : (A \times B\times C) \rightarrow D$ would be encoded as
+$f : A \rightarrow (B \rightarrow (C \rightarrow D))$, or, by associativity,
+$f : A \rightarrow B \rightarrow C \rightarrow D$.
+
+In this example (from the current EasyCrypt library) we can see the how actual
+types and operators are defined in the EasyCrypt's expression language:
+
+\begin{easycrypt}[caption=Lists (expression language), label={lst:exprlang}]{}
+type 'a list = [
+ | "[]"
+ | (::) of 'a & 'a list ].
+
+op hd: 'a list -> 'a.
+axiom hd_cons (x:'a) xs: hd (x::xs) = x.
+
+op map (f:'a -> 'b) (xs:'a list) =
+ with xs = "[]" => []
+ with xs = (::) x xs => (f x)::(map f xs).
+\end{easycrypt}
+
+The first line defines the \rawec{list} type as a sum type with two constructors
+(cases): the \textit{empty list} and the \textit{construction} of a new list
+from an existing one and an element that will appear at its head position. The
+rest of the code defines operators working with lists.
+
+The next line abstractly defines the operator \rawec{hd}, together with its
+type. The axiom following it partially specifies the behavior of the \rawec{hd} when
+applied to some list: if the list has the form \rawec{x::xs} (element \rawec{x}
+followed by \rawec{xs}), the return value is \rawec{x}. The other case (empty
+list) is left unspecified.
+
+The last line defines the \rawec{map} operator directly, using pattern
+matching. This operator receives a function and a list, and returns the list
+consisting of the results of evaluating the function over each element of the
+list, preserving its order.
+
+\paragraph{Probabilistic expressions}
+
+Additionally, EasyCrypt defines some standard types and operators to work with
+probabilistic expressions. The type \rawec{'a distr} represents discrete
+sub-distributions over types. The operator \rawec{mu} represents the probability
+of some event in a sub-distribution:
+
+\begin{easycrypt}[]{}
+op mu : 'a distr -> ('a -> bool) -> real
+\end{easycrypt}
+
+For example, the uniform distribution over booleans is defined in the
+EasyCrypt's standard library as follows:
+
+\begin{easycrypt}[caption=Uniform distribution over bool,
+ label={lst:booldistr}]{}
+op dbool : bool distr.
+axiom mu_def : forall (p : bool -> bool),
+ mu dbool p =
+ (1/2) * charfun p true +
+ (1/2) * charfun p false.
+\end{easycrypt}
\subsubsection{pWhile language}
+Expression languages are usually not adequate to define games and other data
+structures as cryptographic schemes and oracles, due to the stateful nature of
+sequential algorithms. That's why EasyCrypt uses a different language
+called \textbf{pWhile} \cite{BartheGB12} (probabilistic while) to define them:
+
+\begin{bnf}[caption=pWhile language, label={lst:pwhile}]{}
+ $\mathcal{C} ::=$ skip
+ | $\mathcal{V \leftarrow E}$
+ | $\mathcal{V \overset{\$}{\leftarrow} DE}$
+ | if $\mathcal{E}$ then $\mathcal{C}$ else $\mathcal{C}$
+ | while $\mathcal{E}$ do $\mathcal{C}$
+ | $\mathcal{V} \leftarrow \mathcal{P}(\mathcal{E}, ..., \mathcal{E})$
+ | $\mathcal{C}$; $\mathcal{C}$
+\end{bnf}
+
\subsection{Proof languages}
+These are the languages used to write and prove theorems:
+
\subsubsection{Judgments}
+Whenever there is some statement that we want to prove, it must be written as a
+judgment in some \textbf{logic}. Apart from the first order logic expressions,
+EasyCrypt supports judgments in some logics derived from Hoare logic:
+
+\begin{itemize}
+\item Hoare Logic (\textit{HL}). These judgments have the following shape:
+
+ $$c : P \Longrightarrow Q$$
+
+ where $P$ and $Q$ are assertions (predicates) and $c$ is a statement or
+ program. $P$ is the \textbf{precondition} and $Q$ is the
+ \textbf{postcondition}. The validity of this kind of Hoare judgment implies
+ that if $P$ holds before the execution of $c$ and it terminates, then $Q$ must
+ also hold.
+
+\item Probabilistic Hoare Logic (\textit{pHL}). This is the logic resulting from
+ assigning some probability to the validity of the previously seen Hoare
+ judgments. The probability can be a number or an upper/lower bound:
+
+ $$[c : P \Longrightarrow Q] \leq \delta$$
+ $$[c : P \Longrightarrow Q] = \delta$$
+ $$[c : P \Longrightarrow Q] \geq \delta$$
+
+\item Probabilistic Relational Hoare Logic (\textit{pRHL}). These have the
+ following shape:
+
+ $$c_1 \sim c_2 : \Psi \Longrightarrow \Phi$$
+
+ In this case, the pre and postconditions are not standalone predicates, but
+ \textbf{relationships} between the memories of the two programs $c_1$ and
+ $c_2$. This judgment means that if the precondition $\Psi$ holds before the
+ execution of $c_1$ and $c_2$, the postcondition $\Phi$ will also hold after
+ finishing its execution.
+
+ This logic is the most complete and useful when developing game-based
+ reduction proofs, because it allows to encode each game transition as a
+ judgment. Twe two games are $c_1$ and $c_2$ respectively and the
+ pre/postconditions refer to the probability of the adversary winning the
+ games.
+
+\end{itemize}
+
\subsubsection{Tactics}
+If the judgment is declared as an axiom, it is taken as a fact and does not need
+to be proven. Lemmas, however, will make EasyCrypt enter in ``proof mode'',
+where it stops reading declarations, takes the current judgment as a goal and
+and starts accepting \textbf{tactics} until the current goal is trivially
+true. Tactics are indications on what rules EasyCrypt must apply to transform
+the current goal.
+
+This is a simplified example of proof from the EasyCrypt's library:
+
+\begin{easycrypt}[caption=Tactics usage, label={lst:tactics}]{}
+lemma cons_hd_tl :
+ forall (xs:'a list),
+ xs <> [] => (hd xs)::(tl xs) = xs.
+proof.
+ intros xs.
+ elim / list_ind xs.
+ simplify.
+ intros x xs' IH h {h}.
+ rewrite hd_cons.
+ rewrite tl_cons.
+ reflexivity.
+qed.
+\end{easycrypt}
+
+(TODO: briefly explain proof)
+
@@ -233,6 +534,8 @@ result of applying transformations over the initial one.
\section{Lambda Calculus} %%
+\cite{Church36}
+
\subsection{Extensions}
@@ -255,19 +558,32 @@ http://adam.chlipala.net/cpdt/html/Equality.html
\section{Evaluation Strategies} %%
+\cite{Sestoft02}
+
\section{Abstract Machines} %%
+(TODO: basic intro, motivation, function application implementations
+(push-enter, eval-apply))
+
-\subsection{Krivine Machine}
+\subsection{Krivine Machine} %%%%%%%%%%
\cite{Krivine07}
+\cite{Douence07}
+
+
+\subsection{ZAM: Standard version}
+
+\cite{Leroy90}
-\subsection{ZAM}
+\subsection{ZAM: Alternative version}
+
+\cite{LeroyG02}
+
-\cite{GregoireLeroy02}
@@ -275,14 +591,21 @@ http://adam.chlipala.net/cpdt/html/Equality.html
+
\chapter{KRIVINE MACHINE} %%%%%%%%%%
-- Outside EasyCrypt: weak symbolic with fixpts and cases
-- Inside EasyCrypt: bla bla
+\cite{Krivine07}
+\cite{Douence07}
+
+
+\section{Not compiled} %%
+\section{Compiled} %%
+
+
\chapter{ZAM} %%%%%%%%%%
@@ -290,6 +613,8 @@ http://adam.chlipala.net/cpdt/html/Equality.html
+
+
\chapter{REDUCTION IN EASYCRYPT} %%%%%%%%%%
@@ -298,6 +623,14 @@ http://adam.chlipala.net/cpdt/html/Equality.html
+\section{Data types}
+
+
+
+\section{Implementation}
+
+
+
\part{EPILOGUE} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -307,11 +640,25 @@ http://adam.chlipala.net/cpdt/html/Equality.html
\chapter{CONCLUSIONS} %%%%%%%%%%
+\begin{itemize}
+\item Brief wrapping of importance of verified crypto today
+\item Survey of available reduction machineries, differences between them
+\item Improvements to EasyCrypt: modularity, extensibility, efficiency (?)
+\item Problems encountered during the implementation (debugging!!! specially the
+ compiled one, progressive replacing of the EasyCrypt's one due to state)
+\end{itemize}
+
\chapter{FUTURE WORK} %%%%%%%%%%
+\begin{itemize}
+\item Compile code to abstract machine opcodes to improve efficiency
+\item Tests
+\item Make users able to define their own rules (expand the engine in ``userland'')
+\end{itemize}
+