diff --git a/year4/semester2/CT420/notes/CT420.pdf b/year4/semester2/CT420/notes/CT420.pdf index 7dde6fc6..fd22ab66 100644 Binary files a/year4/semester2/CT420/notes/CT420.pdf and b/year4/semester2/CT420/notes/CT420.pdf differ diff --git a/year4/semester2/CT420/notes/CT420.tex b/year4/semester2/CT420/notes/CT420.tex index 9cf52f30..afc8d14e 100644 --- a/year4/semester2/CT420/notes/CT420.tex +++ b/year4/semester2/CT420/notes/CT420.tex @@ -29,6 +29,7 @@ % \newcommand{\secref}[1]{\textbf{§~\nameref{#1}}} \newcommand{\secref}[1]{\textbf{§\ref{#1}~\nameref{#1}}} +\usepackage{mathtools} \usepackage{changepage} % adjust margins on the fly \usepackage{minted} @@ -163,7 +164,104 @@ The principles of ground-based navigation systems is as follows: \item This allows the receiver to deduce the distance to each of the stations, providing a fix. \end{enumerate} +NEED TO FINISH +\section{Time Synchronisation in Distributed Systems} +A \textbf{distributed system (DS)} is a type of networked system wherein multiple computers (nodes) work together to perform a task. +Such systems may or may not be connected to the Internet. +Time \& synchronisation are important issues here: think of error logs in distributed systems -- how can error events recorded in different computers be correlated with each other if there is no common time base? +The problem is that GNSS-based time synchronisation may or may not be available, as GPS signals are absorbed or weakened by building structures. +There is no other time reference such systems can rely on because in such a distributed system there are just a series of imperfect computer clocks. +\\\\ +In distributed systems, all the different nodes are supposed to have the same notion of time, but quartz oscillators oscillate at slightly different frequencies. +Hence, clocks tick at different rates (called \textit{clock skew}), resulting in an increasing gap in perceived time. +The difference between two clocks at a given pot is called \textit{clock offset}. +The \textbf{clock synchronisation problem} aims to minimise the clock skew and subsequently the offset between two or more clocks. +A clock can show a positive or negative offset with regard to a reference clock (e.g., UTC), and will need to be resynchronised periodically. +One cannot just set the clock to the ``correct'' time: jumps, particularly backwards, can confuse software and operating systems. +Instead, we aim for gradual compensation by correcting the skew: if a clock runs too fast, make it run slower until correct and if a clock runs too slow, make it run faster until correct. +\\\\ +Synchronisation can take place in different forms: +\begin{itemize} + \item Based on \textbf{physical} clocks: absolute to each other by synchronising to an accurate time source (e.g., UTC), absolute to each other by synchronising to locally agreed time (i.e., no link to a global time reference), where the term \textit{absolute} means that the differences in timestamps are proper time intervals. + + \item Based on \textbf{logical} clocks (i.e., clocks are more like counters): timestamps may be ordered but with no notion of measurable time intervals. +\end{itemize} + +In either case, the DS endpoints synchronise using a shared network. +For physical clock synchronisation, network latencies must be considered as packets traverse from a sending node to a receiving node. +In a \textbf{perfect network}, messages \textit{always} arrive, with a propagation delay of \textit{exactly} $d$; +the sender sends time $T$ in a message, the receiver sets its clock to $T + d$, and synchronisation is exact. +\\\\ +In a \textbf{deterministic network}, messages arrive with a propagation delay $0 < d \leq D$; +the sender sends time $T$ in a message, the receiver sets its clock to $T + \frac{D}{2}$, and therefore the synchronisation error is at most $\frac{D}{2}$. +\textbf{Deterministic communication} is the ability of a network to guarantee that a message will be transmitted in a specified, predictable period of time. + +\subsection{Synchronisation in the Real World} +Most off-the-shelf networks are \textit{asynchronous}, that is, data is transmitted intermittently on a best-effort basis. +They are designed for flexibility, not determinism, and as a result, propagation delays are arbitrary and sometimes even unsymmetric (i.e., upstream \& downstream latencies are different). +Therefore, synchronisation algorithms are needed to accommodate these limitations. + +\subsubsection{Cristian's Algorithm} +\textbf{Cristian's algorithm} attempts to compensate for symmetric network delays: +\begin{enumerate} + \item The client remembers the local time $T_0$ just before sending a request. + \item The server receives the request, determines $T_S$, and sends it as a reply. + \item When the client receives the reply, it notes the local arrival time $T_1$. + \item The correct time is then approximately $(T_S + \frac{(T_1 - T_0)}{2} )$. +\end{enumerate} + +The algorithm assumes symmetric network latency. +If the server is synced to UTC< all clients will follow UTC. +Limitations of Cristian's algorithm include: +\begin{itemize} + \item Assumes a symmetric network latency; + \item Assumes that timestamps can be taken as the packet hits the wire / arrives at the client; + \item Assumes that $T_S$ is right in the middle of the server process; for example, consider the server process being pre-empted just before it sends the response back to the client, which will corrupt the synchronisation of the client. +\end{itemize} + +\subsubsection{Berkeley Algorithm} +In the \textbf{Berkeley algorithm}, there is no accurate time server: instead, a set of client clocks is synchronised to their average time. +The assumption is that offsets / skews of all clocks follow some symmetric distribution (e.g., a normal distribution) with some clocks going faster and others slower, and therefore a mean value close to 0. +\begin{enumerate} + \item One node is designated to be the \textbf{master node} $M$. + \item The master node periodically queries all other clients for their local time. + \item Each client returns a timestamp or their clock offset to the master. + \item Cristian's algorithm is used to determine and compensate for RTTs, which can be different for each client. + \item Using these, the master computes the average time (thereby ignoring outliers), calculates the difference to all timestamps it has received, and sends an adjustment to each client. + Again, each computer gradually adjusts its local clock. +\end{enumerate} + +Client clocks are adjusted to run faster or slower, to be synced to an overall agreed system time. +The client networks is an intranet, i.e., an isolated system. +Therefore, the Berkeley algorithm is an \textbf{internal clock synchronisation algorithm}. +The Berkeley algorithm was implemented in the TEMPO time synchronisation protocol, which was part of the Berkelely UNIX 4.3BSD system. + +\subsection{Logical Clocks} +\textbf{Logical clocks} are another concept linked to internal clock synchronisation. +Logical clocks only care about their internal consistency, but not about absolute (UTC) time; +subsequently, they do not need clock synchronisation and take into account the order in which events occur rather than the time at which they occurred. +In practice, if clients or processes only care that event $a$ happens before event $b$, but don't care about the exact time difference, they can make use of a logical clock. +\\\\ +We can define the \textbf{happens-before relation} $a \rightarrow b$: +\begin{itemize} + \item If events $a$ and $b$ are within the same process, then $a \rightarrow b$ if $a$ occurs with an earlier local timestamp: \textbf{process order}. + \item If $a$ is the event of a message being sent by one process, and $b$ is the event of the message being received by another process, then $a \rightarrow b$: \textbf{causal order}. + \item We also have \textbf{transitivity:} if $a \rightarrow b$ and $b \rightarrow c$, then $a \rightarrow c$. +\end{itemize} +Note that this only provides a \textit{partial order}: +if two events $a$ and $b$ happen in different processes that do not exchange messages (not even indirectly), then neither $a \rightarrow b$ nor $b \rightarrow a$ is true. +In this situation, we say that $a$ and $b$ are \textbf{concurrent} and write $a \sim b$, i.e., nothing can be said about when the events happened or which event happened first. +\\\\ +Happens-before can be implemented using the \textbf{Lamport scheme:} +\begin{enumerate} + \item Each process $P_i$ has a logical clock $L_i$, where $L_i$ can be simply an integer variable initialised to 0. + \item $L_i$ is incremented on every local event $e$; we write $L_i(e)$ or $L(e)$ as the timestamp of $e$. + \item When $P_i$ sends a message, it increments $L_i$ and copies its content into the packet. + \item When $P_i$ receives a message from $P_k$, it extracts $L_k$ and sets $L := \text{max}(L_i, L_k)$ and then increments $L_i$. +\end{enumerate} + +This guarantees that if $a \rightarrow b$, then $L_i(a) < L_k(b)$, but nothing else.