diff --git a/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.pdf b/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.pdf
index 9ab97484..fc733b10 100644
Binary files a/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.pdf and b/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.pdf differ
diff --git a/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.tex b/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.tex
index f7a57ba1..3bf983e5 100644
--- a/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.tex	
+++ b/year4/semester1/CT4100: Information Retrieval/notes/CT4100-Notes.tex	
@@ -1505,6 +1505,14 @@ Each pair is also either ``true'' (correct) or ``false'' (incorrect), i.e., the
 
 \subsubsection{How Many Clusters?}
 The number of clusters $k$ is given in many applications.
+For example, there may be an external constraint on $k$; for the scatter-gather algorithm, it was hard to show more than 10-20 clusters on a monitor in the 1990s.
+\\\\
+If there is no external constraint, there is still no ``right'' number of clusters that is empirically correct.
+One approach is to define an optimisation criterion, and find the $k$ for which the optimum is reached.
+We cannot use RSS or average squared distance from the centroid as a criterion as this will always result in $k = N$ clusters.
+The \textbf{elbow method} can be used to get an idea of where the residual sum of squares stops rapidly decreasing when plotted against the number of clusters.
+
+
 
 \section{Query Estimation}
 \textbf{Query difficulty estimation} is used to attempt to estimate the quality of search results for a query from a given collection of documents in the absence of user relevance feedback.
@@ -1866,7 +1874,6 @@ Other issues in web search include:
     \item   Augmenting link analysis algorithms to deal with such manipulation.
 \end{itemize}
 
-\section{Exam Notes}