[CT4100]: Tweak Week 4 lecture notes
This commit is contained in:
Binary file not shown.
@ -641,7 +641,7 @@ The term independence assumption is also usually adopted, i.e., that the occurre
|
||||
However, it is unlikely that 30 occurrences of a term in a document truly carries thirty times the significance of a single occurrence of that term.
|
||||
A common modification is to use the logarithm of the term frequency:
|
||||
\begin{align*}
|
||||
\text{If } \textit{tf}_{i,d} > 0:& \quad w_{i,d} = 1 + \log(\textit{tf}_{i,d})\\
|
||||
\text{If } \textit{tf}_{i,d} > 0 \text{:}& \quad w_{i,d} = 1 + \log(\textit{tf}_{i,d})\\
|
||||
\text{Otherwise:}& \quad w_{i,d} = 0
|
||||
\end{align*}
|
||||
|
||||
|
Reference in New Issue
Block a user