[CT4100]: Add exam notes

2024-12-20 20:44:56 +00:00
parent e7377cd7fb
commit c1bf8ce0fe
1 changed files with 107 additions and 0 deletions
--- a/year4/semester1/CT4100:
+++ b/year4/semester1/CT4100:
@ -0,0 +1,107 @@
 Week 12 - Exam hints part 1
 Information retrieval
 Boolean model
 Vector model
 -> Weighting schemes
 Tf-idf
 BM25 / Pivot Normalisation
 Take stuff into account - local evidence, normalisation, global (idf), document lenght normalisation
 No need to memorise the equations - explain the main components then you're g
 Lookiong at it from an axiomatic approach
 other models
 Neural networks
 extended boolean
 Evaluation
 Evaluationg different weighting schemes
 precision recall
 precision recall graph
 MAP
 user ones - coverage & novelty (only when we have users that we can study or discuss)
 purity etc.
 Learning
 Looked at evolutionary computation
 Looked at clustering
 give us ways to generate new weighting schemes
 K -means
 Collab filtering
 interesting in evaluating it with coverage and accuracy
 websearch
 HITS algorithm
 Page rank
 Query expansion
 Query difficulty prediction
 how can we tell if a certain query is easy or hard
 preretrieval
 query features
 features of collection
 postretreival]
 answer set
 or even just ranking of documents when they come back
 Preprocessing & indexing
 system efficiently 
 PAPER OVERVIEW
 4 questions, do any 3
 3 parts to every question
 Q1
 Weighting schemes
 stuff on axiomatic approach (only formal correct way to look at weighting schemes)
 tf - idf -> takes issues on overcoming those
 vector space model (And other approaches)
 Q2
 Relevance feedback
 know when to do it
 what features to use
 explicit feedback
 similar to assignment - query expansion (but different obviously lmao)
 Q3
 Alternative model to vector space or boolean 
 How we evaluate a system here
 anything to do to apply learning algorithms
 Q4
 similarity - overlaps with assignment questions
 recommendation approaches
 clustering approaches
 Q1. Weighing Schemes
 - tf-idf -> dealing with issues with some of the stuff in vector space model, tf-idf
 - axiometric approach
 -  can determine if a weighting scheme is good or not
 - can also be used to constrain a search
 Q.2 Relevance feedback
 - often want to know when we can use it and features to use
 - broken, blind feedback, query, increase precision, soemtimes can damage precision
 - want to do query feedback when it is a hard query
 - what features should exist if we want feedback
 - when user says a doc is relevant or not/ give a score in some range
 - query expansion
 Q3. Alternatives to Vector Space
 - evaluation - precision, recall, preicision recall graphs
 - apply learning 
 Q4
 - similarity - borrow ideas from more than one component oin the course
 - clustering approaches
 - recommendation approaches