Week 12 - Exam hints part 1 Information retrieval Boolean model Vector model -> Weighting schemes Tf-idf BM25 / Pivot Normalisation Take stuff into account - local evidence, normalisation, global (idf), document lenght normalisation No need to memorise the equations - explain the main components then you're g Lookiong at it from an axiomatic approach other models Neural networks extended boolean Evaluation Evaluationg different weighting schemes precision recall precision recall graph MAP user ones - coverage & novelty (only when we have users that we can study or discuss) purity etc. Learning Looked at evolutionary computation Looked at clustering give us ways to generate new weighting schemes K -means Collab filtering interesting in evaluating it with coverage and accuracy websearch HITS algorithm Page rank Query expansion Query difficulty prediction how can we tell if a certain query is easy or hard preretrieval query features features of collection postretreival] answer set or even just ranking of documents when they come back Preprocessing & indexing system efficiently PAPER OVERVIEW 4 questions, do any 3 3 parts to every question Q1 Weighting schemes stuff on axiomatic approach (only formal correct way to look at weighting schemes) tf - idf -> takes issues on overcoming those vector space model (And other approaches) Q2 Relevance feedback know when to do it what features to use explicit feedback similar to assignment - query expansion (but different obviously lmao) Q3 Alternative model to vector space or boolean How we evaluate a system here anything to do to apply learning algorithms Q4 similarity - overlaps with assignment questions recommendation approaches clustering approaches Q1. Weighing Schemes - tf-idf -> dealing with issues with some of the stuff in vector space model, tf-idf - axiometric approach - can determine if a weighting scheme is good or not - can also be used to constrain a search Q.2 Relevance feedback - often want to know when we can use it and features to use - broken, blind feedback, query, increase precision, soemtimes can damage precision - want to do query feedback when it is a hard query - what features should exist if we want feedback - when user says a doc is relevant or not/ give a score in some range - query expansion Q3. Alternatives to Vector Space - evaluation - precision, recall, preicision recall graphs - apply learning Q4 - similarity - borrow ideas from more than one component oin the course - clustering approaches - recommendation approaches