[CT4100]: Add exam notes
This commit is contained in:
107
year4/semester1/CT4100: Information Retrieval/exam
Normal file
107
year4/semester1/CT4100: Information Retrieval/exam
Normal file
@ -0,0 +1,107 @@
|
||||
Week 12 - Exam hints part 1
|
||||
Information retrieval
|
||||
Boolean model
|
||||
Vector model
|
||||
|
||||
-> Weighting schemes
|
||||
Tf-idf
|
||||
BM25 / Pivot Normalisation
|
||||
Take stuff into account - local evidence, normalisation, global (idf), document lenght normalisation
|
||||
|
||||
No need to memorise the equations - explain the main components then you're g
|
||||
Lookiong at it from an axiomatic approach
|
||||
other models
|
||||
Neural networks
|
||||
extended boolean
|
||||
|
||||
|
||||
Evaluation
|
||||
Evaluationg different weighting schemes
|
||||
precision recall
|
||||
precision recall graph
|
||||
MAP
|
||||
user ones - coverage & novelty (only when we have users that we can study or discuss)
|
||||
purity etc.
|
||||
|
||||
Learning
|
||||
Looked at evolutionary computation
|
||||
Looked at clustering
|
||||
|
||||
give us ways to generate new weighting schemes
|
||||
K -means
|
||||
|
||||
Collab filtering
|
||||
interesting in evaluating it with coverage and accuracy
|
||||
websearch
|
||||
HITS algorithm
|
||||
Page rank
|
||||
|
||||
|
||||
Query expansion
|
||||
|
||||
Query difficulty prediction
|
||||
how can we tell if a certain query is easy or hard
|
||||
preretrieval
|
||||
query features
|
||||
features of collection
|
||||
|
||||
postretreival]
|
||||
answer set
|
||||
or even just ranking of documents when they come back
|
||||
|
||||
|
||||
Preprocessing & indexing
|
||||
system efficiently
|
||||
|
||||
PAPER OVERVIEW
|
||||
4 questions, do any 3
|
||||
3 parts to every question
|
||||
|
||||
Q1
|
||||
Weighting schemes
|
||||
stuff on axiomatic approach (only formal correct way to look at weighting schemes)
|
||||
tf - idf -> takes issues on overcoming those
|
||||
vector space model (And other approaches)
|
||||
|
||||
Q2
|
||||
Relevance feedback
|
||||
know when to do it
|
||||
what features to use
|
||||
explicit feedback
|
||||
similar to assignment - query expansion (but different obviously lmao)
|
||||
|
||||
Q3
|
||||
Alternative model to vector space or boolean
|
||||
How we evaluate a system here
|
||||
anything to do to apply learning algorithms
|
||||
|
||||
Q4
|
||||
similarity - overlaps with assignment questions
|
||||
recommendation approaches
|
||||
clustering approaches
|
||||
|
||||
|
||||
Q1. Weighing Schemes
|
||||
- tf-idf -> dealing with issues with some of the stuff in vector space model, tf-idf
|
||||
- axiometric approach
|
||||
- can determine if a weighting scheme is good or not
|
||||
- can also be used to constrain a search
|
||||
|
||||
|
||||
Q.2 Relevance feedback
|
||||
- often want to know when we can use it and features to use
|
||||
- broken, blind feedback, query, increase precision, soemtimes can damage precision
|
||||
- want to do query feedback when it is a hard query
|
||||
- what features should exist if we want feedback
|
||||
- when user says a doc is relevant or not/ give a score in some range
|
||||
- query expansion
|
||||
|
||||
Q3. Alternatives to Vector Space
|
||||
- evaluation - precision, recall, preicision recall graphs
|
||||
- apply learning
|
||||
|
||||
|
||||
Q4
|
||||
- similarity - borrow ideas from more than one component oin the course
|
||||
- clustering approaches
|
||||
- recommendation approaches
|
Reference in New Issue
Block a user