Information Retrieval

Course ID
CEID_NE5597
Department
Division of Computer Software
Level
Undergraduate
Professor
MAKRIS CHRISTOS
Semester
Winter
ECTS
5
  • Introductory notions (user modeling, document logical representation, retrieval process).
  • Performance evaluation metrics (recall, precision, average precision, R-precision, precision histograms, NDCG metric, harmonic median, user-oriented metrics).
  • Information retrieval modeling.
  • Set-oriented models (boolean models, fuzzy set model, extended boolean model), algebraic models (vector space models, latent semantic indexing model, topic models), probabilistic models (classical and language models).
  • Web information retrieval and its peculiarities.
  • Web search engines (crawler, indexer). HITS algorithm (Hyperlink-induced topic search). Google search engine (the PageRank metric). The SALSA algorithm, variants in web searching
  • Machine Learning Techniques in Information Retrieval (Learning to Rank, Linguistic Models, Vector representation of words (word embeddings such as word2vec, CBOW, skipgram), LSTM, Transformers, BERT, GPT)
  • Indexing structures (inverted files, signature files, bitmaps).
  • Storage Techniques in Distributed Information Retrieval (MapReduce, Apache Spark)
  • Full indexing structures in main memory (suffix trees, suffix arrays, acyclic directed graphs (DAWG) for strings), and in secondary memory (supra-suffix array, prefix Β-tree, string Β-tree).
  • Compression algorithms for text and for indexing structures.
  • Text Mining
Skip to content