INF350E/INF384H/CS395T: Information Retrieval (and Web Search)
The University of Texas at 
Austin

INF350E / INF384H / CS395T: Information Retrieval (and Web Search)
Fall 2018

   

THIS COURSE IS CROSS-LISTED; IF ONE SECTION IS FULL, PLEASE ENROLL IN THE OTHER. All students will receive the same credit toward graduation requirements regardless of which section they enroll in.

ON THE WAITLIST? I will do my best to ensure that any graduate student who wants to be in the class can enroll. Show up the first day of class and I will probably be able to get you in.

Notes for Computer Science (CS) students:

Instructor: Matt Lease (regarding subject matter expertise, see publications)
Day and Time: Thursdays 3-5:50pm (note change in originally scheduled class time)
Location: UTA 1.208 (at the iSchool)
Unique IDs: 27608 (INF350E) · 27700 (INF384H) · 51870 (CS395T)

Forthcoming...
Syllabus
Course Schedule (weekly readings & assignments)

Previous offerings: Fall 2016 (deep learning focus) · Fall 2014 (likely most similar to 2018 offering) · Fall 2013 · Fall 2012 · Fall 2011 · Fall 2010 · Spring 2010

Research seminar: This class focuses on reading, analysis, and discussion of recently published research articles in Information Retrieval (and Web Search). If you are not interested in reading, analyzing, and discussing research articles, this class will hot be a good fit for you.

Mixed Graduate/Undergraduate course. This is primarily a graduate-level class, but a small number of upper-level undergraduates are being allowed to enroll via the INF350E listing. Expectations will be for graduate-level work.

Prerequisites: There are no prerequisites, but many readings will employ mathematics, such as statistics and probability, as well as computer algorithms and programming. It is okay if you do not fully understand technical details as long as you make an effort get what you can out of each reading and class discussion. Course projects typically involve programming, but for students without programming expertise, projects can be executed as a literature review paper instead. All sufficiently motivated students are invited to attend. This course typically attracts significant student participation across a wide variety of disciplines: information science, computer science, linguistics, electrical engineering, and design studies. Course activities are intended to serve the needs of both (1) those studying to work professionally on search engines or conduct research in IR, and (2) non-specialists interested in gaining broader exposure and understanding of IR methods and systems.

Textbook: none required, all readings online

Want to publish original research?

In every previous offering of the course, several of the best, most innovative course projects have been extended beyond the semester until the work was in publishable form. If you have a great idea and are willing to work hard to get it published, the course project provides a great opportunity to refine the idea and get started developing the project with regular feedback and advising from the instructor. Examples of previous course projects which led to published papers include:

Full papers

  • Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, Maarten de Rijke, and Matthew Lease. Neural Information Retrieval: At the End of the Early Years. Information Retrieval, 2018.
  • Yinglong Zhang, Jin Zhang, Matthew Lease, and Jacek Gwizdka. Multidimensional Relevance Modeling via Psychometrics and Crowdsourcing. In Proceedings of the 37th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 435-444, 2014.
  • Ripon Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne Perry. Improving Bug Localization using Structured Information Retrieval. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013.
  • Aibo Tian and Matthew Lease. Active Learning to Maximize Accuracy vs. Effort in Interactive Information Retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 145-154, 2011.
  • Yinglong Zhang, Jin Zhang, Matthew Lease, and Jacek Gwizdka. Multidimensional Relevance Modeling via Psychometrics and Crowdsourcing. In Proceedings of the 37th international ACM SIGIR conference on Research and Development in Information Retrieval, 2014.
  • Yongyi Zhou, Ramona Broussard, and Matthew Lease. Mobile options for online public access catalogs. In Proceedings of the iConference, pages 598-605. ACM, 2011.
Short papers
  • Xi Zheng, Akanksha Bansal, and Matthew Lease. Bullseye: Structured Passage Retrieval and Document Highlighting for Scholarly Search. In The Thirteenth Asia-Pacific Conference on Conceptual Modelling (APCCM), held as part of the Australasian Computer Science Week (ACSW) Multiconference, 2017. 4 pages.
  • Shilpa Shukla, Matthew Lease, and Ambuj Tewari. Parallelizing ListNet Training using Spark. In Proceedings of the 35th international ACM SIGIR conference on Research and Development in Information Retrieval, 2012.
  • Lu Guo and Matthew Lease. Personalizing Local Search with Twitter. In Workshop on Enriching Information Retrieval (ENIR) at the 34th Annual ACM SIGIR Conference, 2011.
  • Abhimanu Kumar and Matthew Lease. Learning to Rank From a Noisy Crowd. In Proceedings of the 34th Annual ACM SIGIR Conference, 2011.
  • Abhimanu Kumar and Matthew Lease. Modeling Annotator Accuracies for Supervised Learning. In Proceedings of the Workshop on Crowdsourcing for Search and Data Mining (CSDM) at the Fourth ACM International Conference on Web Search and Data Mining (WSDM), pages 19-22, Hong Kong, China, February 2011.
  • Elben Shira and Matthew Lease. Expert Search on Code Repositories. Technical Report TR-11-42, Department of Computer Science, University of Texas at Austin, December 2011.
  • Ramona Broussard, Yongyi Zhou, and Matthew Lease. Mobile Phone Search for Library Catalogs. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010.
  • Ramona Broussard, Yongyi Zhou, and Matthew Lease. University of Texas Mobile Library Search. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010.

Whether or not you submit your paper for peer-review, your final paper can be posted online as a citable technical report?

Looking for a funded Research Assistant (RA) position? I typically do not offer RA positions until a student has taken a course with me and demonstrated their abilities and drive to succeed. While the availability of an RA position depends on available funding, I am often looking for new RAs to help me build the next generation of search engines.


Overview

In an Information Age promising instant access to seemingly limitless digital information, search has become the dominant paradigm for enabling information access.

Creating an effective search engine, however, requires addressing many important challenges:

  • Characterizing the nature of search relevance (both topical and user-oriented)
  • Defining operational models for ranking based on relevance
  • Developing search algorithms which are accurate, scalable, and efficient
  • Designing interfaces, interaction mechanisms, and graphical visualizations providing an engaging and user-friendly search experience (human-computer interaction and visualization)
  • Understanding trade-offs between system-oriented and user-oriented methods for search evaluation
Information Retrieval (IR) studies both human information needs and the systems built to meet those needs. As such, IR has lain squarely at the intersection of Information Science and Computer Science since its inception. IR studies methods for capturing, representing, storing, organizing, and retrieving unstructured or loosely structured digital information, as well as designing interface, interaction, and visualization methods for creating an effective and compelling search experience. While digital information was once restricted to electronic documents, today's landscape of digital content is incredibly rich and diverse, including Web pages, news articles, books, transcribed speech, email, blogs (and micro-blogs), images, and video. The rise of the Web as a massive, global repository and distribution network has earned Web search engines and other Web technologies particular importance in organizing and finding information today.

The course will culminate in an end-of-semester paper and presentation of the final course project to disseminate and showcase the work.

Meta-ideas for course projects:

  • Literature Review: write a survey on state-of-the-art practice in a specialized area of IR (review and synthesis of published scientific literature)
  • Algorithm: implement and evaluate a new search algorithm
  • Analysis: Present a novel analysis of one or more existing IR systems
  • User-centered evaluation: evaluate IR system effectiveness via user-oriented qualitative and/or quantitative methods (e.g. interactive IR, task-completion accuracy and/or times, usability issues, affective perceptions, etc.)
  • Human-computer interaction: design a new search interface, implement/mock-up, and evaluate
  • Visualization: design a new graphical visualization method for conveying search results or managing information overload
  • Crowdsourced evaluation: explore crowdsourcing methods for informing or evaluating search engines
  • Mobile IR: Develop a mobile IR application using our pool of Google Android phones
  • Develop an interesting IR application using our pool of GoogleTV devices


Related courses: UT Austin
  • Undergraduate: CS 371R: Information Retrieval and Web Search
  • Graduate: EE380L: Data mining

Related courses: other universities

Reference Textbooks

Other References