Concepts of Information Retrieval (and Web Search) -- Fall 2012
THIS COURSE IS CROSS-LISTED; IF ONE SECTION IS FULL, PLEASE ENROLL IN ONE OF THE OTHER SECTIONS. Computer Science students will receive the
same credit toward requirements from CS regardless of which
section they enroll in. If the CS listing is full, enroll in the other and email firstname.lastname@example.org to request the CS major credit.
Instructor: Matt Lease
Day and Time: Fridays 1-4pm
Location: UTA 1.208 (at the iSchool)
Unique IDs: 28555
Course Blogs (Subscribe to an aggregated blog using Google Reader or a similar RSS reader)
Previous offerings: Fall 2011 ·
Fall 2010 ·
Prerequisites: No prior knowledge of IR or programming expertise is required; all interested and
motivated students are invited to attend. This course typically attracts significant student
participation across a wide variety of disciplines: information science, computer science, linguistics, electrical engineering, and design studies. Course activities are
intended to serve the needs of both (1) those studying to work professionally on search engines or conduct research in IR, and (2) non-specialists interested in gaining
broader exposure and understanding of IR methods and systems.
Textbook: none required, all readings online
Graduate-level course: undergraduate seniors may enroll only with instructor permission.
In an Information Age promising instant access to seemingly limitless digital information, search has become a ubiquitous paradigm for enabling information access.
However, creating an effective search engine requires meeting a variety of important practical challenges:
Information Retrieval (IR) studies both human information needs and the systems built to meet those
needs. As such, IR has lain squarely at the intersection of Information Science and Computer Science since its inception. IR studies methods for capturing,
representing, storing, organizing, and retrieving unstructured or loosely structured digital information, as well as designing interface, interaction, and visualization
methods for creating an effective and compelling search experience. While digital information was once restricted to electronic documents, today's landscape of digital content is
incredibly rich and diverse, including Web pages, news articles, books, transcribed speech, email, blogs (and micro-blogs), images, and video. The rise of the Web as a
massive, global repository and distribution network has earned Web search engines and other Web technologies particular importance in organizing and finding information
- Characterizing the nature of search relevance (both topical and user-oriented)
- Defining operational models for ranking based on relevance
- Developing search algorithms which are accurate, scalable, and efficient
- Designing interfaces, interaction mechanisms, and graphical visualizations providing an engaging and user-friendly search experience (human-computer interaction and visualization)
- Understanding trade-offs between system-oriented and user-oriented methods for search evaluation
The course will culminate in an end-of-semester paper and presentation of the
final course project to disseminate and showcase the work.
- Provide broad exposure to the field of IR via weekly readings and in-class discussion of IR topics
- Develop expertise and first-hand experience in a particular specialized topic of IR via a course project conducted individually or in small groups (depth)
- Develop professional skills essential for both research and non-research activities:
- Efficient and critical reading of published scientific literature
- Making effective presentations: in-depth and elevator pitch
- Verbally communicating scientific knowledge across disciplines (and potentially to the public at-large)
- Performing online literature search for prior scholarly work
- Organizing and managing a project
- Scientific writing for scholarly publication
- Read several published IR papers each week and write a short critique of the week's readings to (1) summarize key ideas, (2) identify main
contributions, and (3) discuss strengths/weaknesses
- Once or twice a semester, present an assigned reading to the class and moderate discussion
- Develop a major course project, individually or in small groups, with the goal of publishing the work. The instructor will advise the work and expect students to engage in independent exploration of concepts and execution of tasks. Meta-ideas:
- Term paper: write a survey on state-of-the-art practice in a specialized area of IR (review and synthesis of published scientific literature)
- Algorithm: implement and evaluate a new search algorithm
- Analysis: Present a novel analysis of one or more existing IR systems
- User-centered evaluation: evaluate IR system effectiveness via user-oriented qualitative and/or quantitative methods (e.g. interactive IR, task-completion accuracy and/or times, usability issues, affective perceptions, etc.)
- Human-computer interaction: design a new search interface, implement/mock-up, and evaluate
- Visualization: design a new graphical visualization method for conveying search results or managing information overload
- Crowdsourced evaluation: explore crowdsourcing methods for informing or evaluating search engines
- Mobile IR: Develop a mobile IR application using our pool of Google Android phones
Want to publish original research? In every previous offering of the course, several of the best,
most innovative course projects have been extended beyond the semester until the work was in publishable form.
If you have a great idea and are willing to work hard to get it published, the course project provides a great
opportunity to refine the idea and get started developing the project with regular feedback and advising from
the instructor. Examples of previous course projects which led to
- Ramona Broussard, Yongyi Zhou, and Matthew Lease. Mobile Phone Search for Library Catalogs. In Proceedings of
the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010.
- Ramona Broussard, Yongyi Zhou, and Matthew Lease. University of Texas Mobile Library Search. In
Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010.
- Lu Guo and Matthew Lease. Personalizing Local Search with Twitter. In Workshop on Enriching Information
Retrieval (ENIR) at the 34th Annual ACM SIGIR Conference, 2011.
- Adriana Kovashka and Matthew Lease. Human and Machine Detection of Stylistic Similarity in Art. In
Proceedings of the 1st Annual Conference on the Future of Distributed Work (CrowdConf), San Francisco,
- Abhimanu Kumar and Matthew Lease. Learning to Rank From a Noisy Crowd. In Proceedings of the 34th Annual
ACM SIGIR Conference, 2011.
- Abhimanu Kumar and Matthew Lease. Modeling Annotator Accuracies for Supervised Learning. In Proceedings
of the Workshop on Crowdsourcing for Search and Data Mining (CSDM) at the Fourth ACM International Conference on
Web Search and Data Mining (WSDM), pages 19-22, Hong Kong, China, February 2011.
- Elben Shira and Matthew Lease. Expert Search on Code Repositories. Technical Report TR-11-42, Department
of Computer Science, University of Texas at Austin, December 2011.
- Shilpa Shukla, Matthew Lease, and Ambuj Tewari. Parallelizing ListNet Training using Spark. In Proceedings of the 35th international ACM SIGIR conference on Research and Development in Information Retrieval, 2012.
- Aibo Tian and Matthew Lease. Active Learning to Maximize Accuracy vs. Effort in Interactive Information
Retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and Development in
Information Retrieval, pages 145-154, 2011.
- Yongyi Zhou, Ramona Broussard, and Matthew Lease. Mobile options for online public access catalogs. In
Proceedings of the iConference, pages 598-605. ACM, 2011.
Related courses: UT Austin
- Undergraduate: CS 371R: Information Retrieval and Web Search
- Graduate: EE380L: Data mining
Related courses: other universities
New version of Baeza-Yates books is forthcoming