Session 1

Session 1: 10.30 am - 12.30 pm
Session chair: Xavier Anguera Miro (Telefónica)

10.30 - 10.55 am
OPENING REMARKS AND INTRODUCTION - The Need for Music Information Retrieval with User-Centered and Multimodal Strategies

Cynthia Liem (Delft University of Technology), Meinard Müller (Saarland University & MPI Informatik); Douglas Eck (Google Inc.); George Tzanetakis (University of Victoria); Alan Hanjalic (Delft University of Technology)

Music is a widely enjoyed content type, existing in many multifaceted representations. With the digital information age, a lot of digitized music information has theoretically become available at the user's fingertips. However, the abundance of information is too large-scaled and too diverse to annotate, oversee and present in a consistent and human manner, motivating the development of automated Music Information Retrieval (Music-IR) techniques.

In this paper, we encourage to consider music content beyond a monomodal audio signal and argue that Music-IR approaches with multimodal and user-centered strategies are necessary to serve real-life usage patterns and maintain and improve accessibility of digital music data. After discussing relevant existing work in these directions, we show that the field of Music-IR faces similar challenges as neighboring fields, and thus suggest opportunities for joint collaboration and mutual inspiration.

10.55 - 11.20 am
Affective Content Analysis of Music Video Clips
Ashkan Yazdani (École Polytechnique Fédérale de Lausanne); Krista Kappeler (École Polytechnique Fédérale de Lausanne); Touradj Ebrahimi (École Polytechnique Fédérale de Lausanne)

Nowadays, the amount of multimedia contents is explosively increasing and it is often a challenging problem to nd a content that will be appealing or matches users' current mood or a ective state. In order to achieve this goal, an ecient indexing technique should be developed to annotate multimedia contents such that these annotations can be used in a retrieval process using an appropriate query. One approach to such indexing techniques is to determine the a ect (type and intensity), which can be induced in a user while con-
suming multimedia. In this paper, a ective content analysis of music video clips is performed to determine the emotion they can induce in people. To this end, a subjective test was developed, where 32 participants watched di erent music video clips and assessed their induced emotions. These self assessments were used as ground-truth and the results of classi cation using audio, visual and audiovisual features extracted from music video clips are presented and compared.

11.20 - 11.45 am
A Tempo-Sensitive Music Search Engine With Multimodal Inputs
Yu Yi (National University of Singapore); Yinsheng Zhou (National University of Singapore); Ye Wang (National University of Singapore)

This paper presents TMSE: a novel Tempo-sensitive Music Search Engine with multimodal inputs for wellness and therapeutic applications. TMSE integrates six different interaction modes, Query-by-Number, Query-by-Sliding, Query-by-Example, Query-by-Tapping, Query-by-Clapping, and Query-by-Walking, into one single interface for narrowing the intention gap when a user searches for music by tempo. Our preliminary evaluation results indicate that multimodal inputs of TMSE enable users to formulate tempo related queries more easily in comparison with existing music search engines.

11.45 am - 12.30 pm
KEYNOTE - Audiovisual Archive Exploitation in the Networked Information Society
Roeland Ordelman (Netherlands Institute for Sound and Vision & University of Twente)

Safeguarding the massive body of audiovisual content, including rich music collections, in audiovisual archives and enabling access for various types of user groups is a prerequisite for unlocking the social-economic value of these collections. Data quantities and the need for specific content descriptors however, force archives to re-evaluate their annotation strategies and access models, and incorporate technology in the archival workflow. It is argued that this can only be successfully done provided that user requirements are studied well and that new approaches are introduced in a well-balanced manner, fitting in with traditional archival perspectives, and by bringing the archivist in the technology loop by means of education and by deploying hybrid work-flows for technology aided annotation.

