Scalable Understanding of Multilingual Media

Media monitoring enables the global news media to be viewed in terms of emerging trends, people in the news, and the evolution of story-lines. The massive growth in the number of broadcast and Internet media channels means that current approaches can no longer cope with the scale of the problem. The aim of SUMMA is to significantly improve media monitoring by creating a platform to automate the analysis of media streams across many languages, to aggregate and distil the content, to automatically create rich knowledge bases, and to provide visualisations to cope with this deluge of data. SUMMA has six objectives: (1) Development of a scalable and extensible media monitoring platform; (2) Development of high-quality and richer tools for analysts and journalists; (3) Extensible automated knowledge base construction; (4) Multilingual and cross-lingual capabilities; (5) Sustainable, maintainable platform and services; (6) Dissemination and communication of project results to stakeholders and user group. Achieving these aims will require advancing the state of the art in a number of technologies: multilingual stream processing including speech recognition, machine translation, and story identification; entity and relation extraction; natural language understanding including deep semantic parsing, summarisation, and sentiment detection; and rich visualisations based on multiple views and dealing with many data streams. The project will focus on three use cases: (1) External media monitoring - intelligent tools to address the dramatically increased scale of the global news monitoring problem; (2) Internal media monitoring - managing content creation in several languages efficiently by ensuring content created in one language is reusable by all other languages; (3) Data journalism. The outputs of the project will be field-tested at partners BBC and DW, and the platform will be further validated through innovation intensives such as the BBC NewsHack.
University of Edinburgh
BRITISH BROADCASTING CORPORATION, Deutsche Welle, Idiap Research Institute, LETA, PRIBERAM INFORMATICA S.A., Qatar Computing Research Institute, University College London
H2020
Feb 01, 2016
Jan 31, 2019