This document is a proposal for work on neural machine translation (MT) and automatic speech recognition (ASR) technology by the Idiap Research Institute. The six tasks of the proposed project are:
A. Data identification. Review and list the available written or spoken corpora of Swiss languages (specified in Section 2) including their licenses of use. Gather the corpora and explore possibilities for collecting additional data from social media and possibly from Swisscom units. Both monolingual and multilingual parallel corpora are of interest. Pilot experiments for collecting more data will be considered depending on time.
B. Install, configure and train machine translations systems: namely, neural MT (NMT) following the state of the art in the field (e.g. Edinburgh and Montréal groups), and phrase-based statistical MT (PBSMT, Moses) for comparison purposes. The systems will be based on third party software, generally “open source”, and might involve data preparation and configuration scripts written by Idiap. Explore the use of Idiap’s GPUs to speed up NMT training.
C. Benchmark MT systems: run and test the MT systems, indicate the best configurations that are found, compare scores with the state of the art, and also between NMT and PBSMT. The benchmarking will consider translation quality and coverage, but also speed and use of computing resources.
D. Create Swiss German ASR: follow a logical progression from standard German through Swiss accented German to Swiss dialect. The ASR system will be initially based on Kaldi, although the machine learning parts will be harmonized with those in the other parts of the project.
E. Investigate incrementally adaptive ASR: This is a research thread aimed at designing neural methods that can adapt incrementally to incoming data, as is the situation with dialect data at Swisscom.
F. Document the three tasks above: data sources, system installation/configuration/training, and benchmarking results. The documentation will include tutorials that will allow Swisscom teams to reproduce the systems and build upon them new systems to suit specific internal needs.
The research plan below presents in some detail the tasks and expected outcomes, for a total of six months. The financial support should cover the fulltime salary of two postdocs for the six months.
Below, we first present the linguistic context and motivate the choice of languages for the project. Then we provide a more detailed description of the tasks. We also present Idiap NLP and Speech groups’ past work. The document ends with considerations on the collaboration procedures between Swisscom and Idiap for data sharing, related projects intellectual property rights, and needed workforce.