"An Odysseus internship affords undergraduate students from around the world with an interest in Computer Science the opportunity to undertake exciting and fun research in a distributed yet cooperative environment."
Most Machine Translation (MT) systems presuppose textual input. However, with advances in speech recognition and speech synthesis, Spoken Language Translation (SLT) systems are beginning to emerge with impressive results. This collaborative three-year project is concerned with developing a series of prototypes which integrate speech recognition and speech synthesis engines developed in UCD with two different machine translation systems developed at DCU. This will be done on an incremental basis for three different language pairs (one language pair per internship year). Both the speech recognition and the speech synthesis systems will assume a limited domain which will be decided year-on-year by the supervisors so that problems associated with unrestricted vocabularies and varied speaker types will not be an issue. An HMM-based phoneme/word recognition model will underlie the speech recognition interface and speech output will be generated by a state-of-theart unit-selection synthesizer. The recognition and synthesis engines will be coupled with DCU's Example-Based MT (EBMT) system and with a Statistical MT (SMT) system constructed in DCU from freely available tools, including:
* Giza++, to extract the word-level correspondences;3
* the SRI language modelling toolkit;4
* the Pharaoh phrase-based SMT decoder.5
The project will involve 4 students (two at UCD and two at DCU) working on the speech recognition engine, the speech synthesis engine, the EBMT system and the SMT system respectively. Student 1 at UCD will build dedicated speech recognition engines for a limited domain of the two languages of the pair using the models which have been developed by the UCD team. Student 2 also at UCD will build unit-selection speech synthesisers for the two language pairs using the techniques developed by the UCD group. Student 3 at DCU will adapt the DCU EBMT system to the language pairs and domains of the project, while Student 4 at DCU will do the same for the SMT system, including optimisation according to a number of distinct parameters.
Each year the internship will run as follows: after an initial joint meeting, the first four weeks of the internship will be devoted to introducing individual students to the relevant knowledge sources and techniques for their specific task. Each student will work closely with PhD and post-doctoral researchers at their local institution, gaining hands-on experience. During this period, joint meetings will take place on a weekly basis. Weeks 5 to 9 will be devoted to collaborative work where the students will work jointly as a group to build the prototype for that particular language pair. Part of this period also will be devoted to augmenting existing copora with annotations relevant to this particular task. A thorough evaluation of the prototype using the standard metrics for speech technology and machine translation evaluation metrics will be undertaken during the final three weeks of the internship.
Relevance of Project to the Host Laboratories:
It can be anticipated that in the very near future, most input to the computer will be via speech, as opposed to text. The DCU state-of-the-art EBMT system currently deals with textual input only, and this project can be seen as a catalyst to a long-term goal to migrate our system to dealing with spoken language input. Most recent success in the area of SLT has been using SMT systems. We have experience of such approaches in a number of recent papers which combine our leading EBMT system with state-of-the-art SMT systems in a set of novel hybrid MT systems. We wish to compare how effective our EBMT system is at processing spoken language input against a leading SMT system. The project provides the framework for evaluating the UCD speech recognition and speech synthesis engines in the context of spoken language translation. Furthermore, as a joint project between two of the major partners in two of the core technologies, this proposal will strengthen the CSET in speech and language technology submitted recently to SFI.
Supervisors:
Dr. Andy Way (NCLT, DCU, 2 students) and Dr. Julie Berndsen (Computer Science and Informatics, UCD, 2 students)
Keywords:
Spoken Language Translation, Machine Translation, Speech Recognition, Speech Synthesis, Example-Based Machine Translation, Statistical Machine Translation
Links:
Recent comments
1 year 14 weeks ago
1 year 14 weeks ago
2 years 2 weeks ago
2 years 2 weeks ago