Kenneth Heafield (University of Edinburgh) “Faster Neural Machine Translation”

Calendar

When:

May 6, 2019 @ 12:00 pm – 1:15 pm

2019-05-06T12:00:00-04:00

2019-05-06T13:15:00-04:00

Where:

Hackerman Hall 320
3400 N. Charles Street
Baltimore
MD 21218

Cost:

Free

Seminars

2019 Heafield May

Abstract

The Marian toolkit dominated a shared task on translation speed run by the Workshop on Neural Machine translation. Speed came from many levels: model complexity, teacher-student compression, and efficient kernels. Compressing the model is particularly important because memory bandwidth is the limiting factor on GPUs with tensor cores and on CPUs. I wrote 8-bit integer multiplication in AVX512 intrinstics, which reduced translation latency 2.7x and now we are looking at 4 bits. Much of the systems for ML addresses vision tasks; large parameter skew and variable-size input make sequential models difficult and interesting.

Biography

Kenneth Heafield is a Lecturer (which translates to en-US as Assistant Professor) leading a machine translation group at the University of Edinburgh. He works on efficient neural networks, low-resource translation, mining petabytes for translations, and occasionally grammatical error correction.

Kenneth Heafield (University of Edinburgh) “Faster Neural Machine Translation”

Center for Language and Speech Processing