Artificial Intelligence · 2020-02-01

Google introduces open-source text-generation method, “LaserTagger” – AI


So far, Sequence-to-sequence (seq2seq) models were used in the field of machine translation, indeed being the “tool of choice” for various text-generation tasks, including summarizationsentence fusion & grammatical error correction.

But the use of seq2seq models for text generation have some drawbacks depending on the use case, such as producing outputs that are not supported by the input text (known as hallucination) & requiring large amounts of training data to reach good performance. What’s more, such models are inherently slow at inference time, since they typically generate the output word-by-word.

Now, in a paper “Encode, Tag, Realize: High-Precision Text Editing,” the Google AI research team has put forth a open sourced method for text generation, which is designed to specifically address these shortcomings. Called LaserTagger, it is speedy & precise, they say. Instead of generating the output text from scratch, LaserTagger produces output by tagging words with predicted edit operations that are then applied to the input words in a separate realization step. This, claims Google, is a less error-prone way of tackling text generation, which can be handled by an easier to train & faster to execute model architecture.

There are many advantages of LaserTagger when applied at large scale, such as improving the formulation of voice answers in some services by reducing the length of the responses & making them less repetitive. Also, the high inference speed allows the model to be plugged into an existing technology stack, without adding any noticeable latency on the user side, while the improved data efficiency enables the collection of training data for many languages, thus benefiting users from different language backgrounds, said Google.

via: Google AI blog


Click here to opt-out of Google Analytics