MIT’s TextFooler can trick NLP models – AI

Of late fake news, Online racist rants & incitements to violence or sexual misconduct, especially on large, busy social media Sites, forced networks like Facebook, Twitter & even Google to build algorithms to detect the sentiments of sentences to control & eliminate them.

Natural Language Processing (NLP) algorithms are used to train Machine Learning (ML) models to ‘understand’ the meaning of sentences in the way a human would. BERT (Bidirectional Encoder Representations from Transformers), developed by Google, is one such.

Launched in October 2019, BERT has been set to transform Google’s Search Engine for the better, say SEO experts. It is equipped to extract the meaning of a sentence, in the manner of human understanding.

Now, a team of scientists from MIT believes it has developed an algorithm, that generates adversarial text that fools some of the most well-developed ML, NLP models currently in use, including Google’s Bert.

Scientists Di Jin, Zhijing Jin, Joey Tianyi Zhou & Peter Szolo, from the Universities of Hong Kong & Singapore, have developed TextFooler an adversarial text attack model, which they’ve released as an open source project on GitHub in an effort to highlight the flaws in all major NPL models currently in use.

By analyzing text, TextFooler goes through a process of replacing & juggling words in a sentence while maintaining the original meaning, & grammatical syntax. It attacks NLP models by converting specific parts of a given sentence to ‘fool’ them into making the wrong predictions. It has 2 principal components — text classification & entailment. It modifies the classification to invalidate the entailment judgment of the target NLP models.

Their method hinges on effectively replacing words with synonyms, which are ‘weighted’ by their algorithm, to best replace words, while still retaining the correct meaning, grammar & natural flow of the sentence.

Now, the team behind TextFooler are suggesting that their system should be used to ‘train’ the current ML, NLP models to improve their sentiment detection, with the ultimate aim of achieving better control over malicious text attacks aimed a fooling models like BERT. The scientists believe that their breakthrough can be employed to combat the spread of fake news & improve techniques in the war against adversarial text attacks.

Image by Gerd Altmann from Pixabay