Connect with us

News

Baidu has a new trick for teaching AI the meaning of language

Published

on

Earlier this month, a Chinese tech giant quietly dethroned Microsoft and Google in an ongoing competition in AI. The company was Baidu, China’s closest equivalent to Google, and the competition was the General Language Understanding Evaluation, otherwise known as GLUE.

GLUE is a widely accepted benchmark for how well an AI system understands human language. It consists of nine different tests for things like picking out the names of people and organizations in a sentence and figuring out what a pronoun like “it” refers to when there are multiple potential antecedents. A language model that scores highly on GLUE, therefore, can handle diverse reading comprehension tasks. Out of a full score of 100, the average person scores around 87 points. Baidu is now the first team to surpass 90 with its model, ERNIE.

The public leaderboard for GLUE is constantly changing, and another team will likely top Baidu soon. But what’s notable about Baidu’s achievement is that it illustrates how AI research benefits from a diversity of contributors. Baidu’s researchers had to develop a technique specifically for the Chinese language to build ERNIE. It just so happens, however, that the same technique makes it better at understanding English as well.

What is Baidu Ernie?

ERNIE 1.0 (Enhanced Representation through Knowledge Integration) was introduced by a Baidu research team in April 2019.

ERNIE 2.0, which debuted in July 2019, is a continual pretraining framework that incrementally builds and learns pretraining tasks through constant multi-task learning.

Baidu has a new trick for teaching AI the meaning of language

ERNIE’s predecessor

To appreciate ERNIE, consider the model it was inspired by: Google’s BERT. (Yes, they’re both named after the Sesame Street characters.)

Before BERT (“Bidirectional Encoder Representations from Transformers”) was created in late 2018, natural-language models weren’t that great. They were good at predicting the next word in a sentence—thus well suited for applications like Autocomplete—but they couldn’t sustain a single train of thought over even a small passage. This was because they didn’t comprehend meaning, such as what the word “it” might refer to.

But BERT changed that. Previous models learned to predict and interpret the meaning of a word by considering only the context that appeared before or after it—never both at the same time. They were, in other words, unidirectional.

BERT, by contrast, considers the context before and after a word all at once, making it bidirectional. It does this using a technique known as “masking.” In a given passage of text, BERT randomly hides 15% of the words and then tries to predict them from the remaining ones. This allows it to make more accurate predictions because it has twice as many cues to work from. In the sentence “The man went to the ___ to buy milk,” for example, both the beginning and the end of the sentence give hints at the missing word. The ___ is a place you can go and a place you can buy milk.

The use of masking is one of the core innovations behind dramatic improvements in natural-language tasks and is part of the reason why models like OpenAI’s infamous GPT-2 can write extremely convincing prose without deviating from a central thesis.

Baidu has a new trick for teaching AI the meaning of language

From English to Chinese and back again

When Baidu researchers began developing their own language model, they wanted to build on the masking technique. But they realized they needed to tweak it to accommodate the Chinese language.

In English, the word serves as the semantic unit—meaning a word pulled completely out of context still contains meaning. The same cannot be said for characters in Chinese. While certain characters do have inherent meaning, like fire (火, huŏ), water (水, shuĭ), or wood (木, ), most do not until they are strung together with others. The character 灵 (líng), for example, can either mean clever (机灵, jīlíng) or soul (灵魂, línghún), depending on its match. And the characters in a proper noun like Boston (波士顿, bōshìdùn) or the US (美国, měiguó) do not mean the same thing once split apart.

So the researchers trained ERNIE on a new version of masking that hides strings of characters rather than single ones. They also trained it to distinguish between meaningful and random strings so it could mask the right character combinations accordingly. As a result, ERNIE has a greater grasp of how words encode information in Chinese and is much more accurate at predicting the missing pieces. This proves useful for applications like translation and information retrieval from a text document.

The researchers very quickly discovered that this approach actually works better for English, too. Though not as often as Chinese, English similarly has strings of words that express a meaning different from the sum of their parts. Proper nouns like “Harry Potter” and expressions like “chip off the old block” cannot be meaningfully parsed by separating them into individual words.

Baidu has a new trick for teaching AI the meaning of language

ERNIE thus learns more robust predictions based on meaning rather than statistical word usage patterns.

A diversity of ideas

The latest version of ERNIE uses several other training techniques as well. It considers the ordering of sentences and the distances between them, for example, to understand the logical progression of a paragraph. Most important, however, it uses a method called continuous training that allows it to train on new data and new tasks without it forgetting those it learned before. This allows it to get better and better at performing a broad range of tasks over time with minimal human interference.

Baidu actively uses ERNIE to give users more applicable search results, remove duplicate stories in its news feed, and improve its AI assistant Xiao Du’s ability to accurately respond to requests. It has also described ERNIE’s latest architecture in a paper that will be presented at the Association for the Advancement of Artificial Intelligence conference next year. The same way their team built on Google’s work with BERT, the researchers hope others will also benefit from their work with ERNIE.

When we first started this work, we were thinking specifically about certain characteristics of the Chinese language, But we quickly discovered that it was applicable beyond that.

Hao Tian, Chief Architect of Baidu Research

References

[Bowman et al. 2015] Bowman, S. R.; Angeli, G.; Potts, C.; and Manning, C. D. 2015. A large annotated corpus
for learning natural language inference. arXiv preprint arXiv:1508.05326.
[Chen and Liu 2018] Chen, Z., and Liu, B. 2018. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 12(3):1–207.
[Chen et al. 2018] Chen, J.; Chen, Q.; Liu, X.; Yang, H.; Lu, D.; and Tang, B. 2018. The bq corpus: A large-scale domainspecific chinese corpus for sentence semantic equivalence identification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4946–4951.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

News

Xender-like app development

Published

on

By

Xender-like app development

Xender – Share Music&Video Status Saver Transfer

Xender – best sharing app fulfilling all your transfer needs

☆ Share Music, Share Video &Share Photo, Share MV, Share It, Share Me, Share File
☆ Transfer All type of files (App, music, pdf, word, excel, zip, Folder..)in any places at any time
☆ Absolutely without mobile data usage
☆ 200 times Bluetooth transfer speed: Top WiFi File Transfer Master!
☆ Cross-Platform Supports: Android, IOS, Tizen, Windows, PC/ Mac
☆ No Need for USB connection or additional PC software
☆ The choice of 500 Million+ users
☆ Over 200 million files successfully transferred daily
☆ Play all music and videos right after receive
☆ New Feature [toMP3]: Convert Video to Audio
☆ Social Media Downloader: Save videos from Whatsapp, Facebook and Instagram
☆ Game Center –Hundreds of Casual Games available WithOUT install/ Download

Features of Xender App Can Be Clone

1. Allow to use as a File Manager.
2. Allow to handle cross-platform.
3. App Extractor is a kind of additional feature that enables the user to increase the using time.
4. Also suggest user to update the app will be a great help for the user.
5. Group sharing included with the app like Xender
6. Allow to clone your phone
7. Support with regional language too. Adding this feature is the most marketing aspects of the app. Such a feature increases the user rate.

Support and stay connected

Facebook: https://facebook.com/XenderApp
Twitter: https://twitter.com/XenderApp
Google+: https://plus.google.com/+AnMobi
Tumblr: https://www.tumblr.com/dashboard

Continue Reading

News

Business books to read in 2021: top picks

Published

on

By

Business books to read in 2021: top picks

The COVID pandemic has changed habits and reassigned priorities for all of us. There is, however, a segment of people whose natural way to treat crisis is to use the opportunities it offers: the entrepreneurs.

With an unprecedented economy stimulus coming from the U.S. Federal Reserve, the idea of starting new business (or investing in an existing one) became even more popular than previously. This, however, requires a certain level of preparation: you definitely don’t want to learn on your own mistakes losing money. Thankfully, business and entrepreneurship is the field where there are tons of useful and inspiring books to learn from.

But how to navigate through those thousands of books and authors? Let’s make an overview of the most in-demand business ebooks as of 2021.

Finance and Investments

Business books to read in 2021: top picks

Strongly recommended is “The Intelligent Investor” by Benjamin Graham, an eminent economist and investor of the 20th century. He has created a guide that has inspired millions of people around the world for 70 years. Since its first publication in 1949, the book has become a veritable bible of the stock market. The modern edition is supplemented by comments by financial journalist Jason Zweig, who draws parallels between Graham’s examples and modern realities, and also provides a deeper understanding of how to adapt the author’s philosophy to everyday life.

Internet Technology

The ITIL (Information Technology Infrastructure Library) is a framework for standardizing the selection, planning, provision, maintenance and the entire life cycle of IT services in a company. The goal is to improve efficiency and achieve predictable service delivery. 

The ITIL books cover key concepts of service management, the four dimensions of service management, the ITIL service value system, and ITIL management practices.

Entrepreneurship

This is an extremely wide field of study, incorporating both motivational, inspiring books with those designed to improve hard skills. As for the first option, the “Money: Master the Game” by Tony Robbins keeps first place in our wish list.

Another must-read is “Think and grow rich” by Napoleon Hill. This bestseller was written in the end of the 20th century and has been reprinted over 40 times. In his book, the author has collected the stories of famous millionaires of his time who stubbornly went towards their goals.

Continue Reading

News

What is The Full Form of PhD?

Published

on

By

The full form of PhD is a Doctor of Philosophy or a Doctorate in Philosophy.

This is a postgraduate academic degree awarded to individuals who have gone through the process of studying and researching extensive subject matter in their field. It can be obtained from an accredited university and requires one to go through a research dissertation.

What Is The Meaning Of PHD?

PhD is a doctoral degree that takes the form of a research degree. It generally recommends a substantial amount of independent, original research and a formal defence of that research.

The word “Doctor” in the abbreviation PhD comes from Latin “doctor” which means “teacher”. In other words, PhD holders are teachers who have done extensive teaching and research as demonstrated by their dissertation.

What Is The Abbreviation Of Doctorate of Philosophy?

PhD is the abbreviation for Doctorate of Philosophy.

Doctorate of Philosophy is a postgraduate degree that requires three to five years of study and research. The abbreviation for Doctorate of Philosophy is PhD.

Which is higher MD or PhD?

A person can be awarded with a Master of Science degree (MS) after completing course work and passing a series of written and oral exams. Master’s degrees typically take one to three years to complete.

A PhD is a Doctoral Degree, which is the highest academic award in most countries. To earn a PhD in the US, students usually have to complete at least four years of study beyond an undergraduate degree.

Master’s degrees are often seen as higher than Doctoral Degrees because they require more advanced subject knowledge and coursework.

Continue Reading

Trending