Hear from CIOs, CTOs, and other senior executives and leaders on data and AI strategies at the Future of Work Summit on January 12, 2022. Learn more
In June, Naver, the South Korean-based Seongnam-based company that operates the eponymous search engine Naver, has announced that it has formed one of the largest AI language models of its kind, called HyperCLOVA. Naver claimed that the system learned 6,500 times more Korean data than OpenAI’s GPT-3 and contained 204 billion parameters, the parts of the machine learning model learned from historical training data. (GPT-3 has 175 billion parameters.)
HyperCLOVA was considered a notable achievement because of the scale of the model and because it fits in with the trend of “diffusing” generative models, with multiple players developing GPT-3 style models, such as that of Huawei. PanGu-Alpha (stylized PanGu-α). The benefits of large language models, including the ability to generate human-like text for marketing and customer support purposes, were previously limited to English because companies did not have the resources to train these models in d ‘other languages.
In the months following the development of HyperCLOVA, Naver began using it to customize search results on the Naver platform, Naver managing director Nako Sung told VentureBeat in an interview. It will also soon be available as a private beta through HyperCLOVA Studio, a codeless tool that will allow developers to access the model for text generation and classification tasks.
“Originally used to correct typos in search queries on Naver Search, [HyperCLOVA] now enables many new features on our e-commerce platform, Naver Shopping, such as summarizing multiple consumer reviews in one line, recommending and selecting products based on user purchasing preferences or generation of trendy marketing phrases for featured shopping collections, ”Sung said. “We also launched CLOVA CareCall, a… conversational agent for seniors living alone. The service is based on the HyperCLOVA’s natural conversation generation capabilities, which allows it to have human conversations.
Great language models
The HyperCLOVA training, which can understand English and Japanese in addition to Korean, required a large-scale data center infrastructure, according to Sung. Naver operated a server cluster consisting of 140 Nvidia SuperPod A100 DGX nodes, which the company says can deliver up to 700 petaflops of computing power.
It took months to train HyperCLOVA on 2TB of Korean text data, much of which came from user-generated content on Naver’s platforms. For example, one source was Knowledge iN, a Korean-speaking Quora-like community where users can ask questions on topics to receive answers from experts. Another was the public posts of people who use free web hosting services provided by Naver.
Sung says this differentiates HyperCLOVA from previous large language models such as GPT-3, which have a limited ability to understand the nuances of languages other than English. He claims that by making the model rely on “the collective intelligence of Korean culture and society,” it can better serve Korean users – and at the same time reduce Naver’s reliance on it. -vis other AI services less focused on Asia-Pacific.
In a recent issue of its Import AI newsletter, former OpenAI policy director Jack Clark asserted that because generative models ultimately reflect and amplify the data on which they are formed, different nations care a lot about how their own culture is represented in these models. “[HyperCLOVA] is part of a general trend of different nations asserting their own AI capability [and] capacity via advanced models like the GPT-3, ”he continued. “[We’ll] wait for more technical details to see if [it’s] really comparable to GPT-3.
Some experts have argued that because companies developing influential AI systems are primarily located in the United States, China and the EU, a disproportionate share of the economic benefits will fall within these regions, resulting in could exacerbate inequalities. In an analysis of the publications at two major machine learning conferences, NeurIPS 2020 and ICML 2020, none of the top 10 countries in terms of publication index were located in Latin America, Africa or Southeast Asia. East. In addition, a recent report from the Center for Security and Emerging Technology at Georgetown University found that while 42 of the 62 major AI labs are located outside of the United States, 68% of the workforce is located in the United States.
“These large amounts of collective intelligence continually enrich and strengthen HyperCLOVA,” said Sung. “The most well-known hyper-scale language model is GPT-3, and it is trained primarily with data in English, and is only taught 0.016% of Korean data on total input… [C]Considering the impact of large-scale AI on industries and economies in the near future, we are convinced that building Korean-language-based AI is very important for Korea’s sovereignty in matters of AI. “
Challenges in model development
Among other things, prominent AI researcher Timnit Gebru questioned the wisdom of building large linguistic models, examining who benefits and who suffers. It is well established that models can amplify biases in the data on which they are trained, and the effects of model training on the environment have been raised as serious concerns.
To resolve issues related to bias, Sung says Naver is in talks with “outside experts,” including researchers from the IA Policy Initiative at Seoul National University, and plans to form an advisory committee on the ethics of AI in Korea this year. The company also released a benchmark – Korean Language Understanding Evaluation (KLUE) – to assess the natural language comprehension abilities of Korean language models, including HyperCLOVA.
“We recognize that while AI can make our lives easier, it is also not foolproof like all other technologies in use today,” he added. “While striving for the convenience of the service we provide, Naver will also strive to explain our AI service in a way that users can easily understand at their request or when necessary… We will ensure safety at all stages of designing and testing our services, including after service deployment, to avoid a situation where AI as an everyday tool threatens life or causes physical harm to people.
Real world applications
Currently, Naver says HyperCLOVA is leveraged for various Naver services, including Naver Smart Stores, the company’s e-commerce marketplace, where it “fixes” product names by generating “more attractive” names compared to SKUs. original optimized for search engines. In another e-commerce use case, Naver applies HyperCLOVA to create product recommendation systems tailored to individual buyers’ preferences.
“While HyperCLOVA does not specifically learn from user purchase logs, we have found that it is able to recommend products in our market to some extent. So we refined this feature and introduced it as one of our ecommerce features. Unlike existing recommendation algorithms, this model shows the “generalized” ability to perform well on cold elements, cold users and cold services, ”Sung said. “Recommending a certain gift to someone is not an appropriate problem for traditional machine learning to solve. This is because there is no information on the recipient of the gift … [But] with HyperCLOVA, we were able to make this experience possible.
HyperCLOVA is also powering an AI-based calling service for seniors living alone, which Naver says he plans to refine to provide more personalized conversations in the future. Beyond that, Naver says he’s developing a multilingual version of HyperCLOVA that can understand two or more languages at the same time and an API that will allow developers to build apps and services on the model.
The pandemic has accelerated the digital transformation of the world, causing businesses to become more dependent on software to streamline their processes. As a result, the demand for natural language technology is now higher than ever, especially in business. According to a 2021 survey by John Snow Labs and Gradient Flow, 60% of tech leaders said their natural language processing budgets increased by at least 10% from 2020, while a third – 33% – said their spending increased by more than 30%.
The global NLP market is expected to climb in value to $ 35.1 billion by 2026.
“The most interesting thing about HyperCLOVA is that its ease of use is not limited to AI experts, such as engineers and researchers, but it has also been used by service planners and managers. company within our organization. Most winners [in a recent HyperCLOVA hackathon] came from non-AI developer roles, which I think proves that HyperCLOVA’s codeless AI platform will equip everyone with AI capabilities, dramatically accelerating the speed of AI transformation and changing its scope in the future.
VentureBeat’s mission is to be a digital public place for technical decision-makers to learn about transformative technology and conduct transactions. Our site provides essential information on data technologies and strategies to guide you in managing your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the topics that interest you
- our newsletters
- Closed thought leader content and discounted access to our popular events, such as Transform 2021: Learn more
- networking features, and more
Become a member