LLMs are AI models for deep learning that serve as the core of generative AI including ChatGPT (2). The four organizations aim to improve the environment for creating LLMs that can be widely used by academia and companies, contribute to improving the research capabilities of AI in
Background
While many anticipate that LLMs and generative AI will play a fundamental role in as the research and development of technologies for security, the economy, and society overall, the advancement and refinement of these models will require high-performance computing resources that can efficiently process large amounts of data.
Tokyo Tech,
Implementation period
From
Roles of each organization and company
The technology used in this initiative will allow the organizations to efficiently perform large-scale language model training on the large-scale parallel computing environment of the supercomputer Fugaku. The roles of each organization and company are as follows:
Fujitsu: Acceleration of LLMs
RIKEN: Distributed parallelization and accelerating communication of LLMs, acceleration of LLMs
Future Plans
To support Japanese researchers and engineers to develop LLMs in the future, the four organizations plan to publish the research results obtained through the scope of the initiatives for use of Fugaku defined by Japanese policy on GitHub (3) and Hugging Face (4) in fiscal 2024. It is also anticipated that many researchers and engineers will participate in the improvement of the basic model and new applied research to create efficient methods that lead to the next generation of innovative research and business results.
The four organizations will additionally consider collaborations with
Comment from
"The collaboration will integrate parallelization and acceleration of large-scale language models using the supercomputer "Fugaku" by Tokyo Tech and RIKEN, Fujitsu's development of high-performance computing infrastructure software for Fugaku and performance tuning of AI models, and
Comment from
"We aim to build a large-scale language model that is open-source, available for commercial use, and primarily based on Japanese data, with transparency in its training data. By enabling traceability of the learning data, we anticipate that this will facilitate research robust enough to scientifically verify issues related to the black box problem, bias, misinformation, and so-called "hallucination" phenomena common to AI. Leveraging the insights gained from deep learning from Japanese natural language processing developed at
Comment from
"We are excited for the chance to leverage the powerful, parallel computing resources of the supercomputer Fugaku to supercharge research into AI and advance research and development of LLMS. Going forward, we aim to incorporate the fruits of this research into Fujitsu's new AI Platform, codenamed "Kozuchi," to deliver paradigm-shifting applications that contribute to the realization of a sustainable society."
Comment from
"The A64FX (5) CPU is equipped with an AI acceleration function known as SVE.
Software development and optimization are essential to maximize its capabilities and to utilize it for AI applications, however. We feel that this joint research will play an important role in bringing together experts of LLMs and computer science in
Project name
Distributed Training of Large Language Models on Fugaku (Project Number: hp230254)
(1) Large Language Models :
Neural networks with hundreds of millions to billions of parameters that have been pre-learned using large amounts of data. Recently, GPT in language processing and ViT in image processing are known as representative large-scale learning models.
(2) ChatGPT:
A large-scale language model for natural language processing developed by OpenAI that supports tasks such as interactive systems and automatic sentence generation with high accuracy.
(3) GitHub:
A platform used to publish open-source software around the world.
(4) Hugging Face:
A platform used to publish AI datasets around the world.
(5) A64FX:
An ARM-based CPU developed by Fujitsu installed in supercomputer Fugaku.
About
Tokyo Tech stands at the forefront of research and higher education as the leading university for science and technology in
About
About Fujitsu
Fujitsu's purpose is to make the world more sustainable by building trust in society through innovation. As the digital transformation partner of choice for customers in over 100 countries, our 124,000 employees work to resolve some of the greatest challenges facing humanity. Our range of services and solutions draw on five key technologies: Computing, Networks, AI, Data & Security, and Converging Technologies, which we bring together to deliver sustainability transformation.
About RIKEN
As the leadership center of high-performance computing, the
Press Contacts:
Public Relations Division,
E-mail: media@jim.titech.ac.jp
Tel: +81-3-5734-2975
E-mail: koho@is.tohoku.ac.jp
Tel: +810-22-795-4529
Public and Investor Relations Division
Inquiries
RIKEN
Research Promotion Office,
Public Relations Division, RIKEN
Tel: +81-50-3495-0247
E-mail: ex-press@ml.riken.jp
Copyright 2023 JCN Newswire . All rights reserved.
© Japan Corporate News, source