On February 26th and 27th, 2024, the Smart Education National New Generation Artificial Intelligence Open Innovation Platform, Gaotu Techedu, Google,

TAL Education Group

简/EN

Xueersi partnered with Google and others to launch a global large-language model math competition, with 120 teams competing simultaneously.

@Open Innovation Platform | 2024-03-02 01:32:40

3219

The 2024 AAAI Conference Wise Education Seminar, initiated by the National New Generation Artificial Intelligence Open Innovation Platform for Smart Education, TAL (Xueersi), Google, Princeton University, Jinan University, and other global leading technology companies and research institutions, was held in Vancouver, Canada, from February 26 to 27, 2024. This two-day seminar, themed "Building Bridges of Innovation and Responsibility," aimed to explore innovation and ethical responsibility in the application of artificial intelligence, especially generative artificial intelligence, in education.

During the conference, the AAAI2024 Global large Model Mathematical Problem Solving Competition officially announced the results. This is the world's first competition focusing on the mathematical capabilities of large models, attracting over 120 teams from various countries and regions. After more than four months of intense competition, eight teams, namely CPDP-ICST, cogbase, MathEducators, CTYUN-AI, zuiii, shengkai, loveisp, and Mathematical Problem Solving and Reasoning, emerged victorious and won the competition.

Focusing on generative AI, the conference delved into innovation and responsibility. Over the past year, the hottest topic has been generative AI, represented by large language models (LLMs). With the ongoing boom in global LLM development, their innovative applications and ethical implications have become key concerns. Thus, experts and scholars—including researchers from the National New Generation AI Open Innovation Platform for Smart Education, TAL, Google, Princeton University, and Jinan University—launched the 'AI for Education' workshop at AAAI 2024.

During the two-day meeting, participants brought their thoughts and presented their insights through papers, live speeches, posters, and a global mathematical reasoning competition, deeply discussing the impact of generative artificial intelligence on education, its future, and challenges. Regarding the impact of large model technology in education, some experts stated that educators should actively embrace large models rather than resist them. Prohibiting students from using large models to complete assignments is as futile as banning the use of the internet 20 years ago. Educators should redesign assignments instead of spending efforts on discovering cheaters. Moreover, certain experts believe that the emergence of large models has shown people the future of education—each student has a tutor that better understands their needs and knows how to engage them more effectively. Addressing the challenges of hallucinations and evaluations in the use of large models in education, some experts proposed an automatic, iterative refinement-based method for generating test cases, using LLM and the compiler from symphony, tested on the Code Workout dataset, which showed that the method could generate test cases that accurately measure students' knowledge levels.

Simultaneously, participants also deeply shared and discussed the standards responsible AI should have in educational contexts and the ethical requirements that should be set. This includes ensuring fairness, accountability, explainability, and transparency in important educational decision-making scenarios such as admissions, warning systems, and grading. The responsible AI in education's methodological contributions and impacts include but are not limited to generative models, predictive models, causal inference, reinforcement learning, and data collection. Additionally, some attendees proposed that as AI, particularly generative AI, has an increasing impact on education, ensuring educational equity through regulations and processes is necessary.

The Global large Model Mathematical Problem Solving Competition concluded, with 120 teams competing. In order to enhance the mathematical and scientific reasoning capabilities of large language models, TAL (Xueersi), together with Google, Jinan University, and other renowned technology companies and universities, launched the AAAI2024 Global large Model Mathematical Problem Solving Competition in October 2023, leveraging the National New Generation Artificial Intelligence Open Innovation Platform for Smart Education.

During the conference, the organizers officially announced the results of the AAAI2024 Global large Model Mathematical Problem Solving Competition. This is the world's first competition focusing on the mathematical capabilities of large models, requiring participants to use large models to generate reasoning steps and answers for given mathematical problems, attracting over 120 teams from various countries and regions. After more than four months of intense competition, eight teams, namely CPDP-ICST, cogbase, MathEducators, CTYUN-AI, zuiii, shengkai, loveisp, and Mathematical Problem Solving and Reasoning, emerged victorious and won the competition.

The competition was divided into two stages. The first stage was the public leaderboard phase, where the organizers randomly selected 30% of the data from the given dataset for participants to debug their large models. The second stage was the private leaderboard phase, where participants had to use the optimized large models from the first stage to solve the remaining 70% of the dataset. The organizers ranked the participants by comparing the accuracy of their model output answers against the correct answers. The scores from the second stage were used as the final competition results.

To better explore the mathematical capabilities of various large models in different languages, the organizers set up two tracks in Chinese and English. TAL (Xueersi) provided the competition’s Chinese and English datasets—TAL-SAQ7K-CN and TAL-SAQ6K-EN, which included actual problems from middle and elementary school mathematics competitions domestically and abroad. Considering the potential impact of using third-party large models on the competition results, the organizers categorized the results according to whether third-party models were used and selected the top three teams in each category based on the final scores. Ultimately, among the more than 120 participating teams, CPDP-ICST, cogbase, MathEducators, CTYUN-AI, zuiii, shengkai, loveisp, and Mathematical Problem Solving and Reasoning emerged victorious. CPDP-ICST, cogbase, and MathEducators were the top three teams in both the Chinese and English tracks.

Mathematics has long been regarded as a litmus test for artificial intelligence. Currently, large language models still face significant challenges in tackling mathematical reasoning tasks. According to a representative from the National New Generation AI Open Innovation Platform for Smart Education, education is one of the earliest application scenarios for large models. Breakthroughs in their mathematical capabilities could usher in enduring, even transformative changes, enabling broader access to high-quality educational resources and realizing true large-scale personalized education. By supporting the launch of this global competition on large models' mathematical problem-solving abilities, the platform aims to drive technological innovation and extend the dividends of progress to more individuals.

The National New Generation Artificial Intelligence Open Innovation Platform for Smart Education was approved by the Ministry of Science and Technology in 2019 and is built by Beijing Century TAL Education Technology Co., Ltd. The platform is based on the education industry, covering the entire country, providing full-scene, full-process, and full-cycle service support for educational institutions, educational technology companies, educators, and AI developers in technology, solutions, and industrialized services, promoting the intelligent upgrading of the education industry and constructing a diversified new ecosystem of smart education characterized by "symbiosis," "mutual growth," and "co-creation."

Original text from: https://news.ikanchai.com/2024/0229/577866.shtml

TAG

Previous Page ：
CES2024,National Business First AM 20240118
Next Page ：
The MathEval benchmark is released, providing a 'barometer' for assessing the mathematical capabilities of large models.