" Key Dataset Updates for This Month "
Original Mathematical Competition Problems at the Contest Level
Collect 100 original competition problems that strictly adhere to CMO/AIME/IMO standards, all of which are solution-oriented questions requiring multi-step reasoning, covering core areas such as algebra, geometry, number theory, and more.
Code Logic Problem Database
A comprehensive question bank system integrating coverage algorithm design, data structures, and dynamic programming has been constructed, containing a total of 630 high-quality code logic problems, covering multiple programming languages.
3D Game Video Data
This game video dataset collectively contains 50,000 high-quality 3D game scene videos of different types, covering diverse scenarios. Each video includes 4 standardized action segments and is accompanied by 4 corresponding action records, detailing information such as direction, perspective, position, speed, acceleration, and more.
Code Structured Data
Integrating question bank resources from over 20 domestic and international OJ platforms, we have cumulatively recorded over 14,000 high-quality test questions. Each question is equipped with detailed answers and problem-solving approaches, and includes 118,000 sets of test cases, forming a comprehensive and high-quality data resource with complete content and abundant examples.
Plant Multimodal Database
A professional-level, multimodal plant ecological image data resource library with over ten thousand entries, specifically constructed for high-end landscape design, ecological research, computer vision, and AIGC synthesis fields.
All datasets

Question Bank

Chinese Question Bank Dataset
English Question Bank Dataset
Algorithm Code Dataset
Structured Data of Code
Code Logic Problem Database
Original Math Competition Problems

Text

English materials
Chinese Language Materials
Vertical Domain
Minor Language Varieties

Image

Screenshot Dataset
image annotation

Video

High-Quality Video (Film and Television)
3D Game Video
Short drama video
Vehicle Data Class

Audio

Cross-language Pronunciation Corpus
Chinese Audio
Speech Interaction and Command Data
Multimodal Speech Data
Biological Characteristics and Behavioral Data Class

Multimodal

Graphic and Text Description Dataset
Graphical and Textual Reasoning Q&A
Multimodal Audio
Video Q&A
Cabin Class Sampling Dataset
Industrial Multimodal Dataset
Human Body Multimodal Dataset
Plant Multimodal Database