10/28/2025

New “Children’s Japanese Speech Corpus” Joins Qlean Dataset Lineup

Visual Bank Inc. (Minato-ku, Tokyo; CEO: Saneyuki Nagai) promotes the provision of its AI training data solution “Qlean Dataset,” developed through its subsidiary Amana Images Inc. Designed to support both research and commercial AI development, Qlean Dataset offers a diverse range of original datasets—collectively called “AI Data Recipes”—that enable flexible, scalable, and rights-cleared data sourcing.

The new addition, “Japanese Children’s Conversational Speech Corpus,” further expands the lineup of AI Data Recipe with data specifically tailored for speech recognition, language development, and educational AI applications.
▶ AI Data Recipe lineup

About “AI Data Recipe” in Qlean Dataset

“AI Data Recipe” are original, commercially usable datasets provided under Qlean Dataset.
They can be flexibly combined according to project goals, accuracy requirements, and delivery schedules—available both with and without annotation.

The lineup continues to expand through partnerships with organizations such as Chiba Lotte Marines and Toyo Keizai Inc., along with newly recorded materials and international collaborations. This structure greatly reduces the burden of data preparation in AI development while accelerating project execution.

Overview of “Japanese Children’s Conversational Speech Corpus”

Use Cases of the Dataset

1. Improving ASR for Child Speech
This corpus captures natural daily conversations among Japanese-speaking children, including pronunciation variations and age-specific phonetic traits—ideal for training Automatic Speech Recognition (ASR) models or voice assistants targeting young users.

2. Research on Educational and Developmental AI
The dataset enables quantitative analysis of linguistic comprehension and response tendencies by age, supporting models for educational AI, reading assistants, and developmental support AI.

3. Conversational AI and Educational Robots
By leveraging children’s natural tempo and intonation, developers can build Japanese dialogue AI and educational robots that deliver more engaging and child-friendly conversational experiences.

4. Emotion and Empathy AI Training
Containing laughter, pitch variations, and pauses unique to children’s emotional expression, the corpus is suitable for training emotion recognition or empathetic response AI systems—useful in educational and household AI environments.

5. Japanese Speech and Multimodal LLM Training
Rich in child-specific grammar, vocabulary, and conversational patterns, this dataset can be used for tuning Japanese dialogue models and speech-based LLMs.

6. Linguistic and Sociolinguistic Research
For academic use, the corpus provides a valuable foundation for studying vocabulary diversity, grammar development, and conversational patterns in child language acquisition.

Features of Qlean Dataset

  • Research and commercial use supported:
    All data subjects have provided explicit consent for data collection and AI use, ensuring compliance with global privacy standards.

  • Speed and ROI through modular “AI Data Recipe”:
    The unique structure of AI Data Recipe enables fast, cost-efficient data acquisition and integration.

  • Custom datasets available:
    Qlean Dataset can create tailored datasets to meet specific requirements, leveraging its full data production and annotation capabilities.

Contact form: https://qleandataset.visual-bank.co.jp/en/contact
Service site: https://qleandataset.visual-bank.co.jp/en/

About Visual Bank Inc.

Visual Bank Inc. is a next-generation data infrastructure company committed to “unleashing the potential of all data.”
The company operates THE PEN, an AI-powered assistance tool for manga artists, and wholly owns Amana Images Inc., which provides the AI training data service Qlean Dataset.

Visual Bank has been recognized in national R&D programs and continues to advance initiatives toward real-world AI implementation.

CEO: Saneyuki Nagai
Address: C-Cube Minami Aoyama Bldg. 6F, 7-1-7 Minami Aoyama, Minato-ku, Tokyo 107-0062
Corporate website: https://visual-bank.co.jp/en/
Amana Images overview: https://qleandataset.visual-bank.co.jp/en/company-overview

    amana images inc.

    Visual Bank Inc.


    © amanaimages inc.