Qlean Dataset Launches the Japanese Two-Speaker Fashion & Beauty Dialogue Speech Corpus │ Qlean Dataset

12/22/2025

Qlean Dataset Launches the Japanese Two-Speaker Fashion & Beauty Dialogue Speech Corpus

Visual Bank Inc. (Minato-ku, Tokyo; CEO: Saneyuki Nagai, hereinafter “Visual Bank”) has announced the release of the Japanese Two-Speaker Fashion & Beauty Dialogue Speech Corpus Dataset as part of its AI training data solution, Qlean Dataset, operated through its subsidiary Amana Images Inc.

This dataset consists of Japanese dialogue speech recordings featuring two speakers—male and female individuals aged from their 20s to 50s—engaging in conversations centered on fashion and beauty topics.
It is offered as part of Qlean Dataset’s machine learning dataset lineup, AI Data Recipe, and is designed for research and development in speech-based AI, including automatic speech recognition (ASR) and dialogue understanding.

The recorded conversations cover concrete themes such as makeup, outfit coordination, item selection, and fashion trends. Speakers exchange opinions through impressions, advice, and personal experiences.
Rather than relying on scripts, the dialogues proceed at a natural conversational pace, closely reflecting real-world spoken interactions.

The dataset captures speaker turn-taking, interactive responses, and natural topic transitions between two participants. Recorded under conditions similar to everyday conversation, it is well suited for evaluating speech recognition accuracy and contextual understanding performance in practical usage scenarios.
In addition to AI development for user-facing dialogue systems in the fashion and beauty domain, this corpus can be applied broadly across research and industrial environments focused on conversational speech AI.

Overview of the “Japanese Two-Speaker Fashion & Beauty Dialogue Speech Corpus”

Overview	A Japanese dialogue speech corpus featuring two speakers discussing fashion and beauty topics.
Data Type	Audio
Speaker Attributes	Male and female speakers in their 20s to 50s
File Format	MP3 / WAV
Total Duration	Approximately 50 hours (Individual recordings range from approximately 5 to 60 minutes)
Sampling Rate	44.1 kHz
Covered Dialogue Scenarios	・Conversations between two speakers discussing fashion, beauty, style, and trends ・Dialogues addressing concrete topics such as makeup, outfit coordination, and item selection ・Naturally paced conversations without reliance on scripts ・Exchanges involving shared impressions, advice, and personal experiences ・Dialogues spanning a wide range of themes within the fashion and beauty domain
Sample Details	https://qleandataset.visual-bank.co.jp/en/lineup/pn-034

Use Case Examples for the Japanese Two-Speaker Fashion & Beauty Dialogue Speech Corpus

【Research Applications】

Analysis of Speaker Turn-Taking and Response Structures
This dataset can be used to evaluate and validate models that analyze speaker alternation and interactive response patterns in two-speaker dialogue speech for ASR and dialogue understanding research.
Domain-Specific NLP Research Using Dialogue Corpora
Containing vocabulary and expressions specific to the fashion and beauty domain, the dataset supports linguistic feature analysis and domain adaptation studies in NLP research.

【Industrial Applications】

Training Data for Conversational AI Systems
The dataset can be utilized as training data for speech recognition and dialogue understanding models in voice-based AI assistants and chatbots used in fashion and beauty-related services.
Dialogue Understanding Evaluation for Customer Support AI
Featuring natural conversations that include product recommendations and advice, the corpus is suitable for validating dialogue understanding accuracy and response design in customer support and service-oriented AI systems.

About Qlean Dataset

Qlean Dataset is a commercial-use-ready AI training data solution provided by Amana Images Inc., a subsidiary of Visual Bank Inc.
It supports a wide range of data types, including images, videos, audio, 3D assets, and text, enabling both research and commercial AI development in a legally safe environment.
Through collaborations with data partners such as Chiba Lotte Marines Co., Ltd. and Toyo Keizai Inc., Qlean Dataset continues to expand its specialized, industry-focused lineup known as the “AI Data Recipe.”
By reducing the operational burden of data collection and preparation, Qlean Dataset helps organizations establish AI development environments that are both legally compliant and risk-free.

▶ Qlean Dataset: https://qleandataset.visual-bank.co.jp/en
▶ AI Data Recipe: https://qleandataset.visual-bank.co.jp/en/lineup

Key Features of Qlean Dataset

Existing datasets deliverable within one business day
Custom data collection and recording services available

▶ Contact: https://qleandataset.visual-bank.co.jp/en/contact

About Visual Bank Inc.

Visual Bank Inc. is a Tokyo-based startup building Next-Generation Data infrastructure to enhance AI development capabilities under the mission “Unlocking Data Accessibility.”
The company operates THE PEN, an AI-assisted creative tool for manga artists and the Qlean Dataset service.
Its subsidiaries include Amana Images Inc., one of Japan’s largest photostock providers; Qlean Dataset, which leads research and development in AI data; and THE PEN Inc., an AI-assisted creative tool for manga artists.

CEO: Saneyuki Nagai
Address: 6F, C-Cube Minami Aoyama Building, 7-1-7 Minami-Aoyama, Minato-ku, Tokyo 107-0062
Corporate Site: https://visual-bank.co.jp/en
Amana Images: https://qleandataset.visual-bank.co.jp/en/company-overview

Back to News