5/19/2026

Qlean Dataset Launches Japanese Regional Dialect Speech Dataset for ASR, LLM, and TTS Development

Visual Bank, Inc. (Minato, Tokyo; CEO: Masayuki Nagai), through its subsidiary amanaimages Inc., has released the Japanese Regional Dialect Conversational Speech Dataset under its AI training data solution Qlean Dataset.

■  What Is a Japanese Dialect Speech Dataset? 

A speech corpus covering region-specific phonetics, accents, and vocabulary absent from standard Japanese (hyojungo) corpora. Used as ML data for ASR robustness benchmarking, LLM dialect comprehension, and region-specific TTS development. Part of the AI Data Recipe lineup; custom dialect recordings available on request.

■  Dataset Specifications 

Two-speaker spontaneous conversational audio by male and female native speakers of Kansai (Osaka-ben) and Hiroshima dialects. Captures naturalistic intonation, sentence-final particles, and turn-taking patterns reflecting real-world spoken Japanese.

Data Type

Audio (two-speaker dialogue)

Speakers

Native Japanese speakers by region (gender-labeled)

Format

mp3 / wav

Sample Rate

44.1kHz / 48kHz, 16-bit / 24-bit

Dialects

Kansai (Osaka-ben), Hiroshima dialect, and more

License

Commercially licensed

→ Sample data & full details: https://qleandataset.visual-bank.co.jp/en/lineup/ds-098

 

■ FAQ 

Q: How can this dataset be used for ASR development? 
A:Use it to measure WER on Kansai and Hiroshima dialect audio against standard-Japanese-trained models such as Whisper or ESPnet, quantifying the robustness gap. It also serves as fine-tuning data for dialect adaptation via LoRA or full fine-tuning.

Q: How does this dataset support LLM development?
A:Dialogue transcripts with dialect-specific sentence-final particles, particles, and intonation patterns can be used as training or evaluation data for dialect-to-standard-Japanese style transfer, context-dependent semantic interpretation, and discourse structure analysis.

Q: Can this data be used for TTS fine-tuning?
A:Yes. The naturalistic prosody of Kansai and Hiroshima dialects makes this corpus well-suited for fine-tuning models such as VITS or StyleTTS to generate region-specific speech for local service agents, guide robots, or dialogue characters.

Q: Is custom dialect recording available beyond Kansai and Hiroshima?
A:Yes. Qlean Dataset supports custom data collection for additional regional dialects, specific age groups, or targeted conversational scenarios based on your development requirements.

 

■ Use Cases 

  • ASR Robustness Benchmarking
    — Kansai & Hiroshima Dialect Audio Evaluate how well standard-Japanese ASR models handle regional speech variants using WER and CER metrics. Quantify the dialect performance gap before and after domain adaptation.

  • Dialect Adaptation Fine-Tuning

    Use as few-shot or LoRA fine-tuning data to adapt ASR models to regional Japanese. Experiment with standard/dialect corpus mixing ratios to optimize generalization without catastrophic forgetting.

  • LLM Dialect Understanding & Style Transfer

    Train and evaluate LLMs on dialect-to-standard-Japanese conversion, sentiment analysis of dialect text, and discourse structure tasks using authentic conversational transcripts.

  • Region-Specific TTS — Kansai & Hiroshima Dialect Voice Synthesis

    Fine-tune VITS, StyleTTS, or similar architectures on naturalistic dialect prosody to build voice synthesis engines for regional service applications or conversational AI characters.

  • Domain-Adapted STT for Contact Centers

    Build custom language models and lexicons for environments where regional Japanese is prevalent. Combine with custom vocabulary features in Google STT or Amazon Transcribe for region-optimized speech-to-text pipelines.

About Qlean Dataset

Qlean Dataset is a commercially licensed AI training data solution provided by amanaimages Inc., a wholly owned subsidiary of Visual Bank. All datasets are rights-cleared for commercial use, giving AI developers a legally secure environment to source and deploy high-quality training data.
The platform covers audio, image, video, 3D, and text modalities — serving foundation model developers and applied AI teams alike. Through partnerships with domestic and international data holders, broadcasters, newspapers, and newswire agencies, Qlean Dataset continuously expands its AI Data Recipe lineup of industry-specific, trend-driven datasets. Existing datasets ship within 2 business days; custom recording and data collection are also available on request.

URL:https://qleandataset.visual-bank.co.jp/en
URL:https://qleandataset.visual-bank.co.jp/en/products/japanese-language-corpora
Contact

About Visual Bank Inc.

Visual Bank Group is a technology company developing data infrastructure and AI solutions that support advanced AI development. The company operates THE PEN, an AI tool for manga creators, and its subsidiary, amanaimages Inc., provides commercial digital content and AI training data solutions, including Qlean Dataset. Visual Bank is also a selected participant in GENIAC, a Japanese government initiative supporting the advancement of next generation AI technologies.

CEO: Saneyuki Nagai
Website:https://visual-bank.co.jp/en

    amana images inc.

    Visual Bank Inc.


    © amanaimages inc.