DefinedData is an online catalog of off-the-shelf AI datasets, available in multiple languages, dialects, domains, and pricing options. These pre-collected datasets, annotated and validated by a global crowd, can be used to train baseline models or evaluate and improve current models. It comprises:
Monologue speech training data
Dialogue speech gold sets
Scripted and spontaneous speech
10 different languages
5 different industry domains
Balanced and wide range of demographics represented
Specialized grammars
Language Coverage
English
,
French
,
Spanish; Castilian
,
Italian
,
Portuguese
,
German
,
Japanese
,
Dutch; Flemish