OEM Resources

PDPL

We present a product line of PDPL pronunciation databases (lexicons) for Polish. The databases are directed to developers of Polish speech processing systems. Our databases are the biggest available collections of high quality Polish pronunciation, morphosyntactic and rough rank information. See below the feature chart for the databases.

Please contact us at info@polfonetika.com for more detailed information.

Name PDPL-1 PDPL-2 PDPL-3
Lexemes 10.000 75.000 125.000
Common wordforms 102.000 4.200.000 6.800.000
Wordforms selection the most frequent all all
Proper nouns 5.000 200.000 200.000
Proper nouns types various first/last names, towns, streets, places, companies first/last names, towns, streets, places, companies
Total wordforms 102.000 4.400.000 7.000.000
Lemma no yes yes
Morphosyntactics no yes yes
Rank information no clustered clustered
Format CMU/SPHINX own/XML own/XML
Phonesets SAMPA�86(39), SAMPA�37 SAMPA�86(39), SAMPA�37 SAMPA�86(39), SAMPA�37
Regional pronunciations Cracow-Poznan, Warsaw Cracow-Poznan, Warsaw Cracow-Poznan, Warsaw
Pronunciation contexts as found in corpora isolated isolated
Phonetization type automatic automatic/verified automatic/verified

PLPL

PLPL pronunciation databases are directed to bi-lingual dictionary publishers as well as to publishers of the Polish language learning resources. Entries in the databases consist of base wordform and manually verified transcription(s). See below the feature chart for more details on available PLPL databases.

Please contact us at info@polfonetika.com for more detailed information.

Name PLPL-2
Lexemes 78.000
Common wordforms 72.000
Wordforms selection base form only
Proper nouns 6.000
Proper nouns types various
Total wordforms 78.000
Lemma yes
Morphosyntactics no
Rank information no
Format own/XML
Phonesets SAMPA�86(39), SAMPA�37
Regional pronunciations Warsaw
Pronunciation contexts isolated
Phonetization type verified

PXPL

PXPL pronunciation databases combine features of PDPL and PLPL pronunciation database series with an additional data comprising syllable boundaries, lexical accent marks and detailed rank information.

Please contact us at info@polfonetika.com for more detailed information.