Unicode & IPA Characters

The following table lists resources that enable corpus compilers/transcribers to understand or produce corpus materials in formats that can be universally exchanged, at least as far as character encoding is concerned.

Scriptsource

IPA Fonts and fonts for many languages from the Summer Institute of Linguistics. See also the fonts page at the LINGUIST LIST site and the Glasgow University site.

Alan Wood’s Unicode Resources

Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications

Font set for Japanese, Chinese and other characters (Konjaku-Mojikyo)

The "Konjaku-Mojikyo", includes about 20,000 Chinese characters defined by Unicode (ISO 10646), and about 50,000 Chinese characters collected in the "Dai Kanwa Dictionary" by Professor Morohashi. Plus: oracle bone inscription, Siddham (Sanskrit) characters, Japanese Kana, Chu Nom (used in medieval times in Vietnam), Shui Script (used by the Chinese ethnic minority group), Tangut (Xixia) Script, symbols and so forth.

IPAKLICK

A browser-based tool (javascript keyboard) that makes it easy to insert strings of IPA-symbols (Unicode) into a text (via the clipboard). The site also contains links to free Unicode fonts that include the IPA-symbols and to 'superlinguistic' names for consonants, in which the coarticulatory and perceptual effects of consonants on vowels are exploited.

IPA-SAM phonetic fonts

Free TrueType® fonts for Windows. With them installed, you can display phonetic symbols on the screen and print them out in any size. The IPA-SAM character set includes all the symbols of the International Phonetic Alphabet as currently recognized by the IPA. There are three typefaces: Doulos (similar to Times), Sophia (san serif) and Manuscript (similar to Courier, monospaced). All are available in regular, bold, italic, and bold italic.

(Phonetic) Transcription Editor

As the name says, mainly an editor for creating phonetic transcriptions, which allows output to be saved to a UTF-8 encoded text file or (double-spaced) HTML page, suitable for submission of assignments.

The program also provides a (very basic) option for grapheme–to–phoneme conversion, which, however, has some serious limitations, as it ‘knows’ nothing about strong and weak syllables or features of connected speech.

The characters that appear on the toolbars can also be edited, provided the user has write-access to the installation folder.

Unicode web site

Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. Unicode enables a single software product or a single web site to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption.