Re: A request for text samples in Native Languages

Alice Harman (aharman@teleport.com)
Mon, 16 Oct 1995 12:55:01 -0700


nramshaw@pine.shu.ac.uk (Neil Ramshaw) writes:

> I'm trying to build a system that, given a piece of text, will try to
> identify the language the text is written in...
>
> If anyone could help me out, and supply me with a couple of pages of text in
> any language (including diacritics, special characters, etc.), or if you
> know where I can find sources of text ... , I'd be most grateful.

Niix maicqui,
In our languages (Sahaptin, Wasco & Paiute) we use the barred-l, the
t-barred-l, an x with an underline below it, and an accent mark. We
can get everything except the accent mark on our ASCII keyboard. The
accent works as long as it's over a vowel; some of our words have an
accent over an m or an s.
Good luck with your project.

Alice Harman
aharman@teleport.com