> I'm trying to build a system that, given a piece of text, will try to
> identify the language the text is written in...
>
> If anyone could help me out, and supply me with a couple of pages of text in
> any language (including diacritics, special characters, etc.), or if you
> know where I can find sources of text ... , I'd be most grateful.
Niix maicqui,
In our languages (Sahaptin, Wasco & Paiute) we use the barred-l, the
t-barred-l, an x with an underline below it, and an accent mark. We
can get everything except the accent mark on our ASCII keyboard. The
accent works as long as it's over a vowel; some of our words have an
accent over an m or an s.
Good luck with your project.
Alice Harman
aharman@teleport.com