New AlchemyAPI Release: Identify content in 97 languages!

Posted by: admin on August 6th, 2009

Another AlchemyAPI release is upon us, providing a significant update to our “automatic language identification” capability, new performance and character encoding enhancements, and more!

AlchemyAPI utilizes statistical and lexical techniques to automatically determine the language of any content it processes.  Our natural language processing core leverages this information to provide increased accuracy when categorizing text, extracting keywords and named entities, and so on.

Our new automatic language identification capability represents a huge leap in functionality; AlchemyAPI is now capable of identifying content written in 97 different languages!  This includes nearly all of the world’s major languages, in addition to uncommon / regional dialects such as Cherokee, Dakota, and Macedonian, etc.

AlchemyAPI’s language detector is extremely robust, capable of generating a match from only a few words of text.  It operates at a high level of precision, and is extremely fast.  We detect more languages, with better accuracy, than any other language detection service in the industry today.

An interactive demo of our language detector is available here.

Our language detection API now also returns additional data: ISO-639 language codes, links to language information on Ethnologue and Wikipedia, the number of native speakers for each language, and more!

If you are currently doing language / information-extraction research, are a linguist, or are working with a language that we do not currently support, we’d love to hear from you.

Entry Filed under: AlchemyAPI, Company, NLP, Releases


Leave a Comment

Required
Hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed