Support for languages other than English

The first version of Anagram Genius was written in 1988 and since this time the software has been used to discover tens of thousands of fantastic new English-language anagrams.

However, with the release of version 9 in 2003, anagrams in other languages are now partially supported perhaps opening the door to similar numbers of discoveries in hundreds of new tongues!

The support is only partial at the moment though. There are two main constraints:

  1. The language must be writable with the Latin alphabet (i.e. "A"->"Z") or a subset of it.
  2. The words must be allocated an approximate English part of speech and English grammar rules will be applied.

Constraint 1 probably rules out Greek, Arabic or Russian but at a stretch a language which uses a different alphabet could be used provided it had 26 or fewer distinct letters. This can be achieved by mapping the foreign alphabet onto the Latin alphabet and spelling all their words using the mapped letters.

This technique is probably impractical for completely different alphabets but it may allow some languages with similar alphabets to be used. For example, some eastern European alphabets are largely Latin but have accented letters which are considered completely different letters from their non-accented versions. These accented letters need to be preserved during anagramming but may be mappable onto letters in the Latin alphabet which are not used (or very rarely used) in their language.

In some languages such as French, accents are considered punctuation and it is perfectly acceptable to anagram an e acute (é) say into an e grave (è) or an e with no accent at all. Constraint 1 therefore does not apply to French as the words can simply be listed in the lexicon without accents.

Constraint 2 will result in anagrams being produced with poor grammar and word-ordering in the new language. However, the word ordering can be changed at the Weed Stage and it is likely that many gems will still be present. Early versions of Anagram Genius did no word ordering at all and still regularly produced good anagrams.

The key to implementing a new language is to create a lexicon for the new language in the form of a custom dictionary. Details of how to create and format the custom dictionary are described in the help topic Custom Dictionaries. All the fields except for the part of speech field are exactly the same. The part of speech field needs to be improvised to get as close to English grammar as possible. English parts of speech need to be selected to approximate the word's use in a sentence.

It is important that the custom dictionary containing the words of the new language has a name starting with "lex" - e.g. "lexfrench.txt". This tells Anagram Genius that the words in this custom dictionary should be scored as if they came from its main lexicon and not given a particularly high score as they would if it was a normal custom dictionary. It also means that "real" custom dictionaries can be created in the new language and if selected their words will score better than the ones in the lexicon.

Once the lexicon has been created, creating anagrams in the new language is simply a matter of entering the subject text, selecting the lexicon in the list of custom dictionaries and selecting the Use custom dictionaries only flag to stop English words in the main lexicon being found and used in anagram generation. The result is anagrams in the new language!

This process is new to version 9 and support for other languages will probably improve in the future. However, there has already been some success using these techniques to create German anagrams. If you have developed a lexicon for another language please get in touch by emailing us at genius2000@genius2000.com

If you believe you can market a non-English version of Anagram Genius we also definitely want to hear from you!