obfuscated for machine processing, but is still intended to be readable by humans?
you should consider phonetic metrics and corresponding algorithms may be char-based models can resolve this
i think, you should use a lot of rules (as removing repeatable non-char symbols and mapping as O==0 and other) If you wanna think about machine learning descision of problem, you can use any kinds of word embending (like fasttext or word2vec): you can find similar words by such embending model
Обсуждают сегодня