Similarity

How it Works?

This page uses differents algorithms to calculate "distance" or "similarity" between 2 texts.

We use the Levenshtein distance which is useful for finding typos or similar phrasings with minor variations. (Wikipedia)

We also use a N-gram overlap algorithm which breaks down the text into overlapping sequences of characters (n-grams) of a specific length (n). The similarity score is then determined by the number of matching n-grams between the two texts. This method is helpful for identifying texts that share similar phrases or word order.









How to read the results?

N-gram Score : the best result is approching 0. When the score is near 1 it means the two texts are very similar.

Levenshtein Score : the best result is a large number. When the score is near 0 it means the two texts are very similar.