MOSS (Stanford)
Tokenizes each source file, drops comments and whitespace, fingerprints k-gram sequences with a winnowing algorithm, and reports pairs of files that share long matched substrings. Output: a percentage similarity score plus a side-by-side diff view of matched regions.