:mod:`filter_dataset` ===================== .. py:module:: filter_dataset Module Contents --------------- .. function:: remove_non_typos(dataset:str, filtered_dataset:str) Remove non-typo-ed identifiers from the dataset. 1. Remove examples, where token splits of the wrong and the correct identifiers are equal (they differ in non-alpha chars or casing). 2. Remove examples, where wrong and correct identifiers are equal on lemmas level. :param dataset: Path to the dataset. :param filtered_dataset: Path to save the filtered dataset to.