lookout.style.typos.metrics

Module Contents

class lookout.style.typos.metrics.ScoreMode

Bases:enum.Enum

Modes for calculation scores of typos correction.

ScoreMode.detection: Typo is detected right: token is corrected when and only when it is not typo-ed. ScoreMode.correction: Correctly spelled tokens should not be corrected. Typo-ed tokens should contain the right correction among first k suggestions. ScoreMode.on_typoed: Same as correction, but only the truly typo-ed tokens are taken into account.

detection = detection
correction = correction
on_typoed = on_typoed
lookout.style.typos.metrics.get_scores(data:pandas.DataFrame, suggestions:Dict[int, List[Candidate]], mode:ScoreMode=ScoreMode.correction, k:int=1)

Calculate the score of the solution of the specific typo correction problem.

Token is considered corrected, when the first suggestion doesn’t match the token. Supports three problems: ScoreMode.detection: Typo is detected right: token is corrected when and only when it is not typo-ed. ScoreMode.correction: Correctly spelled tokens should not be corrected. Typo-ed tokens should contain the right correction among first k suggestions. ScoreMode.on_typoed: Same as correction, but only the truly typo-ed tokens are taken into account. :param data: DataFrame which is indexed by Columns.Id and has columns Column.Token and Column.CorrectToken. :param suggestions: {id : [(candidate, correct_prob)]}, candidates are sorted by correct_prob in a descending order . :param mode: One of ScoreMode.detection, ScoreMode.correction, ScoreMode.on_typoed. :param k: Number of the first suggested corrections to check. Used in modes ScoreMode.correction, ScoreMode.on_typoed. :return: Scores of the suggestions.

lookout.style.typos.metrics.generate_report(data:pandas.DataFrame, suggestions:Dict[int, List[Candidate]])

Print scores for suggestions in an easy readable way.