lookout.style.typos.research.dev_utils¶
Module Contents¶
-
lookout.style.typos.research.dev_utils.extract_embeddings_from_fasttext(fasttext:FastText, tokens:Iterable[str])¶ Convert the embeddings from FastText to a dense matrix.
Parameters: - fasttext – trained embeddings.
- tokens – list of tokens - axis Y of the returned matrix.
Returns: matrix with extracted embeddings.
-
lookout.style.typos.research.dev_utils.rand_bool(true_prob)¶ Returns True with probability true_prob
-
lookout.style.typos.research.dev_utils.detection_score(typos, suggestions)¶ Calculates score of solution for typo detection problem.
typos: DataFrame which indexed by “id” and has columns “typo”, “corrupted”.
- suggestions: {id : [(candidate, correct_prob)]}, candidates are sorted
- by correct_prob in a descending order .
-
lookout.style.typos.research.dev_utils.first_k_set(corrections, k)¶
-
lookout.style.typos.research.dev_utils.score_at_k(typos, suggestions, k)¶ Calculates score of solution for typo correction problem. The suggestions for typo correction are considered correct if there is a right one among the first k.
- typos: DataFrame which is indexed by “id” and
- has columns “typo”, “corrupted”.
- suggestions: {id : [(candidate, correct_prob)]},
- candidates inside one suggestions list are sorted by correct_prob in a descending order.
-
lookout.style.typos.research.dev_utils.correction_score(typos, corrections)¶ Equal to score_at_k(typos, corrections, 1).
-
lookout.style.typos.research.dev_utils.accuracy(score)¶
-
lookout.style.typos.research.dev_utils.precision(score)¶
-
lookout.style.typos.research.dev_utils.recall(score)¶
-
lookout.style.typos.research.dev_utils.f1(score)¶
-
lookout.style.typos.research.dev_utils.print_score_metrics(score, file=None)¶
-
lookout.style.typos.research.dev_utils.print_suggestion_results(typos, suggestions, file=None)¶