:mod:`lookout.style.typos.ranking`
==================================

.. py:module:: lookout.style.typos.ranking

.. autoapi-nested-parse::

   Ranking typo correction candidates using a GBT.


Module Contents
---------------


.. py:class:: CandidatesRanker(config:Optional[Mapping[str, Any]]=None, **kwargs)

   Bases::class:`modelforge.Model`

   
   Rank typos correcting candidates based on given features.     XGBoost classifier is used.


   .. attribute:: _log
      

   .. attribute:: NAME
      :annotation: = candidates_ranks 

      
   .. attribute:: VENDOR
      :annotation: = source{d} 

      
   .. attribute:: DESCRIPTION
      :annotation: = Model that ranks candidates according to their probability to fix the typo. 

      
   .. attribute:: LICENSE
      

   .. method:: set_config(self, config:Optional[Mapping[str, Any]]=None)

      
      Update ranking configuration.

      :param config: Ranking configuration, options:
                     train_rounds: Number of training rounds (int).
                     early_stopping: Early stopping parameter (int).
                     boost_param: Boosting parameters (dict).

      
   .. method:: fit(self, identifiers:pandas.Series, candidates:pandas.DataFrame, features:numpy.ndarray, val_part:float=0.1)

      
      Train booster on the given data.

      :param identifiers: Series containing column right corrections and indexed in                             correspondence with typos from which candidates were generated.
      :param candidates: DataFrame containing information about candidates for correction.                            Columns are [Columns.Id, Columns.Token, Columns.Candidate].
      :param features: Matrix of features for candidates.
      :param val_part: Part of data used for validation.

      
   .. method:: rank(self, candidates:pandas.DataFrame, features:numpy.ndarray, n_candidates:int=3, return_all:bool=True)

      
      Assign the correctness probability value for each of the candidates.

      :param candidates: DataFrame containing information about candidates for correction.
      :param features: Matrix of features for candidates.
      :param n_candidates: Number of most probably correct candidates to return for each typo.
      :param return_all: False to return corrections only for typos corrected in the                            first candidate.
      :return: Dictionary `{id : [(candidate, correctness_proba), ...]}`, candidates are sorted                  by correctness probability in a descending order.

      
   .. method:: dump(self)

      
      Describe the model for introspection.

      
   .. method:: __eq__(self, other:'CandidatesRanker')

      
   .. staticmethod:: _create_labels(identifiers:pandas.Series, candidates:pandas.DataFrame)

      
   .. method:: _generate_tree(self)

      
   .. method:: _load_tree(self, tree:dict)