:mod:`lookout.style.format.benchmarks.quality_report`
=====================================================

.. py:module:: lookout.style.format.benchmarks.quality_report

.. autoapi-nested-parse::

   Measure quality on several top repositories.


Module Contents
---------------


.. data:: FLOAT_PRECISION
   :annotation: = .3f 

   
.. function:: get_repo_name(url:str)

   
   Extract name of repository from URL.

   :param url: URL for repository.
   :return: name of repository.

   
.. function:: ensure_repo(repository:str, storage_dir:str)

   
   Clones repository if it is an url and returns repository path.

   :param repository: Repository url or directory in the file system.
   :param storage_dir: Clone repository to this directory if it is an url.
   :return: Repository path.

   
.. py:exception:: RestartReport

   Bases::class:`ValueError`

   
   Exception raises if report collection should be restarted.


.. function:: measure_quality(repository:str, from_commit:str, to_commit:str, context:AnalyzerContextManager, config:dict, bblfsh:Optional[str], vnodes_expected_number:Optional[int], restarts:int=3)

   
   Generate `QualityReport` for a repository. If it fails it returns empty reports.

   :param repository: URL of repository.
   :param from_commit: Hash of the base commit.
   :param to_commit: Hash of the head commit.
   :param context: LookoutSDK instance to query analyzer.
   :param config: config for FormatAnalyzer.
   :param bblfsh: Babelfish server address to use. Specify None to use the default value.
   :param vnodes_expected_number: Specify number for expected number of vnodes if known.                                    report collection will be restarted if number of extracted                                    vnodes does not match.
   :param restarts: Number of restarts if number of extracted vnodes does not match.
   :return: Dictionary with all QualityReport reports.

   
.. function:: calc_weighted_avg(arr:Union[Sequence[Sequence], numpy.ndarray], col:int, weight_col:int=5)

   
   Calculate average value in `col` weighted by column `weight_col`.

   
.. function:: calc_avg(arr:Union[Sequence[Sequence], numpy.ndarray], col:int)

   
   Calculate average value in `col`.

   
.. data:: Metrics
   

.. data:: __doc__
   :annotation: = Metrics for the quality report. Metrics are calculated on the samples
 subset where predictions were made. `full_` prefix means that metric was calculated on all
 available samples. Without `full_` means that metric was calculated only on samples where it has
 prediction from the model. `ppcr` means predicted positive condition rate and shows the
 ratio of samples where the model was able to predict.
 

.. function:: _get_metrics(report:str)

   
   Extract avg / total precision, recall, f1 score, support from report.

   
.. function:: _get_model_summary(report:str)

   
   Extract model summary - number of rules and avg. len.

   
.. function:: _get_json_data(report:str)

   
.. function:: handle_input_arg(input_arg:str, log:Optional[logging.Logger]=None)

   
   Process input argument and return an iterator over input data.

   :param input_arg: file to process or `-` to get data from stdin.
   :param log: Logger if you want to log handling process.
   :return: An iterator over input files.

   
.. function:: _generate_report_summary(reports:Iterable[Mapping[str, str]], report_name:str)

   
.. function:: generate_quality_report(input:str, output:str, force:bool, bblfsh:str, config:dict, database:Optional[str]=None, fs:Optional[str]=None)

   
   Generate quality report for the given data. Entry point for command line interface.

   :param input: csv file with repositories to make report. Should contain url, to and from                   columns.
   :param output: Directory where to save results.
   :param force: force to overwrite results stored in output directory if True.                   Stored results will be used if False.
   :param bblfsh: bblfsh address to use.
   :param config: config for FormatAnalyzer.
   :param database: sqlite3 database path to store the models. Temporary file is used if not set.
   :param fs: Model repository file system root. Temporary directory is used if not set.
   :return: