:mod:`lookout.style.format.benchmarks.general_report` ===================================================== .. py:module:: lookout.style.format.benchmarks.general_report .. autoapi-nested-parse:: Facilities to report the quality of a given model on a given dataset. Module Contents --------------- .. function:: generate_quality_report(language:str, report:Mapping[str, Any], ptr:ReferencePointer, vnodes:Sequence[VirtualNode], max_files:int, name:str) Generate report: classification report, confusion matrix, files with most errors. .. function:: generate_model_report(model:FormatModel, analyze_config:Dict[str, Any], languages:Optional[Union[str, Iterable[str]]]=None) Generate report about model - description for each rule, min/max support, min/max confidence. :param model: trained format model. :param analyze_config: config that is used at the analysis stage. It is needed to calculate the real number of enabled rules. :param languages: Languages for which report should be created. You can specify one language as string, several as list of strings or None for all languages in the model. :return: report in str format. .. py:class:: FakeStub(files:Iterable[File], changes:Iterable[Change]) Data service stub mock which returns the list of bound files and changes. .. method:: GetFiles(self, *args, **kwargs) Return the list of File-s. .. method:: GetChanges(self, _) Return the list of Change-s. .. py:class:: FakeDataService(bblfsh_client:bblfsh.BblfshClient, files:Iterable[File], changes:Iterable[Change]) Data service mock which returns the list of bound files and changes through FakeStub. .. method:: get_bblfsh(self) Return the Babelfish gRPC stub. .. method:: get_data(self) Return the FakeStub to pretend that the server is running. .. method:: check_bblfsh_driver_versions(self, versions:Iterable[str]) Do not care about the versions here. .. function:: analyze_files(analyzer_type:Type[FormatAnalyzer], config:dict, model_path:str, language:str, bblfsh_addr:str, input_pattern:str, log:logging.Logger) Run the model, record the fixes for each file and return them. .. function:: print_reports(input_pattern:str, bblfsh_addr:str, language:str, model_path:str, config:Union[str, dict]='{}') Print reports for a given model on a given dataset. .. py:class:: FormatAnalyzerSpy Bases::class:`lookout.style.format.analyzer.FormatAnalyzer` The class which runs the FormatAnalyzer and returns the found fixes. .. attribute:: Changes .. method:: run(self, ptr_from:ReferencePointer, data_service_head:DataService, data_service_base:Optional[DataService]=None) Run `generate_file_fixes` for all files in ptr_from revision. :param ptr_from: Git repository state pointer to the base revision. :param data_service_head: Connection to the Lookout data retrieval service to get the new files. :param data_service_base: Connection to the Lookout data retrieval service to get the initial files. If it is None, we assume the empty contents. :return: Generator of fixes for each file. .. classmethod:: train(cls, ptr:ReferencePointer, config:Mapping[str, Any], data_service:DataService, **data) Train a model given the files available or load the existing model. If you set config["model"] to path in the file system model will be loaded otherwise a model is trained in a regular way. :param ptr: Git repository state pointer. :param config: Configuration dict. :param data: Contains "files" - the list of files in the pointed state. :param data_service: Connection to the Lookout data retrieval service. :return: FormatModel containing the learned rules, per language. .. py:class:: ReportAnalyzer Bases::class:`lookout.style.format.benchmarks.general_report.FormatAnalyzerSpy` Base class for different kind of reports. * analyze - generate report for all files. If you want only aggregated report set aggregate flag to True in analyze config. * train - train or load the model. Child classes are required to implement 2 methods: * generate_report * generate_model_report (optional - by default it will return empty string) .. attribute:: default_config .. classmethod:: get_report_names(cls) Get all available report names. :return: List of report names. .. method:: generate_reports(self, fixes:Iterable[FileFix]) General function to generate reports. :param fixes: List of fixes per file or for all files if config["aggregate"] is True. :return: Dictionary with report names as keys and report string as values. .. method:: analyze(self, ptr_from:ReferencePointer, ptr_to:ReferencePointer, data_service:DataService, **data) Analyze ptr_from revision and generate reports for all files in it. If you want to get an aggregated report set aggregate flag to True in analyze config. :param ptr_from: Git repository state pointer to the base revision. :param ptr_to: Git repository state pointer to the head revision. Not used. :param data_service: Connection to the Lookout data retrieval service. :param data: Contains "files" - the list of changes in the pointed state. :return: List of comments. .. py:class:: QualityReportAnalyzer Bases::class:`lookout.style.format.benchmarks.general_report.ReportAnalyzer` Generate basic quality reports for the model. * analyze - generate report for all files. If you want only aggregated report set aggregate flag to True in analyze config. * train - train or load the model. It is possible to run this analyzer independently and query it with lookout-sdk. If you want to use pretrained model it is possible to specify it in config, for example: `--config-json='{"style.format.analyzer.FormatAnalyzer": {"model": "/saved/model.asdf"}}` Otherwise model will be trained with `FormatAnalyzer.train()` Usage examples: 1) Launch analyzer: `analyzer run lookout.style.format.quality_report_analyzer -c config.yml` 2) Query analyzer 2.1) Get one quality report per file for pretrained model /saved/model.asdf: ``` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2 --config-json='{"style.format.analyzer.FormatAnalyzer": {"model": "/saved/model.asdf"}}' ``` 2.2) Get aggregated quality report for all files without pretrained model ``` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2 --config-json='{"style.format.analyzer.FormatAnalyzer": {"aggregate": true}}' ``` .. attribute:: version :annotation: = 1 .. attribute:: description :annotation: = Source code formatting quality report generator: whitespace, new lines, quotes, etc. .. attribute:: default_config .. classmethod:: get_report_names(cls) Get all available report names. :return: Tuple with report names. .. method:: generate_reports(self, fixes:Iterable[FileFix]) Generate model train and test reports. Model report generated only if config["aggregate"] is True. :param fixes: List of fixes per file or for all files if config["aggregate"] is True. :return: Ordered dictionary with report names as keys and report string as values. .. method:: generate_model_report(self) Generate report about the trained model. :return: report. .. method:: generate_train_report(self, fixes:Iterable[FileFix]) Generate train report: classification report, confusion matrix, files with most errors. :return: report. .. method:: generate_test_report(self) Generate report on the test dataset. :return: Report.