lookout.style.format.benchmarks.general_report

Facilities to report the quality of a given model on a given dataset.

Module Contents

lookout.style.format.benchmarks.general_report.generate_quality_report(language:str, report:Mapping[str, Any], ptr:ReferencePointer, vnodes:Sequence[VirtualNode], max_files:int, name:str)

Generate report: classification report, confusion matrix, files with most errors.

lookout.style.format.benchmarks.general_report.generate_model_report(model:FormatModel, analyze_config:Dict[str, Any], languages:Optional[Union[str, Iterable[str]]]=None)

Generate report about model - description for each rule, min/max support, min/max confidence.

Parameters:
  • model – trained format model.
  • analyze_config – config that is used at the analysis stage. It is needed to calculate the real number of enabled rules.
  • languages – Languages for which report should be created. You can specify one language as string, several as list of strings or None for all languages in the model.
Returns:

report in str format.

class lookout.style.format.benchmarks.general_report.FakeStub(files:Iterable[File], changes:Iterable[Change])

Data service stub mock which returns the list of bound files and changes.

GetFiles(self, *args, **kwargs)

Return the list of File-s.

GetChanges(self, _)

Return the list of Change-s.

class lookout.style.format.benchmarks.general_report.FakeDataService(bblfsh_client:bblfsh.BblfshClient, files:Iterable[File], changes:Iterable[Change])

Data service mock which returns the list of bound files and changes through FakeStub.

get_bblfsh(self)

Return the Babelfish gRPC stub.

get_data(self)

Return the FakeStub to pretend that the server is running.

check_bblfsh_driver_versions(self, versions:Iterable[str])

Do not care about the versions here.

lookout.style.format.benchmarks.general_report.analyze_files(analyzer_type:Type[FormatAnalyzer], config:dict, model_path:str, language:str, bblfsh_addr:str, input_pattern:str, log:logging.Logger)

Run the model, record the fixes for each file and return them.

lookout.style.format.benchmarks.general_report.print_reports(input_pattern:str, bblfsh_addr:str, language:str, model_path:str, config:Union[str, dict]='{}')

Print reports for a given model on a given dataset.

class lookout.style.format.benchmarks.general_report.FormatAnalyzerSpy

Bases:lookout.style.format.analyzer.FormatAnalyzer

The class which runs the FormatAnalyzer and returns the found fixes.

Changes
run(self, ptr_from:ReferencePointer, data_service_head:DataService, data_service_base:Optional[DataService]=None)

Run generate_file_fixes for all files in ptr_from revision.

Parameters:
  • ptr_from – Git repository state pointer to the base revision.
  • data_service_head – Connection to the Lookout data retrieval service to get the new files.
  • data_service_base – Connection to the Lookout data retrieval service to get the initial files. If it is None, we assume the empty contents.
Returns:

Generator of fixes for each file.

classmethod train(cls, ptr:ReferencePointer, config:Mapping[str, Any], data_service:DataService, **data)

Train a model given the files available or load the existing model.

If you set config[“model”] to path in the file system model will be loaded otherwise a model is trained in a regular way.

Parameters:
  • ptr – Git repository state pointer.
  • config – Configuration dict.
  • data – Contains “files” - the list of files in the pointed state.
  • data_service – Connection to the Lookout data retrieval service.
Returns:

FormatModel containing the learned rules, per language.

class lookout.style.format.benchmarks.general_report.ReportAnalyzer

Bases:lookout.style.format.benchmarks.general_report.FormatAnalyzerSpy

Base class for different kind of reports.

  • analyze - generate report for all files. If you want only aggregated report set aggregate

flag to True in analyze config. * train - train or load the model.

Child classes are required to implement 2 methods: * generate_report * generate_model_report (optional - by default it will return empty string)

default_config
classmethod get_report_names(cls)

Get all available report names.

Returns:List of report names.
generate_reports(self, fixes:Iterable[FileFix])

General function to generate reports.

Parameters:fixes – List of fixes per file or for all files if config[“aggregate”] is True.
Returns:Dictionary with report names as keys and report string as values.
analyze(self, ptr_from:ReferencePointer, ptr_to:ReferencePointer, data_service:DataService, **data)

Analyze ptr_from revision and generate reports for all files in it.

If you want to get an aggregated report set aggregate flag to True in analyze config.

Parameters:
  • ptr_from – Git repository state pointer to the base revision.
  • ptr_to – Git repository state pointer to the head revision. Not used.
  • data_service – Connection to the Lookout data retrieval service.
  • data – Contains “files” - the list of changes in the pointed state.
Returns:

List of comments.

class lookout.style.format.benchmarks.general_report.QualityReportAnalyzer

Bases:lookout.style.format.benchmarks.general_report.ReportAnalyzer

Generate basic quality reports for the model.

  • analyze - generate report for all files. If you want only aggregated report set aggregate

flag to True in analyze config. * train - train or load the model.

It is possible to run this analyzer independently and query it with lookout-sdk. If you want to use pretrained model it is possible to specify it in config, for example: –config-json=’{“style.format.analyzer.FormatAnalyzer”: {“model”: “/saved/model.asdf”}} Otherwise model will be trained with FormatAnalyzer.train()

Usage examples: 1) Launch analyzer: analyzer run lookout.style.format.quality_report_analyzer -c config.yml 2) Query analyzer 2.1) Get one quality report per file for pretrained model /saved/model.asdf: ` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2     --config-json='{"style.format.analyzer.FormatAnalyzer": {"model": "/saved/model.asdf"}}' ` 2.2) Get aggregated quality report for all files without pretrained model ` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2     --config-json='{"style.format.analyzer.FormatAnalyzer": {"aggregate": true}}' `

version = 1
description = Source code formatting quality report generator: whitespace, new lines, quotes, etc.
default_config
classmethod get_report_names(cls)

Get all available report names.

Returns:Tuple with report names.
generate_reports(self, fixes:Iterable[FileFix])

Generate model train and test reports.

Model report generated only if config[“aggregate”] is True.

Parameters:fixes – List of fixes per file or for all files if config[“aggregate”] is True.
Returns:Ordered dictionary with report names as keys and report string as values.
generate_model_report(self)

Generate report about the trained model.

Returns:report.
generate_train_report(self, fixes:Iterable[FileFix])

Generate train report: classification report, confusion matrix, files with most errors.

Returns:report.
generate_test_report(self)

Generate report on the test dataset.

Returns:Report.