lookout.style.format.benchmarks.general_report
¶
Facilities to report the quality of a given model on a given dataset.
Module Contents¶
-
lookout.style.format.benchmarks.general_report.
generate_quality_report
(language:str, report:Mapping[str, Any], ptr:ReferencePointer, vnodes:Sequence[VirtualNode], max_files:int, name:str)¶ Generate report: classification report, confusion matrix, files with most errors.
-
lookout.style.format.benchmarks.general_report.
generate_model_report
(model:FormatModel, analyze_config:Dict[str, Any], languages:Optional[Union[str, Iterable[str]]]=None)¶ Generate report about model - description for each rule, min/max support, min/max confidence.
Parameters: - model – trained format model.
- analyze_config – config that is used at the analysis stage. It is needed to calculate the real number of enabled rules.
- languages – Languages for which report should be created. You can specify one language as string, several as list of strings or None for all languages in the model.
Returns: report in str format.
-
class
lookout.style.format.benchmarks.general_report.
FakeStub
(files:Iterable[File], changes:Iterable[Change])¶ Data service stub mock which returns the list of bound files and changes.
-
GetFiles
(self, *args, **kwargs)¶ Return the list of File-s.
-
GetChanges
(self, _)¶ Return the list of Change-s.
-
-
class
lookout.style.format.benchmarks.general_report.
FakeDataService
(bblfsh_client:bblfsh.BblfshClient, files:Iterable[File], changes:Iterable[Change])¶ Data service mock which returns the list of bound files and changes through FakeStub.
-
get_bblfsh
(self)¶ Return the Babelfish gRPC stub.
-
get_data
(self)¶ Return the FakeStub to pretend that the server is running.
-
check_bblfsh_driver_versions
(self, versions:Iterable[str])¶ Do not care about the versions here.
-
-
lookout.style.format.benchmarks.general_report.
analyze_files
(analyzer_type:Type[FormatAnalyzer], config:dict, model_path:str, language:str, bblfsh_addr:str, input_pattern:str, log:logging.Logger)¶ Run the model, record the fixes for each file and return them.
-
lookout.style.format.benchmarks.general_report.
print_reports
(input_pattern:str, bblfsh_addr:str, language:str, model_path:str, config:Union[str, dict]='{}')¶ Print reports for a given model on a given dataset.
-
class
lookout.style.format.benchmarks.general_report.
FormatAnalyzerSpy
¶ Bases:
lookout.style.format.analyzer.FormatAnalyzer
The class which runs the FormatAnalyzer and returns the found fixes.
-
Changes
¶
-
run
(self, ptr_from:ReferencePointer, data_service_head:DataService, data_service_base:Optional[DataService]=None)¶ Run generate_file_fixes for all files in ptr_from revision.
Parameters: - ptr_from – Git repository state pointer to the base revision.
- data_service_head – Connection to the Lookout data retrieval service to get the new files.
- data_service_base – Connection to the Lookout data retrieval service to get the initial files. If it is None, we assume the empty contents.
Returns: Generator of fixes for each file.
-
classmethod
train
(cls, ptr:ReferencePointer, config:Mapping[str, Any], data_service:DataService, **data)¶ Train a model given the files available or load the existing model.
If you set config[“model”] to path in the file system model will be loaded otherwise a model is trained in a regular way.
Parameters: - ptr – Git repository state pointer.
- config – Configuration dict.
- data – Contains “files” - the list of files in the pointed state.
- data_service – Connection to the Lookout data retrieval service.
Returns: FormatModel containing the learned rules, per language.
-
-
class
lookout.style.format.benchmarks.general_report.
ReportAnalyzer
¶ Bases:
lookout.style.format.benchmarks.general_report.FormatAnalyzerSpy
Base class for different kind of reports.
- analyze - generate report for all files. If you want only aggregated report set aggregate
flag to True in analyze config. * train - train or load the model.
Child classes are required to implement 2 methods: * generate_report * generate_model_report (optional - by default it will return empty string)
-
default_config
¶
-
classmethod
get_report_names
(cls)¶ Get all available report names.
Returns: List of report names.
-
generate_reports
(self, fixes:Iterable[FileFix])¶ General function to generate reports.
Parameters: fixes – List of fixes per file or for all files if config[“aggregate”] is True. Returns: Dictionary with report names as keys and report string as values.
-
analyze
(self, ptr_from:ReferencePointer, ptr_to:ReferencePointer, data_service:DataService, **data)¶ Analyze ptr_from revision and generate reports for all files in it.
If you want to get an aggregated report set aggregate flag to True in analyze config.
Parameters: - ptr_from – Git repository state pointer to the base revision.
- ptr_to – Git repository state pointer to the head revision. Not used.
- data_service – Connection to the Lookout data retrieval service.
- data – Contains “files” - the list of changes in the pointed state.
Returns: List of comments.
-
class
lookout.style.format.benchmarks.general_report.
QualityReportAnalyzer
¶ Bases:
lookout.style.format.benchmarks.general_report.ReportAnalyzer
Generate basic quality reports for the model.
- analyze - generate report for all files. If you want only aggregated report set aggregate
flag to True in analyze config. * train - train or load the model.
It is possible to run this analyzer independently and query it with lookout-sdk. If you want to use pretrained model it is possible to specify it in config, for example: –config-json=’{“style.format.analyzer.FormatAnalyzer”: {“model”: “/saved/model.asdf”}} Otherwise model will be trained with FormatAnalyzer.train()
Usage examples: 1) Launch analyzer: analyzer run lookout.style.format.quality_report_analyzer -c config.yml 2) Query analyzer 2.1) Get one quality report per file for pretrained model /saved/model.asdf:
` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2 --config-json='{"style.format.analyzer.FormatAnalyzer": {"model": "/saved/model.asdf"}}' `
2.2) Get aggregated quality report for all files without pretrained model` lookout-sdk review ipv4://localhost:2000 --git-dir /git/dir/ --from REV1 --to REV2 --config-json='{"style.format.analyzer.FormatAnalyzer": {"aggregate": true}}' `
-
version
= 1¶
-
description
= Source code formatting quality report generator: whitespace, new lines, quotes, etc.¶
-
default_config
¶
-
classmethod
get_report_names
(cls)¶ Get all available report names.
Returns: Tuple with report names.
-
generate_reports
(self, fixes:Iterable[FileFix])¶ Generate model train and test reports.
Model report generated only if config[“aggregate”] is True.
Parameters: fixes – List of fixes per file or for all files if config[“aggregate”] is True. Returns: Ordered dictionary with report names as keys and report string as values.
-
generate_model_report
(self)¶ Generate report about the trained model.
Returns: report.
-
generate_train_report
(self, fixes:Iterable[FileFix])¶ Generate train report: classification report, confusion matrix, files with most errors.
Returns: report.
-
generate_test_report
(self)¶ Generate report on the test dataset.
Returns: Report.