`prepare_dataset`¶

Filter and prepare dataset for evaluation. It should be launched on dataset prepared by typos_preprocessing.ipynb.

Module Contents¶

prepare_dataset.Changes¶

prepare_dataset.COLUMNS = ['identifier', 'correct_id', 'filename', 'line', 'commit', 'repository']¶

prepare_dataset.NEW_COLUMNS¶

prepare_dataset.COL2IND¶

prepare_dataset.NEW_COL2IND¶

class prepare_dataset.IdentifierFileCommitRanger(*, filename:str, repository:str, identifier:str, commit:str, directory:Optional[str]=None)¶

Find first commit where identifier was added to the file.

_log¶

_run_cmd(self, cmd, step, cwd=None, env=None)¶

_clone(self)¶

_checkout(self)¶

_blame(self, filename=None)¶

static _validate_date(text)¶

_get_full_hash(self, short_hash)¶

_get_diff(self)¶

_to_changes(self, line)¶

_pipeline(self)¶

__call__(self)¶

static _find_deleted_file(text, filename=None)¶

prepare_dataset._parallel_comp(args)¶

prepare_dataset.pipeline(input_csv, output_csv, n_cores=1, cache='/tmp')¶

Find first commit hash of appearing identifier in file.

Parameters:	input_csv – Path to input csv. output_csv – Path to store result csv. n_cores – How many cores to use. cache – Cache location. If empty - no caching

prepare_dataset.parse_args()¶

prepare_dataset.args¶

`prepare_dataset`¶

Module Contents¶

Lookout Style Analyzer

Navigation

Related Topics

prepare_dataset¶

Module Contents¶

`prepare_dataset`¶