lookout.style.format.uast_stability_checker

Module to check if predictions change the UAST structure of the processed files.

Module Contents

class lookout.style.format.uast_stability_checker.UASTStabilityChecker(feature_extractor:FeatureExtractor, debug:bool=False)

Check if predictions change the UAST structure of the processed files.

See check() or file_check() for more info.

_log
_check_return_type
_parse_code(self, vnode:VirtualNode, parent:bblfsh.Node, content:str, stub:'bblfsh.aliases.ProtocolServiceStub', node_parents:Mapping[int, bblfsh.Node], path:str)

Find a parent node that Babelfish can parse and parse it.

Iterates over the parents of the current virtual node until it is parsable and returns the parsed UAST or None if it reaches the root without finding a parsable parent.

The cache will be used to avoid recomputations for parents that have already been considered.

Parameters:
  • vnode – Vnode that is modified. Used to check that we retrieve the correct parent.
  • parent – First virtual node to try to parse. Will go up in the tree if it fails.
  • content – Content of the file.
  • stub – Babelfish GRPC service stub.
  • node_parents – Parents mapping of the input UASTs.
  • path – Path of the file being parsed.
Returns:

tuple of the parsed UAST and the corresponding starting and ending offsets. None if Babelfish failed to parse the whole file.

_check_file(self, y:numpy.ndarray, y_pred:numpy.ndarray, vnodes_y:Sequence[VirtualNode], vnodes:Sequence[VirtualNode], file:File, stub:'bblfsh.aliases.ProtocolServiceStub', vnode_parents:Mapping[int, bblfsh.Node], node_parents:Mapping[int, bblfsh.Node], rule_winners:numpy.ndarray, grouped_quote_predictions:QuotedNodeTripleMapping)
check(self, y:numpy.ndarray, y_pred:numpy.ndarray, vnodes_y:Sequence[VirtualNode], vnodes:Sequence[VirtualNode], files:Sequence[File], stub:'bblfsh.aliases.ProtocolServiceStub', vnode_parents:Mapping[int, bblfsh.Node], node_parents:Mapping[int, bblfsh.Node], rule_winners:numpy.ndarray, grouped_quote_predictions:QuotedNodeTripleMapping)

Filter the model’s predictions that modify the UAST apart from changing Node positions.

Parameters:
  • y – Numpy 1-dimensional array of labels.
  • y_pred – Numpy 1-dimensional array of predicted labels by the model.
  • vnodes_y – Sequence of the labeled VirtualNode-s corresponding to labeled samples.
  • vnodes – Sequence of all the VirtualNode-s corresponding to the input.
  • files – File or Sequence of File-s with content, uast and path.
  • stub – Babelfish GRPC service stub.
  • vnode_parentsVirtualNode-s’ parents mapping as the LCA of the closest left and right babelfish nodes.
  • node_parents – Parents mapping of the input UASTs.
  • rule_winners – Numpy array with the indexes of the winning rules for each sample.
  • grouped_quote_predictions – Quotes predictions (handled differenlty from the rest).
Returns:

List of predictions indices that are considered valid i.e. that are not breaking the UAST.

static check_uasts_equivalent(uast1:bblfsh.Node, uast2:bblfsh.Node)

Check if 2 UAST nodes are identical regarding roles, internal_type and token of their subtree members.

Parameters:
  • uast1 – The bblfsh.Node of the first UAST to compare.
  • uast2 – The bblfsh.Node of the second UAST to compare.
Returns:

True if the 2 input UASTs are identical and False otherwise.