lookout.style.format.annotations.annotated_data

Module Contents

exception lookout.style.format.annotations.annotated_data.NoAnnotation

Bases:Exception

Raised by AnnotationManager methods if there is no Annotation found.

See documentation about AnnotationManager.find_overlapping_annotation() or AnnotationManager.find_covering_annotation() for more information.

class lookout.style.format.annotations.annotated_data.AnnotationsSpan(start, stop, *args, **kwargs)

Bases:dict

Annotations collection for a specific span (or range).

Dictionary-like object.

start
stop
span
class lookout.style.format.annotations.annotated_data.AnnotationManager(sequence:str)

Manager of Annotation-s for a text, e.g. source code.

All the methods to work with annotated data should be placed in this class. Candidates can be found here: https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/JCasUtil.html

sequence
__len__(self)

Return the size of the underlying sequence.

__getitem__(self, item:Union[int, slice, Tuple[int, int]])

Get the underlying sequence item or sequence slice for the specified range.

Parameters:item – index, slice or (start, stop) tuple.
Returns:The corresponding part of the sequence.
count(self, annotation_type:Type[Annotation])

Count the number of annotations of a specific type.

add(self, *annotations)

Add several annotations.

_add(self, annotation:Annotation)

Add an annotation. Annotations of the same type may not overlap.

get(self, annotation_type:Type[Annotation], span:Optional[Tuple[int, int]]=None)

Return an annotation for the given span and type.

Looking for an exact (type and span) match only.

Parameters:
  • annotation_type – Annotation type to get.
  • span – Annotation span (range) to get. If span is not specified it returns an annotation that cover all content (aka global annotation).
Returns:

Requested Annotation.

iter_by_type_nested(self, annotation_type:Type[Annotation], *covered_by, start_offset:Optional[int]=None)

Iterate over annotations of the specified type which are covered by some other annotation types.

Iteration goes over annotation_type objects. Annotations which are specified in covered_by are added to the resulting AnnotationsSpan object. Spans of the additional annotations should fully cover the spans of annotation_type. For example, suppose that you have line and token annotations. Each line contains several tokens. If you try to iterate through line and set token as covered_by annotation you get only line annotation inside AnnotationsSpan. It happens because you can annotate token with line but not line with token: token is covered by only one line and not vice versa.

So, manager.iter_by_type_nested(line, token) # gives you only line annotation as output,

# token annotations not found
manager.iter_by_type_nested(token, line) # gives you token and line annotations
# because it is possible to find only one # covering line annotation.

covered_by can’t be empty. If you need to iterate over a single annotation type, you should call AnnotationManager.iter_annotation() instead.

Parameters:
  • annotation_type – Type of annotation to iterate through.
  • covered_by – Additional annotations that should be added to the main one if they cover its span.
  • start_offset – Start to iterate from a specific offset. Can be used as a key argument only.
Returns:

Iterator over annotations of the requested type.

iter_by_type(self, annotation_type:Type[Annotation], *, start_offset:Optional[int]=None)

Iterate over annotations of the specified type.

If you need to iterate through several annotations use AnnotationManager.iter_annotations() instead.

Parameters:
  • annotation_type – Type of annotation to iterate through.
  • start_offset – Start to iterate from the spesific offset. Can be used as a key argument only.
Returns:

Iterator through annotations of requested type.

_find_annotations(self, annotation_type:Type[Annotation], start:int, stop:int, inspect:Callable, action:str)
find_overlapping_annotation(self, annotation_type:Type[Annotation], start:int, stop:int)

Find an annotation of the given type that intersects the interval [start, stop).

Parameters:
  • annotation_type – Annotation type to look for.
  • start – Start of the search interval.
  • stop – End of the search interval. Stop point itself is excluded.
Raises:

NoAnnotation – There is no such annotation that overlaps with the given interval.

Returns:

Annotation of the requested type.

find_covering_annotation(self, annotation_type:Type[Annotation], start:int, stop:int)

Find an annotation of the given type that fully covers the interval [start, stop).

Parameters:
  • annotation_type – Annotation type to look for.
  • start – Start of the search interval.
  • stop – End of the search interval. Stop point itself is excluded.
Raises:

NoAnnotation – There is no such annotation that overlaps with the given interval.

Returns:

Annotation of the requested type.

classmethod _check_spans_overlap(cls, start1:int, stop1:int, start2:int, stop2:int)

Check if two spans have at least 1 common point.

Span 1 is [start1, stop1). stop1 itself is excluded. Span 2 is [start2, stop2). stop2 itself is excluded.

Everywhere in next examples x < y < z. Corner cases explained: 1. [x, y) and [y, z) have no overlap because y is excluded from the 1st interval. 2. 0-intervals:

2.1. [y, y) and [y, y) are overlapping because it is the same interval. 2.2. [y, y) and [y, z) have no overlap. 2.3. [x, y) and [y, y) have no overlap. 2.4. [x, z) and [y, y) are overlapping because [x, z) fully covers y point.

Despite the fact that overlapping rules are defined for 0-intervals, it is unsafe to rely on them. If you want to get an additional annotation of the 0-interval annotation, link one annotation to another. See TokenAnnotation for example.

Parameters:
  • start1 – Start offset of the first span.
  • stop1 – Stop offset of the first span.
  • start2 – Start offset of the second span.
  • stop2 – Stop offset of the second span.
Returns:

True if two spans overlap, otherwise False.

classmethod from_file(cls, file:UnicodeFile)

Create AnnotationManager instance from UnicodeFile.

Parameters:filefile.content will be used as data to be annotated with file.path, file.language and file.uast.
Returns:new AnnotationManager instance.