# API REFERENCES ## Pecha * [Pecha.from_path()](#pechafrom_path) * [Pecha.create()](#pechacreate) * [Pecha.base_path()](#pechabase_path) * [Pecha.layer_path()](#pechalayer_path) * [Pecha.metadata_path()](#pechametadata_path) * [Pecha.get_base()](#pechaget_base) * [Pecha.set_base()](#pechaset_base) * [Pecha.add_layer()](#pechaadd_layer) * [Pecha.add_annotation()](#pechaadd_annotation) * [Pecha.set_metadata()](#pechaset_metadata) * [Pecha.get_layers()](#pechaget_layers) * [Pecha.get_segmentation_layer_path()](#pechaget_segmentation_layer_path) * [Pecha.get_first_layer_path()](#pechaget_first_layer_path) * [Pecha.get_layer_by_ann_type()](#pechaget_layer_by_ann_type) * [Pecha.get_layer_by_filename()](#pechaget_layer_by_filename) * [Pecha.publish()](#pechapublish) * [Pecha.merge_pecha()](#pechamerge_pecha) ## DocxRootParser * [DocxRootParser.parse()](#docxrootparserparse) * [DocxRootParser.extract_anns()](#docxrootparserextract_anns) * [DocxRootParser.extract_segmentation_anns()](#docxrootparserextract_segmentation_anns) * [DocxRootParser.extract_alignment_anns()](#docxrootparserextract_alignment_anns) ## DocxSimpleCommentaryParser * [DocxSimpleCommentaryParser.parse()](#docxsimplecommentaryparserparse) * [DocxSimpleCommentaryParser.extract_anns()](#docxsimplecommentaryparserextract_anns) * [DocxSimpleCommentaryParser.extract_segmentation_anns()](#docxsimplecommentaryparserextract_segmentation_anns) * [DocxSimpleCommentaryParser.extract_alignment_anns()](#docxsimplecommentaryparserextract_alignment_anns) ## DocxAnnotationParser * [DocxAnnotationParser.add_annotation()](#docxannotationparseradd_annotation) ## DocxAnnotationUpdate * [DocxAnnotationUpdate.extract_layer_name()](#docxannotationupdateextract_layer_name) * [DocxAnnotationUpdate.extract_layer_id()](#docxannotationupdateextract_layer_id) * [DocxAnnotationUpdate.extract_layer_enum()](#docxannotationupdateextract_layer_enum) * [DocxAnnotationUpdate.update_annotation()](#docxannotationupdateupdate_annotation) ## TranslationAlignmentTransfer * [TranslationAlignmentTransfer.is_empty()](#translationalignmenttransferis_empty) * [TranslationAlignmentTransfer.get_segmentation_ann_path()](#translationalignmenttransferget_segmentation_ann_path) * [TranslationAlignmentTransfer.map_layer_to_layer()](#translationalignmenttransfermap_layer_to_layer) * [TranslationAlignmentTransfer.get_root_pechas_mapping()](#translationalignmenttransferget_root_pechas_mapping) * [TranslationAlignmentTransfer.get_translation_pechas_mapping()](#translationalignmenttransferget_translation_pechas_mapping) * [TranslationAlignmentTransfer.mapping_to_text_list()](#translationalignmenttransfermapping_to_text_list) * [TranslationAlignmentTransfer.get_serialized_translation_alignment()](#translationalignmenttransferget_serialized_translation_alignment) * [TranslationAlignmentTransfer.get_serialized_translation_segmentation()](#translationalignmenttransferget_serialized_translation_segmentation) ## CommentaryAlignmentTransfer * [CommentaryAlignmentTransfer.get_first_valid_root_idx()](#commentaryalignmenttransferget_first_valid_root_idx) * [CommentaryAlignmentTransfer.is_valid_ann()](#commentaryalignmenttransferis_valid_ann) * [CommentaryAlignmentTransfer.get_segmentation_ann_path()](#commentaryalignmenttransferget_segmentation_ann_path) * [CommentaryAlignmentTransfer.index_annotations_by_root()](#commentaryalignmenttransferindex_annotations_by_root) * [CommentaryAlignmentTransfer.map_layer_to_layer()](#commentaryalignmenttransfermap_layer_to_layer) * [CommentaryAlignmentTransfer.get_root_pechas_mapping()](#commentaryalignmenttransferget_root_pechas_mapping) * [CommentaryAlignmentTransfer.get_commentary_pechas_mapping()](#commentaryalignmenttransferget_commentary_pechas_mapping) * [CommentaryAlignmentTransfer.get_serialized_commentary()](#commentaryalignmenttransferget_serialized_commentary) * [CommentaryAlignmentTransfer.get_serialized_commentary_segmentation()](#commentaryalignmenttransferget_serialized_commentary_segmentation) * [CommentaryAlignmentTransfer.format_serialized_commentary()](#commentaryalignmenttransferformat_serialized_commentary) * [CommentaryAlignmentTransfer.process_commentary_ann()](#commentaryalignmenttransferprocess_commentary_ann) ### `Pecha.from_path() -> Pecha` Loads a Pecha instance from a local path. - **Parameters:** - `pecha_path` (Path): Path to the Pecha directory - **Returns:** `Pecha` instance - **Example:** ```python from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.from_path(Path("/path/to/pecha")) ``` ### `Pecha.create() -> Pecha` Creates a new Pecha instance in the specified output directory. - **Parameters:** - `output_path` (Path): Directory where the Pecha should be created - `pecha_id` (str, optional): Custom Pecha ID. If not provided, a new ID will be generated - **Returns:** `Pecha` instance - **Example:** ```python from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.create(Path("./output")) ``` ### `Pecha.base_path() -> Path` Returns the path to the base directory which contains all the base files. If the directory does not exist, it is created. - **Returns:** Path object pointing to the base directory - **Example:** ```python base_dir = pecha.base_path print(base_dir) # /path/to/pecha/base ``` ### `Pecha.layer_path() -> Path` Returns the path to the layers directory which contains all the annotation files. If the directory does not exist, it is created. - **Returns:** Path object pointing to the layers directory - **Example:** ```python layer_dir = pecha.layer_path print(layer_dir) # /path/to/pecha/layers ``` ### `Pecha.metadata_path() -> Path` Returns the path to the metadata file. - **Returns:** Path object pointing to the metadata file - **Example:** ```python metadata_file = pecha.metadata_path print(metadata_file) # /path/to/pecha/metadata.json ``` ### `Pecha.get_base() -> str` Gets the content of a base file by its name. - **Parameters:** - `base_name` (str): Name of the base file - **Returns:** str containing the base text content - **Example:** ```python base_text = pecha.get_base("base1") ``` ### `Pecha.set_base() -> str` Sets the content of a base file. - **Parameters:** - `content` (str): Text content to write to the base file - `base_name` (str, optional): Name for the base file. If not provided, a new ID will be generated - **Returns:** str containing the base name - **Example:** ```python base_name = pecha.set_base("This is the text content", "base1") ``` ### `Pecha.add_layer() -> Tuple[AnnotationStore, Path]` Adds a new annotation layer for a given base. - **Parameters:** - `base_name` (str): Name of the base file to associate with this layer - `layer_type` (AnnotationType): Type of annotation layer (must be included in AnnotationType enum) - **Returns:** Tuple of (AnnotationStore, Path) containing: - AnnotationStore: The created annotation store - Path: Path to the layer file - **Example:** ```python from openpecha.pecha.layer import AnnotationType # Add a segmentation layer layer, layer_path = pecha.add_layer("base1", AnnotationType.SEGMENTATION) # Add a chapter layer layer, layer_path = pecha.add_layer("base1", AnnotationType.CHAPTER) ``` - **Note:** The layer file will be created with a name format of `{layer_type}-{random_id}.json` in the layers directory under the base name folder. ### `Pecha.add_annotation() -> AnnotationStore` Adds an annotation to an existing annotation layer (Annotation Store). - **Parameters:** - `ann_store` (AnnotationStore): The annotation store/layer to add the annotation to - `annotation` (BaseAnnotation): The annotation object to add (e.g., SegmentationAnnotation, CitationAnnotation) - `layer_type` (AnnotationType): The type of annotation (must match the layer type) - **Returns:** AnnotationStore with the added annotation - **Example:** ```python from openpecha.pecha.annotations import Span, SegmentationAnnotation from openpecha.pecha.layer import AnnotationType # Create a segmentation annotation ann = SegmentationAnnotation(span=Span(start=0, end=10), index=1) # Add the annotation to the layer layer = pecha.add_annotation(layer, ann, AnnotationType.SEGMENTATION) # Save the layer after adding annotations layer.save() ``` - **Note:** - The annotation's span must be valid for the base text - The layer_type must match the type of annotation being added - The layer must be saved after adding annotations to persist the changes ### `Pecha.set_metadata() -> PechaMetaData` Updates the Pecha's metadata with new values while preserving existing metadata fields if not overridden. - **Parameters:** - `pecha_metadata` (Dict): Dictionary containing metadata fields to update. Can include: - `title` (Dict[str, str] | str): Title in different languages or single language - `author` (List[str] | Dict[str, str] | str): Author(s) information - `language` (str): Language code (e.g., 'bo', 'en') - `parser` (str): Name of the parser used - `initial_creation_type` (str): How the Pecha was created - `source_metadata` (Dict): Additional source information - `copyright` (Dict): Copyright information - `licence` (str): License type - **Returns:** Updated PechaMetaData object - **Example:** ```python # Update metadata with new values pecha.set_metadata({ "title": {"en": "New Title", "bo": "གསར་བཅོས་ཁ་བྱང་།"}, "author": ["Author 1", "Author 2"], "language": "bo", "source_metadata": { "id": "source123", "publisher": "Publisher Name" } }) # Update specific fields while preserving others pecha.set_metadata({ "title": {"en": "Updated Title"}, "copyright": { "year": "2024", "holder": "Copyright Holder" } }) ``` - **Note:** - Existing metadata fields not included in the update dictionary will be preserved - The parser and initial_creation_type fields will be preserved from existing metadata if not specified - The metadata is automatically saved to the metadata.json file - Invalid metadata will raise a ValueError ### `Pecha.get_layers() -> Generator[Tuple[str, AnnotationStore]` Returns all layers from the Pecha associated with the given base. - **Parameters:** - `base_name` (str): Name of the base file - `from_cache` (bool, optional): Whether to load from cache. Defaults to False - **Returns:** Generator yielding tuples of (layer_name, AnnotationStore) - **Example:** ```python for layer_name, layer_store in pecha.get_layers("base1"): print(layer_name, layer_store) ``` ### `Pecha.get_segmentation_layer_path() -> str` Gets the path to the first segmentation layer file. - **Returns:** str containing the relative path to the segmentation layer file - **Example:** ```python layer_path = pecha.get_segmentation_layer_path() ``` ### `Pecha.get_first_layer_path() -> str` Gets the path to the first layer file. - **Returns:** str containing the relative path to the first layer file - **Example:** ```python layer_path = pecha.get_first_layer_path() ``` ### `Pecha.get_layer_by_ann_type() -> Union[Tuple[AnnotationStore, Path], Tuple[List[AnnotationStore], List[Path]]]` Gets layers by annotation type. - **Parameters:** - `base_name` (str): Name of the base file - `layer_type` (AnnotationType): Type of annotation to retrieve - **Returns:** Tuple of (AnnotationStore or list of AnnotationStore, Path or list of Path) - **Example:** ```python layer, layer_path = pecha.get_layer_by_ann_type("base1", AnnotationType.SEGMENTATION) ``` ### `Pecha.get_layer_by_filename() -> Optional[AnnotationStore]` Gets a layer by its filename. - **Parameters:** - `base_name` (str): Name of the base file - `filename` (str): Name of the layer file - **Returns:** AnnotationStore or None if not found - **Example:** ```python layer = pecha.get_layer_by_filename("base1", "segmentation-1234.json") ``` ### `Pecha.publish() -> None` Publishes the Pecha to GitHub and optionally creates a release with assets. - **Parameters:** - `asset_path` (Path, optional): Path to the asset directory - `asset_name` (str, optional): Name for the asset. Defaults to "source_data" - `branch` (str, optional): Branch to publish to. Defaults to "main" - `is_private` (bool, optional): Whether the repository should be private. Defaults to False - **Example:** ```python pecha.publish( asset_path=Path("./assets"), asset_name="source_data", branch="main", is_private=False ) ``` ### `Pecha.merge_pecha() -> None` Merges the layers of a source pecha into the current pecha. - **Parameters:** - `source_pecha` (Pecha): The source Pecha instance - `source_base_name` (str): The base name of the source pecha - `target_base_name` (str): The base name of the target (current) pecha - **Example:** ```python pecha.merge_pecha(source_pecha, "source_base", "target_base") ``` ### `DocxRootParser.parse() -> Tuple[Pecha, annotation_path]` Parses a DOCX file and creates a Pecha object with annotations. - **Parameters:** - `input` (str | Path): Path to the DOCX file to be parsed - `annotation_type` (AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT) - `metadata` (Dict): Dictionary containing metadata for the Pecha - `output_path` (Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATH - `pecha_id` (str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated - **Returns:** Tuple containing: - Pecha: The created Pecha instance - annotation_path: Path to the created annotation layer file - **Example:** ```python from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() pecha, layer_path = parser.parse( input="path/to/file.docx", annotation_type=AnnotationType.SEGMENTATION, metadata={"title": "Sample Title"}, output_path=Path("./output") ) ``` ### `DocxRootParser.extract_anns() -> Tuple[List[BaseAnnotation], str]` Extracts text and annotations from a DOCX file. - **Parameters:** - `docx_file` (Path): Path to the DOCX file - `annotation_type` (AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT) - **Returns:** Tuple containing: - List[BaseAnnotation]: List of extracted annotations - str: The extracted base text - **Example:** ```python from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() anns, base = parser.extract_anns( Path("path/to/file.docx"), AnnotationType.SEGMENTATION ) ``` ### `DocxRootParser.extract_segmentation_anns() -> Tuple[List[SegmentationAnnotation], str]` Extracts segmentation annotations from numbered text. - **Parameters:** - `numbered_text` (Dict[str, str]): Dictionary mapping segment numbers to text content - **Returns:** Tuple containing: - List[SegmentationAnnotation]: List of segmentation annotations - str: The concatenated base text - **Example:** ```python from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_segmentation_anns(numbered_text) ``` ### `DocxRootParser.extract_alignment_anns() -> Tuple[List[AlignmentAnnotation], str]` Extracts alignment annotations from numbered text. - **Parameters:** - `numbered_text` (Dict[str, str]): Dictionary mapping segment numbers to text content - **Returns:** Tuple containing: - List[AlignmentAnnotation]: List of alignment annotations - str: The concatenated base text - **Example:** ```python from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_alignment_anns(numbered_text) ``` ### `DocxSimpleCommentaryParser.parse() -> Tuple[Pecha, annotation_path]` Parses a DOCX file and creates a commentary Pecha object with annotations. - **Parameters:** - `input` (str | Path): Path to the DOCX file to be parsed - `annotation_type` (AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT) - `metadata` (Dict[str, Any]): Dictionary containing metadata for the Pecha - `output_path` (Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATH - `pecha_id` (str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated - **Returns:** Tuple containing: - Pecha: The created Pecha instance - annotation_path: Path to the created annotation layer file - **Example:** ```python from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() pecha, layer_path = parser.parse( input="path/to/commentary.docx", annotation_type=AnnotationType.ALIGNMENT, metadata={"title": "Commentary Title", "commentary_of": "P0001"}, output_path=Path("./output") ) ``` ### `DocxSimpleCommentaryParser.extract_anns() -> Tuple[List[BaseAnnotation], str]` Extracts text and annotations from a commentary DOCX file. - **Parameters:** - `docx_file` (Path): Path to the DOCX file - `annotation_type` (AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT) - **Returns:** Tuple containing: - List[BaseAnnotation]: List of extracted annotations - str: The extracted base text - **Example:** ```python from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() anns, base = parser.extract_anns( Path("path/to/commentary.docx"), AnnotationType.ALIGNMENT ) ``` ### `DocxSimpleCommentaryParser.extract_segmentation_anns() -> Tuple[List[SegmentationAnnotation], str]` Extracts segmentation annotations from numbered commentary text. - **Parameters:** - `numbered_text` (Dict[str, str]): Dictionary mapping segment numbers to text content - **Returns:** Tuple containing: - List[SegmentationAnnotation]: List of segmentation annotations - str: The concatenated base text - **Example:** ```python from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "First commentary segment", "2": "Second commentary segment" } anns, base = parser.extract_segmentation_anns(numbered_text) ``` ### `DocxSimpleCommentaryParser.extract_alignment_anns() -> Tuple[List[AlignmentAnnotation], str]` Extracts alignment annotations from numbered commentary text, handling root text references. - **Parameters:** - `numbered_text` (Dict[str, str]): Dictionary mapping segment numbers to text content - **Returns:** Tuple containing: - List[AlignmentAnnotation]: List of alignment annotations with root text references - str: The concatenated base text - **Example:** ```python from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "1-2 First commentary segment", "2": "3-4 Second commentary segment" } anns, base = parser.extract_alignment_anns(numbered_text) ``` - **Note:** The commentary text can include root text references in the format "1-2 Commentary text" where "1-2" refers to the root text segments being commented on. ### `DocxAnnotationParser.add_annotation() -> Tuple[Pecha, annotation_path]` Adds annotations to an existing Pecha from a DOCX file. - **Parameters:** - `pecha` (Pecha): The Pecha instance to add annotations to - `type` (AnnotationType | str): Type of annotation to extract (ALIGNMENT, SEGMENTATION, or FOOTNOTE) - `docx_file` (Path): Path to the DOCX file containing annotations - `metadatas` (List[Any]): List of metadata objects to determine if the Pecha is root-related - **Returns:** Tuple containing: - Pecha: The updated Pecha instance - annotation_path: Path to the created annotation layer file - **Example:** ```python from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.annotation import DocxAnnotationParser parser = DocxAnnotationParser() pecha, layer_path = parser.add_annotation( pecha=existing_pecha, type=AnnotationType.FOOTNOTE, docx_file=Path("path/to/annotations.docx"), metadatas=[metadata] ) ``` - **Note:** - The parser supports three types of annotations: ALIGNMENT, SEGMENTATION, and FOOTNOTE - For FOOTNOTE annotations, it uses DocxFootnoteParser - For root-related Pechas, it uses DocxRootParser - For other cases, it uses DocxSimpleCommentaryParser - The coordinates of annotations are automatically updated to match the base text ### `DocxAnnotationUpdate.extract_layer_name() -> str` Extracts the layer name from a layer path. - **Parameters:** - `layer_path` (str): Path to the layer file - **Returns:** str containing the layer name (filename without extension) - **Example:** ```python updater = DocxAnnotationUpdate() layer_name = updater.extract_layer_name("path/to/segmentation-1234.json") print(layer_name) # "segmentation-1234" ``` ### `DocxAnnotationUpdate.extract_layer_id() -> str` Extracts the layer ID from a layer path. - **Parameters:** - `layer_path` (str): Path to the layer file - **Returns:** str containing the layer ID (last part of the filename after the hyphen) - **Example:** ```python updater = DocxAnnotationUpdate() layer_id = updater.extract_layer_id("path/to/segmentation-1234.json") print(layer_id) # "1234" ``` ### `DocxAnnotationUpdate.extract_layer_enum() -> AnnotationType` Extracts the annotation type from a layer path. - **Parameters:** - `layer_path` (str): Path to the layer file - **Returns:** AnnotationType enum value corresponding to the layer type - **Example:** ```python updater = DocxAnnotationUpdate() layer_type = updater.extract_layer_enum("path/to/segmentation-1234.json") print(layer_type) # AnnotationType.SEGMENTATION ``` ### `DocxAnnotationUpdate.update_annotation() -> Pecha` Updates annotations in an existing Pecha from a DOCX file while preserving the layer ID. - **Parameters:** - `pecha` (Pecha): The Pecha instance to update annotations in - `annotation_path` (str): Path to the existing annotation layer file - `docx_file` (Path): Path to the DOCX file containing new annotations - `metadatas` (List[Any]): List of metadata objects to determine if the Pecha is root-related - **Returns:** Updated Pecha instance - **Example:** ```python from pathlib import Path from openpecha.pecha.parsers.docx.update import DocxAnnotationUpdate updater = DocxAnnotationUpdate() updated_pecha = updater.update_annotation( pecha=existing_pecha, annotation_path="path/to/segmentation-1234.json", docx_file=Path("path/to/updated_annotations.docx"), metadatas=[metadata] ) ``` - **Note:** - The method preserves the original layer ID when updating annotations - It automatically determines the annotation type from the existing layer path - Uses DocxAnnotationParser internally to handle the actual annotation update ### `TranslationAlignmentTransfer.is_empty() -> bool` Checks if a text string is empty (contains only whitespace and newlines). - **Parameters:** - `text` (str): The text to check - **Returns:** bool indicating if the text is empty - **Example:** ```python transfer = TranslationAlignmentTransfer() is_empty = transfer.is_empty(" \n ") # True is_empty = transfer.is_empty("Some text") # False ``` ### `TranslationAlignmentTransfer.get_segmentation_ann_path() -> Path` Gets the path to the first segmentation layer JSON file in a Pecha. - **Parameters:** - `pecha` (Pecha): The Pecha instance to search in - **Returns:** Path object pointing to the segmentation layer file - **Example:** ```python transfer = TranslationAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha) ``` ### `TranslationAlignmentTransfer.map_layer_to_layer() -> Dict[int, List[int]]` Maps annotations from source layer to target layer based on span overlap or containment. - **Parameters:** - `src_layer` (AnnotationStore): Source annotation layer - `tgt_layer` (AnnotationStore): Target annotation layer - **Returns:** Dictionary mapping source indices to lists of target indices - **Example:** ```python transfer = TranslationAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer) ``` - **Note:** - Maps based on span overlap or containment - Excludes edge overlaps - Returns a sorted dictionary ### `TranslationAlignmentTransfer.get_root_pechas_mapping() -> Dict[int, List[int]]` Gets mapping from a Pecha's alignment layer to its segmentation layer. - **Parameters:** - `pecha` (Pecha): The Pecha instance - `alignment_id` (str): ID of the alignment layer - **Returns:** Dictionary mapping alignment indices to segmentation indices - **Example:** ```python transfer = TranslationAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json") ``` ### `TranslationAlignmentTransfer.get_translation_pechas_mapping() -> Dict[int, List]` Gets mapping from segmentation to alignment layer in a translation Pecha. - **Parameters:** - `pecha` (Pecha): The translation Pecha instance - `alignment_id` (str): ID of the alignment layer - `segmentation_id` (str): ID of the segmentation layer - **Returns:** Dictionary mapping segmentation indices to alignment indices - **Example:** ```python transfer = TranslationAlignmentTransfer() mapping = transfer.get_translation_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" ) ``` ### `TranslationAlignmentTransfer.mapping_to_text_list() -> List[str]` Flattens a mapping from translation to root text into a list of texts. - **Parameters:** - `mapping` (Dict[int, List[str]]): Mapping of indices to text lists - **Returns:** List of texts, with empty strings for missing indices - **Example:** ```python transfer = TranslationAlignmentTransfer() texts = transfer.mapping_to_text_list({1: ["text1"], 3: ["text2"]}) # ["text1", "", "text2"] ``` ### `TranslationAlignmentTransfer.get_serialized_translation_alignment() -> List[str]` Serializes root translation alignment text mapped to root segmentation text. - **Parameters:** - `root_pecha` (Pecha): The root Pecha instance - `root_alignment_id` (str): ID of the root alignment layer - `root_translation_pecha` (Pecha): The translation Pecha instance - `translation_alignment_id` (str): ID of the translation alignment layer - **Returns:** List of texts aligned with root segmentation - **Example:** ```python transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_alignment( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json" ) ``` ### `TranslationAlignmentTransfer.get_serialized_translation_segmentation() -> List[str]` Serializes root translation segmentation text mapped to root segmentation text. - **Parameters:** - `root_pecha` (Pecha): The root Pecha instance - `root_alignment_id` (str): ID of the root alignment layer - `translation_pecha` (Pecha): The translation Pecha instance - `translation_alignment_id` (str): ID of the translation alignment layer - `translation_segmentation_id` (str): ID of the translation segmentation layer - **Returns:** List of texts aligned with root segmentation - **Example:** ```python transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_segmentation( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json", "segmentation-9012.json" ) ``` ### `CommentaryAlignmentTransfer.get_first_valid_root_idx() -> int | None` Gets the first valid root index from an annotation's alignment index. - **Parameters:** - `ann` (dict): The annotation dictionary containing alignment_index - **Returns:** First valid root index or None if no valid indices found - **Example:** ```python transfer = CommentaryAlignmentTransfer() idx = transfer.get_first_valid_root_idx({"alignment_index": "1,2-4"}) # 1 ``` ### `CommentaryAlignmentTransfer.is_valid_ann() -> bool` Checks if an annotation is valid (exists and has non-empty text). - **Parameters:** - `anns` (Dict[int, Dict[str, Any]]): Dictionary of annotations - `idx` (int): Index to check - **Returns:** bool indicating if the annotation is valid - **Example:** ```python transfer = CommentaryAlignmentTransfer() is_valid = transfer.is_valid_ann(annotations, 1) ``` ### `CommentaryAlignmentTransfer.get_segmentation_ann_path() -> Path` Gets the path to the first segmentation layer JSON file in a Pecha. - **Parameters:** - `pecha` (Pecha): The Pecha instance to search in - **Returns:** Path object pointing to the segmentation layer file - **Example:** ```python transfer = CommentaryAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha) ``` ### `CommentaryAlignmentTransfer.index_annotations_by_root() -> Dict[int, Dict[str, Any]]` Indexes annotations by their root index. - **Parameters:** - `anns` (List[Dict[str, Any]]): List of annotation dictionaries - **Returns:** Dictionary mapping root indices to annotation dictionaries - **Example:** ```python transfer = CommentaryAlignmentTransfer() indexed_anns = transfer.index_annotations_by_root(annotations) ``` ### `CommentaryAlignmentTransfer.map_layer_to_layer() -> Dict[int, List[int]]` Maps annotations from source layer to target layer based on span overlap or containment. - **Parameters:** - `src_layer` (AnnotationStore): Source annotation layer - `tgt_layer` (AnnotationStore): Target annotation layer - **Returns:** Dictionary mapping source indices to lists of target indices - **Example:** ```python transfer = CommentaryAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer) ``` - **Note:** - Maps based on span overlap or containment - Excludes edge overlaps - Returns a sorted dictionary - Handles complex alignment indices (e.g., "1,2-4") ### `CommentaryAlignmentTransfer.get_root_pechas_mapping() -> Dict[int, List[int]]` Gets mapping from a Pecha's alignment layer to its segmentation layer. - **Parameters:** - `pecha` (Pecha): The Pecha instance - `alignment_id` (str): ID of the alignment layer - **Returns:** Dictionary mapping alignment indices to segmentation indices - **Example:** ```python transfer = CommentaryAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json") ``` ### `CommentaryAlignmentTransfer.get_commentary_pechas_mapping() -> Dict[int, List[int]]` Gets mapping from commentary Pecha's segmentation layer to alignment layer. - **Parameters:** - `pecha` (Pecha): The commentary Pecha instance - `alignment_id` (str): ID of the alignment layer - `segmentation_id` (str): ID of the segmentation layer - **Returns:** Dictionary mapping segmentation indices to alignment indices - **Example:** ```python transfer = CommentaryAlignmentTransfer() mapping = transfer.get_commentary_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" ) ``` ### `CommentaryAlignmentTransfer.get_serialized_commentary() -> List[str]` Serializes commentary annotations with root/segmentation mapping and formatting. - **Parameters:** - `root_pecha` (Pecha): The root Pecha instance - `root_alignment_id` (str): ID of the root alignment layer - `commentary_pecha` (Pecha): The commentary Pecha instance - `commentary_alignment_id` (str): ID of the commentary alignment layer - **Returns:** List of formatted commentary texts - **Example:** ```python transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json" ) ``` ### `CommentaryAlignmentTransfer.get_serialized_commentary_segmentation() -> List[str]` Serializes commentary segmentation annotations with root/segmentation mapping and formatting. - **Parameters:** - `root_pecha` (Pecha): The root Pecha instance - `root_alignment_id` (str): ID of the root alignment layer - `commentary_pecha` (Pecha): The commentary Pecha instance - `commentary_alignment_id` (str): ID of the commentary alignment layer - `commentary_segmentation_id` (str): ID of the commentary segmentation layer - **Returns:** List of formatted commentary texts - **Example:** ```python transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary_segmentation( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json", "segmentation-9012.json" ) ``` ### `CommentaryAlignmentTransfer.format_serialized_commentary() -> str` Formats a commentary text with chapter and segment information. - **Parameters:** - `chapter_num` (int): Chapter number - `seg_idx` (int): Segment index - `text` (str): Commentary text - **Returns:** Formatted string in the format "text" - **Example:** ```python transfer = CommentaryAlignmentTransfer() formatted = transfer.format_serialized_commentary(1, 2, "Commentary text") # "<1><2>Commentary text" ``` ### `CommentaryAlignmentTransfer.process_commentary_ann() -> str | None` Processes a single commentary annotation and returns the serialized string. - **Parameters:** - `ann` (dict): The commentary annotation to process - `root_anns` (dict): Dictionary of root annotations - `root_map` (dict): Mapping from root alignment to segmentation - `root_segmentation_anns` (dict): Dictionary of root segmentation annotations - **Returns:** Formatted commentary string or None if not valid - **Example:** ```python transfer = CommentaryAlignmentTransfer() result = transfer.process_commentary_ann( commentary_ann, root_anns, root_map, root_segmentation_anns ) ```