API REFERENCES
Pecha
DocxRootParser
DocxSimpleCommentaryParser
DocxAnnotationParser
DocxAnnotationUpdate
TranslationAlignmentTransfer
CommentaryAlignmentTransfer
Pecha.from_path() -> Pecha
Loads a Pecha instance from a local path.
Parameters:
pecha_path(Path): Path to the Pecha directory
Returns:
PechainstanceExample:
from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.from_path(Path("/path/to/pecha"))
Pecha.create() -> Pecha
Creates a new Pecha instance in the specified output directory.
Parameters:
output_path(Path): Directory where the Pecha should be createdpecha_id(str, optional): Custom Pecha ID. If not provided, a new ID will be generated
Returns:
PechainstanceExample:
from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.create(Path("./output"))
Pecha.base_path() -> Path
Returns the path to the base directory which contains all the base files. If the directory does not exist, it is created.
Returns: Path object pointing to the base directory
Example:
base_dir = pecha.base_path print(base_dir) # /path/to/pecha/base
Pecha.layer_path() -> Path
Returns the path to the layers directory which contains all the annotation files. If the directory does not exist, it is created.
Returns: Path object pointing to the layers directory
Example:
layer_dir = pecha.layer_path print(layer_dir) # /path/to/pecha/layers
Pecha.metadata_path() -> Path
Returns the path to the metadata file.
Returns: Path object pointing to the metadata file
Example:
metadata_file = pecha.metadata_path print(metadata_file) # /path/to/pecha/metadata.json
Pecha.get_base() -> str
Gets the content of a base file by its name.
Parameters:
base_name(str): Name of the base file
Returns: str containing the base text content
Example:
base_text = pecha.get_base("base1")
Pecha.set_base() -> str
Sets the content of a base file.
Parameters:
content(str): Text content to write to the base filebase_name(str, optional): Name for the base file. If not provided, a new ID will be generated
Returns: str containing the base name
Example:
base_name = pecha.set_base("This is the text content", "base1")
Pecha.add_layer() -> Tuple[AnnotationStore, Path]
Adds a new annotation layer for a given base.
Parameters:
base_name(str): Name of the base file to associate with this layerlayer_type(AnnotationType): Type of annotation layer (must be included in AnnotationType enum)
Returns: Tuple of (AnnotationStore, Path) containing:
AnnotationStore: The created annotation store
Path: Path to the layer file
Example:
from openpecha.pecha.layer import AnnotationType # Add a segmentation layer layer, layer_path = pecha.add_layer("base1", AnnotationType.SEGMENTATION) # Add a chapter layer layer, layer_path = pecha.add_layer("base1", AnnotationType.CHAPTER)
Note: The layer file will be created with a name format of
{layer_type}-{random_id}.jsonin the layers directory under the base name folder.
Pecha.add_annotation() -> AnnotationStore
Adds an annotation to an existing annotation layer (Annotation Store).
Parameters:
ann_store(AnnotationStore): The annotation store/layer to add the annotation toannotation(BaseAnnotation): The annotation object to add (e.g., SegmentationAnnotation, CitationAnnotation)layer_type(AnnotationType): The type of annotation (must match the layer type)
Returns: AnnotationStore with the added annotation
Example:
from openpecha.pecha.annotations import Span, SegmentationAnnotation from openpecha.pecha.layer import AnnotationType # Create a segmentation annotation ann = SegmentationAnnotation(span=Span(start=0, end=10), index=1) # Add the annotation to the layer layer = pecha.add_annotation(layer, ann, AnnotationType.SEGMENTATION) # Save the layer after adding annotations layer.save()
Note:
The annotation’s span must be valid for the base text
The layer_type must match the type of annotation being added
The layer must be saved after adding annotations to persist the changes
Pecha.set_metadata() -> PechaMetaData
Updates the Pecha’s metadata with new values while preserving existing metadata fields if not overridden.
Parameters:
pecha_metadata(Dict): Dictionary containing metadata fields to update. Can include:title(Dict[str, str] | str): Title in different languages or single languageauthor(List[str] | Dict[str, str] | str): Author(s) informationlanguage(str): Language code (e.g., ‘bo’, ‘en’)parser(str): Name of the parser usedinitial_creation_type(str): How the Pecha was createdsource_metadata(Dict): Additional source informationcopyright(Dict): Copyright informationlicence(str): License type
Returns: Updated PechaMetaData object
Example:
# Update metadata with new values pecha.set_metadata({ "title": {"en": "New Title", "bo": "གསར་བཅོས་ཁ་བྱང་།"}, "author": ["Author 1", "Author 2"], "language": "bo", "source_metadata": { "id": "source123", "publisher": "Publisher Name" } }) # Update specific fields while preserving others pecha.set_metadata({ "title": {"en": "Updated Title"}, "copyright": { "year": "2024", "holder": "Copyright Holder" } })
Note:
Existing metadata fields not included in the update dictionary will be preserved
The parser and initial_creation_type fields will be preserved from existing metadata if not specified
The metadata is automatically saved to the metadata.json file
Invalid metadata will raise a ValueError
Pecha.get_layers() -> Generator[Tuple[str, AnnotationStore]
Returns all layers from the Pecha associated with the given base.
Parameters:
base_name(str): Name of the base filefrom_cache(bool, optional): Whether to load from cache. Defaults to False
Returns: Generator yielding tuples of (layer_name, AnnotationStore)
Example:
for layer_name, layer_store in pecha.get_layers("base1"): print(layer_name, layer_store)
Pecha.get_segmentation_layer_path() -> str
Gets the path to the first segmentation layer file.
Returns: str containing the relative path to the segmentation layer file
Example:
layer_path = pecha.get_segmentation_layer_path()
Pecha.get_first_layer_path() -> str
Gets the path to the first layer file.
Returns: str containing the relative path to the first layer file
Example:
layer_path = pecha.get_first_layer_path()
Pecha.get_layer_by_ann_type() -> Union[Tuple[AnnotationStore, Path], Tuple[List[AnnotationStore], List[Path]]]
Gets layers by annotation type.
Parameters:
base_name(str): Name of the base filelayer_type(AnnotationType): Type of annotation to retrieve
Returns: Tuple of (AnnotationStore or list of AnnotationStore, Path or list of Path)
Example:
layer, layer_path = pecha.get_layer_by_ann_type("base1", AnnotationType.SEGMENTATION)
Pecha.get_layer_by_filename() -> Optional[AnnotationStore]
Gets a layer by its filename.
Parameters:
base_name(str): Name of the base filefilename(str): Name of the layer file
Returns: AnnotationStore or None if not found
Example:
layer = pecha.get_layer_by_filename("base1", "segmentation-1234.json")
Pecha.publish() -> None
Publishes the Pecha to GitHub and optionally creates a release with assets.
Parameters:
asset_path(Path, optional): Path to the asset directoryasset_name(str, optional): Name for the asset. Defaults to “source_data”branch(str, optional): Branch to publish to. Defaults to “main”is_private(bool, optional): Whether the repository should be private. Defaults to False
Example:
pecha.publish( asset_path=Path("./assets"), asset_name="source_data", branch="main", is_private=False )
Pecha.merge_pecha() -> None
Merges the layers of a source pecha into the current pecha.
Parameters:
source_pecha(Pecha): The source Pecha instancesource_base_name(str): The base name of the source pechatarget_base_name(str): The base name of the target (current) pecha
Example:
pecha.merge_pecha(source_pecha, "source_base", "target_base")
DocxRootParser.parse() -> Tuple[Pecha, annotation_path]
Parses a DOCX file and creates a Pecha object with annotations.
Parameters:
input(str | Path): Path to the DOCX file to be parsedannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)metadata(Dict): Dictionary containing metadata for the Pechaoutput_path(Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATHpecha_id(str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated
Returns: Tuple containing:
Pecha: The created Pecha instance
annotation_path: Path to the created annotation layer file
Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() pecha, layer_path = parser.parse( input="path/to/file.docx", annotation_type=AnnotationType.SEGMENTATION, metadata={"title": "Sample Title"}, output_path=Path("./output") )
DocxRootParser.extract_anns() -> Tuple[List[BaseAnnotation], str]
Extracts text and annotations from a DOCX file.
Parameters:
docx_file(Path): Path to the DOCX fileannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)
Returns: Tuple containing:
List[BaseAnnotation]: List of extracted annotations
str: The extracted base text
Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() anns, base = parser.extract_anns( Path("path/to/file.docx"), AnnotationType.SEGMENTATION )
DocxRootParser.extract_segmentation_anns() -> Tuple[List[SegmentationAnnotation], str]
Extracts segmentation annotations from numbered text.
Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
Returns: Tuple containing:
List[SegmentationAnnotation]: List of segmentation annotations
str: The concatenated base text
Example:
from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_segmentation_anns(numbered_text)
DocxRootParser.extract_alignment_anns() -> Tuple[List[AlignmentAnnotation], str]
Extracts alignment annotations from numbered text.
Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
Returns: Tuple containing:
List[AlignmentAnnotation]: List of alignment annotations
str: The concatenated base text
Example:
from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_alignment_anns(numbered_text)
DocxSimpleCommentaryParser.parse() -> Tuple[Pecha, annotation_path]
Parses a DOCX file and creates a commentary Pecha object with annotations.
Parameters:
input(str | Path): Path to the DOCX file to be parsedannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)metadata(Dict[str, Any]): Dictionary containing metadata for the Pechaoutput_path(Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATHpecha_id(str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated
Returns: Tuple containing:
Pecha: The created Pecha instance
annotation_path: Path to the created annotation layer file
Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() pecha, layer_path = parser.parse( input="path/to/commentary.docx", annotation_type=AnnotationType.ALIGNMENT, metadata={"title": "Commentary Title", "commentary_of": "P0001"}, output_path=Path("./output") )
DocxSimpleCommentaryParser.extract_anns() -> Tuple[List[BaseAnnotation], str]
Extracts text and annotations from a commentary DOCX file.
Parameters:
docx_file(Path): Path to the DOCX fileannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)
Returns: Tuple containing:
List[BaseAnnotation]: List of extracted annotations
str: The extracted base text
Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() anns, base = parser.extract_anns( Path("path/to/commentary.docx"), AnnotationType.ALIGNMENT )
DocxSimpleCommentaryParser.extract_segmentation_anns() -> Tuple[List[SegmentationAnnotation], str]
Extracts segmentation annotations from numbered commentary text.
Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
Returns: Tuple containing:
List[SegmentationAnnotation]: List of segmentation annotations
str: The concatenated base text
Example:
from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "First commentary segment", "2": "Second commentary segment" } anns, base = parser.extract_segmentation_anns(numbered_text)
DocxSimpleCommentaryParser.extract_alignment_anns() -> Tuple[List[AlignmentAnnotation], str]
Extracts alignment annotations from numbered commentary text, handling root text references.
Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
Returns: Tuple containing:
List[AlignmentAnnotation]: List of alignment annotations with root text references
str: The concatenated base text
Example:
from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "1-2 First commentary segment", "2": "3-4 Second commentary segment" } anns, base = parser.extract_alignment_anns(numbered_text)
Note: The commentary text can include root text references in the format “1-2 Commentary text” where “1-2” refers to the root text segments being commented on.
DocxAnnotationParser.add_annotation() -> Tuple[Pecha, annotation_path]
Adds annotations to an existing Pecha from a DOCX file.
Parameters:
pecha(Pecha): The Pecha instance to add annotations totype(AnnotationType | str): Type of annotation to extract (ALIGNMENT, SEGMENTATION, or FOOTNOTE)docx_file(Path): Path to the DOCX file containing annotationsmetadatas(List[Any]): List of metadata objects to determine if the Pecha is root-related
Returns: Tuple containing:
Pecha: The updated Pecha instance
annotation_path: Path to the created annotation layer file
Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.annotation import DocxAnnotationParser parser = DocxAnnotationParser() pecha, layer_path = parser.add_annotation( pecha=existing_pecha, type=AnnotationType.FOOTNOTE, docx_file=Path("path/to/annotations.docx"), metadatas=[metadata] )
Note:
The parser supports three types of annotations: ALIGNMENT, SEGMENTATION, and FOOTNOTE
For FOOTNOTE annotations, it uses DocxFootnoteParser
For root-related Pechas, it uses DocxRootParser
For other cases, it uses DocxSimpleCommentaryParser
The coordinates of annotations are automatically updated to match the base text
DocxAnnotationUpdate.extract_layer_name() -> str
Extracts the layer name from a layer path.
Parameters:
layer_path(str): Path to the layer file
Returns: str containing the layer name (filename without extension)
Example:
updater = DocxAnnotationUpdate() layer_name = updater.extract_layer_name("path/to/segmentation-1234.json") print(layer_name) # "segmentation-1234"
DocxAnnotationUpdate.extract_layer_id() -> str
Extracts the layer ID from a layer path.
Parameters:
layer_path(str): Path to the layer file
Returns: str containing the layer ID (last part of the filename after the hyphen)
Example:
updater = DocxAnnotationUpdate() layer_id = updater.extract_layer_id("path/to/segmentation-1234.json") print(layer_id) # "1234"
DocxAnnotationUpdate.extract_layer_enum() -> AnnotationType
Extracts the annotation type from a layer path.
Parameters:
layer_path(str): Path to the layer file
Returns: AnnotationType enum value corresponding to the layer type
Example:
updater = DocxAnnotationUpdate() layer_type = updater.extract_layer_enum("path/to/segmentation-1234.json") print(layer_type) # AnnotationType.SEGMENTATION
DocxAnnotationUpdate.update_annotation() -> Pecha
Updates annotations in an existing Pecha from a DOCX file while preserving the layer ID.
Parameters:
pecha(Pecha): The Pecha instance to update annotations inannotation_path(str): Path to the existing annotation layer filedocx_file(Path): Path to the DOCX file containing new annotationsmetadatas(List[Any]): List of metadata objects to determine if the Pecha is root-related
Returns: Updated Pecha instance
Example:
from pathlib import Path from openpecha.pecha.parsers.docx.update import DocxAnnotationUpdate updater = DocxAnnotationUpdate() updated_pecha = updater.update_annotation( pecha=existing_pecha, annotation_path="path/to/segmentation-1234.json", docx_file=Path("path/to/updated_annotations.docx"), metadatas=[metadata] )
Note:
The method preserves the original layer ID when updating annotations
It automatically determines the annotation type from the existing layer path
Uses DocxAnnotationParser internally to handle the actual annotation update
TranslationAlignmentTransfer.is_empty() -> bool
Checks if a text string is empty (contains only whitespace and newlines).
Parameters:
text(str): The text to check
Returns: bool indicating if the text is empty
Example:
transfer = TranslationAlignmentTransfer() is_empty = transfer.is_empty(" \n ") # True is_empty = transfer.is_empty("Some text") # False
TranslationAlignmentTransfer.get_segmentation_ann_path() -> Path
Gets the path to the first segmentation layer JSON file in a Pecha.
Parameters:
pecha(Pecha): The Pecha instance to search in
Returns: Path object pointing to the segmentation layer file
Example:
transfer = TranslationAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha)
TranslationAlignmentTransfer.map_layer_to_layer() -> Dict[int, List[int]]
Maps annotations from source layer to target layer based on span overlap or containment.
Parameters:
src_layer(AnnotationStore): Source annotation layertgt_layer(AnnotationStore): Target annotation layer
Returns: Dictionary mapping source indices to lists of target indices
Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer)
Note:
Maps based on span overlap or containment
Excludes edge overlaps
Returns a sorted dictionary
TranslationAlignmentTransfer.get_root_pechas_mapping() -> Dict[int, List[int]]
Gets mapping from a Pecha’s alignment layer to its segmentation layer.
Parameters:
pecha(Pecha): The Pecha instancealignment_id(str): ID of the alignment layer
Returns: Dictionary mapping alignment indices to segmentation indices
Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json")
TranslationAlignmentTransfer.get_translation_pechas_mapping() -> Dict[int, List]
Gets mapping from segmentation to alignment layer in a translation Pecha.
Parameters:
pecha(Pecha): The translation Pecha instancealignment_id(str): ID of the alignment layersegmentation_id(str): ID of the segmentation layer
Returns: Dictionary mapping segmentation indices to alignment indices
Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.get_translation_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" )
TranslationAlignmentTransfer.mapping_to_text_list() -> List[str]
Flattens a mapping from translation to root text into a list of texts.
Parameters:
mapping(Dict[int, List[str]]): Mapping of indices to text lists
Returns: List of texts, with empty strings for missing indices
Example:
transfer = TranslationAlignmentTransfer() texts = transfer.mapping_to_text_list({1: ["text1"], 3: ["text2"]}) # ["text1", "", "text2"]
TranslationAlignmentTransfer.get_serialized_translation_alignment() -> List[str]
Serializes root translation alignment text mapped to root segmentation text.
Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layerroot_translation_pecha(Pecha): The translation Pecha instancetranslation_alignment_id(str): ID of the translation alignment layer
Returns: List of texts aligned with root segmentation
Example:
transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_alignment( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json" )
TranslationAlignmentTransfer.get_serialized_translation_segmentation() -> List[str]
Serializes root translation segmentation text mapped to root segmentation text.
Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layertranslation_pecha(Pecha): The translation Pecha instancetranslation_alignment_id(str): ID of the translation alignment layertranslation_segmentation_id(str): ID of the translation segmentation layer
Returns: List of texts aligned with root segmentation
Example:
transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_segmentation( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json", "segmentation-9012.json" )
CommentaryAlignmentTransfer.get_first_valid_root_idx() -> int | None
Gets the first valid root index from an annotation’s alignment index.
Parameters:
ann(dict): The annotation dictionary containing alignment_index
Returns: First valid root index or None if no valid indices found
Example:
transfer = CommentaryAlignmentTransfer() idx = transfer.get_first_valid_root_idx({"alignment_index": "1,2-4"}) # 1
CommentaryAlignmentTransfer.is_valid_ann() -> bool
Checks if an annotation is valid (exists and has non-empty text).
Parameters:
anns(Dict[int, Dict[str, Any]]): Dictionary of annotationsidx(int): Index to check
Returns: bool indicating if the annotation is valid
Example:
transfer = CommentaryAlignmentTransfer() is_valid = transfer.is_valid_ann(annotations, 1)
CommentaryAlignmentTransfer.get_segmentation_ann_path() -> Path
Gets the path to the first segmentation layer JSON file in a Pecha.
Parameters:
pecha(Pecha): The Pecha instance to search in
Returns: Path object pointing to the segmentation layer file
Example:
transfer = CommentaryAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha)
CommentaryAlignmentTransfer.index_annotations_by_root() -> Dict[int, Dict[str, Any]]
Indexes annotations by their root index.
Parameters:
anns(List[Dict[str, Any]]): List of annotation dictionaries
Returns: Dictionary mapping root indices to annotation dictionaries
Example:
transfer = CommentaryAlignmentTransfer() indexed_anns = transfer.index_annotations_by_root(annotations)
CommentaryAlignmentTransfer.map_layer_to_layer() -> Dict[int, List[int]]
Maps annotations from source layer to target layer based on span overlap or containment.
Parameters:
src_layer(AnnotationStore): Source annotation layertgt_layer(AnnotationStore): Target annotation layer
Returns: Dictionary mapping source indices to lists of target indices
Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer)
Note:
Maps based on span overlap or containment
Excludes edge overlaps
Returns a sorted dictionary
Handles complex alignment indices (e.g., “1,2-4”)
CommentaryAlignmentTransfer.get_root_pechas_mapping() -> Dict[int, List[int]]
Gets mapping from a Pecha’s alignment layer to its segmentation layer.
Parameters:
pecha(Pecha): The Pecha instancealignment_id(str): ID of the alignment layer
Returns: Dictionary mapping alignment indices to segmentation indices
Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json")
CommentaryAlignmentTransfer.get_commentary_pechas_mapping() -> Dict[int, List[int]]
Gets mapping from commentary Pecha’s segmentation layer to alignment layer.
Parameters:
pecha(Pecha): The commentary Pecha instancealignment_id(str): ID of the alignment layersegmentation_id(str): ID of the segmentation layer
Returns: Dictionary mapping segmentation indices to alignment indices
Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.get_commentary_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" )
CommentaryAlignmentTransfer.get_serialized_commentary() -> List[str]
Serializes commentary annotations with root/segmentation mapping and formatting.
Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layercommentary_pecha(Pecha): The commentary Pecha instancecommentary_alignment_id(str): ID of the commentary alignment layer
Returns: List of formatted commentary texts
Example:
transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json" )
CommentaryAlignmentTransfer.get_serialized_commentary_segmentation() -> List[str]
Serializes commentary segmentation annotations with root/segmentation mapping and formatting.
Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layercommentary_pecha(Pecha): The commentary Pecha instancecommentary_alignment_id(str): ID of the commentary alignment layercommentary_segmentation_id(str): ID of the commentary segmentation layer
Returns: List of formatted commentary texts
Example:
transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary_segmentation( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json", "segmentation-9012.json" )
CommentaryAlignmentTransfer.format_serialized_commentary() -> str
Formats a commentary text with chapter and segment information.
Parameters:
chapter_num(int): Chapter numberseg_idx(int): Segment indextext(str): Commentary text
Returns: Formatted string in the format “
text” Example:
transfer = CommentaryAlignmentTransfer() formatted = transfer.format_serialized_commentary(1, 2, "Commentary text") # "<1><2>Commentary text"
CommentaryAlignmentTransfer.process_commentary_ann() -> str | None
Processes a single commentary annotation and returns the serialized string.
Parameters:
ann(dict): The commentary annotation to processroot_anns(dict): Dictionary of root annotationsroot_map(dict): Mapping from root alignment to segmentationroot_segmentation_anns(dict): Dictionary of root segmentation annotations
Returns: Formatted commentary string or None if not valid
Example:
transfer = CommentaryAlignmentTransfer() result = transfer.process_commentary_ann( commentary_ann, root_anns, root_map, root_segmentation_anns )