Filter¶

Antibugs¶

pytexmd.filter.antibugs.raw_remove_comments(input: str) → str[source]¶

Removes comments (lines starting with %) from a raw string.

Parameters:: input (str) – The input string to process.
Returns:: The string with comments removed.
Return type:: str

Example

>>> raw_remove_comments("Hello % comment\nWorld")
'Hello \nWorld'

pytexmd.filter.antibugs.no_more_html_bugs(input: str) → str[source]¶

Fixes HTML bugs by adding spaces around ‘<’ and ‘>’ characters.

Parameters:: input (str) – The input string.
Returns:: The processed string with spaces around ‘<’ and ‘>’.
Return type:: str

Example

>>> no_more_html_bugs("<div>")
' < div > '

pytexmd.filter.antibugs.no_more_dolar_bugs_begin(input: str) → str[source]¶

Replaces escaped dollar signs ($) with a placeholder.

Parameters:: input (str) – The input string.
Returns:: The string with ‘$’ replaced by ‘BACKSLASHDOLLAR’.
Return type:: str

Example

>>> no_more_dolar_bugs_begin("Price is \$5")
'Price is BACKSLASHDOLLAR5'

pytexmd.filter.antibugs.no_more_dolar_bugs_end(input: str) → str[source]¶

Restores dollar signs by replacing the placeholder with ‘$’.

Parameters:: input (str) – The input string.
Returns:: The string with ‘BACKSLASHDOLLAR’ replaced by ‘$’.
Return type:: str

Example

>>> no_more_dolar_bugs_end("Price is BACKSLASHDOLLAR5")
'Price is $5'

pytexmd.filter.antibugs.no_more_textup_bugs_begin(input: str) → str[source]¶

Removes ‘textup’ from the input string.

Parameters:: input (str) – The input string.
Returns:: The string with ‘textup’ removed.
Return type:: str

Example

>>> no_more_textup_bugs_begin("This is \textup{important}")
'This is {important}'

pytexmd.filter.antibugs.remove_empty_at_begin(input: str) → str[source]¶

Removes leading spaces and newlines from the input string.

Parameters:: input (str) – The input string.
Returns:: The string with leading spaces and newlines removed.
Return type:: str

Example

>>> remove_empty_at_begin("   \nHello")
'Hello'

pytexmd.filter.antibugs.only_two_breaks(input: str) → str[source]¶

Ensures that there are at most two consecutive line breaks in the input string.

Parameters:: input (str) – The input string.
Returns:: The processed string with at most two consecutive line breaks.
Return type:: str

Example

>>> only_two_breaks("a<br><br><br>b")
'a<br><br>b'

pytexmd.filter.antibugs.no_more_bugs_begin(input: str) → str[source]¶

Applies a series of bug fixes to the input string at the beginning of processing.

Parameters:: input (str) – The input string.
Returns:: The processed string after applying bug fixes.
Return type:: str

Example

>>> no_more_bugs_begin("Some \$text <div> \textup{here}")
'Some BACKSLASHDOLLARtext  < div >  {here}'

pytexmd.filter.antibugs.no_more_bugs_end(input: str) → str[source]¶

Applies a series of bug fixes to the input string at the end of processing.

Parameters:: input (str) – The input string.
Returns:: The processed string after applying bug fixes.
Return type:: str

Example

>>> no_more_bugs_end("Some BACKSLASHDOLLARtext")
'Some $text'

Core¶

Core filter classes and utilities for pytexmd.

This module provides the main classes and functions for parsing and processing LaTeX content, including tree elements, searchers, and helpers for Markdown/MyST conversion.

class pytexmd.filter.core.Element(modifiable_content: str, parent: Element | None)[source]¶

Bases: object

Base class for LaTeX tree elements.

children¶

Child elements.

Type:: List[Element] | None

_modifiable_content¶

Content to be processed.

Type:: str

parent¶

Parent element.

Type:: Element | None

Example

>>> elem = Element("some content", None)
>>> elem._modifiable_content
'some content'

hasattr(string: str) → bool[source]¶

Check if the element has a given attribute.

Parameters:: string (str) – Attribute name.
Returns:: True if attribute exists, False otherwise.
Return type:: bool

Example

>>> class Dummy(Element): pass
>>> d = Dummy("x", None)
>>> d.hasattr("_modifiable_content")
True

search_attribute_holder(string: str) → Element | None[source]¶

Find the nearest ancestor with the given attribute.

Parameters:: string (str) – Attribute name.
Returns:: Element holding the attribute, or None.
Return type:: Optional[Element]

Example

>>> class Dummy(Element): pass
>>> d = Dummy("x", None)
>>> d.search_attribute_holder("_modifiable_content") is d
True

all_childs() → List[Element][source]¶

Recursively collect all child elements.

Returns:: List of all child elements including self.
Return type:: List[Element]

Example

>>> e = Element("abc", None)
>>> e.all_childs()[0] is e
True

search_on_func(function: Callable[[Element], bool]) → Element | None[source]¶

Search ancestors using a predicate function.

Parameters:: function (Callable[[Element], bool]) – Predicate function.
Returns:: First matching ancestor or None.
Return type:: Optional[Element]

Example

>>> e = Element("abc", None)
>>> e.search_on_func(lambda x: True) is e
True

search_class(searcher: type) → Element | None[source]¶

Search ancestors for a specific class type.

Parameters:: searcher (type) – Class type to search for.
Returns:: First matching ancestor or None.
Return type:: Optional[Element]

Example

>>> class Dummy(Element): pass
>>> d = Dummy("x", None)
>>> d.search_class(Dummy) is d
True

search_up_on_func(function: Callable[[Element], bool]) → Element | None[source]¶

Search upwards in the tree for an element matching a predicate.

Parameters:: function (Callable[[Element], bool]) – Predicate function.
Returns:: First matching element or None.
Return type:: Optional[Element]

Example

>>> class Dummy(Element): pass
>>> d = Dummy("x", None)
>>> d.search_up_on_func(lambda x: True) is d
True

expand(all_classes: List[Element]) → None[source]¶

Expand the element tree by processing children.

Parameters:: all_classes (List[Element]) – List of element classes.

Example

>>> class Dummy(Element): pass
>>> e = Element("abc", None)
>>> e.expand([Dummy])

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.core.Document(modifiable_content: str, parent: Element)[source]¶

Bases: StructureMaker

Element representing a LaTeX document.

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, Document, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

get_structures() → List[SectionStructure][source]¶

class pytexmd.filter.core.Undefined(modifiable_content: str, parent: Element)[source]¶

Bases: StructureMaker

Element for undefined LaTeX content.

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

get_structures() → List[SectionStructure][source]¶

class pytexmd.filter.core.RawText(string: str, parent: Element)[source]¶

Bases: Element

Element for raw text content.

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.core.JunkSearcher(junk_name: str, save_split: bool = True)[source]¶

Bases: Searcher

Searcher for junk LaTeX commands.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.ReplaceSearcher(junk_name: str, replacement: str, save_split: bool = True)[source]¶

Bases: Searcher

Searcher for replacing LaTeX commands.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.GuardianSearcher(name: str, save_split: bool = True)[source]¶

Bases: Searcher

Searcher for guarding LaTeX commands.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, RawText, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.OneArgumentJunkSearcher(command_name: str, begin_brace: str = '{', end_brace: str = '}')[source]¶

Bases: Searcher

Searcher for junk commands with one argument.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.OneArgumentCommandSearcher(command_name: str, begin: str, end: str)[source]¶

Bases: Searcher

Searcher for commands with one argument.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

pytexmd.filter.core.find_nearest_classes(string: str, all_classes: List[Element]) → List[Element][source]¶

Find nearest matching element classes in a string.

Parameters:

string (str) – Input string.
all_classes (List[Element]) – List of element classes.

Returns:

List of nearest matching classes.

Return type:

List[Element]

Example

>>> class Dummy:
...     @staticmethod
...     def position(s): return s.find("x")
>>> find_nearest_classes("abcxdef", [Dummy])
[Dummy]

pytexmd.filter.core.has_value_equal(instance: Element, attribute_name: str, value) → bool[source]¶

Check if an element’s attribute equals a value.

Parameters:

instance (Element) – Element instance.
attribute_name (str) – Attribute name.
value – Value to compare.

Returns:

True if attribute equals value, False otherwise.

Return type:

bool

Example

>>> class Dummy(Element): pass
>>> d = Dummy("x", None)
>>> d.section_number = 5
>>> has_value_equal(d, "section_number", 5)
True

pytexmd.filter.core.get_number_within_equation(string: str) → str[source]¶

Extract equation numbering context from LaTeX string.

Parameters:: string (str) – LaTeX string.
Returns:: Numbering context or “document”.
Return type:: str

Example

>>> get_number_within_equation("abc\numberwithin{equation}{section}")
'section'

class pytexmd.filter.core.Searcher[source]¶

Bases: object

Base class for searchers to find LaTeX constructs.

Example

>>> class DummySearcher(Searcher):
...     def position(self, s): return s.find("x")
...     def split_and_create(self, s, p): return "", Element("x", p), ""
>>> ds = DummySearcher()
>>> ds.position("abcxdef")
3

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Element, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.BeginEndSearcher(command_name: str, element_type: type, save_split: bool = True)[source]¶

Bases: Searcher

Searcher for LaTeX environments with egin and end.

name¶

Environment name.

Type:: str

save_split¶

Whether to save the split command.

Type:: bool

Example

>>> searcher = BeginEndSearcher("itemize")
>>> searcher.name
'itemize'

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Element, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.SectionLikeSearcher(command_name: str)[source]¶

Bases: Searcher

Searcher for LaTeX commands.

name¶

Command name.

Type:: str

save_split¶

Whether to save the split command.

Type:: bool

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(input: str, parent: Element) → Tuple[str, Element, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

class pytexmd.filter.core.SectionLike(modifiable_content: str, parent, command_name: str, name: str)[source]¶

Bases: StructureMaker

Element for section-like LaTeX commands.

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

get_content() → str[source]¶

get_structures() → List[SectionStructure][source]¶

pytexmd.filter.core.label_call(org: str, label_type: LabelType, rename: str = '') → str[source]¶

pytexmd.filter.core.ref_call(org: str) → str[source]¶

class pytexmd.filter.core.LabelType(*values)[source]¶

Bases: Enum

REF = 'ref'¶

NUMREF = 'numref'¶

SECTION_LIKE = 'section_like'¶

DOC = 'doc'¶

EQ = 'eq'¶

PRF_REF = 'prf:ref'¶

ENUMERATION_ITEM = 'enumeration_item'¶

class pytexmd.filter.core.BackMatter(save_split: bool = True)[source]¶

Bases: Searcher

Searcher for back matter LaTeX commands.

position(string: str) → int[source]¶

Find position of construct in string.

Parameters:: string (str) – Input string.
Returns:: Position index, or -1 if not found.
Return type:: int

split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

Split string and create element for construct.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, created element, post-content.

Return type:

Tuple[str, Element, str]

Enumitem¶

class pytexmd.filter.enumitem.Itemize(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Represents a LaTeX itemize environment.

Example

>>> itemize = Itemize("content", None)
>>> isinstance(itemize.to_string(), str)
True

current_index = 0¶

to_string() → str[source]¶

Converts the itemize to a formatted string.

Returns:: The formatted itemize string.
Return type:: str

Example

>>> itemize = Itemize("abc", None)
>>> isinstance(itemize.to_string(), str)
True

static position(string: str) → int[source]¶

Finds the position of ‘begin{itemize}’ in the string.

Parameters:: string (str) – The input string.
Returns:: The position index.
Return type:: int

Example

>>> Itemize.position("\begin{itemize}abc")
0

static split_and_create(string: str, parent: Element) → tuple[source]¶

Splits the string on itemize environment and creates an Itemize.

Parameters:

string (str) – The input string.
parent (Element) – The parent element.

Returns:

(pre, Itemize, post)

Return type:

tuple

Example

>>> pre, itemize, post = Itemize.split_and_create("\begin{itemize}abc\end{itemize}", None)
>>> isinstance(itemize, Itemize)
True

class pytexmd.filter.enumitem.ItemizeItem(modifiable_content: str, parent: Element, enum_item: str = '*')[source]¶

Bases: Element

Represents an item in a LaTeX itemize environment.

Example

>>> item = ItemizeItem("First item", None)
>>> print(item.to_string())
•  First item

label_name() → str[source]¶

Returns the label of the item.

Returns:: The label.
Return type:: str

Example

>>> item = ItemizeItem("abc", None)
>>> item.label_name()
'•'

to_string() → str[source]¶

Converts the item to a formatted string.

Returns:: The formatted item string.
Return type:: str

Example

>>> item = ItemizeItem("abc", None)
>>> isinstance(item.to_string(), str)
True

static position(string: str) → int[source]¶

Finds the position of ‘item’ in the string.

Parameters:: string (str) – The input string.
Returns:: The position index.
Return type:: int

Example

>>> ItemizeItem.position("\item abc")
0

static split_and_create(string: str, parent: Element) → tuple[source]¶

Splits the string on ‘item’ and creates an ItemizeItem.

Parameters:

string (str) – The input string.
parent (Element) – The parent element.

Returns:

(pre, ItemizeItem, post)

Return type:

tuple

Example

>>> pre, item, post = ItemizeItem.split_and_create("\item abc", None)
>>> isinstance(item, ItemizeItem)
True

class pytexmd.filter.enumitem.Enumeration(modifiable_content: str, parent: Element, start, label_part)[source]¶

Bases: Element

Represents a LaTeX enumerate environment.

Example

>>> enum = Enumeration("content", None, enum_style_arabic, "(", ")")
>>> isinstance(enum.to_string(), str)
True

generate_enum_item() → str[source]¶

to_string() → str[source]¶

Converts the enumerate to a formatted string.

Returns:: The formatted enumerate string.
Return type:: str

Example

>>> enum = Enumeration("abc", None, enum_style_arabic, "(", ")")
>>> isinstance(enum.to_string(), str)
True

static position(string: str) → int[source]¶

Finds the position of ‘begin{enumerate}’ in the string.

Parameters:: string (str) – The input string.
Returns:: The position index.
Return type:: int

Example

>>> Enumeration.position("\begin{enumerate}abc")
0

static split_and_create(string: str, parent: Element) → tuple[source]¶

Splits the string on enumerate environment and creates an Enumeration.

Parameters:

string (str) – The input string.
parent (Element) – The parent element.

Returns:

(pre, Enumeration, post)

Return type:

tuple

Example

>>> pre, enum, post = Enumeration.split_and_create("\begin{enumerate}abc\end{enumerate}", None)
>>> isinstance(enum, Enumeration)
True

class pytexmd.filter.enumitem.EnumerationItem(modifiable_content: str, parent: Element, enum_item: str = None)[source]¶

Bases: Element

Represents an item in a LaTeX enumerate environment.

Example

>>> enum_item = EnumerationItem("First", None)
>>> isinstance(enum_item.to_string(), str)
True

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

static position(string: str) → int[source]¶

Finds the position of ‘item’ in the string.

Parameters:: string (str) – The input string.
Returns:: The position index.
Return type:: int

Example

>>> EnumerationItem.position("\item abc")
0

static split_and_create(string: str, parent: Element) → tuple[source]¶

Splits the string on ‘item’ and creates an EnumerationItem.

Parameters:

string (str) – The input string.
parent (Element) – The parent element.

Returns:

(pre, EnumerationItem, post)

Return type:

tuple

Example

>>> pre, enum_item, post = EnumerationItem.split_and_create("\item abc", None)
>>> isinstance(enum_item, EnumerationItem)
True

Equations¶

Equation filter classes and utilities for pytexmd.

This module provides classes and functions for parsing and processing LaTeX equations, environments, and math for Markdown/MyST conversion.

pytexmd.filter.equations.apply_latex_protection(string: Element) → Element[source]¶

Expands and protects LaTeX environments and commands in the given element.

Parameters:: string (Element) – The element to process.
Returns:: The processed element.
Return type:: Element

class pytexmd.filter.equations.InlineLatex(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Represents inline LaTeX math ($…$).

Example

>>> inline = InlineLatex("x^2", None)
>>> isinstance(inline.to_string(), str)
True

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, InlineLatex, str][source]¶

Split string and create InlineLatex element.

Parameters:

string (str) – Input string.
parent (Element) – Parent element.

Returns:

Pre-content, InlineLatex, post-content.

Return type:

Tuple[str, InlineLatex, str]

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.equations.LatexText(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Represents LaTeX text command.

Example

>>> text = LatexText("hello", None)
>>> isinstance(text.to_string(), str)
True

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, LatexText, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.equations.Cases(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Represents LaTeX cases environment.

Example

>>> cases = Cases("x & y \\ z & w", None)
>>> isinstance(cases.to_string(), str)
True

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, Cases, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.equations.DoubleDolarLatex(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Represents display math ($$…$$).

Example

>>> dbl = DoubleDolarLatex("x^2", None)
>>> isinstance(dbl, DoubleDolarLatex)
True

prio_elem = True¶

add_label(label: str)[source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, Undefined, str][source]¶

pytexmd.filter.equations.get_all_filters() → list[source]¶

Returns all equation-related filter classes/searchers.

Returns:: List of filter classes/searchers.
Return type:: list

Example

>>> filters = get_all_filters()
>>> isinstance(filters, list)
True

File Maker¶

pytexmd.filter.file_maker.string_to_tree(string: str) → Document[source]¶

Converts a string to a document tree structure.

Parameters:: string (str) – The input string to process.
Returns:: The processed document tree.
Return type:: Document

Example

`python latex = r"""\section{Intro}\begin{equation}E=mc^2\end{equation}""" doc = string_to_tree(latex) print(doc.to_string()) `

pytexmd.filter.file_maker.process_string(output_folder: str, string: str, depth=2, output_suffix: str = '.md', verify=True)[source]¶

Processes a LaTeX string and writes the document to hierarchical MyST files.

This function converts LaTeX to a document tree, then splits it into multiple files based on section hierarchy with automatic content verification.

Parameters:

output_folder (str) – The output folder path.
string (str) – The input LaTeX string.
depth (int, optional) – Splitting depth (0=no split, 1=chapter, 2=section, etc.). Defaults to 2.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.
verify (bool, optional) – Verify content integrity after parsing. Defaults to True.

Returns:

Root structure with child_files tracking for all sections

Return type:

dict

Example

`python # Process a LaTeX string and split into hierarchical files latex = r"""\chapter{Intro}\section{Background}\subsection{Details}""" structure = process_string("output", latex, depth=2) # Creates: output/intro.md with toctree to output/background.md `

pytexmd.filter.file_maker.element_to_file_whole(element: SectionLike, output_folder: str, file_name: str, output_suffix: str = '.md')[source]¶

Writes the whole element to a file.

Parameters:

element (SectionEnumerate) – The element to write.
output_folder (str) – The output folder path.
file_name (str) – The file name.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.

Returns:

None

Example

`python # Save the entire document as 'output/index.md' doc = string_to_tree(r"\section{Intro}") element_to_file_whole(doc, "output", "index") `

pytexmd.filter.file_maker.element_to_file_only_begin(element: SectionLike, output_folder: str, file_name: str, file_names: List[str], output_suffix: str = '.md')[source]¶

Writes only the beginning part of the element to a file, with a toctree.

Parameters:

element (SectionEnumerate) – The element to write.
output_folder (str) – The output folder path.
file_name (str) – The file name.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.

Returns:

None

Example

`python # Save only the introduction and generate a toctree for subsections doc = string_to_tree(r"\section{Intro}\section{Background}") element_to_file_only_begin(doc, "output", "index") `

pytexmd.filter.file_maker.split_document_to_files(document_md, output_folder, depth=2, output_suffix='.md', verify=True)[source]¶

Main function to split document tree into hierarchical MyST files.

Each section file will know its child files through the structure.

Parameters:

document_md – Document tree object (from string_to_tree)
output_folder (str) – Output directory path
depth (int) – Splitting depth (0=no split, 1=chapter, 2=section, etc.)
output_suffix (str) – File extension
verify (bool) – Verify content integrity after parsing

Returns:

Root structure with child_files tracking for all sections

Return type:

dict

Example

`python # Convert and split a document doc = string_to_tree(latex_string) structure = split_document_to_files(doc, "./output", depth=2, verify=True) # Each section in structure has 'child_files' list `

pytexmd.filter.file_maker.split_by_sections(content_string, max_depth=2)[source]¶

Split document string into hierarchical sections based on MyST comment markers.

Parameters:

content_string (str) – The full document string with MyST markers
max_depth (int) – Maximum depth for splitting (0=part, 1=chapter, 2=section, etc.)

Returns:

Hierarchical structure of sections with content and children tracking

Return type:

dict

pytexmd.filter.file_maker.verify_content_integrity(original_content, structure)[source]¶

Verify that the split structure contains all original content.

Parameters:

original_content (str) – Original document string
structure (dict) – Parsed section structure

Returns:

(is_valid, message, stats)

Return type:

tuple

pytexmd.filter.file_maker.string_to_filename(name)[source]¶

Convert section name to valid filename.

Parameters:: name (str) – Section name to convert
Returns:: Sanitized filename
Return type:: str

Notworking Preprocessor¶

pytexmd.filter.notworking_preprocessor.do_commands(string: str) → str[source]¶

pytexmd.filter.notworking_preprocessor.do_newenvironment(string: str) → str[source]¶

Processes all LaTeX newenvironment definitions and applies them.

Parameters:: string (str) – The input string.
Returns:: The string with environments expanded.
Return type:: str

Example

>>> s = r"\newenvironment{foo}[2]{<b>#1 #2>}{</b>} \begin{foo}{a}{b}content\end{foo}"
>>> do_newenvironment(s)
'<b>a b>content</b>'

Preprocessor¶

pytexmd.filter.preprocessor.do_commands(string: str) → str[source]¶

Processes all LaTeX newcommand definitions and applies them.

Parameters:: string (str) – The input string.
Returns:: The string with commands expanded.
Return type:: str

Example

>>> s = r"\newcommand{\foo}[2]{#1+#2} \foo{a}{b}"
>>> do_commands(s)
'a+b'

pytexmd.filter.preprocessor.do_newenvironment(string: str) → str[source]¶

Processes all LaTeX newenvironment definitions and applies them.

Parameters:: string (str) – The input string.
Returns:: The string with environments expanded.
Return type:: str

Example

>>> s = r"\newenvironment{foo}[2]{<b>#1 #2>}{</b>} \begin{foo}{a}{b}content\end{foo}"
>>> do_newenvironment(s)
'<b>a b>content</b>'

Splitting¶

pytexmd.filter.splitting.get_all_allchars_no_abc() → str[source]¶

Returns a string of non-alphabetic ASCII characters.

Returns:: String containing non-alphabetic ASCII characters.
Return type:: str

Example

>>> chars = get_all_allchars_no_abc()
>>> isinstance(chars, str)
True

pytexmd.filter.splitting.save_command_split(string: str, split_on: str) → List[str][source]¶

Splits a string on a given substring, preserving certain patterns.

Parameters:

string (str) – The input string to split.
split_on (str) – The substring to split on.

Returns:

List of split string segments.

Return type:

List[str]

Raises:

ValueError – If input types are incorrect.

Example

>>> parts = save_command_split("foo$bar$baz", "$")
>>> parts
['foo', 'bar', 'baz']

pytexmd.filter.splitting.first_char_brace(string: str, begin_brace: str = '{') → bool[source]¶

Checks if the first non-whitespace character of a string is a given brace.

Parameters:

string (str) – The input string.
begin_brace (str, optional) – The brace character to check. Defaults to “{“.

Returns:

True if first character is the brace, False otherwise.

Return type:

bool

Raises:

ValueError – If input types are incorrect.

Example

>>> is_brace = first_char_brace(" {foo}")
>>> is_brace
True

pytexmd.filter.splitting.split_on_first_brace(string: str, begin_brace='{', end_brace='}', error_replacement='brace_error') → Tuple[str, str][source]¶

Splits a string on the first matching pair of braces.

Parameters:

string (str) – The input string.
begin_brace (str, optional) – The opening brace. Defaults to “{“.
end_brace (str, optional) – The closing brace. Defaults to “}”.
error_replacement (str, optional) – Replacement string if brace not found. Defaults to “brace_error”.

Returns:

Content inside braces, and the remaining string.

Return type:

Tuple[str, str]

Raises:

ValueError – If input types are incorrect.

Example

>>> inside, rest = split_on_first_brace("{foo}bar")
>>> inside
'foo'
>>> rest
'bar'

pytexmd.filter.splitting.split_rename(string: str) → Tuple[str, str] | None[source]¶

Splits the input string into a name and the remaining string if the first character is a ‘[‘.

Parameters:: string (str) – The input string.
Returns:: A tuple containing the name and the remaining string, or None if the first character is not ‘[‘.
Return type:: Optional[Tuple[str, str]]
Raises:: ValueError – If input is not a string.

Example

>>> name, rest = split_rename("[foo]bar")
>>> name
'foo'
>>> rest
'bar'

pytexmd.filter.splitting.split_on_next(string: str, split_on: str, save_split: bool = True) → Tuple[str, str][source]¶

Splits a string on the next occurrence of a substring.

Parameters:

string (str) – The input string.
split_on (str) – The substring to split on.
save_split (bool, optional) – Whether to use save_command_split. Defaults to True.

Returns:

The part before and after the split.

Return type:

Tuple[str, str]

Raises:

ValueError – If input types are incorrect.

Example

>>> before, after = split_on_next("foo$bar$baz", "$")
>>> before
'foo'
>>> after
'bar$baz'

pytexmd.filter.splitting.begin_end_split(string: str, begin_name: str, end_name: str, save_split: bool = False) → Tuple[str, str, str][source]¶

Splits a string into three parts: before, between, and after given begin and end substrings.

Parameters:

string (str) – The input string.
begin_name (str) – The substring marking the beginning.
end_name (str) – The substring marking the end.
save_split (bool, optional) – Whether to use save_command_split. Defaults to False.

Returns:

The parts before, between, and after the delimiters.

Return type:

Tuple[str, str, str]

Raises:

ValueError – If input types are incorrect.

Example

>>> pre, mid, post = begin_end_split("a\begin{env}b\end{env}c", "\begin{env}", "\end{env}")
>>> pre
'a'
>>> mid
'b'
>>> post
'c'

pytexmd.filter.splitting.position_of(string: str, begin_name: str, save_split: bool = True) → int[source]¶

Finds the position of a substring in a string.

Parameters:

string (str) – The input string.
begin_name (str) – The substring to find.
save_split (bool, optional) – Whether to use save_command_split. Defaults to True.

Returns:

The position index, or -1 if not found.

Return type:

int

Raises:

ValueError – If input types are incorrect.

Example

>>> pos = position_of("foo$bar", "$")
>>> pos
3

Text¶

Section and theorem filter classes and utilities for pytexmd.

This module provides classes and functions for parsing and processing LaTeX sections, theorems, references, and formatting for Markdown/MyST conversion.

class pytexmd.filter.text.Ref(modifiable_content: str, parent: Element, label_ref: str)[source]¶

Bases: Element

Element for LaTeX ref reference.

Example

>>> ref = Ref("content", None, "mylabel")
>>> isinstance(ref, Ref)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Ref, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.EqRef(modifiable_content: str, parent: Element, label_ref: str)[source]¶

Bases: Element

Element for LaTeX eqref reference.

Example

>>> eqref = EqRef("content", None, "eq1")
>>> isinstance(eqref, EqRef)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Ref, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.Proof(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Element for LaTeX proof environment.

Example

>>> proof = Proof("content", None)
>>> isinstance(proof, Proof)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Proof, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.Textbf(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Element for LaTeX textbf command.

Example

>>> bold = Textbf("content", None)
>>> isinstance(bold, Textbf)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Textbf, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.Emph(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Element for LaTeX emph command.

Example

>>> emph = Emph("content", None)
>>> isinstance(emph, Emph)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Emph, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.Cite(modifiable_content: str, parent: Element, citations: list[str], rename: str)[source]¶

Bases: Element

Element for LaTeX cite command.

Example

>>> cite = Cite("content", None, ["ref1", "ref2"])
>>> isinstance(cite, Cite)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Cite, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

pytexmd.filter.text.get_all_filters() → list[source]¶

Returns all section-related filter classes/searchers.

Returns:: List of filter classes/searchers.
Return type:: list

Example

>>> filters = get_all_filters()
>>> isinstance(filters, list)
True

pytexmd.filter.text.get_number_within_equation(input: str) → str[source]¶

Extract equation numbering context from LaTeX string.

Parameters:: input (str) – LaTeX string.
Returns:: Numbering context or “document”.
Return type:: str

Example

>>> get_number_within_equation("abc\numberwithin{equation}{section}")
'section'

pytexmd.filter.text.get_theoremSearchers(input: str) → list[source]¶

Extract theorem searchers from LaTeX preamble.

Parameters:: input (str) – LaTeX preamble string.
Returns:: List of TheoremSearcher instances.
Return type:: list

Example

>>> result = get_theoremSearchers(r"\newtheorem{theorem}{Theorem}")
>>> isinstance(result, list)
True

class pytexmd.filter.text.Textit(modifiable_content: str, parent: Element)[source]¶

Bases: Element

Element for LaTeX textit command.

Example

>>> textit = Textit("content", None)
>>> isinstance(textit, Textit)
True

static position(input: str) → int[source]¶

static split_and_create(input: str, parent: Element) → Tuple[str, Emph, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

class pytexmd.filter.text.ProofLabel(modifiable_content: str, parent: Element, label_ref: str)[source]¶

Bases: Element

Element for MyST label.

Parameters:

modifiable_content (str) – Content to process.
parent (Element) – Parent element.
label_ref (str) – Label reference.

Example

>>> label = ProofLabel("content", None, "mylabel")
>>> isinstance(label, ProofLabel)
True

static position(string: str) → int[source]¶

static split_and_create(string: str, parent: Element) → Tuple[str, ProofLabel, str][source]¶

to_string() → str[source]¶

Output Markdown/MyST string.

Returns:: Markdown/MyST representation.
Return type:: str

Example

>>> class Dummy(Element):
...     def to_string(self): return "dummy"
>>> Dummy("abc", None).to_string()
'dummy'

Module Contents¶

pytexmd.filter.string_to_tree(string: str) → Document[source]¶

Converts a string to a document tree structure.

Parameters:: string (str) – The input string to process.
Returns:: The processed document tree.
Return type:: Document

Example

`python latex = r"""\section{Intro}\begin{equation}E=mc^2\end{equation}""" doc = string_to_tree(latex) print(doc.to_string()) `

pytexmd.filter.process_string(output_folder: str, string: str, depth=2, output_suffix: str = '.md', verify=True)[source]¶

Processes a LaTeX string and writes the document to hierarchical MyST files.

This function converts LaTeX to a document tree, then splits it into multiple files based on section hierarchy with automatic content verification.

Parameters:

output_folder (str) – The output folder path.
string (str) – The input LaTeX string.
depth (int, optional) – Splitting depth (0=no split, 1=chapter, 2=section, etc.). Defaults to 2.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.
verify (bool, optional) – Verify content integrity after parsing. Defaults to True.

Returns:

Root structure with child_files tracking for all sections

Return type:

dict

Example

pytexmd.filter.element_to_file_whole(element: SectionLike, output_folder: str, file_name: str, output_suffix: str = '.md')[source]¶

Writes the whole element to a file.

Parameters:

element (SectionEnumerate) – The element to write.
output_folder (str) – The output folder path.
file_name (str) – The file name.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.

Returns:

None

Example

`python # Save the entire document as 'output/index.md' doc = string_to_tree(r"\section{Intro}") element_to_file_whole(doc, "output", "index") `

pytexmd.filter.element_to_file_only_begin(element: SectionLike, output_folder: str, file_name: str, file_names: List[str], output_suffix: str = '.md')[source]¶

Writes only the beginning part of the element to a file, with a toctree.

Parameters:

element (SectionEnumerate) – The element to write.
output_folder (str) – The output folder path.
file_name (str) – The file name.
output_suffix (str, optional) – The file suffix. Defaults to “.md”.

Returns:

None

Example

`python # Save only the introduction and generate a toctree for subsections doc = string_to_tree(r"\section{Intro}\section{Background}") element_to_file_only_begin(doc, "output", "index") `

pytexmd.filter.split_document_to_files(document_md, output_folder, depth=2, output_suffix='.md', verify=True)[source]¶

Main function to split document tree into hierarchical MyST files.

Each section file will know its child files through the structure.

Parameters:

document_md – Document tree object (from string_to_tree)
output_folder (str) – Output directory path
depth (int) – Splitting depth (0=no split, 1=chapter, 2=section, etc.)
output_suffix (str) – File extension
verify (bool) – Verify content integrity after parsing

Returns:

Root structure with child_files tracking for all sections

Return type:

dict

Example

pytexmd.filter.split_by_sections(content_string, max_depth=2)[source]¶

Split document string into hierarchical sections based on MyST comment markers.

Parameters:

content_string (str) – The full document string with MyST markers
max_depth (int) – Maximum depth for splitting (0=part, 1=chapter, 2=section, etc.)

Returns:

Hierarchical structure of sections with content and children tracking

Return type:

dict

pytexmd.filter.verify_content_integrity(original_content, structure)[source]¶

Verify that the split structure contains all original content.

Parameters:

original_content (str) – Original document string
structure (dict) – Parsed section structure

Returns:

(is_valid, message, stats)

Return type:

tuple

pytexmd.filter.string_to_filename(name)[source]¶

Convert section name to valid filename.

Parameters:: name (str) – Section name to convert
Returns:: Sanitized filename
Return type:: str