.gptcontext

Additional context is provided below.

Preferences for python code:
- adhere to common style conventions, e.g. PEP8
- keep lines under 80 characters long

Markdown2confluence pushes a folder containing markdown files and pushes them to confluence, with a page structure like the file and folder structure of the markdown files, and ignoring any non-markdown files.

Required behavior:
All pages managed by markdown2confluence contains $CONFLUENCE_PAGE_TITLE_SUFFIX, e.g. '(autogenerated)'. New pages are created with this suffix, and on subsequent runs any pages with the suffix (or label, TBD) are overwritten or deleted.
Depending on how confluence labels work it might be best to use labels instead. If using labels, refuse to delete any pages that does not have the page title suffix.
Any markdown that contains full or relative links to local media files should be published as pages with attached media. Relative links in markdown to local media are resolved from the location of the markdown file. Full-path links in markdown are resolved from the $MARKDOWN_FOLDER


Currently I am working on:
- Publisher class in publish.py contains the old code for now, I am moving
  functionality to the other classes.
- Change from directly using requests to using the confluence client from
  atlassian
- Use labels instead of only relying on the suffix (previously called search
  pattern)


file structure:
markdown2confluence/
├── CODEOWNERS
├── Dockerfile
├── docs
│   └── usage.md
├── LICENCE
├── markdown2confluence
│   ├── converter.py
│   ├── __init_.py
│   ├── main.py
│   ├── confluence.py
│   ├── config.py
│   ├── file_manager.py
│   └── publisher.py
├── README.md
├── requirements.txt
├── setup.py
└── tests
    ├── e2e
    │   ├── __init__.py
    │   └── test_e2e.py
    ├── integration
    │   ├── __init__.py
    │   └── test_integration.py
    └── unit
        ├── __init__.py
        ├── test_file_manager.py
        ├── test_confluence.py
        └── test_publisher.py

arguments:
--confluence-url
--confluence-username
--confluence-password
--confluence-space-id
--confluence-parent-page-id
--markdown-folder
--confluence-page-title-suffix
--confluence-page-label
--confluence-ignorefile

and corresponding environment variables:
CONFLUENCE_URL
CONFLUENCE_USERNAME
CONFLUENCE_PASSWORD
CONFLUENCE_SPACE_ID
CONFLUENCE_PARENT_PAGE_ID
MARKDOWN_FOLDER
CONFLUENCE_PAGE_TITLE_SUFFIX
CONFLUENCE_PAGE_LABEL
CONFLUENCE_IGNOREFILE


### Architecture Summary

#### Components and Their Key Interfaces

1. **ConfluenceClient**

Responsible for direct interactions with the Confluence API, handling operations like page creation, updates, deletion, and labeling with retries and backoff for robustness.

```python
class ConfluenceClient:
    def __init__(self, confluence_config: dict):
        """Initialize with API configuration."""

    def create_or_update_page(self, title: str, html: str, parent_id=None, space_key: str, labels=None) -> dict:
        """Create or update a Confluence page, applying labels."""
    
    def delete_page(self, page_id: str) -> dict:
        """Delete a Confluence page by ID."""
```

2. **Publisher**

Orchestrates the conversion of Markdown to HTML and the subsequent publishing to Confluence, respecting the original directory structure and managing page relationships.

```python
class Publisher:
    def __init__(self, confluence_client: ConfluenceClient, source_directory: str, space_key: str):
        """Setup with Confluence client, source directory, and target space key."""
    
    def publish(self):
        """Main method to start the publishing process."""
    
    def traverse_directory(self, directory: str, parent_id=None):
        """Recursively traverse directories, converting and uploading Markdown files."""
```

3. **FileManager** (unchanged, conceptual)

Handles file reading and potentially logging or other file outputs, and maybe traversing the file system

```python
class FileManager:
    def read_file(self, path: str) -> str:
        """Read the content of a file."""
```

### Workflow Overview with Snippets

- The process starts with `Publisher`, which is initialized with necessary configurations and an instance of `ConfluenceClient`.
  
```python
publisher = Publisher(confluence_client=ConfluenceClient(confluence_config), source_directory="path/to/markdown", space_key="SPACEKEY")
publisher.publish()
```

- `Publisher.publish()` begins the process, invoking `traverse_directory()` to walk through the directory structure, processing each Markdown file by converting it to HTML.

- For each processed file, `Publisher` uses `ConfluenceClient.create_or_update_page()` to either create a new page or update an existing one in Confluence, applying a predefined label to mark the page as managed by `markdown2confluence`.

- Should a page need to be deleted or labels added, `Publisher` utilizes other methods of `ConfluenceClient` like `delete_page()` and maybe `add_labels_to_page()`, ensuring the Confluence space remains synchronized with the source content.

### Conclusion

This architecture, enriched with interface snippets, outlines a clear, modular approach to converting and managing Markdown content within Confluence, ensuring scalability and maintainability through well-defined responsibilities and robust Confluence API interactions.