GPT Exporter — Architecture Overview

This document describes the internal structure and logic of GPT Exporter, including the Python application, the export engine, and the offline web viewer.
Level: Advanced.

GPT Exporter is intentionally lightweight and simple so beginners can understand and extend it easily.
The architecture is prepared for future introduction of licensing, feature tiers, and optional limits without breaking core logic.

------------------------------------------------------------

1. High-Level Architecture

Desktop App (Python, PySide6 GUI)
→ Starts a local HTTP server (ThreadingHTTPServer)
→ Serves HTML/CSS/JS viewer
→ Viewer loads chat JSON files
→ All data comes from chats/first/ next to the executable

------------------------------------------------------------

2. Components

2.1 Desktop GUI (PySide6)

File: app/main.py
Package: app (contains __init__.py)
Features:
- two tabs: Open, Export
- ZIP picker dialog
- export log widget
- tray icon
- browser launcher
- isolated export thread
- startup of local server

Notable functions:
- _init_tab_open
- _init_tab_export
- _do_export
- _place_window
- _make_tray
- start_server
- open_when_ready

The GUI is thin; heavy lifting is delegated to export.py and the local server.

------------------------------------------------------------

2.2 Local HTTP Server

Implemented in app/main.py using ThreadingHTTPServer.

Responsibilities:
- serve static files from the web/ folder
- serve chat data from chats/first/
- route API endpoints:
  /api/list  — returns list of chat files
  /api/lib   — returns a specific file (legacy support)

Important class: ExporterHandler
Overrides translate_path, do_GET, log_message.

The server only binds to 127.0.0.1 and has zero networking beyond local machine.

------------------------------------------------------------

2.3 Export Engine (export.py)

Responsible for converting ChatGPT export archives into per-chat text-only JSON files (no attachments or images in the current Nova version).

Process:
1. Extract ZIP into a temporary folder
2. Detect available conversation sources inside the unpacked archive
3. Prefer newer archive structure based on `export_manifest.json`
4. Load conversation data from:
   - sharded files such as `conversations-000.json`, `conversations-001.json`, and similar, when present
   - legacy `conversations.json` as a fallback
5. For each chat:
   - reconstruct message chain using mapping tree
   - follow parent pointers
   - collect text content parts
   - produce clean message list
6. Save each chat as a separate JSON file containing only title, filename, and text messages
7. Delete temporary folder

Key functions:
- ensure_dirs
- load_json
- clean_filename
- extract_messages (core logic)

Compatibility:
- older archives with single `conversations.json`
- newer archives with `export_manifest.json`
- newer archives with sharded conversation files

Output files look like:
{
  "title": "...",
  "filename": "...",
  "messages": [
      { "role": "user", "text": "..." },
      { "role": "ChatGPT", "text": "..." }
  ]
}

------------------------------------------------------------

3. Web Viewer

Files:
- gpt-exporter.html
- styles.css
- scripts.js
- favicon.ico

Runs in the browser but loads everything locally.
No online content.

Viewer features:
- deep-linking for every message
- scroll tracking
- automatic hash updates
- message rendering
- code formatting
- message copy buttons
- sidebar with sorted chat list
------------------------------------------------------------

3.1 Deep Linking Logic

The viewer uses IntersectionObserver to detect the “top visible” message.
It updates the URL hash automatically:

v=chat
f=<filename>
m=<message_index>

When reloaded, viewer reads the hash and scrolls to the exact message.

------------------------------------------------------------

3.2 Scroll Tracking

scripts.js implements scroll-based message tracking for deep links.
Scroll updates are efficient and optimized.

------------------------------------------------------------

3.3 Chat Rendering Process

1. fetchChatFile loads file contents
2. loadChat renders JSON chat files produced by export.py
3. createMessageElement builds DOM nodes
4. formatText applies inline code, fenced blocks, bold, italic, and special JSON edit blocks

Renderer supports:
- JSON chats produced by export.py

Note:
- the sidebar filename matcher can still recognize legacy `.txt` filenames if such files are present in `chats/first/`
- actual chat rendering in v0.1-nova expects JSON chat files

------------------------------------------------------------

4. Data Model

GPT Exporter uses simple filesystem-based storage next to the executable:

chats/
  first/
    1.example.json
    2.other-chat.json
    img/
    lib/

.gpt-exporter/
  temp/
  export-stage/
  export-backup/

Each chat is independent.
No database is used.

------------------------------------------------------------

5. Application Lifecycle

Start App
↓
Create GUI
↓
Start local HTTP server
↓
User selects ZIP
↓
export.py processes chats
↓
Chats saved to chats/first
↓
User clicks "Open local viewer"
↓
Browser viewer loads
↓
User navigates chats locally

------------------------------------------------------------

6. Security Model

- No external network traffic for user data processing
- Local viewer is served only on 127.0.0.1
- Viewer served only on 127.0.0.1
- All data stored locally
- No telemetry or analytics by default
- No cloud usage
- Offline by design

------------------------------------------------------------

7. Extending the Project

The architecture was intentionally built to be easy to extend.

Possible extensions:
- search inside a chat
- filtering or tagging
- exporting to TXT, HTML, Markdown
- dark mode
- improved sidebar sorting
- additional metadata extraction

------------------------------------------------------------

8. Summary

GPT Exporter uses a clean, transparent, beginner-friendly architecture:
Python GUI → Local Server → Web Viewer → JSON Files.
No cloud, no database, and only one external Python dependency in the current version: PySide6.
The current exporter supports both legacy and newer official ChatGPT archive structures while keeping the same user-facing workflow and output format.
Simple enough to understand after reading this file.
The architecture is designed to support future extensions such as licensing and feature tiers.

------------------------------------------------------------