exc-to-pdf

Excel to PDF converter optimized for Google NotebookLM analysis.

๐ŸŽฏ Overview

exc-to-pdf is a Python tool that converts Excel files (.xlsx) into PDF documents specifically optimized for AI analysis with Google NotebookLM. The tool preserves all data, maintains structure, and creates navigation-friendly PDFs that AI systems can effectively analyze.

Key Features

๐Ÿš€ Quick Start

Installation

pip install exc-to-pdf

From Source

# Clone the repository
git clone https://github.com/fulvian/exc-to-pdf.git
cd exc-to-pdf

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode
pip install -e .

Basic Usage

# Convert Excel to PDF
exc-to-pdf input.xlsx output.pdf

# With options
exc-to-pdf input.xlsx output.pdf --bookmarks --preserve-formatting

# Python module alternative
python -m exc_to_pdf input.xlsx output.pdf

Python API

from exc_to_pdf.excel_processor import ExcelProcessor
from exc_to_pdf.pdf_generator import PDFGenerator

# Process Excel file
processor = ExcelProcessor("input.xlsx")
sheets_data = processor.extract_all_sheets()

# Generate PDF
generator = PDFGenerator()
generator.create_pdf(sheets_data, "output.pdf")

๐Ÿ“‹ Requirements

Core Dependencies

๐Ÿ—๏ธ Project Structure

exc-to-pdf/
โ”œโ”€โ”€ src/                    # Source code
โ”‚   โ”œโ”€โ”€ excel_processor.py  # Excel reading logic
โ”‚   โ”œโ”€โ”€ pdf_generator.py    # PDF generation
โ”‚   โ”œโ”€โ”€ table_detector.py   # Table identification
โ”‚   โ””โ”€โ”€ main.py            # CLI interface
โ”œโ”€โ”€ tests/                  # Test suite
โ”‚   โ”œโ”€โ”€ unit/              # Unit tests
โ”‚   โ”œโ”€โ”€ integration/       # Integration tests
โ”‚   โ””โ”€โ”€ fixtures/          # Test data
โ”œโ”€โ”€ docs/                  # Documentation
โ”‚   โ”œโ”€โ”€ idee_fondanti/     # Foundational documents
โ”‚   โ””โ”€โ”€ api/               # API documentation
โ”œโ”€โ”€ scripts/               # Utility scripts
โ””โ”€โ”€ requirements.txt       # Dependencies

๐Ÿ”„ Development Workflow

This project follows the DevStream 7-Step Workflow:

  1. DISCUSS - Requirements analysis and planning
  2. ANALYZE - Technical analysis and research
  3. RESEARCH - Context7 and best practices research
  4. PLAN - Implementation planning
  5. APPROVE - Architecture validation
  6. IMPLEMENT - Code development
  7. VERIFY - Testing and validation

Current Development Phase

Phase: P1 - Project Foundation โœ… Next: P2 - Excel Processing Engine

See docs/idee_fondanti/piano_fondante_exc-to-pdf.md for complete development plan.

๐Ÿงช Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/unit/test_excel_processor.py

๐Ÿ“Š Architecture

Data Flow

Excel File โ†’ openpyxl parsing โ†’ pandas processing โ†’ reportlab rendering โ†’ PDF Output

Key Components

  1. ExcelProcessor: Reads and parses Excel files
  2. TableDetector: Identifies table structures
  3. PDFGenerator: Creates structured PDF output
  4. BookmarkManager: Adds navigation elements

๐ŸŽฏ Google NotebookLM Optimization

The PDF output is specifically designed for AI analysis:

๐Ÿ“ Development Status

๐Ÿค Contributing

  1. Follow DevStream workflow
  2. Maintain 95%+ test coverage
  3. Use type hints and docstrings
  4. Pass code review validation

๐Ÿ“„ License

MIT License - see LICENSE file for details.


Built with โค๏ธ using DevStream framework