Completed

PDF Bot

A Discord bot that automatically converts PDF files to images for easy viewing

PythonDiscord.pypdf2imagePillowasyncio

Overview

A Discord bot that automatically converts uploaded PDF files into images, making PDFs easily viewable directly within Discord. When a user uploads a PDF, the bot creates a dedicated thread and posts each page as an image.

Features

  • Automatic Detection: Monitors channels for PDF uploads and processes them automatically
  • Multi-File Support: Handles multiple PDFs uploaded in a single message
  • Threaded Output: Creates organized threads to keep converted images together
  • Parallel Processing: Converts multiple PDFs simultaneously for faster results
  • Rate Limiting: Per-guild semaphores prevent system overload from busy servers
  • Image Optimization: Automatically resizes and compresses images for Discord's file limits
  • Batch Uploads: Posts images in batches of 10 (Discord's limit) with page numbers

How It Works

  1. User uploads one or more PDF files to any channel
  2. Bot detects the PDF attachments and creates a thread
  3. PDFs are downloaded and converted to images in parallel
  4. Images are optimized for Discord (max 2048px, compressed PNG)
  5. Pages are posted to the thread in batches with page numbers
  6. Temporary files are cleaned up after processing

Commands

  • pdf!help - Display usage information and bot features
  • pdf!status - Show bot statistics and current processing status

Technologies Used

  • Runtime: Python 3.10+
  • Discord API: discord.py for bot functionality
  • PDF Processing: pdf2image with Poppler backend
  • Image Processing: Pillow for optimization and compression
  • Async HTTP: aiohttp for efficient file downloads
  • Concurrency: asyncio with ThreadPoolExecutor for non-blocking PDF conversion

Technical Highlights

Concurrent Processing

Uses asyncio.gather() with per-guild semaphores to process multiple PDFs in parallel while preventing resource exhaustion. Each guild gets its own rate limiter, so busy servers don't affect others.

Image Optimization Pipeline

  1. Convert PDF pages at 200 DPI for quality/size balance
  2. Resize images exceeding 2048px while maintaining aspect ratio
  3. Apply additional compression if file size exceeds 8MB
  4. Save as optimized PNG with compression level 6

Resource Management

  • ThreadPoolExecutor for CPU-bound PDF conversion
  • Automatic cleanup of temporary files after processing
  • Reusable aiohttp session for efficient downloads

Configuration

Key configurable settings:

  • Max file size: 25MB (Discord's free tier limit)
  • Conversion DPI: 200
  • Max image dimension: 2048px
  • Max concurrent conversions per guild: 3
  • Worker threads: 2

Requirements

  • Python 3.10+
  • Poppler (for pdf2image)
  • Discord bot token with message content intent