Welcome to Text Extractor, a powerful Python-based desktop application built with PyQt6, designed to extract and consolidate text from multiple files into a single output file. This tool is ideal for students, researchers, developers, and anyone needing to efficiently manage and merge text content from various sources, with advanced features like duplicate removal, customizable output formatting, and a modern, themeable interface.
- Features
- Supported File Formats
- Installation
- Usage
- Screenshots
- Releases
- Support
- Contributing
- Security
- License
- Dependencies
- File Selection and Management: Easily select, add, sort, and clear multiple files for text extraction.
- Duplicate Removal: Use hash-based detection to eliminate duplicate files, ensuring a clean output.
- Customizable Output: Add line numbers, timestamps, and file names to the consolidated output file.
- Encoding Support: Choose from multiple encoding options (UTF-8, ASCII, Latin-1, UTF-16) for file reading and writing.
- File Preview: Preview file contents in the app before processing, with customizable preview length.
- Search Functionality: Search within files to filter the file list based on content.
- File Information: View detailed file metadata, including size and last modified date.
- Themeable Interface: Switch between dark and light themes for a personalized user experience.
- Progress Monitoring: Track extraction progress with a progress bar and status updates.
- Automatic Updates: Check for new versions on startup with optional notifications.
- Cross-Platform UI: Built with PyQt6 for a modern, intuitive interface compatible with Windows, macOS, and Linux.
Text Extractor supports the following readable file formats:
- Text Files:
.txt
- Java Files:
.java
- Python Files:
.py
- Markdown Files:
.md
- HTML Files:
.html
- CSV Files:
.csv
- XML Files:
.xml
- JSON Files:
.json
- Rich Text Format:
.rtf
- Microsoft Word Documents:
.docx
Text Extractor is packaged using Briefcase, making it easy to run or distribute as a native application across platforms. You can either build from source or use pre-compiled binaries where available.
-
Ensure you have Python 3.9+ installed on your system (Windows, macOS, Linux).
-
Clone this repository:
git clone https://github.com/VoxDroid/Text-Extractor.git cd Text-Extractor
-
Install Briefcase and dependencies:
pip install briefcase pip install -r requirements.txt
-
Initialize the Briefcase project (if not already set up):
briefcase create
-
Build the application:
- Windows:
briefcase build windows
- macOS:
briefcase build macos
- Linux:
briefcase build linux
- Windows:
-
Run the application:
- Windows:
briefcase run windows
- macOS:
briefcase run macos
- Linux:
briefcase run linux
- Windows:
- Windows: Download the latest
.exe
(portable) or.msi
(installer) tagged with [W
] for Windows, from the Releases section. Run the MSI installer or use the portable version for no-setup runs. - macOS: Download the latest universal
.dmg
(x86_64 and Apple Silicon) tagged with [M
] for macOS, from the Releases section. Open the DMG, drag the app to Applications, and launch it. - Linux: Download the latest
.rpm
(for Fedora/Red Hat),.deb
(for Debian/Ubuntu), or.pkg.tar.zst
(for Arch/Pacman) tagged with [L
] for Linux, from the Releases section. Run the installer and launch the app.
Upon launching, you’ll see the main interface featuring three tabs: Extractor, Settings, and Help. The in-app Help tab contains a comprehensive user manual.
- Launch the application and explore the Extractor tab to begin selecting files.
- Configure settings such as encoding, duplicate removal, and output formatting in the Settings tab.
- Refer to the Help tab for detailed instructions and examples.
- Purpose: Select and process files for text extraction.
- How to Use:
- Click "Select Files" to choose files or "Add Files" to append more files.
- Use "Sort" to organize the file list or "Clear" to reset it.
- Click a file to preview its contents (up to the specified preview length).
- Use "Search in Files" to filter files by content.
- Click "File Info" to view metadata for all selected files.
- Specify an output filename and click "Save As" to consolidate the text into a single file.
- Monitor progress with the progress bar and cancel if needed.
- Purpose: Customize the application’s behavior.
- How to Use:
- Enable or disable options like duplicate removal, line numbers, and timestamps.
- Select the desired encoding (e.g., UTF-8, ASCII).
- Set the preview length for file previews.
- Switch between dark and light themes.
- Purpose: Access the embedded user manual.
- How to Use: Navigate to the Help tab to read detailed guides, view example output, and find support information.
Here are previews of the main tabs in Text Extractor:
- Windows: Pre-compiled
.exe
available in the Releases section. - macOS: Pre-compiled universal
.dmg
(x86_64 and Apple Silicon) available in the Releases section. - Linux: Pre-compiled
.rpm
(for Fedora/Red Hat),.deb
(for Debian/Ubuntu), or.pkg.tar.tsz
(for Arch/Pacman) available in the Releases section. - Check release notes for details on new features, bug fixes, and version updates.
- The Briefcase-built Python source remains the primary method, supporting all platforms with proper setup.
For ways to get help, report issues, or support the project’s development, please see the Support page.
Text Extractor is open-source, and contributions are encouraged! Please read our Contributing Guidelines, Code of Conduct, and Security Policy before submitting issues or pull requests. Use the appropriate issue templates for reporting bugs, suggesting features, or other contributions, and the Pull Request template for code submissions.
If you discover a security vulnerability, please follow our Security Policy by emailing izeno.contact@gmail.com or using the Security Report issue template for non-sensitive issues.
This project is licensed under the MIT License. Use, modify, and distribute it freely per the license terms.
To build from source, install the following Python packages:
PyQt6
(for the GUI)requests
(for HTTP requests)packaging
(for version parsing)qtawesome
(for icons)briefcase
(for packaging the app)
Create a requirements.txt
file with these dependencies and run pip install -r requirements.txt
.
Developed by VoxDroid
GitHub | Ko-fi