|
|
|
|
|
by MDGrey33
535 days ago
|
|
Hi everyone!
I’m excited to announce PyVisionAI, an evolution of the project formerly known as Content Extractor with Vision LLM. Now available on pip and Poetry, it’s a Python library and CLI tool designed to extract text and images from documents and describe images using Vision Language Models.
Key Features
Dual Functionality: Use as a CLI tool or integrate as a Python library.
File Extraction: Process PDF, DOCX, and PPTX files to extract text and images.
Image Descriptions: Generate descriptions using local models (Ollama's llama3.2-vision) or cloud models (OpenAI GPT-4 Vision).
Markdown Output: Save results in neatly formatted Markdown files.
Quick Start
Install via pip:bashCopy codepip install pyvisionai
Extract content from a file:bashCopy codefile-extract -t pdf -s path/to/file.pdf -o output_dir
Describe an image:bashCopy codedescribe-image -i path/to/image.jpg
Repo & Contribution
GitHub: PyVisionAI.
https://github.com/MDGrey33/pyvisionai
Whether you’re working with complex documents or image-rich data, PyVisionAI simplifies the process. Try it out and share your feedback—I’d love to hear your thoughts!
This version is shorter while still emphasizing CLI and library functionality for both file extraction and image descriptions. Let me know if you’d like to tweak anything further! |
|