Show HN: Phi-3-MLX – Language and Vision Models for Apple Silicon

Y	Hacker News new \| ask \| show \| jobs

Show HN: Phi-3-MLX – Language and Vision Models for Apple Silicon (github.com)

3 points by JosefAlbers 740 days ago

Phi-3-MLX is an open-source framework that brings the latest Phi-3 models to Apple Silicon using the MLX framework. It supports both the Phi-3-Mini-128K language model (updated July 2, 2024) and the Phi-3-Vision multimodal model, enabling a wide range of AI applications.

Key features:

1. Apple Silicon Optimization: Leverages MLX for efficient execution on Apple hardware.

2. Flexible Model Usage: - Phi-3-Mini-128K for language tasks - Phi-3-Vision for multimodal capabilities - Seamless switching between language-only and multimodal tasks

3. Advanced Generation Techniques: - Batched generation for multiple prompts - Constrained (beam search) decoding for structured outputs

4. Customization Options: - Model and cache quantization for resource optimization - (Q)LoRA fine-tuning for task-specific adaptation

5. Versatile Agent System: - Multi-turn conversations - Code generation and execution - External API integration (e.g., image generation, text-to-speech)

6. Extensible Toolchains: - In-context learning - Retrieval Augmented Generation (RAG) - Multi-agent interactions

The framework's flexibility unlocks new potential for AI development on Apple Silicon. Some unique aspects include:

- Easy switching between language-only and multimodal tasks - Custom toolchains for specialized workflows - Integration with external APIs for extended functionality

Phi-3-MLX aims to provide a user-friendly interface for a wide range of AI tasks, from text generation to visual question answering and beyond.

GitHub: https://github.com/JosefAlbers/Phi-3-Vision-MLX Documentation: https://josefalbers.github.io/Phi-3-Vision-MLX/

I would love to hear your thoughts on potential applications for this framework and any suggestions for additional features or integrations.