Best Open Source multi modal Libraries
A curated list of the most popular GitHub repositories tagged with multi modal. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1OpenBMB/MiniCPM-o
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
#2OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
#3activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
#4modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
#5enricoros/big-AGI
AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.
#6zai-org/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型