llava-vl.github.io - LLaVA

Description: Visual Instruction Tuning

Example domain paragraphs

Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks in the language domain, but the idea is less explored in the multimodal field. Multimodal Instruct Data . We present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. LLaVA Model . We introduce LLaVA ( L arge L anguage- a nd- V ision A ssistant) , an end-to-end trained large multimodal model that conne

For each subset, we visualize the root noun-verb pairs for the instruction and response. For each chart, please click the link for the interactive page to check out the noun-verb pairs whose frequency is higher the given number.

LLaVa connects pre-trained CLIP ViT-L/14 visual encoder and large language model Vicuna , using a simple projection matrix. We consider a two-stage instruction-tuning procedure: Stage 1: Pre-training for Feature Alignment . Only the projection matrix is updated, based on a subset of CC3M. Stage 2: Fine-tuning End-to-End. . Both the projection matrix and LLM are updated for two different use senarios: Visual Chat : LLaVA is fine-tuned on our generated multimodal instruction-following data for daily user-orie

Links to llava-vl.github.io (35)

simonwillison.net Simon Willison’s Weblog
chunyuan.li Chunyuan Li
hliu.cc Haotian Liu
mmmu-benchmark.github.io MMMU
rtsoft.com Robinson Technologies - Home
zhangyuanhan-ai.github.io Yuanhan Zhang
lsl.zone Shilong Liu Homepage
brianboli.com Li Bo
yuheng-li.github.io Yuheng Li
rl-diffusion.github.io Training Diffusion Models with Reinforcement Learning
zrrskywalker.github.io Renrui Zhang
codedojo.com Code Dojo | Game development, technology, and Japan.
fengli-ust.github.io About Me - Feng Li
tobiaslee.top Stay Hungry,Stay Foolish.
rentainhe.github.io Tianhe Ren
mathverse-cuhk.github.io MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
xinyadu.github.io Xinya Du
llava-rlhf.github.io LLaVA-RLHF
amitpuri.com Amit Puri | Strengthening Digital Experiences, Modernize Cloud Journey with AI-Driven Transformation! This platform explores AI'
sewformer.github.io SewFormer
askaresh.com AskAresh | All things AI, ML, NLP, LLM, Cloud & End-user Computing!
mmie-bench.github.io MMIE
videohallucer.github.io VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
con-textual.github.io ConTextual
image-hijacks.github.io Image Hijacks: Adversarial Images can Control Generative Models at Runtime
mathvision-cuhk.github.io Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
bubo-gpt.github.io BuboGPT
rso.altervista.org RSO Raccolta servizi online gratis 2024 Portale Link
wondervictor.github.io YOLO-World
vra.github.io Yunfeng's Simple Blog
universal-ner.github.io UniNER
bairblog.github.io The Berkeley Artificial Intelligence Research Blog
openagi.codes Open AGI Codes! - Kindle e-book - Coming soon! | Empowering AI enthusiast to enhance digital experiences through AI-driven trans
codedojo.net Code Dojo | Game development, technology, and Japan.
codeiforme.com • Curated knowledge about art and AI •