FOODSAM: ANY FOOD SEGMENTATION

REMARKABLY, THIS PIONEERING FRAMEWORK STANDS AS THE FIRST-EVER WORK TO ACHIEVE INSTANCE, PANOPTIC, AND PROMPTABLE SEGMENTATION ON FOOD IMAGES.

LANGUAGE MODELS ARE FEW-SHOT LEARNERS

BY CONTRAST, HUMANS CAN GENERALLY PERFORM A NEW LANGUAGE TASK FROM ONLY A FEW EXAMPLES OR FROM SIMPLE INSTRUCTIONS - SOMETHING WHICH CURRENT NLP SYSTEMS STILL LARGELY STRUGGLE TO DO.

A SURVEY ON MULTIMODAL LARGE LANGUAGE MODELS

MULTIMODAL LARGE LANGUAGE MODEL (MLLM) RECENTLY HAS BEEN A NEW RISING RESEARCH HOTSPOT, WHICH USES POWERFUL LARGE LANGUAGE MODELS (LLMS) AS A BRAIN TO PERFORM MULTIMODAL TASKS.

LARGE MULTIMODAL MODELS: NOTES ON CVPR 2023 TUTORIAL

THIS TUTORIAL NOTE SUMMARIZES THE PRESENTATION ON ``LARGE MULTIMODAL MODELS: TOWARDS BUILDING AND SURPASSING MULTIMODAL GPT-4'', A PART OF CVPR 2023 TUTORIAL ON ``RECENT ADVANCES IN VISION FOUNDATION MODELS''.