LANGUAGE MODELS ARE FEW-SHOT LEARNERS

BY CONTRAST, HUMANS CAN GENERALLY PERFORM A NEW LANGUAGE TASK FROM ONLY A FEW EXAMPLES OR FROM SIMPLE INSTRUCTIONS - SOMETHING WHICH CURRENT NLP SYSTEMS STILL LARGELY STRUGGLE TO DO.

A SURVEY ON MULTIMODAL LARGE LANGUAGE MODELS

MULTIMODAL LARGE LANGUAGE MODEL (MLLM) RECENTLY HAS BEEN A NEW RISING RESEARCH HOTSPOT, WHICH USES POWERFUL LARGE LANGUAGE MODELS (LLMS) AS A BRAIN TO PERFORM MULTIMODAL TASKS.

LARGE MULTIMODAL MODELS: NOTES ON CVPR 2023 TUTORIAL

THIS TUTORIAL NOTE SUMMARIZES THE PRESENTATION ON ``LARGE MULTIMODAL MODELS: TOWARDS BUILDING AND SURPASSING MULTIMODAL GPT-4'', A PART OF CVPR 2023 TUTORIAL ON ``RECENT ADVANCES IN VISION FOUNDATION MODELS''.

DUAL AGGREGATION TRANSFORMER FOR IMAGE SUPER-RESOLUTION

BASED ON THE ABOVE IDEA, WE PROPOSE A NOVEL TRANSFORMER MODEL, DUAL AGGREGATION TRANSFORMER (DAT), FOR IMAGE SR. OUR DAT AGGREGATES FEATURES ACROSS SPATIAL AND CHANNEL DIMENSIONS, IN THE INTER-BLOCK AND INTRA-BLOCK DUAL MANNER.