AGENTBENCH: EVALUATING LLMS AS AGENTS

LARGE LANGUAGE MODELS (LLMS) ARE BECOMING INCREASINGLY SMART AND AUTONOMOUS, TARGETING REAL-WORLD PRAGMATIC MISSIONS BEYOND TRADITIONAL NLP TASKS.

UNIVTG: TOWARDS UNIFIED VIDEO-LANGUAGE TEMPORAL GROUNDING

MOST METHODS IN THIS DIRECTION DEVELOP TASKSPECIFIC MODELS THAT ARE TRAINED WITH TYPE-SPECIFIC LABELS, SUCH AS MOMENT RETRIEVAL (TIME INTERVAL) AND HIGHLIGHT DETECTION (WORTHINESS CURVE), WHICH LIMITS THEIR ABILITIES TO GENERALIZE TO VARIOUS VTG TASKS AND LABELS.