OKAPI: INSTRUCTION-TUNED LARGE LANGUAGE MODELS IN MULTIPLE LANGUAGES WITH REINFORCEMENT LEARNING FROM HUMAN FEEDBACK
OKAPI INTRODUCES INSTRUCTION AND RESPONSE-RANKED DATA IN 26 DIVERSE LANGUAGES TO FACILITATE THE EXPERIMENTS AND DEVELOPMENT OF FUTURE MULTILINGUAL LLM RESEARCH.