Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 1.01 KB

README.md

File metadata and controls

13 lines (7 loc) · 1.01 KB

Oscar: Object-Semantics Aligned Pre-training for Vision-and-Language Tasks

Introduction

This repository contains source code necessary to reproduce the results presented in the paper Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks. We propose a new cross-modal pre-training method Oscar (Object-Semantics Aligned Pre-training). It leverages object tags detected in images as anchor points to significantly ease the learning of image-text alignments. We pre-train Oscar on the public corpus of 6.5 million text-image pairs, and fine-tune it on downstream tasks, creating new state-of-the-arts on six well-established vision-language understanding and generation tasks. For more on this project, see the Microsoft Research Blog post.

Oiginal Repository

https://github.com/microsoft/Oscar