Leading the AI industry forward with open, collaborative efforts
The challenges of harmful content affect the entire tech industry and society at large. That’s why we open-source our technology to make it available for others to use. We believe being open and collaborative with the AI community will spur research and development, create new ways of detecting and preventing harmful content, and help keep people safe.
Here are some pieces of technology we’ve open-sourced in recent years, including 2 industry competitions we led:
XLM-R is a machine learning model that’s trained in one language and then used with other languages without additional training data. With people posting content in more than 160 languages on Meta technologies, XLM-R lets us use one model for many languages, instead of one model per language. This helps us more easily identify hate speech and other violating content across a wide range of languages and launch products in multiple languages at once. We open-sourced our models and code so the research community can improve the performance of their multilingual models. Goal: To give people the best experience on our platforms, regardless of the language they speak.
Linformer is a transformer architecture that analyzes billions of pieces of content on Facebook and Instagram in different regions around the world. Linformer helps detect hate speech and content that incites violence. We published our research and open-sourced the Linformer code so other researchers and engineers could improve their models. Goal: To create a new AI model that learns from text, images and speech and efficiently detects hate speech, human trafficking, bullying and other forms of harmful content.
Deepfakes Detection Challenge
We created a competition with Microsoft, the Partnership on AI, and academics from several universities for technology that better detects when AI has been used to alter a video in order to mislead viewers. Our contribution to the Deepfakes Detection Challenge was commissioning a realistic data set, which the industry lacked, to help detect deepfakes. Goal: To spur the industry to create new ways of detecting and preventing media manipulated with AI from being used to mislead people.
We created a competition with Getty Images and DrivenData to accelerate research on the problem of detecting hate speech that combines images and text. Our contribution to the Hateful Memes Challenge was creating a unique data set of over 10,000 examples so researchers could easily use them in their work. Goal: To spur the industry to create new approaches and methods for detecting multimodal hate speech.