In physics, however, university labs run joint experiments on the LHC. Big AI experiments are typically carried out on hardware that is owned and controlled by companies. But even that is changing, says Pineau. For example, a group called Compute Canada is putting together computing clusters to let universities run large AI experiments. Some companies, including Facebook, also give universities limited access to their hardware. “It’s not completely there,” she says. “But some doors are opening.”
Haibe-Kains is less convinced. When he asked the Google Health team to share the code for its cancer-screening AI, he was told that it needed more testing. The team repeats this justification in a formal reply to Haibe-Kains’s criticisms, also published in Nature: “We intend to subject our software to extensive testing before its use in a clinical environment, working alongside patients, providers and regulators to ensure efficacy and safety.” The researchers also said they did not have permission to share all the medical data they were using.
It’s not good enough, says Haibe-Kains: “If they want to build a product out of it, then I completely understand they won’t disclose all the information.” But he thinks that if you publish in a scientific journal or conference, you have a duty to release code that others can run. Sometimes that might mean sharing a version that is trained on less data or uses less expensive hardware. It might give worse results, but people will be able to tinker with it. “The boundaries between building a product versus doing research are getting fuzzier by the minute,” says Haibe-Kains. “I think as a field we are going to lose.”
Research habits die hard
If companies are going to be criticized for publishing, why do it at all? There’s a degree of public relations, of course. But the main reason is that the best corporate labs are filled with researchers from universities. To some extent the culture at places like Facebook AI Research, DeepMind, and OpenAI is shaped by traditional academic habits. Tech companies also win by participating in the wider research community. All big AI projects at private labs are built on layers and layers of public research. And few AI researchers haven’t made use of open-source machine-learning tools like Facebook’s PyTorch or Google’s TensorFlow.
As more research is done in house at giant tech companies, certain trade-offs between the competing demands of business and research will become inevitable. The question is how researchers navigate them. Haibe-Kains would like to see journals like Nature split what they publish into separate streams: reproducible studies on one hand and tech showcases on the other.
But Pineau is more optimistic. “I would not be working at Facebook if it did not have an open approach to research,” she says.
Other large corporate labs stress their commitment to transparency too. “Scientific work requires scrutiny and replication by others in the field,” says Kavukcuoglu. “This is a critical part of our approach to research at DeepMind.”