Thousands of books have been used without permission to develop AI systems. We’ve put together some joint guidance with the Society of Authors (SoA) so you know what to do if your works are among them, and what action you can take.
In September, The Atlantic published a series of articles about Books3 – a vast, unlawfully obtained collection of books that have been used without permission to develop some of the most well-known artificial intelligence systems.
You may already have checked to see if your books are listed on it (if you haven’t you can search the Books3 database here) and need to know what you can do about it. We know that many WGGB and SoA members, and UK publishers, are affected. Finding their books listed there has left many authors feeling angry, frustrated and powerless. In short, our view is that this is piracy on an industrial scale.
This joint guidance will help you understand the issue, give you an overview of what we and other trade unions and industry partners are doing about it, and we’ll share a few practical steps you can take now.
What is Books3?
Books3 is a collection of 183,000 books, downloaded from pirate sources. We know it has been used by the developers of several large language models to develop their systems, including Meta, Bloomberg and EleutherAI.
OpenAI, the developers of ChatGPT, will not reveal what works they have used to ‘train’ their systems. We hope that ongoing lawsuits in the US will uncover further information, but we believe it is likely that OpenAI also used books obtained from unlawful sources.
What are the WGGB and SoA doing about it?
We are in close contact with the Publishers Association and the Association of Authors’ Agents, who share deep concerns about Books3 and other issues around the development of artificial intelligence systems. We are also working with the British Copyright Council and the Intellectual Property Office to ensure they understand the damage that inaction will have on creative careers, while we consider the possibility of legal action here in the UK.
This is just part of the work we have been doing to lobby industry and the UK Government on artificial intelligence over the past 12 months. These technologies pose many risks to creative careers – from the way they are developed to human creators already facing competition from them.
Meanwhile, the Authors Guild, our trade union counterpart in the US, has launched a class action lawsuit against OpenAI, Meta and Google. Although we are not directly involved in this action we fully support it – a favourable outcome for the named authors will be an important test case more widely, so we are watching its progress closely.
What can authors do?
If you find your books in the Books3 dataset, or if you know that any AI system has detailed information about your work:
- Tell your publisher (and agent if you have one) – like authors, publishers are working to understand the scale of the issue. We are working with publishers to plan an industry-wide response.
- Send a letter to AI companies telling them that they do not have the right to use your books. Doing this via the Authors Guild website will help show solidarity with the legal action they are taking.
- Contact your respective union (WGGB or SoA) for further advice, to tell us about your experiences, or to get involved in our campaign work.
You can read our policy position statement on AI here.
Photo: Shutterstock.com/PixelsHunter