Overview
ABE Lab is a Creative Europe funded project operated by three partners: the European Digital Reading Laboratory (EDRLab), the Fondazione Libri Italiani Accessibili (LIA) and the National Library of the Netherlands (KB).
In the context of the European Accessibility Act, publishers are producing more and more accessible ebooks. But many titles produced in the past are still on sale, this is what we call the backlist. And though the costs to keep ebooks on the market are often low compared to that for printed books (which need physical storage and reprinting), it is important to understand that ebook titles can also disappear from the market if revenues no longer cover costs.
The main objective of the ABE Lab project is to provide guidelines to European publishers for boosting the remediation of their ebooks that are currently on sale and convert them into accessible versions. A complete description of the project can be found on the ABE Lab website.
To reach this main objective we first needed to collect data on the backlist to know its volume and its composition. This document presents insight on the number of ebooks actually on the European market and how this number can be broken down by category and format.
This document provides key numbers and segmentation. A short explanation on our methodology can be found at the end. Data sources and Glossary can be found as annex.
3.5 millions of ebooks in the European backlist
The last available FEP annual statistic report1 presents data from 2021. It states 13.4 millions titles in active catalogue of which 3 millions are ebooks, representing 22% of the titles on the market for only 12% of sales.
Based on the summation of the numbers we received from individual countries in early 2023, we established a backlist of ebooks published in the EU exceeding 3.5 million.
Since ebooks can be easily traded cross-border, the number of titles available to consumers is much higher. Especially ebooks in the English language are very popular for certain categories (like computer books or scientific publications) and often outnumber local productions.
Since important differences exist between countries, which would lead to different conclusions, we present detailed views per market size (largest, medium and smaller). Title sales from outside the EU by big selling platforms are treated separately as very few data are available from those companies.
Largest markets
France and Germany both have a million ebooks each on their backlist. In France we notice exchanges with Canada (Quebec), in Germany with Austria and Switzerland.
France: 952.416
Germany: 1.055.369
Medium markets
Italy, Spain, Poland, the Netherlands, Czechia and Sweden each have a backlist of more than 100.000 ebooks. In a country like Belgium, we already noticed that many titles are imported from France and the Netherlands. Austria has the same situation with Germany (and Switzerland)
Italy: 376.097
Spain: 336.757
Poland: 138.415
The Netherlands:102.000
Czechia: 126.229
Sweden: 107.561
Smaller markets
Denmark, Greece, Romania, Portugal, Hungary, Bulgaria, Finland, Slovakia, Ireland, Croatia, Lithuania, Slovenia, Latvia, Estonia, Cyprus, Luxembourg and Malta have backlists of ebooks published in the country itself between 80.000 and a few hundred titles. When adding the reported numbers together, the total count for smaller markets exceeds 300.000 already.
Sales from outside EU
Buying ebooks from non European retailers is easy since the deliverable is a file that does not pass through border controls. SInce the EAA targets European markets, we focused on data of ebooks sales by European retailers.
International book selling platforms operate on the European market, but getting accurate data from them is not simple as they operate outside of traditional distribution channels. Below we provide a quick overview of the collections available through those platforms.
Amazon Kindle. Though Amazon does not publish exact numbers about ebooks that are available for Kindle, some sources estimate that more than 14 million titles are currently available2. However, as this number also takes into account a large number of titles without an ISBN, we can not just add this number to our estimate directly.
Apple Books. Though exact numbers were not available, and availability differs from country to country (due to licensing deals). Apple is known to offer millions of ebooks and audiobooks too.
Google Books: According to some sources, Google Books offers more than 40 million books in 50 Languages3, including 10 million books for free4 ^.^ Here we have the same issue that this number can not be compared directly to the backlist estimate as we defined it. Given this magnitude, however, it will be interesting to see how Google will comply with the EAA.
Kobo: Kobo claims to have over 5 million ebooks and audiobooks available for reading directly on their e-readers and apps5. In some countries they work together with local retailers (like Bol.com in the Netherlands) to provide subscription services including a lot of local content n case of Kobo Plus offered by Bol.com about ‘hundreds of thousands’).
Different markets segmentations
By category
To compare the markets in different EU member states, we needed to split the complete range of ebooks in several categories, like fiction books, biographies, children's books, books on art, etcetera. Especially as we know that different types of publications each have their own accessibility issues, we want to make sure we have a good representation of ebook types in this study. Since many different categorization methods are in use, we chose the Thema6 categorization scheme. However, not all ebooks have been assigned Thema codes yet, and some mapping from older schemas had to be applied to make estimates.
In general, we found that fiction books and others that are mainly text, often do have a digital ebook version. Almost all bestsellers are available as ebooks. For non-fiction, however, we notice that market size influences the percentage of non-fiction books that have a digital version. Since it is more complex to make an attractive ebook version of a non-fiction book, the costs are considerably higher than for text only books. In many publishing houses the production of complex titles is even outsourced as this requires many skills that not all publishers have in house. In smaller markets we noticed that non-fiction is considerably less represented in the total ebook offering than in the largest markets.
For example, in a large market like Germany non-fiction is about 86 %, whereas in a smaller market like the Netherlands it is considerably less: 57 %.
Category | Germany | France | Italy | Netherlands | Spain |
---|---|---|---|---|---|
The Arts… | 2% | 2% | 4% | 1% | N/A |
Language & Linguistics | 3% | N/A | 1% | 1% | N/A |
Biographies | 6% | 11% | N/A | 6% | N/A |
Fiction | 14% | 42% | 38% | 43% | N/A |
Reference & Interdisciplinary | 3% | N/A | 0% | 1% | N/A |
Society & Social Sciences | 11% | 13% | 9% | 6% | N/A |
Economics, Finance, Business Management | 8% | 3% | 4% | 6% | N/A |
Law | 4% | 1% | 3% | 5% | N/A |
Medicine & Nursing | 7% | N/A | 2% | 1% | N/A |
History & Archeology | 3% | 5% | 3% | 4% | N/A |
Mathematics & Science | 7% | 1% | 1% | 1% | N/A |
Philosophy & Religion | 5% | 2% | 6% | 6% | N/A |
Earth Sciences | 1% | 1% | 1% | 0% | N/A |
Sport, Outdoor recreation | N/A | 1% | 1% | 1% | N/A |
Technology, Engineering, Agriculture | 6% | 1% | 1% | 0% | N/A |
Computing & IT | 5% | N/A | 1% | 1% | N/A |
Health & Relationships | 3% | N/A | 7% | 3% | N/A |
Lifestyle, Hobbies | 4% | 4% | 4% | 4% | N/A |
Graphic novels | 1% | 10% | 1% | 0% | N/A |
Children’s books | 3% | 5% | 4% | 9% | N/A |
no label | 4% | 11% | N/A | 2% | N/A |
By formats
We obtained detailed data on digital formats of books on the market for only 5 key markets: France, Germany, Italy, Netherlands and Spain. We retained only mainstream text formats (EPUB and PDF). It shows a very disparate situation, where the German market has only 3 % of EPUB3 files but 60 % of PDF files, while France and Italy reach nearly 40% of EPUB3 files versus less than 25% of PDF files.
This disparity does not allow us to make expectations on the total European market. As a consequence we did not integrate any format related query in our wishlist for files to test.
The format repartition will have to be studied at national and publisher’s level in order to refine cost estimation.
Table: % of titles per distribution format in 2022.
Market | % of EPUB2 in 2022 | % of EPUB3 in 2022 | % of PDF in 2022 | % of other formats |
---|---|---|---|---|
France | 12 | 38 | 22 | 18 |
Germany | 34 | 3 | 60 | 3 |
Italy | 36 | 40 | 23 | 1 |
Netherlands | 75 | 10 | 15 | 0 |
Spain | 20 | 15 | 40 | 25 |
By years
We captured the evolution of distributed files formats by year since 2012 for the 5 key markets. The same disparity is observed as some countries present a linear evolution, while others seem to produce the same ebooks formats in 2012 and 2022.
The evolution of formats distributed through the years shows a progression of the use of EPUB3 against other formats.
Table: evolution of distribution formats from 2012 to 2022.
Market | EPUB2 (2012 / 2022) | EPUB3 (2012 / 2022) | PDF (2012 / 2022) |
---|---|---|---|
France | -11 (from 23% to 12%) | +13 (from 25% to 38%) | -11 (from 33% to 22%) |
Germany | -1 (from 35% to 34%) | +3 (from 0% to 3%) | -2 (from 62% to 60%) |
Italy | -12 (from 48% to 36%) | +34 (from 6% to 40%) | -22 (from 45% to 23%) |
Netherlands | -8 (from 83% to 75%) | +9 (from 1% to 10%) | -1 (from 16% to 15%) |
Spain | insufficient data | insufficient data | insufficient data |
Methodology
We define ebook backlists in Europe as the collection of all ebooks published and/or made available in any of the EU Member states. Since there is no central source that provides this information, many publication registration offices, distributors, aggregators, resellers and national libraries were contacted. Given that all these organisations handle only a specific subcollection and that these subcollections have overlaps ,we need to be careful to draw conclusions. Imported titles from outside the EU also make up a large proportion of the backlist, especially ebooks from the UK, US and Canada.
We have collected direct basic data from 22 countries and detailed data from 5 countries (France, Germany, Italy, Netherlands and Spain). This collected data specified the number of titles, and the repartition by categories and formats. We’ve completed that collection with annual data published by the Federation of European Publishers (FEP) to get an idea of sales.
Counting titles
To get information about the numbers of ebooks that have been published in EU member states, we started by contacting all (EU) ISBN agencies. Adding up all production figures should give a good first estimate for the total.
For a more complete estimate of what is on the EU market, it is not sufficient to know what is published locally, but we have to take into account what is imported from abroad. Especially with certain categories of non-fiction ebooks (like computer books) we see that the local production is limited and ebooks in the English language are dominant. Some of these titles are distributed via local distributors and sold at local retailers. Next to that we also see online platforms that operate worldwide (Amazon, Kobo, Google, Apple) that provide their services directly to EU consumers.
National libraries also are a proper source of information about ebooks published in a country. Sometimes these national libraries play a role in providing ISBN’s to publishers and did respond when we requested data from the official ISBN agency. Sometimes we contacted the National libraries directly to obtain information on the availability of ebooks. Especially when publishers have the legal obligation to deposit a copy of their publications at the national library, the library also has an overview.
Classifying titles
Book categories are a way to isolate and regroup books with common characteristics. Those characteristics may explain production choices and technologies used.
We chose the Thema subject category scheme7 established and maintained by EDItEUR. Thema aims at being the scheme for a global book trade and is the most commonly used classification methodology. Even if the use of Thema classification progresses, it is not the only classification used, and since it is relatively new, not all ebooks have been assigned Thema codes yet.
For older collections and countries where Thema classification is not the main standard in use, EDItEUR provides Thema mappings8.
Another way to classify titlesis by production year. As workflows and production tools evolve this is reflected in the way ebook files are made. For building a representative sample set for our research, we need to understand this development.
Appendices
Appendix 1: data sources
- Dilicom
- Dilve
- MVB GmbH
- Informazioni Editoriali
- Bibliotheka Narodowa
- ISBN.NL
- CB Logistics
- Czech National ISBN agency
- National Library of Sweden,
- DBC Digital;
- Publizon
- Greek ISBN agency
- Romania National Library
- APEL
- National Széchényi Library
- ISBN Bulgaria
- Nielsen Bookdata
- Lithuania National Library
- Slovenian ISBN agency
- National Library of Latvia
- Estonian ISBN agency
- Malta National Book Council
Appendix 2: Glossary
EAA: European Accessibility Act
EU: European Union
EPUB: a distribution and interchange format for digital publications and documents. Provides a means of representing, packaging, and encoding structured and semantically enhanced Web content — including HTML, CSS, SVG and other resources — for distribution in a single‐file container. Specification created by IDPF-Industrial Digital Publishing Forum) and maintained by W3C (World Wide Web Consortium) since 2017.
EPUB2: Electronic Publication, Version 2. A format for electronic publications with reflowable text in marked up document structure with associated images for illustrations, all in a container format. EPUB 2 was initially standardised in 2007. EPUB 2.0.1 was approved in 2010.
EPUB3: Electronic Publication, Version 3.0 (2011) to Version 3.3 (2023).
FXL: Fixed Layout.
FEP: Federation of European Publishers
ONIX: The ONIX family includes standards for Books, Serials and Licensing Terms & Rights Information. In this document we use ONIX to refer to ONIX for Books. The ONIX for Books Product Information Message is the international standard for representing and communicating book industry product information in electronic form.
THEMA: the subject category scheme for a global book trade.
PDF: Portable Document Format, created by Adobe in 1992 and standardised as open standard, maintained by the International Organization for Standardization (ISO). The PDF ISO declines in special purposes as PDF/A for archiving, PDF/E for engineering, and PDF/X for printing, PDF/UA for accessibility.
Notes
European Book Publishing Statistics 2021, FEP 2022. ↩︎
How Many Ebooks Are There In The Kindle Store On Amazon? Just Publishing Advice, 2023. ↩︎
How the Google Books team moved 90,000 books across a continent. Ari Mariani, 2023. ↩︎
Thema – the subject category scheme for a global book trade version 1.5, EDItEUR, 2022. ↩︎
Thema – the subject category scheme for a global book trade version 1.5, EDItEUR, 2022. ↩︎
Thema mappings, EDItEUR, 2023. ↩︎