Specifiche EPub e Metadati
Specifiche EPub e Metadati
ePub: metadata per arricchire un eBook (Parte 1°) « Punto acuto « Servizi per
l'[Link]
ePub: metadata opzionali (Parte 2°) « Punto acuto « Servizi per l'editoriapunto-
[Link]
Metadata obbligatori
Metadata opzionali
creator subject description publisher contributor date type format source relation
coverage rights
E ora approfondiamo ogni metadata con un focus dettagliato sulle sue proprietà.
METADATA OBBLIGATORI
- title
Questo elemento è obbligatorio e deve essere inserito almeno un titolo del libro in
ogni ePub. Nelle specifiche ePub3 viene chiarita la possibilità di inserire titoli multipli,
ma in questo caso occorre specificare la tipologia di ognuno di questi nell’elemento
title-type (per esempio “Titolo principale”, “Sottotitolo”, ecc.). Per evitare confusione
con la gerarchia, può essere utile indicare l’ordine di visualizzazione (e per questo ci
viene in aiuto l’elemento opzionale display-seq). Mettiamo di avere un libro dal titolo
“L’universo dei metadata. Guida per arricchire un ePub. Prima edizione”. Ecco come
potremmo suddividerlo:
Purtroppo nessun dei lettori che abbiamo testato permette ancora di visualizzare
questa tripartizione e nel caso di iBooks viene mostrato come titolo principale il
terzo in gerarchia (“Prima edizione” come si può vedere dallo screenshot che
segue).
- language
In questo metadata obbligatorio va specificata la lingua in cui è stata redatta la
pubblicazione. Le abbreviazioni per specificare la lingua devono avere un valore
conforme alle disposizioni [RFC5646] dell’IETF. Ecco alcuni esempi:
it –> italiano
en –> inglesefr –> franceseja –> giapponesede –> tedesco
Nel caso di un libro scritto in italiano il codice del metadata sarà il seguente:
1 <metadata xmlns:dc="[Link] 2 … 3
<dc:language>it</dc:language> 4 … 5 </metadata>
- identifier
Ogni pubblicazione deve avere un numero identificativo univoco che va
specificato con questo metadata (generalmente è il codice ISBN).Per essere valido
nell’epub2 questo metadata appare così:
Dopo il focus sui metadata obbligatori affrontiamo ora l’argomento dei metadata
[Link] come prima cosa l’elenco.
Metadata opzionali creator contributor rights date source publisher subject type
format description relation coverage
Queste informazioni non sono obbligatorie, quindi nel caso decidiate di non inserirle
non avrete problemi con la validazione dell’[Link] riprendendo l’articolo
precedente, i metadata arricchiscono di informazioni il vostro eBook, quindi
costituscono un buon valore aggiunto anche in termini di ricerca dei contenuti.
- creator
versione ePub2
versione ePub3
1 <dc:creator id="creator">Punto Acuto</dc:creator> 2 <meta refines="#creator"
property="role" scheme="marc:relators" id="role">aut</meta> 3 </metadata>
Nel caso in cui ci sia più di un creatore della pubblicazione, con i metadata ePub3 è
possibile servirsi di display-seq (già visto con title) per decidere la gerarchia nella
visualizzazione dei nomi.
aut –> Autore aft –> Autore di postfazione, colophon aui –> Autore dell’introduzione
bkd –> Book designer clb –> Collaboratore cov –> Cover designer ill –> Illustratore
pfr –> Correttore di bozze red –> Redattore trl –> Traduttore
In questo modo può essere data visibilità a tutti coloro che hanno contribuito alla
realizzazione del libro creando una sorta di titoli di coda del [Link] caso in cui si
volesse effettuare un’ulteriore divisione distinguendo ruoli di primo e secondo piano
nella realizzazione del libro, si può ricorrere al metadata contributor.
- contributor
Dal punto di vista del codice ha le stesse caratteristiche di creator, ma indica coloro
che hanno avuto un ruolo di secondo piano nella realizzazione della pubblicazione.
- rights
Vanno qui indicate tutte le informazioni inerenti i diritti legati alla pubblicazione.
Tipicamente le informazioni di copyright includono i vari diritti di proprietà associati
alla pubblicazione, compresi i diritti della proprietà intellettuale.
- date
Questo elemento serve per indicare la data di creazione dell’ebook. È ammessa
solamente una data e nel caso di modifiche successive del libro, lasciando immutato
il valore inserito in date, si potrà ricorrere alla proprietà modified.
1 <dc:date>2012-03-20T[Link]+02:00</dc:date>2 <meta
property="dcterms:modified">2012-03-27T[Link]+02:00</meta>
Il formato dell’ora deve essere conforme ai criteri indicati in questa pagina dal W3C.
- source
- publisher
Come si può facilmente capire qui va indicato il nome della casa editrice o di chi ha
fatto sì che la pubblicazione fosse disponibile.
1 <dc:publisher>Edizioni Taldeitali</dc:publisher>
- subject
Qui è possibile inserire alcune parole chiave o anche una frase sintetica che indichi
l’argomento dell’ebook. Non c’è una lista precisa a cui attenersi per inserire i valori
e non c’è un limite di valori da inserire.
1 <dc:subject>guida per realizzare ebook</dc:subject>
- type
Con questo metadata è possibile specificare la natura o il genere del documento.
Per avere un’idea di alcuni valori disponibili per questo metadata si può consultare la
sezione apposita DCMI Type Vocabulary del Dublin Core.
- format
Qui è possibile specificare il formato del file o le dimensioni della risorsa. Per
avere maggiori informazioni si rimanda al MIME Media Types
- description
Questo metadata prevede una descrizione del contenuto della pubblicazione.
Può includere per esempio un riassunto della trama o una descrizione dei contenuti
principali nel caso di una pubblicazione tecnica.
- relation
Qui può essere indicata una stringa identificativa di una risorsa e la sua relazione
con la pubblicazione. È consigliabile che la stringa faccia parte di un sistema
convenzionale di identificazione. Il DCMI Usage Board sta cercando un modo
formale per esprimere questa intenzione.
- coverage
Questo metadata serve per indicare la pertinenza spaziale e temporale della
pubblicazione, la giurisdizione entro cui la pubblicazione, con i suoi contenuti, è
pertinente. Un luogo identificato con le sue coordinate geografiche, un periodo
temporale, una giurisdizione. Si raccomanda di servirsi di una risorsa controllata
come il Thesaurus of Geographic Names [TGN].
Introduction to EPUB 4 – EDRLab
EPUB 3 was created in 2011, but it didn’t replace EPUB 2 so far on most ebook
distribution channels.
The WG charter states that EPUB 4 will be a profile of PWP, i.e. a specialization of
PWP, with some additional features specific to the publishing industry (if any). EPUB
4 should be the ultimate interchange format for ebooks and other kinds of
publications. It will keep most features of EPUB 3 (if not all), will make use of
HTML5, CSS 3, javascript, media overlays, etc.
With some care and duplication of internal structures, it will be possible for a
publisher to release EPUB files simultaneously compatible with versions 2, 3 and 4
of the format.
The modifications of such internal plumbing will not change much for publishers of
simple ebooks and round-trip transformation between EPUB 3 (or EPUB 2) and
EPUB 4 will be made available by the Readium community.
But EPUB 4 wouldn’t have a great interest for publishers and users if it was only a
matter of plumbing. EDRLab will therefore push two innovations:
A solution for Web comics (and manga); an internal EDRLab Working Group has
been created in June 2017 for preparing proposals to the W3C for such concept and
structure; this will include page transitions and much more.
A solution for audio-books, currently never published using EPUB; an internal
EDRLab Working Group has also been created in June 2017 on this subject.
Conclusion
As on June 2017, the Publishing Working Group has just begun its work on these
three specifications. Currently, no representative of the browser vendors has joined
the group, something which must be addressed quickly, as some issues like a clean
pagination mechanism (CSS Fragmentation?) and a great layout both depend on the
integration of paged content in multiple browsers.
Web standardization should be agile and based on software prototypes. We hope
that the developments already made by the Readium-2 community will foster a rapid
pace of development for Web Publications and EPUB 4 format.
The Importance of EPUB and the Need for EPUB 4
Introduction
EPUB has become a fundamental technology for the global publishing ecosystem. It
is the preferred format for a broad range of types of publications, and it is considered
essential for accessibility. It has also become embedded in systems and workflows,
not just as a distribution file format, but as the basis for content development and
management workflows as well.
It is important to this ecosystem that the specificity, portability, and predictability
provided by EPUB be maintained and advanced as a profile of the more general,
flexible, and accommodating Web Publication format.
As the convergence of EPUB and Web Publications moves forward in the proposed
Publications WG in the W3C, it is critical to the publishing ecosystem that EPUB 3 be
maintained and refined in the meantime (which will be done in the EPUB 3 CG). It is
even more important that the next generation of EPUB, currently referred to as
EPUB 4, retain the specificity, portability, and predictability required by the publishing
ecosystem while benefitting from the improved features and functionality offered by
full alignment with the Open Web Platform as a profile of Web Publications and as a
well-defined type of Portable Web Publication.
EPUB 4 must not be in conflict with Web Publications; it must be a type of Web
Publication that provides the predictability and interoperability that this ecosystem
has come to rely on.
Trade Books
The first and still the most common use of EPUB is for the distribution of ebooks.
Because it has become so widely accepted in this space, it is now possible for trade
book publishers to create a single EPUB file that can be provided to all the retailers
and aggregators for whom they previously had to create separate versions. Although
the biggest recipient, Amazon, still delivers to consumers a proprietary format, the
single EPUB that a trade book publisher sends to the rest of its partners is also the
preferred format to send to Amazon, where it is converted into their proprietary
format.
The ability to send a single EPUB file to multiple recipients in the book supply chain
is an important business requirement to publishers, removing significant friction and
maintenance overhead to production and distribution workflows. That ability is based
on the specificity and consistency provided by the EPUB format, removing ambiguity
and unpredictability as files move between systems.
Although EPUB was used at first mainly for books with relatively simple formats—
fiction and trade nonfiction—it is now used for almost all types of trade books,
including books with complex layouts (e.g., cookbooks, travel guides) and books for
which the graphics and page layout are essential to how the book “works,” such as
many children’s books. As another example, EPUB has become the standard format
for the distribution of e-manga in Japan.
Education
EPUB and the EPUB for Education profile are used not so much for distribution to
the retail supply chain, but as a framework for the content infrastructures and
platforms by which many large educational publishers develop, deliver, and
disseminate their content to the learning management systems (LMS’s) and virtual
learning environments (VLEs) used in the classroom.
While these implementations are essentially built on Open Web Technologies, this is
an example of the added value that the EPUB format provides: an enhanced
vocabulary, containing publication- and education-specific terms not available in
HTML or WAI-ARIA; the ability to create a complex publication consisting of many
documents, media, and interactive features as a single well organized entity; and the
ability to extract ”chunks” of content (distributable objects) such as tests, quizzes,
exercises, scripted components, etc. and distribute them as valid EPUBs as well.
EPUBs used in education also have stricter accessibility requirements than those of
the web in general, although those requirements are all consistent with WAI, WCAG,
and ARIA.
The ability to create arbitrarily complex, interactive, and media-rich publications as
consistent, coherent, identifiable entities is an important business requirement for
publishers that the EPUB format provides.
EPUB is also not just for book content. IBM, for example, has moved from PDF to
EPUB as the standard format by which its documents are delivered. Japanese
official documents are distributed as EPUBs. The EU Publications Office (EU OP)
has created EPUBs for the extremely diverse set of publications it distributes—
ranging from legal, parliamentary, and judicial documents to instructional and
informational documents from the EU agencies in all countries of the European
Union, in all the EU languages. The EU OP is a strong supporter of the continued
evolution of EPUB and Web Publications because their mission is the wide and free
distribution of content by all means possible throughout the EU. Finally, as an
indication of how ubiquitous EPUB has become for document publishing, Google
Docs now provides automatic export as EPUBs.
The ability to disseminate publications in a form that can adapt to any rendering
environment, online or offline, in any orientation and dimension, and that is well
understood and adopted throughout the world, is an important business requirement
for publishers that EPUB provides.
Scholarly Journals
Because scholarly journals were early to see the benefits of digital distribution, the
use of PDFs for journal articles became the norm years ago. This is a problem today
because PDFs are not reflowable or sufficiently accessible. This situation is about to
change: Atypon, one of the leading hosts of scholarly journal content—40% of the
world’s peer reviewed journal literature is on their Literatum platform—has
announced that its next release, coming later in 2017, will create EPUBs as a
standard output, requiring no changes to submitted content on the part of the
publisher. This will suddenly make it possible for literally millions of journal articles to
be available as EPUBs.
The ability to automatically generate a reflowable file that renders adaptively, online
or online, in a web-conformant format, from arbitrary source files such as the
NLM/JATS/BITS XML format universally used in scholarly journals, is an important
business requirement that EPUB provides.
Accessibility
The publication of EPUB Accessibility 1.0 in January was a watershed event in the
publishing ecosystem. This provides the long-needed “baseline specification” for
what is meant by “an accessible publication.” Based on and fully conformant with all
Web accessibility guidelines, EPUB Accessibility provides publication-specific
requirements that will enable the creation of authoritative, referenceable
specifications for use both in legal contexts and in procurement documents,
especially in government and educational contexts. It also provides the basis for
accessibility certification, which is actively being developed by the DAISY
Consortium under a Google Impact Grant. EPUB is now widely preferred as the
format for the distribution of accessible content.
The ability to create accessible publications not in a separate, purpose-built form
based on remediation of standard publication formats, but to make the standard
publication formats, created by standard publishing workflows, natively accessible is
an important business requirement provided by EPUB.
Why EPUB 4?
The convergence of the Web and Publishing—which is the main motivation behind
the main motivation behind the recent combination of the IDPF into the W3C—
means that future publications will be able to make use of all the features available
on the Web, and can produce publications that can be displayed, without any
specific actions, in any Web browser. This evolution is essential for some of the
aforementioned publishing areas like publishing educational document or scholarly
journals and books. This evolution leads to the concept put forward by the recent
work at W3C and now planned to be a core development for Publishing@W3C—
Web Publications, and its subset, Portable Web Publications.
Web Publications need to be able to use any and all available web technologies,
whether online or offline. The Web Publication format needs to be extremely
accommodating and agnostic. For example, when a Web Publication is packaged, it
must be possible to use any packaging format available on the Web, now or in the
future. And the Web Publication specification needs to align completely—down to the
specifics of “may,” “should,” and ”must”—with the Web in general.
However, the publishing ecosystem requires specificity, portability, and predictability
that may mean, in some respects, limiting such choices and requiring things that
may not be required by Web Publications in general. For example, while a Web
Publication may be packaged in any valid way, it is useful for the publishing
ecosystem to know that all EPUBs are packaged in a certain way (e.g., as a .zip).
Likewise, the Web does not require WCAG AA conformance; this is only
recommended for web content. EPUB 4, on the other hand, may require WCAG AA
conformance.
The recognizable and widely implemented EPUB format can, and should, continue to
evolve. But it is important for its identity as a specific type of Web Publication, which
provides the specificity, portability, and predictability required by the publishing
ecosystem, to be maintained in its next, fully Web conformant, generation.