Sei sulla pagina 1di 35

Ithaka

A Systemwide View of Library Collections

Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka


CNI Spring Task Force Meeting April 5, 2005

Ithaka

Systemwide View of Library Collections


Print collections have been changing, as the distinction between local and external resources is increasingly blurred due to resource sharing Digitization combined with network technologies creates opportunities for one copy of a resource to be shared across many libraries
These forces inevitably are going to lead to a shift in focus to the resources of the system, rather than individual library collections

Ithaka

Mass Digitization
Great deal of public and private investment in digitization programs e.g., JSTOR, ARTstor and of course mass digitization spearheaded via GooglePrint Digitization opportunities unlimited; resources are not

How to determine priorities? What programs of digitization will be necessary to meet the needs of the scholarly community?

Ithaka

Print Preservation
From a systemwide perspective, what preservation framework makes most sense for print resources? How have preservation frameworks changed over time? As retrospective materials become increasingly available in digital form, will new frameworks for print preservation be necessary?

Ithaka

What Are We Going to Do Today?


The kinds of collaborations necessary to begin to take advantage of a systemwide perspective are very hard, both from economic and political standpoints We will not be proposing any answers!

Instead, we thought to take advantage of the WorldCat resource which affords the broadest view of print collections to build a bridge from a local perspective to the beginnings of a systemwide perspective
Todays presentation focuses on print books

Ithaka

Data Sources
WorldCat: worlds largest and most comprehensive bibliographic database

> 20,000 libraries worldwide have contributed to the development of WorldCat

Copy of WorldCat from January 2005:

~55 million records

Copy of WorldCat holdings file from January 2005:

~950 million holdings

Ithaka

Data Source Limitations Not all published materials are cataloged in WorldCat
Not all library holdings are represented in WorldCat Largely reflects North American library collections So WorldCat does not embody the whole universe of library collections and holdings but its a very good approximation!

Ithaka

1. The Systemwide Collection

Size Age

Ithaka

How Many Books Are Held in the Systemwide Collection?


60,000,000 54,831,000 50,000,000

40,000,000

30,000,000

20,000,000

10,000,000

0 Total WorldCat Records Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Ithaka

How Many Books Are Held in the Systemwide Collection?


60,000,000 54,831,000 50,000,000 45,269,000 40,000,000

30,000,000

20,000,000

10,000,000

0 Total WorldCat Records Language-based or manuscript monographs Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Ithaka

How Many Books Are Held in the Systemwide Collection?


60,000,000 54,831,000 50,000,000 45,269,000 40,000,000

35,251,000

30,000,000

20,000,000

10,000,000

0 Total WorldCat Records Language-based or manuscript Language-based or manuscript Language-based or manuscript monographs monographs, excluding monographs, excluding government documents and government documents and theses/dissertations theses/dissertations, in print format only

Ithaka

How Many Books Are Held in the Systemwide Collection?


60,000,000 54,831,000 50,000,000 45,269,000 40,000,000 35,251,000 31,923,000 30,000,000

20,000,000

10,000,000

0 Total WorldCat Records Language-based or manuscript Language-based or manuscript Language-based or manuscript monographs monographs, excluding monographs, excluding government documents and government documents and theses/dissertations theses/dissertations, in print format only

Ithaka

Works and Manifestations


FRBR (Functional Requirements for Bibliographic Records):
Hierarchy of bibliographic entities Works, Expressions, Manifestations, Items

Work: distinct intellectual or artistic creation

e.g., Macbeth e.g., Macbeth, Folger Shakespeare Library edition, published in paperback by Washington Square Press (2004)

Manifestation: physical embodiment of an expression of a work

WorldCat records describe FRBR manifestations


Works identified using OCLC FRBRization algorithm
Converts MARC21 bibliographic databases into FRBR work-sets http://www.oclc.org/research/software/frbr/

Ithaka

Most Book Works Have Few Manifestations


35,000,000 31,923,000 30,000,000 26,025,000 25,000,000

20,000,000

15,000,000

10,000,000

5,000,000

0 Manifestations Works Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Ithaka

Print Book Manifestations and Works and Digital Manifestations


35,000,000 31,923,000 30,000,000 26,025,000 25,000,000

20,000,000

15,000,000

10,000,000

5,000,000 121,689 0 Manifestations Works Digital Manifestations Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Ithaka

How Old Are the Components of the Systemwide Collection?


Cumulative Book Works/Manifestations Over Time
35,000,000

30,000,000 Manifestations 25,000,000 Works

20,000,000

15,000,000

10,000,000

5,000,000

0
1700 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

Ithaka

How Old Are the Components of the Systemwide Collection?


Book Works/Manifestations per Year
800,000

700,000 Manifestations Works

600,000

500,000

400,000

300,000

200,000

100,000

0
1700 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

Ithaka

Age of Works and Manifestations: Relative to 1923 (millions)


30 25 20 15 10 5 0 Manifestations Works
18% 17% 82% 83%

Pre1923 1923 and After

Ithaka

2. Individual Collections Cumulate to Form the System

How will digitization bring them together virtually?

Ithaka

Minimal Overlap

Book Works Held by X or More Libraries (in millions)

30 25 20 15 10 5 0
1 or more 2 or more 3 or more 4 or more 5 or more 6 or more 7 or more 8 or more 9 or more 10 or 100 or more more

Number of Libraries

Ithaka

Works Held Broadly

Book Works Held by X or More Libraries (in millions)

7 6 5 4 3 2 1 0
10 or more 50 or more 100 or more 200 or more 300 or more 400 or more 500 or more

Number of Libraries

Ithaka

Works Held Broadly

Book Works Held by X or More Libraries, as Percent of Total Book Works

30% 25% 20% 15% 10% 5% 0%


10 or more 50 or more 100 or more 200 or more 300 or more 400 or more 500 or more

24%

9% 6% 4% 2% 2% 1%

Number of Libraries

Ithaka

The Virtual System in Practice


GooglePrint digitization initiative
Questions:
How many print books does this initiative potentially impact? What proportion of systemwide print book collection does this represent? Overlap (how much held broadly? how much held uniquely?)

Forthcoming paper from OCLC researchers that will offer some perspective on these questions Hopefully, work like this will help to establish set of important questions/metrics that need to be addressed when:
Considering digitization initiatives Considering implications of a changing world of research and learning for collections

Ithaka

3. How Is Rareness Distributed through the System?

Ithaka

Systemwide Holdings of Print Works

More than 5 holdings 33%

1 holding 37%

3-5 holdings 16%

2 holdings 14%

Ithaka

More than 9 millions works are held only once

12,000,000

10,000,000

8,000,000

6,000,000

4,000,000

2,000,000

0 1 holding 2 holdings 3 holdings 4 holdings 5 holdings 6 to 10 holdings 11 to 20 holdings 21-50 holdings 51-100 holdings 100+ holdings

Ithaka

4. What Systemwide Preservation Frameworks Have Served Us?

Ithaka

The Growth and Peak in Average Holdings Over Time


45 40 35 Manifestations Works

Average Holdings

30 25 20 15 10 5 0 0 25 50 75 100 Age in Years 125 150 175 200

Ithaka

Steady, Gradual Nineteenth Century Growth in Works Held Many Times


200,000 180,000 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 0 2 to 10 11 to 50 51 to 100 101 to 200 201 to 400 400 to 1000 1000+

1801-1810

1811-1820

1821-1830

1831-1840

1841-1850

1851-1860

1861-1870

1871-1880

1881-1890

1891-1900

Ithaka

Rapid Postwar Increase in Works Held Many Times


2,500,000 2,000,000 1,500,000 1,000,000 500,000 0 2 to 10 11 to 50 51 to 100 101 to 200 201 to 400 400 to 1000 1000+

1911-1920

1921-1930

1931-1940

1941-1950

1951-1960

1961-1970

1971-1980

1981-1990

1991-2000

Ithaka

Of Works with Multiple Holdings, Steady Increase Through the 1960s in the Proportion Held Many Times
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1000+ 400 to 1000 201 to 400 101 to 200 51 to 100 11 to 50 2 to 10

1801-1810

1811-1820

1821-1830

1831-1840

1841-1850

1851-1860

1861-1870

1871-1880

1881-1890

1891-1900

1901-1910

1911-1920

1921-1930

1931-1940

1941-1950

1951-1960

1961-1970

1971-1980

1981-1990

1991-2000

Ithaka

Summary and Discussion

Ithaka

Summary: Findings
1. Roughly 26 million print title works, represented in 32 million print title manifestations, are held by OCLC member libraries. This should be seen as a minimum in considering the number of printed books over time. Half of the books date from the period since 1977. How can a mass digitization strategy effectively manage the intellectual property ramifications of this finding? Publications are distributed across a wide number of libraries, and any mass digitization strategy that ignores this distributional reality is likely to omit numerous works. How should this finding impact the library systems planning for a massive format migration?

2.

Ithaka

Summary: Findings
3. Rareness is very common within the system. This has been recognized by many librarians but is not always taken into account in policy development. How will any future print preservation strategy address this reality? Can data on rareness help to inform digitization strategies?

4.

Redundancy in holdings across the system has changed over time. How has this led our framework for preservation to become more or less secure? What lessons should be drawn as we consider other print preservation strategies, particularly in the era of mass digitization, such as paper repositories? What lessons might there be for digital preservation?

Ithaka

More information
More in-depth article forthcoming Contact us with comments and questions: Brian Lavoie: lavoie@oclc.org Roger C. Schonfeld: rcs@ithaka.org

Potrebbero piacerti anche