The original, annotated version of this metalist appeared in the ARL Bimonthly Report, No. 227, April 2003.
Update details This version last updated June 30, 2003; links checked June 6, 2003. Please email additions, corrections or comments to Steve Hitchcock.
OverviewOpen access eprint archives are where authors of published research papers and papers destined for peer reviewed publication can self-archive the full texts of their work for all to see. Researchers who self-archive want to improve access to papers while preserving the recognised quality control established by journals (Harnad 2001). The engine for growth of these archives is the recognition by researchers and policy-makers that the improved impact achieved through open access, demonstrated by Lawrence (2001), is not only desirable but entirely compatible with peer reviewed publication.
What is the scale of open access eprint archives, and of author self-archiving, currently? Despite the rhetoric there are no quantitative studies. The context for such studies is not just the growing scale of open access archives and the sheer number of archives, but the evolving structure of distributed archives and independent services. Web-based open access archives are not simply collections built for browsing but also as open data sources for powerful, automated independent services such as search, aggregation and impact measurement.
The enabling infrastructure for distributed archives and independent data services was introduced by the Open Archives Initiative (OAI) with its Protocol for Metadata Harvesting (PMH) in January 2001 (Lynch 2001). Tomaiuolo and Packer (2000) provided a checklist of disciplinary 'preprint' archives that, because OAI was then in its infancy, recognised the likely influence of cross-archive services such as search but could not have detected the growth in institutional archives that OAI has subsequently motivated.
So a new checklist is warranted, but a list of open access eprint archives, and examination of their contents, is insufficient as a measure of the challenge. It is important to look through the lens at archive service providers too.
Thus, this is not a list of individual open access archives of full-text research papers, but instead lists and comments on other lists of individual archives. This list and its categorisation gives a broad overview of the structure, size and progress of full-text open access eprint archives.
This list will be maintained and updated as far as is possible, and is intended to assist further quantitative research on the open access eprint phenomenon for those who want to measure the growth and quality of open access eprint archives.
For a chronological view of the development of open access institutional archives in the wider context of free online scholarship (FOS), including many of the services and archives listed here, see Suber's Timeline of the FOS Movement.
The Budapest Open Access Initiative (BOAI), which supports both open access eprint archives and journals, has reinvigorated the cause and adoption of services providing open access to full-text research papers. While this list covers eprint archives, Bosc et al. offer an overview of new models of scientific communication (in French) that is more in line with the broader BOAI agenda.
ReferencesBosc, Hélène, Simone Jérôme and Jean-Philippe Schmitt (2003) La communication scientifique revue et corrigée par Internet
Harnad, Stevan (2001) "The Self-Archiving Initiative". Nature, 410: 1024-1025
Lawrence, Steve (2001) "Free Online Availability Substantially Increases a Paper's Impact". Nature Web Debate on e-access, May
Lynch, Clifford A (2001) "Metadata Harvesting and the Open Archives Initiative". ARL Bimonthly Report, No. 217, August
Suber, Peter (2002) Timeline of the Free Online Scholarship Movement
Tomaiuolo, Nicholas G. and Packer, Joan G. (2000) "Preprint Servers: Pushing the Envelope of Electronic Scholarly Publishing". Searcher, Vol. 8, No. 9, October
Structure of the metalist
Where the number of archives given in a source is stated, this is an approximate number intended to give an estimate of size. Since the numbers can change on a daily basis these are dated for reference, either by the last-modified date claimed by the resource when viewed, or the date viewed by the compiler of this list.
Electronic Archives "providing free and unrestricted access to peer reviewed scientific papers and academic publications" http://dmoz.org/Science/Publications/Archives/Free_Access_Online_Archives/
HighWire Press, Earth's Largest Free Full-Text
Science Archives (20 archives), list produced to highlight HighWire's
Free Online Full-text Articles (see Open access
journal archives) as the largest such archive
University of Maryland Libraries, Virtual Technical Reports Center:
EPrints, Preprints, & Technical Reports on the Web, "Institutions listed
here provide either full-text reports, or searchable extended abstracts
of their technical reports". Alphabetical by institution name (last updated
March 05, 2003)
University of Virginia Science and Engineering Libraries, Preprint Servers and Databases (33 archives, last modified January 13, 2003), pointers to a variety of electronic pre-print sources in all areas of science and engineering http://viva.lib.virginia.edu/science/guides/s-preprn.htm
Tardis (JISC FAIR project 2002- ), E-print and Related Archives with
Subject and Institutional Categories Identified (113 archives, first posted
January 2003). Institution, multi-institution, subject and multidisciplinary
Aardvark, Asian Resources for Libraries, Free preprint and full text
science archives (115 archives, viewed 20 March 2003)
American Mathematical Society (AMS), Directory of Mathematics Preprint and e-Print Servers http://www.ams.org/global-preprints/
Open Archives Forum, List of Repositories (20 archives, viewed 20 March
2003). No reasons for selection given (OAF is a focus for dissemination
of information about European activity related to open archives and, in
particular, to the OAI)
OAIster, serving 1,093,169 records from 144 institutions (updated 21 February 2003) http://oaister.umdl.umich.edu/o/oaister/viewcolls.html
Arc, an experimental cross-archive search service, used to investigate
issues in harvesting OAI compliant repositories and making them accessible
through a unified search interface, List of Existing Archives (140 archives,
viewed 4 April 2003)
my.OAI, user customisable search engine covering selected metadata databases
from the OAI, see forms-based list of databases in guest search interface
(15 archives, viewed 4 April 2003)
Public Knowledge Project, Open Archives Harvester (12 archives, viewed
20 March 2003). Listed archives have to request harvesting)
Open Archives Initiative - Repository Explorer, Virginia Tech interface to test archives interactively for compliance with the OAI-PMH, see forms-based predefined archive list in Repository Explorer interface (60 archives, viewed 4 April 2003) http://oai.dlib.vt.edu/cgi-bin/Explorer/oai2.0/testoai
Signal Hill, a European partnership for academic publishing set up by
the University Libraries of Utrecht and Delft and Firenze University Press,
institutional archives by country (34 archives, viewed 20 March 2003)
Caltech, Collection of Open Digital Archives (CODA), includes more then
10 repositories in production or in development
US Department of Energy (DOE), the Information
Bridge, provides the open source to full-text and bibliographic records
of DOE research and development reports in physics, chemistry, materials,
biology, environmental sciences, energy technologies, engineering, computer
and information science, renewable energy, and other topics. Contains full-text
documents produced and made available by the DOE National Laboratories
and grantees from 1995 forward. Legacy documents are included as they become
EPrints 2 Archives http://software.eprints.org/#ep2 (37 archives)
EPrints 1 Archives http://software.eprints.org/#ep1 (29 archives)
Front for the Mathematics ArXiv, alternative arXiv interface http://front.math.ucdavis.edu/
NASA, Astrophysics Data System (ADS) ArXiv Preprints Query Form http://adsabs.harvard.edu/preprint_service.html
Die Pro-Physik Findemaschine, specialised German search engine, includes arXiv among searchable resources, uses flexible taxonomies to support thematic searching across disciplines http://findemaschine.pro-physik.de/?language=e
NASA ADS Harvard-Smithsonian Center for Astrophysics Preprints (CfA) Preprints Query Form http://adsabs.harvard.edu/cfa/preprints.html
The Stanford Linear Accelerator Center (SLAC), SPIRES HEP literature database contains more than 500,000 high-energy physics related articles including journal papers, preprints, e-prints, technical reports, conference papers and theses, indexed by the SLAC and Deutsches Elektronen Synchotron (DESY) libraries since 1974 http://www.slac.stanford.edu/spires/hep/
Citebase, citation-ranked search and impact discovery for arXiv (also covers CogPrints and BioMed Central) http://citebase.eprints.org/help/coverage.php
Elsevier, Scirus, "the most comprehensive science-specific search engine on the Internet", covers over 135 million science-related pages, consisting of 120 million Web pages from paid-for sources as well as prominent eprint archives http://www.scirus.com/about/#content
CERN Document Server (CDS), searchable Web interface to over 550,000
bibliographic records, including 220,000 fulltext documents in particle
physics and related areas, covers preprints, articles, books, journals,
photographs ... http://weblib.cern.ch/
MPRESS, the Mathematics Preprint Search System, a searchable index of preprints from 10 servers, mostly covering geographical servers, but also disciplinary maths servers including Topology Atlas, Algebraic Number Theory Archives and K-theory Preprint Archives, as well as the mathematics part of the arXiv mirror at Augsburg http://mathnet.preprints.org/
US Department of Energy (DOE), PrePRINT
Network, searchable gateway to preprint servers that deal with scientific
and technical disciplines of concern to DOE: physics, materials, and chemistry,
as well as portions of biology, environmental sciences and nuclear medicine.
Browse sites at http://www.osti.gov/preprints/ppnbrowse.html
NTRS, NASA Technical Reports Server, search interface for 18 databases http://ntrs.nasa.gov/http://www.ncstrl.org/
Browse list of participating archives http://www.ncstrl.org:8900/ncstrl/body.html
Networked Digital Library Of Theses And Dissertations (NDLTD), theses rather than eprints, but included here as an example of an archive aiming to present open access to full-text research outputs http://www.ndltd.org/
Open Language Archives Community (OLAC), creating a worldwide virtual library of language resources, 21 participating archives, three service providers including OLAC Aggregator, Swahili Language Resources, and a virtual service provider. Open Language Archives are repositories of language data, documentation and description, including texts, recordings, dictionaries, grammars and field notes, where there is an intent to make the materials openly available, includes any such repository which has an accessible digital component, even if it is just an online catalog or a few digital holdings (use of "open" is inspired by OAI). Less an eprint archive, more a preservation and rescue service for language resources http://www.language-archives.org/index.htmlhttp://repec.org/
The following services provide access to all or part of the RePEc database for browse or search:
RePEc ArchivesCurrent archive providers to RePEc http://ideas.repec.org/archives.html
Participating institutions provide over 1000 RePEc series (many of the top series are journal series or smaller databases). LogEc list of the top 25 RePEc series of the past month http://logec.hhs.se/scripts/seriesstat.pl
Working Papers in EconomicsWoPEc, all papers in WoPEc are downloable but not necessarily free (contains over 80,000 documents in electronic format: 53035 Working Papers, 41895 Journal Articles, last updated 23 March 2003) http://netec.mcc.ac.uk/WoPEc.html
RePEc-modelled archives, not economicsDocuments in Information Science (DoIS) is a database of articles and conference proceedings published in electronic format in the area of Library and Information Science, holds about 10042 articles and 3045 conference proceedings, 6928 of them are downloable (28th February 2003) http://dois.mimas.ac.uk/
A more broadly based database, rclis (Research in Computing, Library and Information Science) is in development http://rclis.org/about.htmlhttp://www.biomedcentral.com/start.asp
PubMed Central (PMC) is the U.S. National Library of Medicine's digital archive of life sciences journal literature (52 participating journals at 20 Feb. 2003) http://pubmedcentral.nih.gov/
HighWire Press Free Online Full-text Articles
(list limited to journals published online with the assistance of HighWire
Press). At 28 Feb. 2003, 472,871 full-text articles were available free
from 1,358,713 total articles http://highwire.stanford.edu/lists/freeart.dtl
Advances in Theoretical and Mathematical Physics is an overlay of the arXiv archives. All papers are archived at LANL and its mirror sites. ATMP maintains only links to the above archive, thus realising one of the first e-journals as an overlay to the global eprint archives http://www.intlpress.com/journals/ATMP/
BBS Prints Interactive Archive of the journal Behavioral and Brain Sciences containing original refereed 'target' papers, open peer commentary and repsonses (OAI compliant, Eprints.org journal archive) http://www.bbsonline.org/
Psycoloquy, articles and peer commentary in all areas of psychology as well as cognitive science, neuroscience, behavioral biology, artificial intelligence, robotics/vision, linguistics and philosophy (Eprints.org archive) http://psycprints.ecs.soton.ac.uk/
Open access journals per se, without an archive connection, are not included here.
Citeseer (1998- , aka ResearchIndex), developed at NEC Research Institute, NJ, USA, caches openly accessible full-text research papers on computer science found on the Web in Postscript and PDF formats for autonomous citation indexing, it is claimed to index over 500,000 papers. Not yet OAI compliant, but planned to become so http://citeseer.nj.nec.com/cs
ebizSearch (2001- ), administered by the eBusiness Research Center
at Pennsylvania State University, based on Citeseer software, autonomously
creates citation indexes of e-commerce literature. The search engine crawls
Web sites of universities, commercial organizations, research institutes
and government departments to retrieve academic articles, working papers,
white papers, consulting reports, magazine articles, and published statistics
and facts. Not all documents are stored by eBizSearch, which performs a
citation analysis of all articles accessed
The International Mathematical Union adopted a resolution (May 2001) encouraging mathematicians to make their work available online: "Open access to the mathematical literature is an important goal. ... Our action will have greatly enlarged the reservoir of freely available primary mathematical material, particularly helping scientists working without adequate library access."
Many journals operate a preprint archive, making electronic copies of papers available pre- print publication. These are typically not based on author self-archiving nor are they open access, and so are not covered here.