Project
Manager: Steve Hitchcock Lead Institution: Southampton University
Duration
of Award: 10/99-09/02
Period
of Report: 10/01-12/02
Version
history of this report
This
version 1.0 DRAFT (internal project use only)
First year report http://opcit.eprints.org/y1report/y1report-final.pdf
Second year report http://opcit.eprints.org/y2report/y2report20.pdf
It has always been clear there is not enough awareness of the importance of open access to published research papers — which means that all users can access the papers free of charge, at any time, anywhere — among the research and academic community. What is needed to increase the provision and use of open access are real tools and services to show that open access works, in forms that are transparently beneficial to authors of research papers. Central to this is the causal connection between research access and research impact: open access increases impact§. In the UK there are signs the next Research Assessment Exercise will use citation analysis*, a way of measuring the impact of published research.
With the completion of the Open Citation (OpCit) Project (http://opcit.eprints.org/), a broadly-based campaign for raising awareness of open access, embracing the Open Archives Initiative (OAI) and the Budapest Open Access Initiative, can now be complemented by software for building open-access eprint archives, GNU EPrints (also known as eprints.org software), and a citation-ranked search engine for open archives, Citebase. Together these tools enable authors to provide open access to their papers, and to measure impact by citation measures as well as usage measures.
The principal partners in the Open Citation Project were Southampton University's IAM Group, the Digital Library Research Group at Cornell University, and arXiv, at the outset of the project based at Los Alamos and now hosted at Cornell.
The method used by the project at Southampton has been to build tools to measure and analyse citations from the 200,000+ papers stored by the arXiv physics archives, the largest eprint archive of its type. These data have been complemented, experimentally, with data on how the archives are used, e.g. which papers are viewed most. Collectively the citation and usage data are stored in Citebase, a citation database which provides a user interface for search and discovery, and a machine interface for analysis of this rich data source by other services.
With the emergence of OAI and the consequent emphasis on institutional archives, it was evident there would be a need for large numbers of local, institution-based archives smaller than arXiv, but which would need to operate on similar principles — low cost, largely automated deposit, offering indexing and dissemination of author-archived content. Software used to build CogPrints, a cognitive sciences archive modelled on arXiv, was rewritten to make it OAI-compliant, and then to make it generic. This became the basis of GNU EPrints, which was further developed within the remit of OpCit to generalise the author and management interfaces for open-access archives.
Of most significance, EPrints builds archives that comply with the OAI Protocol for Metadata Harvesting (PMH). This means that any content deposited within an EPrints-based archive will become visible to users of independent OAI services, such as Citebase, immediately enhancing the chances of discovery. Authors depositing papers in an EPrints archive are not required to have any knowledge of OAI metadata, as it is generated automatically.
Connecting papers in open archives and a citation database is a method for automatically extracting metadata and reference lists from the papers. There are many different applications for reference linking. The project at Cornell considered the question "what would be the ideal behavior of a digital object that supported reference linking (both incoming and outgoing)"? Answering this question led to an application programming interface (API) for reference linking.
All three components have been tested, evaluated and demonstrated to be useful by third-party users, and will continue to be developed and integrated within new projects and products beyond the lifetime of the OpCit project.
The activities of the OpCit project were described by Hitchcock et al.
(2002a).
Citebase is a citation-ranked search and impact discovery service that measures citations of scholarly research papers that are available on the Web in the larger open access, OAI disciplinary archives - currently arXiv (http://arxiv.org/), CogPrints (http://cogprints.soton.ac.uk/) and BioMed Central (http://www.biomedcentral.com/). Citebase harvests OAI metadata records for papers in these archives, automatically extracting the references from each paper. The association between document records and references is the basis for a classical citation database.
The primary means for users of accessing this database is the Citebase Web interface (http://citebase.eprints.org/) (Figure 1). The user can classify the search query terms (typical of an advanced search interface) based on metadata in the harvested record (title, author, publication, date). In separate interfaces, users can search by archive identifier or by citation. What differentiates Citebase is that it also allows users to select the criterion for ranking results by Citebase processed data (citation impact, author impact) or based on terms in the records identified by the search, e.g. date. It is also possible to rank results by the number of 'hits', a measure of the number of downloads and therefore a rough measure of the usage of a paper. This is an experimental feature to analyse the quantitative and the temporal relationship between hit (i.e. usage) and citation data, as measures of impact. Hits are currently based on limited data from download frequencies at the UK arXiv mirror at Southampton only.

Figure 1. Citebase search interface showing user-selectable criteria for ranking results
The combination of data from an OAI record for a selected paper with the references from and citations to that paper is also the basis of the Citebase record for the paper. A record can be opened from a search results list. The record contains bibliographic metadata and an abstract for the paper, from the OAI record. This is supplemented with four characteristic services from Citebase:
· Graph of this Article's Citation/Hit History
· All Articles Cited by this Article (Reference List)
· Top 5 Articles Citing this Article (option to view All Articles Citing this Article)
· Top 5 Articles Co-cited with this Article (option to view All Articles Co-Cited with this Article)
Another option presented to users from a results list is to open a PDF version of the full paper. This option is also available from the record page for the paper. This version of the paper is enhanced with linked references to other papers identified to be within arXiv, and is produced by OpCit. An earlier evaluation found that arXiv papers are the most appropriate place for reference links because users overwhelmingly use arXiv for accessing full texts of papers, and references contained within papers are used to discover new works. (see http://opcit.eprints.org/evaluation/v10/v10evaluation.html).
Prior to a more recent evaluation (section 3.2) Citebase had records for 230,000 papers, indexing 5.6 million references. By discipline, approximately 200,000 of these papers are classified within arXiv physics archives.

Figure 2. Co-citation map of the entire arXiv collection
A first attempt to extend the analysis and presentation of citation relationships was explored with OpCit e-Services (http://opcit.eprints.org/eservices/). Like Citebase, the e-Services framework uses OAI metadata. The case illustrated in Figure 2 uses data from Citebase, for which it then provides advanced services:
· simple visualisations (e.g. number of e-print deposits each year)
· knowledge services (e.g. most significant papers)
· co-citation visualisations (uses the co-citedness of papers as a proximity measure when plotting papers on a graph) (Figure 2)
The approach needs refinement before user interface issues can be tackled. First, the large dataset causes computation to slow significantly. Second, due to erroneous or missing citations, some visualisations may not display convincing or useful patterns.
This was the first detailed investigation of the impact on users of an open access Web citation indexing service. The evaluation, including details of methodology, design and results, has been reported by Hitchcock et al. (2002b).
The following elements of Citebase were the focus of the evaluation:
Given the wide prospective user base, what was evaluated was not just the current implementation of Citebase, but the principle of citation-based navigation and ranking.
The evaluation sought to:
The evaluation used two methods to collect data:
The evaluation was open from June 2002, when the first observational tests took place, to the end of October 2002 when a closure notice was placed on the forms.
Valid submissions to Form 1 were received from 195 evaluators. Although the primary target group were physicists, responses also came from mathematicians, computer scientists, information scientists, cognitive scientists, biologists, health scientists, and others.
The current target user group for Citebase is physicists. The impact being made by OAI should help extend coverage significantly to other disciplines, although because the emphasis of OAI is on promoting institutional archives the impact on disciplines, as measured by services such as Citebase, may take longer to emerge. For this reason there was a need to target this evaluation at prospective users, not just current users. Citebase should be designed for an expanding user base.
Prior to evaluation Citebase had not been announced generally and was little used. The evaluation was first announced to selected discussion lists targetted at: colleagues in digital library research, advocates of open access to the scholarly literature, and librarians. The most significant contributor to increased usage was the inclusion of links, on a trial basis, from abstract pages of papers in arXiv to the corresponding Citebase records. Links from arXiv became active on 20th August.
A notable success of the evaluation has been to increase usage of Citebase, in terms of average daily visits, by more than a factor of 10. There is still considerable scope to increase usage of Citebase by arXiv physicists. According to Paul Ginsparg, founder of arXiv: "(Citebase) is a potentially critical component of scholarly information architecture".
Overall, results of the evaluation show there is much scope for improvement, but as exemplified by Citebase Web-based citation indexing of open access archives is closer to a state of readiness for serious use than had previously been realised.
Within the scope of its primary components, the search interface and services available from a Citebase record, it was found Citebase can be used simply and reliably for resource discovery. The majority of users were able to complete a task involving all the major features of Citebase. More data need to be collected and the process refined before it is as reliable for measuring impact. As part of this process users should be encouraged to use Citebase to compare the evaluative rankings it yields with other forms of ranking.
Citebase is a useful service that compares favourably with other bibliographic services, although it needs to do more to integrate with some of these services if it is to become the primary choice for users.
The linked PDFs are unlikely to be as useful to users as the
main features of Citebase. Among physicists, linked PDFs will be little used,
but the approach might find wider use in other disciplines where PDF is used
more commonly.
One of the most important findings of the evaluation is that Citebase needs to be strengthened in terms of the help and support documentation it offers to users.
The first step must be to examine the results of this evaluation to improve Citebase with a view to establishing it as a service used regularly by arXiv users.
There are wider objectives and aspirations for developing Citebase. Where there are gaps in the open-access literature Citebase will motivate authors to accelerate the rate at which these gaps are filled, especially when it is realised there is a direct correlation, which Citebase will confirm, between open access, increased impact and the outcomes of research assessment exercises.
EPrints is software for building open-access archives
aimed at institutions and special-interest communities, and is now used by
nearly 60 archives.
In its current incarnation, the name GNU EPrints reflects that it is open source and freely available under the GNU General Public License and conforms to the strict GNU guidelines for free software. The last major release of EPrints, version 2.0, appeared in February 2002, although it has been updated (now on version 2.2.1) to conform with the latest OAI-PMH (also version 2) announced in June. Features of EPrints version 2, described by Gutteridge (2002), include:
· Internationalised metadata stored as Unicode
· Support for multiple archives on one server
· An improved user interface
EPrints is extending its focus on institutional research papers. It is now configurable for adoption as a journal-archive, e.g. Behavioral and Brain Sciences and Psycoloquy, by new open access journals or established journals converting to open access, and will include the facility to manage peer review and peer commentary. It is planned to extend EPrints for structured data handling in, e.g. e-science applications.
The API automatically extracts metadata and reference lists from papers using four principal methods:
Each component produced by these methods can be seen in a typical Citebase record, but this approach is generalisable to other reference linking applications.
A few Java classes were defined to support reference linking in an object oriented way. These methods can be invoked on the surrogate, a special class in the API that encapsulates data regarding a particular online digital object. To use the API, a new surrogate is instantiated, passing it the URL of the online digital object for which information is to be gathered.
The bulk of the analysis within the API program is done by the surrogate constructor. This call downloads the online work, turns it into XHTML, parses the XHTML, and extracts information, such as citations and references. The next call on the API invokes the method that returns the references in the form of an XML document, which is then converted to a string and printed.
It is anticipated that repositories will at some point contain reference linking data, so the API was later extended to support persistent storage of surrogates. Once a surrogate is instantiated, it can be saved to a repository, if desired. Thus one could build a repository of surrogates, which could later be re-instantiated and have the basic API methods invoked on them.
The API was used to build several applications against online journals (D-Lib Magazine, Journal of Electronic Publishing, ACM Digital Library). With five methods (the original four, plus save) the API was found to be sufficiently usable. The main limitation of the software is that not all HTML pages are equally easy to analyse, e.g. some HTML is badly written and cannot be converted into XHTML and, therefore, cannot be parsed. This is likely to remain a problem on the Web for some time. A more complete description of the reference linking API and its evaluation, including the D-Lib application, can be found in Bergmark and Lagoze (2001).
All three components described above, and a new component, Paracite, a software agent and search interface for parsing and locating raw references on the Web, are usable and will continue to be so beyond the conclusion of OpCit. What is available, the means of access, and plans for maintenance of services, are noted below:
·
Citebase is now up-to-date and indexes arXiv fully.
Citebase can be searched by users at http://citebase.eprints.org/.
A machine interface for data sharing with other services is operational, and
Citebase is listed as an OAI 2.0-conforming data provider (http://www.openarchives.org/Register/BrowseSites.pl).
Researchers at Old Dominion University have harvested Citebase data as part of
their Archon federated digital library on physics (http://archon.cs.odu.edu/), as has the OAI
search engine OAIster (http://oaister.umdl.umich.edu/o/oaister/viewcolls.html#c),
and arXiv is a possible (re)harvester of Citebase data too. Due to ongoing
developments with the data formats, enquiries about the machine interface
should be directed at the developer, Tim Brody tdb01r@ecs.soton.ac.uk. The
citation database will continue to be updated and expanded in terms of
coverage. Both interfaces to Citebase will continue to be developed and
maintained.
· GNU EPrints is available as open source software and is downloadable from http://software.eprints.org/. Machine requirements for running GNU EPrints are other open source components including Linux, Apache Web server, Perl and a MySQL database. GNU EPrints will continue to be developed and maintained.
· The Reference linking API was written in Java and is downloadable from the OpCit project site at Cornell http://www.cs.cornell.edu/cdlrg/Reference%20Linking/. The API is not being developed at present.
· Paracite is still experimental, but can be tried at http://paracite.eprints.org/. There are plans to use the reference linking API within Paracite. As well as providing a user interface, Paracite could mediate between data sources (archives) and linking services (citation databases, OpenURL, etc.). Paracite development is ongoing.
The ideas that have characterised OpCit will be taken forward not just in the products of the project, such as Citebase and GNU EPrints, but in new environments.
The JISC Focus on Access to Institutional Resources (FAIR) programme includes major projects that will extend the use of EPrints-based archives in UK universities through the provision and targeting of new archives and supplementary services:
· SHERPA (Securing a Hybrid Environment for Research Preservation and Access), will build EPrints-based archives at six UK universities, using this experience to report on the implications for management and quality control of the archives.
· E-Prints UK plans to use Citebase software and citation data to enhance its database for discovery of eprint papers available from open access archives hosted at UK universities and colleges.
· TARDIS (Targeting Academic Research for Deposit and dISclosure) will investigate strategies 'to overcome the technical, cultural and academic barriers', which might be found to be restricting the development of institutional eprint archives, by developing a working model of a multidisciplinary institutional archive based on EPrints.
· RoMEO (Rights MEtadata for Open archiving) will canvas users to identify (mis)perceptions about how rights should be formulated and protected for 'give away' works — "texts from which the author does not seek sales revenue" — promoting practical approaches that can "assigned, disclosed, harvested, and displayed" via the OAI-PMH.
To improve interoperability, scalability and reliability of OAI services, OpCit has worked with a team from Old Dominion University (USA) on infrastructure components such as proxies and caches (Liu et al. 2002). Serious errors in OAI data require an intermediate storage approach: aggregation and caching. Celestial, an OAI aggregator, is software developed to act as a buffer between Citebase and the source repositories so that, e.g. it doesn't overload arXiv. Celestial harvests metadata from OAI-compliant repositories and re-exposes that metadata to other services - in effect an OAI cache. Celestial software is available from http://oai-perl.sourceforge.net/
EPrints software is undoubtedly the better known product of the OpCit project, and this is reflected in coverage in news and feature sources shown below. It could be argued that Citebase or similar services will ultimately have more impact with users, but EPrints is necessary now and plays a critical role in enabling open-access archives to be filled.
· Colin Steele, E-prints: the future of scholarly communication? InCite, October 2002 http://www.alia.org.au/incite/2002/10/eprints.html
· Konrad Lischka, Der Geist, der aus der Flasche kam, Telepolis magazine, 16th March 2002 (in German) http://www.heise.de/tp/deutsch/special/copy/12031/1.html
Citebase
· Belinda Weaver, Open archives citation tool, InCite, October 2002 http://www.alia.org.au/incite/2002/10/weaver.html
EPrints
· Roy Tennant, Institutional Repositories, Library Journal, 15th September 2002
· Georg C. F. Greve, Brave GNU World - GNU EPrints, Linux Magazin, September 2002 (in German)
http://www.linux-magazin.de/Artikel/ausgabe/2002/09/bgw/bgw.html
· Raym Crow, The Case for Institutional Repositories: A SPARC Position Paper, The Scholarly Publishing & Academic Resources Coalition, August 2002 http://www.arl.org/sparc/IR/ir.html
· Kendra Mayfield, College Archives 'Dig' Deeper, Wired News, 3rd August 2002 http://www.wired.com/news/school/0,1383,54229,00.html
· Jeffrey R. Young, 'Superarchives' Could Hold All Scholarly Output, Chronicle of Higher Education, 5th July 2002 http://chronicle.com/free/v48/i43/43a02901.htm
· Anon. The ghost is out of the bottle, Higher Education & Research Opportunities in the UK, 29th March 2002 http://www.hero.ac.uk/inside_he/the_ghost_is_out_of_the_b1365.cfm
· Ivan Noble, Boost for research paper access, BBC Online News, 14th February 2002 http://news.bbc.co.uk/1/hi/sci/tech/1818652.stm
· Ed Sponsler and Eric F. Van de Velde, Eprints.org Software: A Review, SPARC E-News, August-September 2001 http://www.arl.org/sparc/core/index.asp?page=g20#6
· Kendra Mayfield, The Science of E-Publishing, Wired News, 19th October 2000 http://www.wired.com/news/culture/0,1284,39323,00.html
Citebase
GNU Eprints
Paracite
The following papers and reports were produced by the project during the final year of its work from September 2001. All publications covering the project back to 1999 are listed at http://opcit.eprints.org/opcitpapers.shtml
Bergmark, D. and Lagoze, C. (2001) "An Architecture for
Automatic Reference Linking". 5th European Conference on Research and
Advanced Technology for Digital Libraries (ECDL),Darmstadt, September
http://www.cs.cornell.edu/cdlrg/Reference%20Linking/tr1842.ps
Bergmark,
D., Phempoonpanich, P. and Shumin Zhao, S. (2001) ”Scraping the ACM
Digital Library”. SIGIR Forum, Vol. 35 No. 2, Fall
http://www.acm.org/sigir/forum/F2001/bergmarkFinal.pdf
Brody, T., Carr, L and Harnad, S. (2002) “Evidence of
Hypertext in the Scholarly Archive”. Proceedings of HT'02, the 13th ACM
Conference on Hypertext, University of Maryland, June 2002
http://opcit.eprints.org/ht02-short/archiveht-ht02.pdf
Gutteridge, C. (2002) "GNU EPrints 2 Overview".
Author eprint, Dept. of Electronics and Computer Science, Southampton University,
October, and in Proceedings 11th Panhellenic Academic Libraries Conference,
Larissa, Greece, November
http://eprints.ecs.soton.ac.uk/archive/00006840/
Gutteridge, C. and Harnad, S. (2002) “Applications, Potential Problems and a Suggested Policy for Institutional E-Print Archives”. Author eprint, Dept. of Electronics and Computer Science, Southampton University, September http://eprints.ecs.soton.ac.uk/archive/00006768/
Harnad, S. (2001) “Skyreading and Skywriting for Researchers: A Post-Gutenberg Anomaly and How to Resolve it”. text-e virtual symposium, 14 – 30 November
http://text-e.org/conf/index.cfm?ConfText_ID=7
Harnad, S. (2003) “Electronic Preprints and Postprints”. Encyclopedia
of Library and Information Science Marcel Dekker, Inc.
http://www.ecs.soton.ac.uk/~harnad/Temp/eprints.htm
Harnad, S. (2003) “Online Archives for Peer-Reviewed Journal Publications”. International Encyclopedia of Library and Information Science. Edited by John Feather and Paul Sturges. Routledge http://www.ecs.soton.ac.uk/~harnad/Temp/archives.htm
Hitchcock, S., Bergmark, D., Brody, T., Gutteridge, C., Carr, L., Hall, W., Lagoze, C. and Harnad, S. (2002a) “Open Citation Linking: The Way Forward”. D-Lib Magazine, Vol. 8, No. 10, October
http://www.dlib.org/dlib/october02/hitchcock/10hitchcock.html
Hitchcock, S., Woukeu, W., Brody, T., Carr, L., Hall, W. and Harnad, S. (2002b) “Evaluating Citebase, an open access Web-based citation-ranked search and impact discovery service”. Evaluation report, IAM Dept., University of Southampton
http://opcit.eprints.org/evaluation/Citebase-evaluation/evaluation-report.html
Liu, X., Brody, T., et al. (2002) “A Scalable Architecture for Harvest-Based Digital Libraries - The ODU/Southampton Experiments”. D-Lib Magazine, Vol. 8, No. 11, November
http://www.dlib.org/dlib/november02/liu/11liu.html
First available as arXiv
Computer Science cs.DL/0205071, May 2002
http://arxiv.org/abs/cs.DL/0205071
OpCit-related presentations were given at the following meetings during the final year of the project in 2002. The full list of presentations, including presentations from previous years, with a link to keynote presentations by Stevan Harnad, can be found at http://opcit.eprints.org/opcitpapers.shtml
November 6-8 "Academic Libraries of Open and Continuous Access", 11th Pan Hellenic Conference of Academic Libraries, Larissa, Greece
October 17-19 "Gaining independence with e-prints archives and OAI", 2nd Workshop on the Open Archives Initiative (OAI), CERN, Geneva
September 13 “Open Access Journals - will they fly?” ALPSP/OSI round table meeting, London
June 24-25 JISC/NSF Digital Libraries Initiative (DLI) All Projects Meeting, Edinburgh
May 29 “Applications of Metadata”, a one-day conference organised by the BCS Electronic Publishing Specialist Group, London
May 13-14 First Workshop of the Open Archives Forum, Pisa
April 12 “We can't go on like this: the future of journals”, ALPSP International Learned Journals Seminar, London
March 22 The Future of Journal Publishing, Nottingham University
March 4 CURL ePrints workshop, Glasgow
January 24-25 JISC All-Projects Synthesis Meeting, Manchester
The following researchers were involved with the Open Citation Project during the year reported:
Stevan Harnad, Carl Lagoze (Principal Investigators), Wendy Hall (Management Chair), Les Carr (Project Technical Director), Steve Hitchcock (Project Manager), Donna Bergmark (Linking API), Tim Brody (Citebase), Christopher Gutteridge (EPrints), Mike Jewell (Paracite), Zhuoan Jiao, Simon Kampa (e-Services), Arouna Woukeu (evaluation)
|
|
Year 1 |
Year 2 |
Year 3 |
Year 4 |
Total |
|
CONSUM |
-4278.11 |
-3335.49 |
-1619.39 |
-656.49 |
-9889.48 |
|
ENTERT |
-161.2 |
-636.04 |
-220.53 |
|
-1017.77 |
|
EQUIP |
-13680 |
-9611.57 |
1209.11 |
-2746.75 |
-24829.2 |
|
S/W |
-91.2 |
0 |
-1020.83 |
|
-1112.03 |
|
SAL |
-59456.5 |
-72068.3 |
-47195.4 |
-26743.1 |
-205463 |
|
TRAVEL |
-10769.8 |
-8094.37 |
-10420.4 |
-1165.48 |
-30450.1 |
|
TOTAL |
-88436.8 |
-93745.8 |
-59267.5 |
-31311.8 |
-272762 |
|
BUDGET |
100846 |
90869 |
99861 |
0 |
291576 |
|
SURPLUS |
12409.22 |
-2876.78 |
40593.51 |
-31311.8 |
18814.11 |
Missing
payments -2500
Year 1: Oct. 1999-Sept. 2000; Year 2: Oct. 2000-Sept. 2001; Year 3: Oct. 2001-Sept. 2002; Year 4: Oct. 2002-Dec. 2002
Underspend in Year 3 was principally due to salaries. The project lost a research assistant, Zhuoan Jiao, at the end of November 2001. Zhuoan was replaced by Tim Brody, who continued to develop Citebase as part of a PhD project, and was paid fees for additional work required by the project.
Year 4 covers an extension to the project to end December 2002 agreed with Rachel Bruce at JISC.
Salaries for Year 4 includes a late claim for fees by Tim Brody.
Equipment spending during years 2 and 3 was mainly to expand capacity or replace faulty or damaged equipment..
Equipment spending in Year 4 was to upgrade Citebase to improve service and reliability in anticipation of increased usage due to collaboration with arXiv.
Some of the remaining budget surplus has been identified to fund part-time work into 2003 to complete the project’s publishing and dissemination activities. Versions of the evaluation report will be published, and at least two further papers on different aspects of the project will be published in 2003.
A brief record of progress against the final year work plan given in the previous report
· Evaluation, analysis, dissemination of data mining, user survey, OpCit demonstrators and other OpCit results
o See OpCit Publications 2001-2 above. The Citebase evaluation report will be edited for journal publication.
· Integrate OpCit with arXiv: develop and promote AMF
o Citebase is linked from arXiv on a trial basis. The results of the evaluation indicate there is a basis for permanent linking.
o AMF, an extended OAI-compliant metadata format for sharing rich metadata such as found in Citebase records, is being considered for this purpose along with other formats (see OAI-implementers discussion thread starting at http://www.openarchives.org/pipermail/oai-implementers/2002-June/000518.html)
· Add OpenURL services: links to OpCit linked demonstrator; work with OpenURL resolver services; build an advanced OpenURL generator to turn references in PDF/TeX/LaTeX/HTML papers to OpenURL requests when viewed
o The primary interface to OpCit links is now via Citebase rather than full-text papers in PDF or other formats.
o Experiments have been performed, with partial success, with an OpenURL resolver at VUB (Brussels). Correspondence with Herbert Van de Sompel, principal architect of OpenURL. is ongoing. Author self-archived data tends to be unstructured, and this is a problem for OpenURL. Paracite may offer a solution. The use of OpenURL for transporting Citebase data will continue to be investigated for the ePrints-UK FAIR project.
· Advanced citation analysis– new measures of impact
o OpCit e-Services provide experimental visualisations of: e.g. most significant papers, number of e-print deposits each year, and co-citations (section 3.1.1).
· Implement and test EPrints components for reference checking
o The work has migrated to Paracite, and will be developed further.
· Evaluate OpCit project software
o The evaluation focussed on Citebase, since much of the project’s work on reference linking and citation analysis has converged within this interface.
· Migrate non-OAi archives (e.g. NCSTRL)
o This became unnecessary when Virginia Tech was awarded a grant to move NCSTRL into an OAI-conformant framework using EPrints software (http://128.82.7.99:8900/ncstrl/about_ncstrl.html).
§ Steve Lawrence, Free Online Availability Substantially Increases a Paper's Impact Nature Web Debate on e-access, May 2001.http://www.nature.com/nature/debates/e-access/Articles/lawrence.html
* Sam Jaffe, Citing UK Science Quality: The next Research Assessment Exercise will probably include citation analysis, The Scientist, Vol. 16, No. 22, Nov. 11, 2002 http://www.the-scientist.com/yr2002/nov/prof1_021111.html