What proportion of preprints are replaced by peer reviewed reprints?

Written by Tim Brody, last updated August 31 2000 09:09:20.

Using data from the publication rates.

The proportion of papers that have got Journal-ref entries is 36.87% (see publication rates). The proportion of papers that have had more than one submission is 21.92% (see question two). This means that the maximum proportion of papers that could be submitted as preprints and then submitted as peer-reviewed reprints is 21.92*36.87 = 8.08% (as a proportion of all papers in the archive).

Using year of publication

The date of publication has a granuality of 1 year (i.e. year of publication). It has been stated that the majority of papers take 8 months to be published, from the initial preprint being sent to the publisher. If an author submits the preprint at the same time as sending it to the publisher, and then submits the reviewed print at the same time as publication, this will be indistinguishable from an author who submits only at or after his/her paper has been published. This puts a question mark over whether this question can be answered, when looking at the historical record of abstracts.

From the data available it is possible to say with some certainty that the number of authors who replace preprints with reviewed papers is less than 60%, as 41.68% of published articles have only had submissions before their year of publication.

From anecdotal evidence (see The Typical Life of a Scientific Paper, quant-ph/9912113), it is apparent that authors may follow the procedure of:

  1. Submitting the pre-print
  2. Submitting a few fixes (possibly as a result of the review process)
  3. Submitting the final copy, as accepted by the publisher

All within a period of one year (although, in the particular example the "year" went over a New Year, so therefore would feature in the following statistics).


Any submission following or at the date of publication is the reviewed work, and any submission before time of publication is the un-reviewed work.


Number of papers that have been submitted to before and at publication:
cat d_publishedcount | ../restrictcol '5/^[^0]/' | ../restrictcol '6/^[^0]/' | wc
= 3265

Number of papers that have been submitted to before and after publication (excluding papers that have been submitted to before, at and after):
cat d_publishedcount | ../restrictcol '5/^[^0]/' | ../restrictcol '7/^[^0]/' | ../restrictcol '6/^0/' | wc
= 127

Sum = 3392.

If we take the background population of published works as the number of lines in d_publishedcount, 48742 (number of abstracts with matching Journal-ref fields), we can work out the proportion that could be preprints replaced by reviewed reprints:

(3392/48742)*100 = 6.96%

Using Incremental Update Data between Nov 1999 and Jun 2000

This graph is a plot of how many papers had their journal-ref entry added on a particular day since their first submission. This is based on incremental update data from November 1999 to June 2000.

1625 papers that had submissions in the period were initially submitted without a Journal-ref, and then had a Journal-ref added at a later date. The number of papers with submissions in the period was 23974 (with 1 or more submissions being made those papers). 5262 papers had 2 or more submissions.

Papers submitted only in the period

grep 'abs' < d_papers | restrictcol '3/^(199911|199912|2000)/' | papers2journaled > d_journaled99

wc < d_journaled99
1092 papers were submitted in the period, and had a Journal-ref added at a later point.

grep 'abs' < d_papers | restrictcol '3/^(199911|199912|2000)/' | countcol -v 2 | countcol -n 1
Submissions (including abstract only) made to papers first submitted in Nov 1999 thru Jun 2000:

1	2	3	4	5	6	7	8
12742	3079	763	140	37	8	2	2

4031 papers first submitted in the period had 2 or more submissions. Of these 1092 had their 2nd or later submission including a Journal-ref. Proportion of papers that had more than 1 submission, and had a journal-ref added:
100*(1092/4031) = 27.09%

Number of updates to abstract that also updated the paper (i.e. Journal-ref added along with reviewed print):
restrictcol '4/paper/' < d_journaled99 | wc

100*(47/4031) = 1.17% of papers submitted between Nov 1999 and Jun 2000, had a Journal-ref entry added, along with a updated paper, within the period.

This data needs to be over a more extended time period to identify when the highest rate of publications are (existing data suggests that the typical period between initial submission and publication is 8 months).

Paper updates vs Abstract Updates

Using the incremental update data from the six month period and identifying which updates have been made to abstracts and which updates have had a paper submitted with them (when the Journal-ref entry was added).

For the following, the mirror received the first deposit in 1999/12/29 and the first deposit that contained a Journal-ref was on 2000/06/18. No paper was submitted with the abstract update.

nucl-th/9912061 19991229 20000618 172 abs

  • Paper reference
  • Date of earliest update without Journal-ref
  • Date of earliest update with Journal-ref
  • Days between the above
  • Whether the update with Journal-ref was ABStract or PAPER update

countcol 4 < d_journaled:
abs paper
1561 64

(64 / (1561+64)) = 3.93% of updates that specify the journal-ref also include a update to the paper.

For papers submitted December 1999

2520 papers submitted in Dec 1999.

grep 'abs' < d_papers | restrictcol '3/^(199912)/' | papers2journaled | countcol 4
336 were updated with just Journal-ref, 20 were updated with Journal-ref and a paper.

100*(20/2520) = 0.79% of papers.

2357 papers submitted in Jan 2000.

grep 'abs' < d_papers | restrictcol '3/^(200001)/' | papers2journaled | countcol 4
224 were updated with just Journal-ref, 10 were updated with Journal-ref and a paper.

100*(10/2357) = 0.42% of papers.

Over the entire period in nucl-th had no papers replaced with the reviewed paper (65 papers were updated with a Journal-ref). Over a similar period hep-ph had 8 papers updated, with 217 Journal-refs added.