Home | You are at

How often are papers changed and updated after initial submission? How extensive are the changes?


Written by Tim Brody, last updated August 31 2000 09:09:20.

Using data from frequency of submissions.


This graph shows how many submissions there have been for papers. We are only analysing papers that have had more than one submission.

This graph is a plot of the number of papers with submissions against the days from the initial submission. This is based on the entire population of papers in the LANL archive. Because this data is based on the abstract's date records, it can give no indication of the nature of re-submissions. Papers with only one submission do not appear.

The following graphs are the cumulative updates seperated out into the first update, second, third and fourth.




The following data is based on:
The last six months of updates to the archive
Taking the "diff" between a submission and the previous submission to that article (and recording the amount of days between these two submissions).
The number of lines of difference between either the abstract, if there was no update to the paper, or the paper

This graph shows a plot of the amount of change that has been made to the abstract/document against the time of the update (compared to the previous update).

This is a graph of the average (sum of changes/number of papers) amount of change per day since initial submission, excluding days that had 3 or less papers changed on that day. Only includes changes to tex-only documents (no tar.gz or .pdf format files).

The following graphs are based on performing the following:

  1. If there is no paper for this change, perform diff abstract1 abstract2
  2. Try and find a paper in a gzipped tarball (.tar.gz), unpack the tarball and filter the tex into text cat *.txt | detex | tr "\ " "\n" | grep -E ".+" > outfile
    diff outfile1 outfile2
  3. Try and find a paper in a gz source (.gz), unpack the source and detex it, compare it gunzip -c file | detex | grep -E ".+"

restrictcol '3/^(199911|199912|2000)/' < d_papers | papers2changes > d_updates6month

This graph shows the average changes (total amount of changes divided by number of papers with a change on that day) made to papers, compared to the number of days since the paper was submitted. Days that had less than 5 submissions on that day are ignored.

This graph shows the average changes made to papers, compared to the number of days since the paper was submitted.

This graph shows the average updates made to papers submitted in December 1999 (a total of 175 changes were made). This excludes changes made to the abstract.

restrictcol '2/^hep-ph/' < d_papers | papers2changes > d_updateshep-ph

Changes made to the hep-ph section (1943 changes made) during the period between November 1999 and June 2000. Excludes changes made to abstracts.

restrictcol '2/^math-ph/' < d_papers | papers2changes > d_updateshep-ph

Changes made to the math-ph section (102 changes made) during the period between November 1999 and June 2000.

Updates made over 7 month period

Here we go again, hopefully correct this time!

Only includes updates in the 7 months of incremental data (a total of 3612 updates to papers).



Home