|Home | You are at|
Last updated August 31 2000 10:09:20.
The ratio or the difference, or even (P-U)/(P+U)
Divide the papers into hi/med/lo impact using several measures:
The experimental design is then Impact (3 levels) by sector (4 levels)
Zhouan: Currently working on the tools to produce the citation information. Also it would be easier to ascertain the impact factor of a paper rather than an author (considering different formats of name, and many authors are quoted for one paper - should the position of the author be considered?
Defining High Impact
Author's hit-rate; author's citation-ratio ("impact factor")
Hypothesis: That authors who use the archive will have a higher "impact
facter" than those who don't (over period 1991 -> 2000).
Important. Here we can use other forms of analysis: Latent Semantic Analysis (I can contact Tom Landauer about the LSA software for research purposes), Shimon Edelman's similarity metric, shared keywords, co-citation
Produce report on LSA technique.
Contact each of the other mirror sites (compose a letter and send it to me: I could edit and send for you).
LSA and other techniques
How valid is the use of LSA? To make an accurate assessment of the "spread" of an area, a physics dictionary will be needed plus a "core" set of papers that should be in the area. What will this tell us about the archive?
Area Analysis - does this
question? What details are needed for the kinds of updates?
Re-writes of text-body (how big), re-writes of abstract, and front-matter, journal reference insertion
Hypothesis: That high-impact authors will deposit papers that get published/are published. Low-impact authors will submit articles that will never be published. Papers that aren't published - why are they submitted to XXX?
For papers that are not tech-reports/non journal-refed:
ASTRO-PH - does astro store pre-prints, are authors using XXX to store just preprints because they can't store them in Astro? Look in astro/contact authors to find out behaviour.
and at each impact level -- and compare across the years as XXX grew and practise evolved...
and AAS and maybe even ISI
Contact authors who updated with JR, but not paper, why they didn't/whether they made changes.
This is one of many variables you will want to correlate with impact (which can be measured the 4 ways mentioned above): latency (how soon the hits occur); whether journal ref is given; sector; etc.
For hep-ph (the largest area in the archive), during the 7 month period, only 8 papers were replaced and 217 had their abstracts updated. Is there enough data to answer this question? - tdb198
i) Is a paper published?
Meeting 26/7/2000Ian Hickman
Actions:Tim: Think up preamble for questionnaire, estimate what people are going to send back.
Tim: Break up age of citations by impact factors.
Ian: What proportion of papers are never hit?
Tim: Proportion of Orange/Red links over time.
Tim/Ian: Qualifying and explaining data.
Meeting 18/7/2000Stevan Harnad
Meeting 11/7/2000Les Carr
SCOOT - script to apply spotcite to hepth on arabica /export/2/XXX_PDF
(note these are my notes, so please don't fry me if I get anything wrong!)
(Bits relevent to ePrint usage research:)
LSA: Could Ian research LSA technique/produce some info on how it works. Harnad: Need to have a "core" set of HEP papers to test against.
Harnad: 4 tests for impact of articles:
Steve Hitchcock: Where do we want to go with impact factor
Tim: SPIRES doesn't contain publication [journal-ref] entries for all papers (of sample of 10, 2 had j/r). Harnad: For papers that do not contain journal-refs, need to contact sample of authors to ascertain what has happened to these papers. Has the article been published in a book/conference etc. Zhouan: This is classed as published.
Astro-ph section of XXX. Why is it so popular? Low proportion of deposited papers have J-R, how does this relate to ASTRO e-Archive? Contact authors/ASTRO find out what deposits in XXX are? Tim: Large number of technical reports in astro.
Concerning updates to papers/journal-ref addition. Steve Hitchcok: Authors who update with J-R? Contact authors to find out whether they changed paper/why they didn't (didn't bother because very little change/because it was published and they want people to look in journal?)
Citation analysis: [from earlier] can Zhouan produce some statistics on citation ratios/can Ian look at Les' code to extract this info? Use ISI to get citation ratios?
Zhouan: Questions over author extraction; how much sharing of names is there?
Date of next tech meeting: 2 Weeks From Now