Systemwide Library and Scholarly Information Advisory Committee
February 11, 1999
Meeting Notes
Members present: Christ (Chair), Chandler, Coleman, Copeland, Henry, King, Lucier, Lynch, McCredie, Hartford for Miller, Pantelia, Peete, Pryatel, Samuelson, Gordon for Sautter, Stead, Varian, Werner, Lawrence (staff)
Guests: Paul Ginsparg, Los Alamos National Laboratory
Members absent: Beck, Kennel, Lynn, Michaels, Sharrow, Tomlinson-Keasey
Objectives for this Meeting
(NOTE: PowerPoint visuals attached)[PDF]
(Adobe Acrobat is required to view PDF files.)
Varian began with an overview of the benefits (fast, inexpensive, inclusive) and drawbacks (uncertain methods for peer review and version control, questions about permanence) of electronic preprint (e-print) archives as a means of scholarly communication, and identified several models for e-print services (centralized, decentralized but moderated, unmoderated). Publication in peer-reviewed journals is a signal of quality in the present system, but a costly signal. Because print publication is expensive, it is important to filter material before publication; in the digital environment, because marginal costs are much lower, filtering post-"publication" is feasible, but shifts the cost of quality control to the broad user community – with more unfiltered material available, the scarce resource is not money, but the attention of readers. Ex ante filtering need not be binary as it is in print (publish/reject), but can be relative and based on a variety of measures (server "hit rates," explicit rating or ranking by referees, endorsements, expert recommendations, etc.), and can contribute to improving the quality of published communications through author revisions based on comments and ratings, as long as a form of version control is available. In the digital environment, ratings can be expressed in two dimensions: is it interesting, and is it correct? Digital publishing also enables published communications to be expressed in a greater variety of forms than at present, e.g., abstract, five-page summary (which might be particularly aimed at the non-specialist), full article, and detailed appendix (with data, methodological detail, extended narrative, links to related resources, etc.). Multidimensional ratings can be articulated with multiple forms in a variety of new ways, e.g., interest/importance could be judged by expert raters prior to publication, with assessments of correctness added by readers after posting. Other features that could be added to the e-print publication/rating system might include threaded commentaries, e-mail notification of new publications or comments on existing ones, linking to reviews and recommendations, etc. The NSF-sponsored Digital Library project at UCB under Prof. Wilensky’s leadership (one component of the joint InterLib project proposed for the Digital Library 2 competition by Berkeley, Santa Barbara and Stanford with participation by the San Diego Supercomputer Center and the California Digital Library) provides one example, featuring a technology called Multi-Valent Documents that have enabled development of an Electronic Research Notebook that both ties together various elements of a research project in progress and facilitates development, review and analysis of content related to the research.
Among the topics raised in discussion were the economics of scholarly monographs; the importance to authors of ex ante review as a means of validating and improving the paper before publication; the risk and the value to junior faculty of e-print (e-print publication can be pre-tenure, but provides a means to establish reputation and build networks of colleagues; the possibility that lower cost for e-print could lead to expanded exposition, adding to cost and the demand for reader attention (in print, editors contained this tendency during the ex ante editorial process); and the possibility of a "UC Journal of Partially-Baked Ideas" (PBI’s) as a means to (finally) publish insignificant results.
Ginsparg provided an overview of the economics of conventional scholarly publishing and the characteristics of the LANL Preprint Server, noting that for scientific journals in general, average revenue per article is about $4,000, while for the preprint server, costs are about $3 to $4 per article added to the archive. He noted that the LANL archive addresses many of the issues raised by Varian, including version control. A committee member raised the concern about publishers that will not publish material previously deposited in an e-print archive; Ginsparg cited the American Mathematical Society as an association publisher with an enlightened view on this topic. He presented a schematic view of an e-print publishing model in which a core set of archival servers, containing unfiltered contributions, were surrounded by an "outer ring" of "virtual journals," containing links to the archival copies of articles selected by the journal editors according to their own quality-filtering criteria and methods. Users could search the archives directly and/or make use of the "virtual journals" to identify articles of interest and value. Archival servers could be operated by institutions or other non-profit entities having the common mission of disseminating research results widely at minimal cost, while "virtual journals," as value-added services, could be operated by scholarly and scientific societies or commercial publishers, as well as by individuals with a personal or professional interest in gathering and making available the best research in their area of interest. This model, which could in principle support all of the ex ante filtering features discussed by Varian, would permit an article to be linked to multiple virtual journals, adding yet another dimension to quality assessment. It was pointed out that there would be a natural tendency for multiple editors to link to the same core set of highly-influential articles, effectively discounting the contributions of others, and perhaps leading to the neglect and eventual loss of "heretical" works or those whose value only becomes evident years after publication. Others, however, expressed the view that with a very low cost of entry into the "virtual journal" market, it would be easy for individuals and groups to step in to organize and publicize the work that was neglected by the "mainstream" publications. McCredie asked whether there were pieces missing from the current technology infrastructure that would be needed to achieve such a model. In Ginsparg’s view, the key missing element is a low-level standard for the formatting of documents. Existing standards (e.g., TeX, PostScript, PDF, MS-Word) are all deficient in one way or another, and the absence of an acceptable and ubiquitous standard presents a significant barrier to rapid expansion of e-print archives. (NOTE: Varian has subsequently observed that LaTeX appears to be the most effective current standard for scientific and technical publication; MS-Word, XML, even Math-ML are still of poor quality compared with TeX. PDF is satisfactory only if people use it correctly. At some point Math-ML (a variant of XML) will likely replace TeX/LaTeX, but it is not clear when that will occur.)
Ober discussed the CDL/UCLA database of editors of major journals and the December 1998 focus group sessions held at UCLA. Werner contributed that, from her perspective, the consistent message from all the groups of editors was that they lacked knowledge of the intricacies of copyright law and intellectual property issues. The committee discussed the concern that the design and implementation of the focus group project appeared to focus chiefly on the sciences. Lucier reiterated the recommendation of the Library Planning and Action Initiative’s Advisory Task Force that CDL development focus initially on the sciences, and noted that, because the CDL had consulted widely in preparing a detailed collection development plan for sciences pursuant to this recommendation, the identification of major journals in the sciences was more completely developed than in other disciplines. Collection development work in other disciplines is proceeding, however, and Lucier noted that the calls he receives from UC faculty interested in advice and assistance in their digital publishing efforts are increasingly from arts and humanities scholars. Werner noted further that any appearance of bias in the identification of journals and editors can be remediated by issuing calls for self-identification from campus faculty, as was done at UCLA. Ober indicated that the forum methodology could be implemented at other campuses if they wished, and reported that the CDL is considering extending the forum concept to junior faculty and graduate students who are not yet editors.
(NOTE: this item was deferred to the end of the meeting.) Lawrence summarized the CDL’s experience to date with negotiation of licensing terms and conditions with commercial and society publishers, noting that the CDL’s objective in negotiating license terms is to ensure that our digital collections are at least as available, usable and reliable as our print collections. He observed that the licensing regime is very different from the statutory regime that governs acquisition and use of print materials, and that all parties – publishers, libraries, and users – are feeling their way through the complexities of this new environment.
Lucier noted that President Atkinson has expressed strong interest in seeing UC take leadership in digital scholarly communication. To this end, Lucier has been working with a small group of major research institutions and publishing associations, with the assistance of the Council on Library and Information Resources, as well as having discussions with UC faculty who have approached him with regard to their digital publishing projects, plans and interests. As a result of these discussions, Lucier has developed the following working principles or assumptions, regarding which he seeks the Committee’s views:
There was substantial agreement that, from the faculty perspective, costs and budgets are secondary concerns – the key issue is access.
Discussion centered on the following themes:
Christ and Varian felt that it was important to focus first on the development of an unfiltered preprint service, and make every attempt to avoid becoming mired in the peer review issues. While it is important to position this initiative so that UC as an institution is not responsible for assessment, provision for assessment must be made in the design. Look for partnerships with scholarly and professional societies to address this issue. Copeland felt that, to gain adoption in a faculty culture that expects rigorous assessment of scholarly performance on a three-year cycle, there must be some means of assessing and rewarding value from the beginning. Samuelson, however, envisions this as operating in parallel with print, where assessment would continue to take place through traditional means.
King observed that an initiative of this kind could work, but be prepared for success; rapid scale-up requires careful planning and adequate investment. The initiative presents a "bootstrap problem" – no one will sign up until there’s critical mass, but critical mass can’t be attained without participants. This suggests that the leaders of this initiative must be prepared to invest significant sums on the preprint service, but we needn’t worry about filtering mechanisms; if the strategy is successful, this area will develop by itself. A large number of small players may be preferable to partnering with one large society. Copeland asked how publishers might view this development, suggesting that the positions of the major publishers in each discipline could influence choice of area to develop.
Based on previous experience in unrelated areas, Coleman suggested development of a list of "readiness criteria" to evaluate whether an institution, discipline, or organization was ready to move forward in this area. Among the criteria suggested by members were:
Varian suggested that another way to identify readiness and interest would be to issue a call to faculty.
Lucier recapped the discussion as follows:
In Michaels’ absence, Lucier outlined the characteristics of the 1999-2000 Governor’s Budget, the University’s strategy for working with the administration on the University’s overall budget plan, and the prospects for funding of the Library Budget Initiative. It was the consensus of the Committee that libraries should be considered part of the University’s core funding needs within a new compact, that any new funding that becomes available should be deployed for the purposes described in the Regents’ Budget, and that the Committee should write to the Provost conveying these views.
Gary S. Lawrence
Academic Initiatives/California Digital Library
3/10/99