Integrating electronic publications in the library - Managing the transition from print to digital

Hans Roes
Tilburg University Library
Spring 1997


Paper presented at the International Summer School on the Digital Library, Tilburg, August 1997.


Abstract

Starting from a rather vague concept of integration as an evolutionary strategy, the emerging world of electronic resources is analyzed in relation to traditional library functions of selection, organisation and provision of information. Although electronic resources pose new problems to the library organisation, these problems are more practical than fundamental in nature. Libraries have adapted to new media throughout history and will continue to do so. In the short term the integration of new media calls for more integration between different departments within the library and between library and computer centre, in the long term new types of jobs might emerge to enable a more proactive approach to patrons' information needs and to filter the information provided to patrons. Catalogues and reference databases are a good tool to integrate the existing printed and emerging electronic information spaces. Products appearing on the market now which give access to electronic (versions of printed) journals cannot always easily be integrated with existing catalogues and reference databases.

Integration and libraries

"integration \In`te*gra"tion\, n. [L. integratio a renewing, restoring: cf. F. int['e]gration.]

  • 1. The act or process of making whole or entire.
  • 2. (Math.) The operation of finding the primitive function which has a given function for its differential coefficient. See Integral.
  • 3. In the theory of evolution: The process by which the manifold is compacted into the relatively simple and permanent. It is supposed to alternate with differentiation as an agent in development."
  • [Webster]

Starting to write this paper on an "integrated desktop", as the Tilburg implementation of the concept of the scholar's workstation is called, it was easy to switch from wordprocessor to webbrowser and look up the definition of integration in the Web-version of Webster's dictionary. A quick copy and paste action, and there appears a nice opening quote. The act of getting the quote is one example of "integrated working". The simplicity of the act illustrates the power of the concept of integration, but does not explain how this integration is achieved. In a sense it is the outcome of a long process, an evolutionary process one might say, although evolution can hardly be called a managed process, and management is the subject of this paper. On the other hand, the metaphor of evolution is a powerful one. The information space is changing, and the information space is the habitat of libraries and librarians. How do we as species manage to adapt to these changes in our environment and what are our chances for survival ?

Looking at the history of libraries we can reassure ourselves. The information marketplace has changed constantly in history [Feather, 1994], yet libraries have survived all of these changes. New media are constantly being created, leading to a fragmentation of the information space. Libraries are the glue for this fragmented world, simply by organizing the information space, however heterogeneous is might seem or be at times.

Electronic publications, as the latest new media in the information space, however, do pose new and serious problems. The main reason for that is that they in themselves are fragmented. It is not just one new medium, but a whole bunch of media (network, CD ROM, tape) in a lot of different, and often incompatible, formats. Compared with electronic media, moving images, which also have, or better had, a lot of different formats (8, 16, 35 mm celluloid, VHS, Video2000 and Betamax tape, to name just a few) seem simple now. The example also shows the likely evolution for electronic media: standardisation reduces the number of formats libraries have to deal with. But we are still a long way from that and how do we cope in the meantime, in other words, how do we manage the transition ?

That we have only just embarked on the transition can be illustrated if we have a look at Tilburg University Library, by no means a traditional one, you would think. Compared to twenty plus kilometres of shelving the number of electronically available articles and preprints is insignificant though. The picture changes if we look at current journal subscriptions, but still, less than 10 percent is electronically available. We really are at the beginning of this transition.

Users not only demand these new media, they also demand one stop shopping [Jeapes, 1997], in other words, they demand integration. Users don't care about media or formats, they care about information and the ease with which information can be accessed [Nissley, 1993]. Adapting to your users' needs is an evolutionary successful strategy. Integration is a strategy to manage the transition.

What is it we are trying to integrate ?

"Libraries will have to continue to support a print-based system while simultaneously merging in an electronic system."
[Woodward, 1994]

Referring to Webster's first definition of integration (the act or process of making whole or entire), another metaphor comes to mind, that of a jigsaw puzzle. Piecing a puzzle together is easier if you have a clear picture of what you want to achieve. Translated to the language of strategic management this implies having a vision, a goal. The ultimate goal is of course to better serve library users by making possible new and more efficient ways for them to get their work done [Levy, 1995; Wiederhold, 1995]. This vision is expressed most often in the concept of the scholar's workstation [Cairns]. The scholar's workstation supposes access to information sources; the tools to process the information obtained; the tools to produce new items of information - papers, articles; and, finally, tools to publish these papers and articles. This paper concentrates on integrated access to information sources, or integrated access to the information space, the first basic function of the scholar's workstation. What are the pieces of the puzzle to be found in that space ?

The example of Tilburg University Library shows that for the most part this information space is still made up of printed material, simply put, books and journals. In the electronic realm counterparts of these information types are emerging:

  • electronic versions of printed (scholarly) journals with or without value added, or sometimes a mix of value added and value subtracted (e.g. full text databases, you can search them, but the full text is a poor extract of what the article looked like in print [Grochmal, 1995])
  • electronic only (scholarly) journals, one the most exciting developments, mostly for free at the moment
  • electronic preprints, in some but certainly not in all disciplines, very important in informal communication but lacking peer review and quality control
  • electronic books, mostly student text books, still in their infancy and libraries seem reluctant here [Perrin, 1997]
  • Internet resources at large, anything not fitting in the categories above, varying from reference works (e.g. the Webster dictionary quoted above), to legal sites, to sites of institutions and Ticer's Summer School papers
  • the above categories imply primary information, but secondary information is also abundantly available in electronic format, in fact this is were the electronic world started for libraries: online catalogues, abstracting and indexing services, but also general reference databases and factual databases

Not only are the types of information to be found in this electronic realm diverse, if we look at the different media in which they come, the pieces of the puzzle become even more scattered: diskettes, CD ROMs, tapes, DAT cassettes, remote access. Yet a third dimension adding to the fragmentation is the different electronic formats in which information is packed: image formats, wordprocessor formats, so called portable formats like postscript and pdf which are not all that portable, the really portable formats ascii and html, spreadsheet formats for factual data, special formats for statistical packages. Interfaces is another dimension that makes the puzzle even more complex since data and interfaces are often sold or licensed together.

Apart from the technical factors two other factors add to the confusion. One is content. The same content is often available on different media, this is obvious for the electronic versions of printed journals, they are available in printed and (hopefully not to many) electronic formats. A frustrating factor here is that there can be slight differences in content covered between different media and formats. A related issue is bulk packaging of information which often implies a library is buying too much, i.e. information it doesn't need but which is part and parcel of the carrier, and too little, certain information which can be relevant is nowhere to be found [Edelman, 1995]. The second complicating factor, apart from the technicalities mentioned, is price. The same, or almost the same information can be priced very differently with different distributors, and anybody ever involved in negotiating site licenses knows how complex issues can become: what is a campus; which user groups; how many simultaneous users etc.

Leave the world of printed publications and the permutations become overwhelming. How do libraries cope with this amalgam, how are library functions affected by these issues ?

Integration and library functions

"[I]t is important for libraries to remember that the format should not be the issue, but rather the ease with which one is able to access information is of foremost importance."
[Nissley, 1993]

Whatever the type of library we are talking about, they all have three basic functions in common. The first one is to select and acquire items from the information space. The second function is to organize the items selected. The third to make the items accessible for their patrons, provision of information.

Traditionally, selection decisions are based on an evaluation of content (quality and scope), and format. With electronic products, two additional criteria are introduced [Davis, 1997]. The first can be labelled as technology options, i.e. does the product fit in the existing library infrastructure and can the information be properly accessed by library patrons. The second factor is licensing issues, since information in an electronic format is usually not bought but licensed. To complicate matters further, technology and licensing options can vary between distributors. The traditional factor content is the least affected, in fact collection development policies are only affected as far as format is concerned, scope is not affected and the goal of collection development remains an "intellectually cohesive, user friendly 'collection' of information resources" [Demas et. al., 1995]. The phrase "build collections / connections" [ibid.] is illustrative as well here. To put it yet another way: the collection development policy is the basis for the integrative process.

Libraries organise the information space in several ways, but the most fundamental method is by cataloguing, which discloses information items both formally and by subject. Existing formats for catalog records, like MARC can easily be adapted to accommodate electronic formats. An experiment with cataloguing Internet Resources by OCLC in the early nineties shows that with the introduction of an extra field (856, Electronic Access and Location), cataloguing of this material is pretty straightforward, although problems remain [Caplan, 1994]. Cataloguing is also preferred to other methods by which Internet Resources are usually disclosed: search engines and subject trees. The use of formal methods like MARC makes it possible to share resources, information can be presented in several forms, and because of the many access points the MARC format offers it is possible to fine tune queries, reducing noise and enhancing precision and recall. In this way the catalog becomes the main vehicle for integration [Sha, 1995; see also Kajosalo].

In the print world, after the user has selected and located items in the catalog, she can either go to the library or she can use document delivery services. In the ideal electronic world a simple click should do the trick. Integration would be complete if the information described in the catalog could be readily accessed and downloaded to the user's workstation. Unfortunately, due to the proliferation in formats, in practice there is not yet integrated access. World Wide Web technology promises an improvement though. A Web interface to an OPAC does just the trick.

The conclusion here is that 'plus ça change, plus c'est la même chose'. There is no change in the basic functions of the library. In fact, by stressing the analogies in the way in which print and electronic sources can be treated by a library, the integration is brought about. Experiments with electronic journals and Internet Resources show that these analogies can go deep [McMillan, 1992; Keating, 1993; Demas et. al. 1995]. This does not imply that there are no problems with the integration of electronic publications, the important point to notice is that these problems are not fundamental, they are merely practical. But one should not take these practical matters to light. One such an important matter is the library organisation.

Integration and the library organisation

"In the CD ROM network planning and implementation process, the unique expertise of the automation librarian, the collection development librarian, and the acquisition librarian are truly complementary."
[Davis, 1993]

Davis' remark about the implications of CD ROMs for the library organisation can easily be generalised to other types of electronic resources. Integrating electronic resources into the library affects all organisational units in the library. These units too will have to work in a more integrated way. The selection process of a CD ROM to be networked demands close cooperation with the automation unit to see wether the product can be fitted into the existing library infrastructure. User support comes into the picture when deciding whether the interface of the several products can easily be introduced or whether there is a need for additional training of library personnel and, of course, users. The negotiating of site licenses with it's complicated issues of concurrent use, multiple buildings, accessibility from student dorms and staff home computers, to name just a few matters, requires more expertise than the average acquisitions librarian has, to understand matters better the acquisitions librarian has to work closely with the automation department to gain an understanding of what exactly is acquired.

The same story can be told for other types of electronic publications. Introducing electronic journals in the library a few years ago required developing your own software since there were no off-the-shelf packages. Nowadays there are packages, but the automation department faces the choice between several options and the question which one best suits the library's particular circumstances. Working together with the computer centre is really the best option here. And that is exactly what we see, the organisation becomes more complex when learning to cope with these new electronic publications. Some libraries even go as far as to develop new functions like electronic acquisitions librarian, internet resources librarian, to name just a few which regularly pop up in the available positions postings on PACS-L.

Looking a bit further ahead it seems inevitable that reference work will also change radically. The first change is that information will seek users rather than vice versa. But, since the amount of information is likely to overwhelm users, quality filtering will also become needed more and more. In a sense the roles of selection and reference librarian might blend together. Also, in the transitional period, it will be important to remind users that there is also important information still in printed form (integration again ...). Electronic information has a tendency to drive out printed information ("if it's not on the Internet, it doesn't exist"). There is a striking analogy here with the introduction of OPACs, when users only asked for works which could be traced through the OPACs. They lost sight of older work, only to be found in card catalogues [Lesk, 1997].

How integration can be achieved

"Visionaries seem to forget that a library is more than a collection of books and journals. Cataloguers created linking devices long before twisted wire connected computers."
[King, 1993]

Library innovation is about more than just information technology. And King [1993] in her very critical appraisal of the promises of the concept of the electronic library has a point when she stresses the value of traditional concepts as the library catalogue. For the sake of the argument, let us assume that a catalogue is a broad concept here, a repository of pointers to information items of different types and formats: books, articles, research papers, electronically available or in good old print. This catalogue is the ultimate linking device. It reflects the collection development decisions taken in the past in response to patron's needs. It shows a customized slice of the information space. Increasingly, the catalogue and other specialized databases will offer links to primary information while at the same time pointing to printed information of importance to the clients of a library. If users at the same time have the possibility of ordering (copies of) printed literature, from their point of view, the integration is complete. In this point of view, it also should not matter where the primary information pointed to by the catalogue, is available locally or remote, and the same goes for electronic publications, it does not matter whether these reside on a local or remote server. The catalogue as a gateway, maybe we could even call this catalogue a virtual library.

Integration or fragmentation

"Rich frameworks for managing distributed heterogeneous bases of structured information do not yet exist."
[Dempsey, Heijne, 1996]

In reality we are far away from the ideal of a unified catalogue, since present day catalogues don't offer the opportunities sketched in the preceding paragraph. The very technology that could bring the integration about is still developing in such a fast pace that what we see most today is a chaotic universe of information services, fragmented pieces of what should be one.

Technical and economic and legal problems to be solved have mainly to do with standards, Dempsey and Heijne enumerate the following issues to be solved yet: document formats, integrity and authentication, metadata, quality guidelines for the inclusion of objects in discovery systems, integration of Web technology with search standards like Z39.50, copyright issues. The building blocks are there, but an effective integration has yet to be achieved [Dempsey and Heijne, 1996; see also Lowry, 1995].

Options

"[B]uilding digital libraries will be a costly and lengthy process."
[TULIP, Final Report, 1996]

One of the first major experiments with the integration of full text journals in libraries has been the TULIP project. [Chen, 1994; Lowry, 1995; TULIP Final Report, 1996; Willis, 1994]. A major conclusion of the extensive user studies that flanked the experiment is that users demand integration, they want access to all information through one source, preferably their familiar user interface. The experiment also shows that it is not easy to achieve integration, even if an institution is capable of developing systems, which, by the way, was very costly. The example of Michigan shows the problem of integration: users had two access routes, the first, integrated with the library system, only offered the bibliographic description and the option to have an article printed on a central printserver and mailed. The other option was TULIPView: it offered only the TULIP journals, but users could view the articles on screen and could print the articles on local printers.

Since the TULIP project, many new products have entered the market. An evaluation of these products is beyond the scope of this paper, they will only be mentioned shortly here for those wishing to further investigate their properties.

Older products which all lacked integrative possibilities are the CD ROM based products like ADONIS and Business Periodicals on Disc. These are self-contained products with proprietary search interfaces. Similar problems can be found with full text databases to be accessed online: Dialog, Lexis Nexis for example. These online services are most often only available through intermediation by reference staff.

Newer products are RightPagesTM, developed at AT&T Bell Labs [Hoffman, 1993], essentially a journal browser for only full text journals, with no integration with printed journals. Still newer products developed by publishers are Elsevier's ScienceDirect (http://www.elsevier.com/inca/homepage/news/1997/sciencedirect/) and Blackwell's Electronic Journal Navigator (http://mktdev1.blackwell.co.uk/es/ejn.htm). The integration these products aim for is in offering journals from a wider range of publishers, which was also the distinguishing feature of ADONIS. These approaches have their advantage above isolated and private solutions of individual publishers like IEEE (http://www.jolly.ieee.org/opdemo/) and SIAM (http://www.siam.org/journals/journals.htm), integration with existing catalogues and reference databases in libraries is absent though.

A similar approach is being followed by subscription agents, the most recent example being SwetsNet (http://www.swetsnet.com/).

Another recent development is the consortium approach where library organisations negotiate on a national level with publishers and develop software to give end users access to full text journals, examples are BIDS Journals Online (http://www.journalsonline.bids.ac.uk/JournalsOnline) in the UK, OCLC's Electronic Journals Online (http://jake.prod.oclc.org:3050/html/ejo_homepage.htm) in the US, and WebDoc (http://www.pica.nl/docs/en/webdoc/webproj.html) in the Netherlands. The example of WebDoc shows the danger of the isolated approach of building a service for electronic resources exclusively: users find these services to contain too little critical mass. The project now seems to develop in the direction of integration with PICA's large Online Contents reference database.

All these products do only one thing: they make access to full text electronic versions of printed journals possible. And they all do it in a slightly different way. In a couple of years it will be clear which approach is the most promising. Libraries should be very critical about these products since they are hard to integrate.

In the meantime, there are many other electronic publications which can be integrated easily using what you have locally available. Electronic journals, electronic preprints, other types of information on the World Wide Web. Most of this information is freely available, all you have to do is to incorporate it in your catalogue. If your catalogue has a Web-interface, all the better.

Conclusions

"Libraries can and will continue to be the best integrated source of information - no matter what format the information comes in and regardless of whether we think of these libraries as virtual libraries."
[McMillan, 1993]

Integration has been introduced here as a broad concept in relation to the mission of the library: to support the primary processes of the parent institution. In the case of a university library these primary processes are education and research. Looking at history one can also state that integration is what libraries are all about. From all the fragments to be found in the information space libraries collect the items most relevant to their users' needs and glue them together in a coherent way. Formats are only a secondary factor in the decision what to collect, organise and provide. Catalogues and reference databases are the main interface for patrons to this collection. Electronic resources call for new forms of cooperation between different departments within the library and between library and computer centre. New products on the market for access to full text electronic journals have limited capabilities for integration with existing catalogues and reference databases.

References

  • Cairns, David, 'Redefining the Scholar's Workstation': A Review of Current Projects in the United States, URL:http://ukoln.bath.ac.uk/papers/bl/scholar/intro.html
  • Caplan, Priscilla L., Controlling E-Journals: The Internet Resources Project, Cataloguing Guidelines, and USMARC, Serials Librarian, 23(3-4), 1994, pp. 103 - 111
  • Chen, Ching-Chih, How TULIP is implemented at MIT: Additional Comments from the Journal Editor, Microcomputers for Information Management, 12(1-2), 1994, pp. 113 - 120
  • Davis, Trisha L., Acquisition of CD-ROM Databases for Local Area Networks, The Journal of Academic Librarianship, 19(2), 1993, pp. 68 - 71
  • Davis, Trisha L., The Evolution of Selection Activities for Electronic Resources, Library Trends, 45(3), Winter 1997, p. 391
  • Demas, S., McDonald, P. and Lawrence, G., The Internet and Collection Development: Mainstreaming Selection of Internet Resources, Library Resources & Technical Services, 39(3), 1995, pp. 275 - 290
  • Dempsey, Lorcan and Heijne, Maria, Scientific Information Supply - Building Networked Information Systems, The Electronic Library, 14(4), 1996, pp. 317 - 332
  • Edelman, Marla, The New World Order: Serials Management of Electronic Resources and Document Delivery, The Serials Librarian, 25(3/4), 1995, p. 261
  • Feather, John, The Information Society. A Study of Continuity and Change, London, 1994
  • Grochmal, H.M., Selecting Electronic Journals, College & Research Libraries News 56(9), 1995, pp. 632 - 633, 654
  • Hoffman, Melia M. et. al., The Right-PagesTM Service: An Image-Based Electronic Library, Journal of the American Society for Information Science, 44(8), 1993, pp. 446 - 452
  • Jeapes, Ben, Learning to live with e-journals, The Electronic Library, 15(1), 1997, pp. 27 - 30
  • Kajosalo, Erja, Issues Related to Cataloguing of Internet Resources, paper presented at LIS598 - Applications of Technology in Libraries, University of Alberta, URL:http://www.slis.ualberta.ca/598/erja/rep_html.htm
  • Keating, Lawrence R., II Reinke, Christa Easton, and Goodman, Judy A., Electronic Journal Subscriptions, Library Acquisitions: Practice & Theory, 17(4), 1993, pp. 455 - 463
  • King, Hannah, Walls Around the Electronic Library, The Electronic Library, 11(3), 1993, pp. 165 - 174
  • Lesk, Michael, Going Digital, Scientific American, March 1997, pp. 40 - 52
  • Levy, David M. And Marshall, Catherine C., Going Digital: A Look at Assumptions Underlying Digital Libraries, Communications of the ACM, 38(4), 1995, pp. 77 - 84
  • Lowry, Charles B., Preparing for the Technological Future: A Journey of Discovery, Library Hi Tech, 13(3), 1995, pp. 39 - 54
  • McMillan, Gail, Technical Processing of Electronic Journals, Library Resources & Technical Services, 36, October 1992, pp. 470 - 477
  • McMillan, Gail, Electronic Journals: Access through Libraries, in: Saunders, Laverna M., In the Virtual Library: Visions and Reality, Westport, Meckler, 1993, pp. 111 - 129
  • Nissley, Meta, Rave New World: Librarians and Electronic Acquisitions, Library Acquisitions: Practice & Theory, 17, 1993, pp. 165 - 173
  • Perrin, Wayne, RESULTS - Survey of Attitudes to Electronic Books, contribution to DIGLIB-L, 22 April 1997, URL http://www.nlc-bnc.ca/cgi-bin/ifla-lwgate/DIGLIB/archives/diglib.log9704/date/article-38.html
  • Sha, Vianna T., Cataloguing Internet Resources: the Library Approach, The Electronic Library, 13(5), 1995, pp. 467 - 476
  • TULIP Final Report, July 18, 1996 URL:http://www.elsevier.nl/homepage/about/resproj/tulip.shtml
  • Wiederhold, Gio, Digital Libraries, Value and Productivity, Communications of the ACM, 38(4), 1995, pp. 85 - 96
  • Willis, Katherine et. al., TULIP - The University Licensing Program: Experiences at the University of Michigan, Serials Review, 20, 1994, pp. 39 - 47
  • Woodward, H., The impact of Electronic Information on Serials Collection Management, IFLA Journal, 20(1), 1994, pp. 35 - 45

© Hans Roes, 1997