Archiving the web: does whole-of-domain archiving = information overload?

Robert Pymm, Jacob Wallis

Research output: Book chapter/Published conference paperConference paperpeer-review

111 Downloads (Pure)


This paper aims to generate discussion on the question of very large data stores and their usefulness as research corpora as exemplified by the whole of domain web archiving undertaken by the National Library of Australia (NLA). Is the effort of creating such huge datasets, maintaining them over the long term and building appropriate access pathways providing a valuable resource for current and future researchers? Or is the highly selective approach, exemplified by the NLA's PANDORA archive, potentially more useful. Two basic issues were identified as of national concern and searches were undertaken on these terms across the 2007 whole of domain archive and in PANDORA. The relevance of the results obtained were then compared with reference to various indexing approaches, searching behaviour and desired outcomes. The need for further research into this area is highlighted by the conclusions of this study.
Original languageEnglish
Title of host publication14th ALIA Information Online Conference & Exhibition
EditorsLinden Fairbairn, Kay Harris
Place of PublicationAustralia
PublisherAustralian Library and Information Association
Number of pages11
Publication statusPublished - 2009
EventALIA - Information Online Conference and Exhibition Conference - Sydney, NSW Australia, Australia
Duration: 20 Jan 200922 Jan 2009


ConferenceALIA - Information Online Conference and Exhibition Conference


Dive into the research topics of 'Archiving the web: does whole-of-domain archiving = information overload?'. Together they form a unique fingerprint.

Cite this