This paper aims to generate discussion on the question of very large data stores and their usefulness as research corpora as exemplified by the whole of domain web archiving undertaken by the National Library of Australia (NLA). Is the effort of creating such huge datasets, maintaining them over the long term and building appropriate access pathways providing a valuable resource for current and future researchers? Or is the highly selective approach, exemplified by the NLA's PANDORA archive, potentially more useful. Two basic issues were identified as of national concern and searches were undertaken on these terms across the 2007 whole of domain archive and in PANDORA. The relevance of the results obtained were then compared with reference to various indexing approaches, searching behaviour and desired outcomes. The need for further research into this area is highlighted by the conclusions of this study.
|Title of host publication||14th ALIA Information Online Conference & Exhibition|
|Editors||Linden Fairbairn, Kay Harris|
|Place of Publication||Australia|
|Publisher||Australian Library and Information Association|
|Number of pages||11|
|Publication status||Published - 2009|
|Event||ALIA - Information Online Conference and Exhibition Conference - Sydney, NSW Australia, Australia|
Duration: 20 Jan 2009 → 22 Jan 2009
|Conference||ALIA - Information Online Conference and Exhibition Conference|
|Period||20/01/09 → 22/01/09|
Pymm, R., & Wallis, J. (2009). Archiving the web: does whole-of-domain archiving = information overload? In L. Fairbairn, & K. Harris (Eds.), 14th ALIA Information Online Conference & Exhibition (pp. 1-11). Australian Library and Information Association.