Sigfrid
Lundberg´s
Stuff

Harvesting

Sigfrid Lundberg's Stuff

2007: What do our users expect from the search field on our top page?
What can deduced by analyses of the search terms entered into the search field on the home page of a large modern library? Could we possibly regard this form as the digital library´s information desk, and the queries as the request from its patrons? If we do that, what could we then learn about the quality of service? In this note we do just that. We regard the terms entered into the form as manifestation of our patrons needs and in order to find out what they need, we have performed a preliminary text mining exercise. We hope that this will help us designing a new search facility that will better answer the needs of our users.
Web indexing, Harvesting, Library catalogues, Structural web design, 2007

2003: Implementering av OAI-PMH i ett bibliotekssystem -- Några ord på vägen
Syftet med denna text är att ge lite kött på benen om vad OAI-PMH är, vad det kan användas till och hur det skulle kunna integreras i ett bibliotekssystem. Det är inte en ersättning för andra dokument utan skall ses som några ord på vägen för organisationer som planerar att eller är i färd med att implementera OAI-PMH. Texten skrevs på uppdrag av https://www.axiell.com/se/
Harvesting, Metadata, Library catalogues, 2003

2001: Technical description of the Studera.Nu search engine Harvesting policies, maintenance and software used
This is a description of the Studera.NU search engine in the very first version. The search engine carries focused information, all of which is indexed manually by content providers. The indexing information is embedded into web pages using meta tags, and the service is built using a harvesting robot. This article documents the functions of the search engine Studera.NU, its robot, search software etc. It aims at providing necessary background information both for staff involved in system management and future developers. A second aim is to give pinpoint areas for future developments of the service.
Web indexing, Harvesting, Metadata, 2001

1998: A regional distributed WWW search and indexing service - the DESIRE way
In an attempt to implement a regional search engine we have created an open, metadata aware system for distributed, collaborative WWW-indexing. The system has three main components: a harvester (for collecting information), a database (for making the collection searchable), and a user interface (for making the information available). All components can be distributed across networked computers, thus supporting scalability. The system is metadata aware and thus allows searches on several fields including title, document author, and URL. Nordic Web Index (NWI) is an application using this system to create a regional Nordic web-indexing service. NWI is built using five collaborating service points within the Nordic countries. The NWI databases can be used to build additional services. Services today include special metadata databases, multimedia databases, and statistics about the Nordic Web.
Web indexing, Harvesting, Metadata, Z39.50, 1998

2000: Metadata harvesting
This is unfortunately unpublished, but it presents most one of the most extensive datasets on the content of HTML embedded metadata which I collected while working with the Nordic Webindex.
Web indexing, Harvesting, Metadata, Z39.50, 2000

1998: En arkitektur för ett distribuerat system för spridning av forskningsinformation
Arkitekturen för ett nätverk av tjänster för spridning av forskningsinformation beskrivs. Nätverket tillåter distribuerat underhåll av såväl data som metadata. Till nätverket kan anslutas såväl bibliografiska databaser som helt WWWbaserade metadataberikade tjänster. Metadataproduktionen följer standarden Dublin Core. De förra typerna av tjänster inkorporeras i nätverket genom interoperabilitet på sökprotokollsnivå, medan metadata och fulltext från de senare göres sökbara genom en WWWrobot. Kommunikation inom nätverket utnyttjar standardiserade Internet protokoll som HTTP, och för informationsåtervinning Z39.50
Web indexing, Harvesting, Metadata, Z39.50, 1998

1998: NWI II, An Enhanced Nordic Web index: Final report
The Nordic Web Index (NWI) project is a collaborative effort across the Nordic countries, aiming at providing a free Worldwide Web search service to the general public in the countries involved. NWI has been fruitful for several reasons: First and foremost we are today providing access to databases covering the WWW in four of the Nordic countries and, as of September 1998, five service points in six languages Denmark, Finland, Sweden, Norway and Iceland.
Web indexing, Harvesting, Metadata, Z39.50, 1998

1997: The Kulturarw3 archiving format
This note describes a file format intended for the joint storage of objects retrieved from the World Wide Web and various kinds of meta information on those objects. The information includes the HTTP response header but may also contain information needed for version control etc within the archive. Kulturarw3 project proposes an archiving record format based on open standards, namely that objects are stored as HTTP MIME multpart/mixed messages (cf. RFC2068 and RFC1521 respectively).
Digital preservation, Harvesting, 1997

1996: NWI — En nordisk söktjänst för World Wide Web
Ett konsortium av nordiska forskningsbibliotek och nätverksorgan presenterar Nordiskt Webindex, en nordisk söktjänst utvecklad i norden för en nordisk publik. I dag finns en experimentell tjänst offentligt tillgänglig på nätet på URL http://nwi.ub2.lu.se.
Web indexing, Harvesting, 1996

Home

Subscribe to Stuff from Sigfrid LundbergSubscribe to more Stuff

stuff by category || year

NB

My name is Sigfrid Lundberg. The stuff I publish here may, or may not, be of interest for anyone else.

On this site there is material on photography, music, literature and other stuff I enjoy in life. However, most of it is related to my profession as an Internet programmer and software developer within the area of digital libraries. I have been that at the Royal Danish Library, Copenhagen (Denmark) and, before that, Lund university library (Sweden).

The content here does not reflect the views of my employers. They are now all past employers, since I retired 1 May 2023.