Web indexing

Sigfrid Lundberg's Stuff

2015-05-17: Om sökmaskiner och spindlar
År 1995 är ett märkesår för mig. Det är då jag bestämmer mig för att byta karriär till Internet och programmering.
Web indexing, 2015

2007: What do our users expect from the search field on our top page?
What can deduced by analyses of the search terms entered into the search field on the home page of a large modern library? Could we possibly regard this form as the digital library´s information desk, and the queries as the request from its patrons? If we do that, what could we then learn about the quality of service? In this note we do just that. We regard the terms entered into the form as manifestation of our patrons needs and in order to find out what they need, we have performed a preliminary text mining exercise. We hope that this will help us designing a new search facility that will better answer the needs of our users.
Web indexing, Harvesting, Library catalogues, Structural web design, 2007

2001: Technical description of the Studera.Nu search engine Harvesting policies, maintenance and software used
This is a description of the Studera.NU search engine in the very first version. The search engine carries focused information, all of which is indexed manually by content providers. The indexing information is embedded into web pages using meta tags, and the service is built using a harvesting robot. This article documents the functions of the search engine Studera.NU, its robot, search software etc. It aims at providing necessary background information both for staff involved in system management and future developers. A second aim is to give pinpoint areas for future developments of the service.
Web indexing, Harvesting, Metadata, 2001

1998: A regional distributed WWW search and indexing service - the DESIRE way
In an attempt to implement a regional search engine we have created an open, metadata aware system for distributed, collaborative WWW-indexing. The system has three main components: a harvester (for collecting information), a database (for making the collection searchable), and a user interface (for making the information available). All components can be distributed across networked computers, thus supporting scalability. The system is metadata aware and thus allows searches on several fields including title, document author, and URL. Nordic Web Index (NWI) is an application using this system to create a regional Nordic web-indexing service. NWI is built using five collaborating service points within the Nordic countries. The NWI databases can be used to build additional services. Services today include special metadata databases, multimedia databases, and statistics about the Nordic Web.
Web indexing, Harvesting, Metadata, Z39.50, 1998

2000: Metadata harvesting
This is unfortunately unpublished, but it presents most one of the most extensive datasets on the content of HTML embedded metadata which I collected while working with the Nordic Webindex.
Web indexing, Harvesting, Metadata, Z39.50, 2000

1999: The Noble Art of Being Indexed or the Webmaster's Guide to Harvesting Robots
As the title indicates, this is a note on how to be indexed by search engines. It is obsolete. Please note that it does not contain anything about search engine optimization in the modern evil sense. Instead it pinpoints what harvesting robots did understand, and what they didn't at the turn of the last century.
Web indexing, Metadata, Structural web design, 1999

1998: En arkitektur för ett distribuerat system för spridning av forskningsinformation
Arkitekturen för ett nätverk av tjänster för spridning av forskningsinformation beskrivs. Nätverket tillåter distribuerat underhåll av såväl data som metadata. Till nätverket kan anslutas såväl bibliografiska databaser som helt WWWbaserade metadataberikade tjänster. Metadataproduktionen följer standarden Dublin Core. De förra typerna av tjänster inkorporeras i nätverket genom interoperabilitet på sökprotokollsnivå, medan metadata och fulltext från de senare göres sökbara genom en WWWrobot. Kommunikation inom nätverket utnyttjar standardiserade Internet protokoll som HTTP, och för informationsåtervinning Z39.50
Web indexing, Harvesting, Metadata, Z39.50, 1998

1998: NWI II, An Enhanced Nordic Web index: Final report
The Nordic Web Index (NWI) project is a collaborative effort across the Nordic countries, aiming at providing a free Worldwide Web search service to the general public in the countries involved. NWI has been fruitful for several reasons: First and foremost we are today providing access to databases covering the WWW in four of the Nordic countries and, as of September 1998, five service points in six languages Denmark, Finland, Sweden, Norway and Iceland.
Web indexing, Harvesting, Metadata, Z39.50, 1998

1997: A proposal for metadata formats for use in the Swedish Enviro-Net
This paper outlines a combination of the Dublin Core metadata embedded in HTML metadata tags and the GILS Z39.50 profile used for information retrieval. It is a combination would solve the distributed indexing, search and retrieval problems within a network of environmental authorities. The Swedish Enviro-Net lived for a few years at the end of 1990ties.
Web indexing, Metadata, Z39.50, 1997

1996: NWI — En nordisk söktjänst för World Wide Web
Ett konsortium av nordiska forskningsbibliotek och nätverksorgan presenterar Nordiskt Webindex, en nordisk söktjänst utvecklad i norden för en nordisk publik. I dag finns en experimentell tjänst offentligt tillgänglig på nätet på URL
Web indexing, Harvesting, 1996


Subscribe to Stuff from Sigfrid LundbergSubscribe to more Stuff

stuff by category || year


My name is Sigfrid Lundberg. The stuff I publish here may, or may not, be of interest for anyone else.

On this site there is material on photography, music, literature and other stuff I enjoy in life. However, most of it is related to my profession as an Internet programmer and software developer within the area of digital libraries. I have been that at the Royal Danish Library, Copenhagen (Denmark) and, before that, Lund university library (Sweden).

The content here does not reflect the views of my employers. They are now all past employers, since I retired 1 May 2023.