Project Discussion Part VI: Digital Libraries

Hello everyone, and we’re nearing Christmas. Thought I’d get that out there; yes, it’s been a while. In this post, we discuss digital libraries.

It was important to talk about this topic because this is what I was essentially proposing as my project; allowing academics to upload their work in a large database. Now we’ve talked about social networks as I felt it relevant to this discussion, but at the same time, I didn’t want it to be too much like one. My original plan was to allow people a service that allows them to store and share work. That is what a digital library can do.

Q: What is a Digital Library?

Libraries have always been a hub of information for many people.  Libraries provide professionals trained to distinguish and verify content, build collections and provide a reference and information service They still are, but as we’ve moved into the technological age, more and more information is being digitalised. Therefore, as with most things, libraries are increasingly becoming digital themselves.

With this information boom, there are now more opportunities to build and share knowledge in the form of electronic formats.  The concept of a ‘world library for the blind’ rests on the ability of digital libraries to share and coordinate collection-building resources and to use digital
technology to share content. This needs to be understood as digital libraries being  designed effectively to do this job.

Technology changes libraries in the way it is organised and delivered. Essentially, the library still functions as a place for storage for organised content. Its digitization is a means of ensuring that its collections are preserved and accessible to all regardless of disability or affiliation. The digital library acts as the critical point of contacts between the information provider and the information consumer (user).

This system is what allows people to navigate their way through the database. Like a physical library, digital/electronic libraries store content as a digitalised format. Digital libraries are closely associated with academic institutions as a means for storage of the mass volumes of work. The term ‘born-digital’ means that the work has always been in a digital format. Many of these works are free to view for the public with little restrictions. Perhaps the real crux of these is that they can allow people not associated with an institutions (non-students) to view these works if necessary.

Digital libraries take advantage of the internet as a source of content and distribution means (remember, a ‘broadening of the distribution channels’).  It has profoundly changed information services for users and libraries. Publishers of content, trade books and magazines, electronic journals and electronic databases offer new opportunities for acquiring, managing and distributing content that is accessible.

Q: How Are The Aims of a Digital Library?

Due to their collections, digital libraries commonly integrate a search system within their database to make it easier for people to find what they’re looking for. They often use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), in which they expose their content to external libraries and search engines. The OAI-PMH is used to ‘collect’ metadata descriptions of the records in archives, so that services can be built using that information from many archives. It’s why you can see works in things like Google Scholar that exists outside the library.

Generally, there are 5 major functions of a digital library: acquisition, cataloging, retrieval, interpretation and sharing.


Is the process by which consumers selects different content to view, borrow, rent, read, and buy. This is the e-commerce functionality of the digital library (and a way in which the library gains funds).


Is the management of acquired content, along with accompanying copyrights and permissions from respective authors/owners. Search engines within the library helps order content by various factors, such as author or title.


Is the process of searching content and the management of the search results. A common search interface simplfies the search process for the user and at the same time enables the publisher to select the best data format for the task.


This involves viewing information in the context of related resources. It also includes being able to identify relevant connections between sources. Again, a search engine will be able to find relevant topics to the original search, thereby allowing the user additional content.


Sharing means having tools and processes for annotating resources, extracting from and citing sources, and collaborating with other users. The end product of research is often shared with others.

Q: Where Do Digital Libraries Stand?

In terms of the internet, digital libraries have a great opportunity to get ‘out there.’ As we have seen countless times, the internet provides the means and channels to get yourself noticed. To get maximum exposure, careful consideration is needed to the behaviour of the customer/consumer.

The new, competitive situation forces libraries to see things much more from the perspective of the user. First of all, this is an acknowledgement that, particularly at universities, libraries deal with a range of users with often different usage behaviours. An undergraduate has other demands for information than a qualified researcher, and their usage behaviours can vary substantially.

Undergrads try much harder to get general information, such as their usage of internet search engines. I can say this back in my uni days, that searching for something could be quite time consuming. Whereas seasoned researchers usually know where to look for information. The point is that with the range of search engines available, users have a large choice about how they get their information.

One thing that I can say as an ex-student, is that the majority of us would use Google as the preferred search engine for information. In fact, I still do. The simple reason why they still use the online catalogue is that, for this information type, they don’t have an available alternative, as internet search engines usually don’t cover the so called ‘deep’ or ‘invisible’ web. In any area where students think that they can find information, especially when they are looking for documents and full text, general search engines are even now much more popular than databases that have been made available through libraries. We students like to make it easy for ourselves, and by using Google to do the searching for us as it’s so useful and quick. Just as users like the ease of phrasing and submitting a search query, they also like the flexible and responsive display of result sets. Superior performance and the size of internet search indexes are most impressive to them.

Q: Challenges and Limitations?

Despite their usefulness, there can be some challenges that digital libraries currently face.

Coverage of data formats, full text search

Most systems focus solely on the search of metadata (bibliographic fields, keywords, abstracts). The cross-search of full text has only recently been introduced and is often restricted to a very limited range of data formats (primarily “html” and “txt”).

Coverage of Content types

Digital libraries largely integrate online library catalogues and databases with some full text repositories (e-journals). Freely available academic online content as described above is usually not covered by library portals. If they are selected at all they are mainly organised as html-link lists or specific databases (subject guides) that record reference metadata about web repositories.

Beyond online catalogues, databases and e-journals, researchers started to place their pre-prints or post-prints on the websites of faculties and research groups. Comprehensive web servers of scientific congresses include online presentations and papers, large international pre-print servers, often organised by the scientific community, store thousands and hundreds of thousands of documents, and the creation of e-learning objects is gaining increasing popularity.

And libraries? They add to the content that is available online. Today we have seen almost 15 years of digitisation activities, starting in the U.S. and spreading from there to other countries. Hundreds, if not thousands of digital document servers are available today, the majority of them as stand-alone systems. And activities at universities in building institutional repositories have only started. The long term goal is to store the research and e-learning output of each institution on self-controlled document servers. While the building of these repositories especially must be welcomed for strategic reasons (e.g. open access to research data, ensuring long term accessibility) the expected number of additional online hosts requires additional efforts on the search side.

Limited scalability / Information Retrieval performance

The majority of the portal systems rely on the metasearch (broadcast search) principle, i.e. a query is translated into the retrieval language of the target repositories (e.g. catalogues, databases) and sent out to selected repositories. The sequentially incoming responses are aggregated and presented in a joint result list.

The problems resulting from this search principle are well-known: due to the sequential response of the target repositories and in particular due to the dependence on the performance of these repositories we get—with an increasing number of target databases—limited scalability and decreasing performance.


