Relevancy Ranking in Discovery Services

Imagining an academic library without a discovery service (be it Ebsco, Summon, Primo…etc) has become practically impossible. Where else could you begin general keyword searches or article title searching? However, for smaller research libraries, discovery services are still have that new smell. The small library that I am working at is launching a discovery service very soon and I’ve been working to help get out the hijinks. And it’s been verryyy interesting.

Killing Print Circulation?

I was immediately blown away by the amount of “junk” you get in certain searches. You’re usually better off doing a google search and visiting some Wikipedia pages to learn about your topic and then come back and plug in some more specialized terms. The thing is when small libraries use a discovery service from a library service company, that company usually attaches a bunch of free databases on to the service to sweeten the deal. Unfortunately this can result in you having a lot of useless databases in your discovery service that just crowds out relevant results instead of providing access to more relevant results.

Will these books ever get off the shelves again after a Discovery Service is implemented

But the point of discovery services is to cover a lot of material, and small research libraries cannot afford to turn down free databases. They are thankful for everyone (probably all 10 or 15) that they have. I love every single one of the databases we have like a small child.
Our Discovery Service also boosts our print collection in its relevancy ranking… Though you would be hard pressed to actually come across any of our print collection in a search. This is because our print collection is so small and the metadata for it often isn’t great. So it is unlikely that anyone will actually do a search on a topic that we have a book on and even if they did search the topic, journal articles with more metadata (like an abstract…) would be ranked higher (because they would be deemed to be more relevant) and push the book a few result pages back and thus out of existence.

I can confirm this. I have never come across a book from our catalogue in all my research using the discovery service (which will continue to remain nameless). The only time I have come across one of our books has been when I have been specifically looking for one.
So will implementing our discovery tool kill our book circulation stats? As a small library our book circulation stats are already pretty dead, I doubt the discovery service will put any extra nails in the coffin. However it is still very much a possibility.

I read this recent article by Kristin Calvert about implementing a discovery service at a university library. They predicted that the service would boost both e-journal usage and print circulation, while lowering Inter-Library Loan (ILL). Their rational was that it would provide access to more digital and print content which would encourage its usage and stop searches from needing to request more.

However their prediction was wrong. Electronic journal usage did go up, but print circulation went down and ILLs went…well…fuzzy. The authors seem surprised that print circulation went down, as the other studies in their literature review didn’t see discovery service have that kind of pronounced affect. However, I think the size of their university physical collection matters here. This was done at a small university and like with my library it is not surprising that the books got lost among the articles. Also with most university students being increasingly digital, discovery services allow them to choose articles over books, whereas before discovery services the library catalogue was their main portal to library resources. Not surprising that when your catalogue get less use so does your books. But anyways, Calvert did a great discussion of this in her article and it is worth it to read. Also so you can find out what happened to their ILL stats. It’s open access right here.

Different Relevancy Rankings

I want to look over how different discovery services rank results. Now that Discovery Tools are the main way students and faculty access library resources, the ranking algorithms behind these tools are incredibly important. Most students don’t go past the first page of results. Convenience and availability is often more important than specify or scholarly. Librarians have a duty to make sure their discovery services provide the best results possible, but of course, what makes a result the “best” result?

I am going to briefly consider the relevancy ranking systems of some different discovery tools. I don’t intend this to be an evaluation of which tool is best. I just want to look at the interesting features of each tool.

Let’s start with Ebscohost. Ebsco deems subject headings as their primary consideration to if a result is relevant. They also don’t boost full-text items, instead they place it very low as a ranking value because a lot of reports, news articles, and press releases are often full text. These kind of items are often not what a searcher is looking for and can drown out the content they are. Ebsco also boosts newer items while pushing down older ones. That seems fair, usually searchers are looking for newer and relevant resources. Ebsco also boosts results if they are peer reviewed and pushes down non-peer reviewed items. This is a nice way to ensure that students return scholarly sources.

Some university libraries actually specify that the default search on a discovery tool only returns peer-reviewed material. – I like this. This is a Research Nudge. I am going to write a blog about this soon. – Setting up ways that improve the research of the general user while not getting in the way of an experienced one (because they can easily turn the feature off).

The discovery service Primo has practically all the above features that Ebsco has. Subject headings are the most important, boost library books, push down full-text items…etc. However they also have their new ScholarRank feature which does some very interesting things. ScholarRank boosts results that are more scholarly. This isn’t just boosting peer-reviewed material like Ebsco does, but it is also ranking and boosting content based on citations, attention, and use and access by the scholarly industry. This is really cool. This means all the results on your first search page are often incredibly important and relevant materials.

But ScholarRank goes farther than just this. They also rank your results based on your own research interests and search history. They consider what discipline you are in (e.g. PHD astronomy student will return different results for the search “Mercury” then will a first year music student), and what your past searching history is like. This is essentially what Google does with its page rank. It allows you to find specialized and relevant sources right away. Watch this pretty hilarious video for a better explanation of this feature.

Now immediately I have filter bubble worries here. If this feature was on default but could be turned off, so that is a Research Nudge (that term again!), I would be very happy with it.

So ScholarRank is cool.

Ok, both Ebsco and Primo do something interesting/controversial when they boost library books. They aren’t boosting this content because books are important and get under considered in search results. If that was true they would boost all books, including books not in the catalogue and e-books. But instead they are just boosting books that have a home on the library shelves. They are boosting less relevant material simply in order to boost library circulation stats! That’s kind of shocking realization.

Librarians have this continual job anxiety/insecurity that things might get so good that they are no longer needed. Part of librarian’s job is to get rid of their job. Teaching everyone how to be an expert searcher and giving them the best technology so everyone can get all the resources they need without needing to ask. And if people don’t need to ask, why do they need librarians? Is this what this is? Are libraries giving an extra boost to less relevant catalogue items so that libraries still need a physical collection and they still get a paycheck? Or is it less nefarious then that and just the general book fetish that librarians have?

Whatever it is, the discovery service Summon doesn’t take any of that stuff. It doesn’t boost items if they are dearer to librarians or the universities heart. It prides itself that its “results are returned in a single, unbiased, relevance-ranked list”

So there are some pretty clear differences between Ebsco, Primo, and Summon.

I think that’s all I got tonight. Sorry this post was so long.

Last thought: Libraries being able to boost material from their own institutional repository is pretty damn cool. This means that the research students use will come from researchers at the university. Think of the wonders this could do for building a close scholarly community at a university. Also think of the terrible filter bubble it could great.

Advertisements

About Ryan Regier

Doing Library Stuff. Follow me on twitter at: @ryregier
This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink.

One Response to Relevancy Ranking in Discovery Services

  1. Pingback: Relevancy Ranking Search Results in WorldCat@OSU | Information Technology

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s