Is Google Scholar a database killer?
Tue 30 Nov 2010, 4:45 pm
It is interesting to read an article in the library literature that I feel is well-researched and well-written, but then to disagree completely with its conclusion.
That’s how I felt when I read Xiaotian Chen’s article “Google Scholar’s Dramatic Coverage Improvement Five Years after Debut,” which appears in the December Serials Review. (It is not freely available online but can be found at DOI 10.1016/j.serrev.2010.08.002 for those with Science Direct subscriptions.) The article demonstrates that Google Scholar is providing 98 to 100 percent coverage of the databases it is allowed to crawl, either because those databases are freely available, or because Google has an agreement with that database publisher.
I first learned of Chen’s article through Peter Murray’s post to the Library Society of the World. Early in that discussion, John Dupuis called attention to the last line of the article: “The conclusion cannot be clearer: libraries can seriously consider cancelling a large number of subscription-based abstracts and indexes since their unique contents and value are rapidly evaporating.”
It’s possible that I’m missing an important piece of information that would change my mind, but I really don’t think that conclusion is clear at all.
Google Scholar doesn’t provide the full text of anything. So if libraries want readers to be able to get past the citation at JSTOR or other subscription-based databases, we can’t drop those subscriptions.
So the logical databases to drop would be the ones that provide indexing and abstracting, but not full text. But there are two problems I can see with that. One, I doubt that those databases would let Google crawl them, so they wouldn’t be duplicated in the Google Scholar database. Second, and more important, the non-full-text abstracting and indexing databases that I’m famliar with in the humanities and social sciences tend to index a lot of works that are not journal articles. And as Chen says in the article, Google Scholar doesn’t do so well with those citations:
It is always possible that a gap exists between Google Scholar and a database that does not allow Google Scholar to crawl. In the 2005 Neuhaus et al. study, databases such as ABI/INFORM, CINAHL, and Historical Abstracts all had low coverage by Google Scholar. Part of the reason was that these databases include some records that Google Scholar does not or cannot index: non-journal records and some records from journals that have ceased publication. Non-journal records include records of newspapers, magazines, trade journals, book chapters, pamphlets, reports, conference proceedings, theses and dissertations. Ceased journals may not have publicly accessible tables of contents on the Web for Google Scholar to index.
So. If we can’t cancel JSTOR and Science Direct and so on because that’s where the full text comes from, and we can’t cancel ABI/INFORM, CINAHL, and Historical Abstracts (and MLA Interntional Bibliography and Philosopher’s Index and ATLAS and so on), what is left to cut? Just the databases that do nothing but index articles that are already held in those full-text archives? I don’t know that we subscribe to anything like that.
So I can’t agree with Chen that the impact of Google Scholar on abstracting and indexing databases “cannot be clearer.” I doubt that Google Scholar is a specialty database killer. It almost certainly is a federated search killer. If a library has already decided that they are interested in sacrificing precise, predictable searching for simple searching and broad results, I’d think they’d be much better off if they foregrounded links to Google Scholar and came up with a coordinated approach to teaching it to students, rather than sinking time into customizing a vendor’s product and money into paying a vendor’s fees.
But Google Scholar as a replacement for subject-specific A&I databases doesn’t make sense to me.

Steve,
Thank you for commenting on my recent Google Scholar article.
Some quick clarification on your comments: 1. My conclusion is on reduced value of “subscription-based abstracts and indexes since their unique contents and value are rapidly evaporating.” It is not about value of full-text journal collections such as JSTOR. 2. Yes, Historical Abstracts is one of these “abstracts and indexes.” It is true that Historical Abstracts and others still have small # of unique contents due to the reasons I mentioned in my article, but it has got to the point that the unique contents may not be worth of total cost.
I have a “sister” article:
The Declining Value of Subscription-Based Abstracting and Indexing Services in the New Knowledge Dissemination Era. Serials Review. 36 (2), 79-85. http://dx.doi.org/10.1016/j.serrev.2010.02.010. This one is about the availability of data posted by publishers, and my GS article is about the GS capability of covering those data.
Comment by Xiaotian Chen — December 7, 2010 @ 10:14 am
Thanks for the comment, Xiaotian.
I think I understand (and don’t disagree with) what you say in your comment–I just have drawn different conclusions about what that means in terms of the value of databases like Historical Abstracts. I suppose in the next few years we’ll see which one of us is closer to the truth.
I enjoyed the article of yours that I commented on, and I’ll take a look at this other article soon.
Comment by steve — December 7, 2010 @ 11:24 am
Steve,
It is possible that both of us are right :-). However, my library has cancelled quite a few expensive indexes over the past few years. Current Contents, Engineering Index, Historical Abstracts, to name only a few. Engineering is one of the key colleges on my campus. When the library consulted engineering faculty for cancelling Engineering Index, we did not received a single objection. That means our users have moved on.
Comment by Xiaotian Chen — December 7, 2010 @ 12:28 pm
Interesting! This is actually the follow-up study that I would like to see: a qualitative assessment of the effect of Google Scholar so far on subscriptions, instruction, and so on.
Comment by steve — December 7, 2010 @ 12:47 pm
I typically recommend GS to students when their topics are still very general and either not focused or unusually defined.
With more precise research requests, it seems unproductive (to say the least) not to send them to more traditional research databases.
Comment by Leo Robert Klein — December 8, 2010 @ 6:08 pm
Leo, that’s similar to my approach, too. I’ll probably write more about that soon.
Comment by steve — December 9, 2010 @ 9:20 am
[...] is Google Scholar a database killer? Like Steve, I think not. I think it’s a great tool that complements our other tools. And hey! It’s [...]
Pingback by Why Would Undergraduates Need Those Clunky Databases Anyway? — December 15, 2010 @ 1:05 pm
[...] See Also… » Is Google Scholar a database killer? Google Scholar as a replacement for subject-specific A&I databases doesn’t make sense to me. (tags: research lis/km) [...]
Pingback by links for 2011-01-31 « OU-Tulsa Library News — February 1, 2011 @ 12:01 am