Institutional repositories: running the numbers
Thu 1 Nov 2007, 2:45 pm
I have been thinking more about institutional repositories recently, partly because of all the interesting writing Dorothea Salo does on IRs, and partly because I have become involved in the Colorado Alliance Digital Repository project in a small way.
Brian “Ubiquitous Librarian” Mathews is thinking about repositories, too, and in his post WHAT GETS VIEWED? An exploratory study of large IR collections, he outlines some basic statistics on most used items from a handful of repositories, then follows that up with a list of good questions like “How many of these hits are from web crawlers or related software?” and “Why does the DSpace interface still look so mid-1990’s?” Gotta love that last one, since I’m on the user interface group for the Alliance Digital Repository.
I thought I’d point out Mathews’ post since it is interesting in its own right, and because he’s interested in finding collaborators. He writes, “If this is your area and you want to work on something together, let me know. I’m devoted to ALA Editions right now, but I’d like to continue this project into 2008.”

The thing that worries me about our digital repository is the problem of ghettozing — is that the right word? I’m sure there’s a word — the problem of these awesome huge carefully created databases of images THAT ARE UNFINDABLE BY GOOGLE IMAGE SEARCH or similar searches. I’m worried that our low-tech, amateurishly slapped-up images on library webpages are going to be easier to find than the images we put a lot more effort into describing and archiving.
Comment by Jessy — November 2, 2007 @ 10:27 am
Jessy, that is a reasonable thing to be worried about and ask questions about, but there’s no reason why the stuff shouldn’t be findable by google. If you look at Brian’s post, one of the questions is “is the googlebot artificially inflating the hit count.”
ADR is acting up, but check out this Google Image Search on “‘Fulvous Whistling-Duck’ cornell“; the first hit is from the Cornell IR. Plug in any of the titles on Brian’s list to the regular Google search, and the doc in question should be the top hit.
Comment by Steve Lawson — November 2, 2007 @ 11:29 am