Content discovery demystified



Scholarly publishing consultants Tracy Gardner and Simon Inger recently concluded a large-scale study of how researchers navigate the flood of digitized scholarly content. Renew Training, the British company they run, will sell you the complete data set for a mere £1000 (that\’s $1,592), or the same information in a deluxe Excel spreadsheet, outfitted with specially designed an analytic features, for £2,500 (a cool $3,981). Anyone whose curiosity is merely idle or penniless must settle for the “survey edition” of the consultants\’ own analysis, in PDF, which is free.

As you would expect, it\’s more of an advertisement than a report, with graphs that hint at how much data they have, and how many kinds of it, from around the world. Gardner and Inger’s own report, “How Readers Discover Content in Scholarly Journals,” is available in e-book format at a reasonable price – so I sprang for a copy and have culled some of their findings for this week’s column.

The key word here being some, because even the consultants’ non-exhaustive crunching of the numbers is pretty overwhelming. Between May and July of this year, they collected responses from more than 19,000 interview subjects spanning the populated world. The questions covered various situations in which someone might go looking for scholarly articles in a digital format and the considerable range of ways of going about it. Two-thirds of respondents were from academic institutions – with a large majority (three out of four) identifying themselves as researchers.

Roughly two-thirds of the respondents were from North America and Europe, and the interview itself was conducted in English. But enough participants came from the medical, corporate, and government sectors, and from countries in Africa, Oceania, and South America, to make the study something other than a report on Anglo-American academe. In addition, Gardner and Inger conducted a similar survey in 2008 (albeit with a much smaller harvest of data, from around 400 respondents). They also draw on a study they conducted in 2005 as consultants for another group.

The trends, then.
The range and size of digitally published scholarship keep growing, and a number of tools or approaches have developed for accessing material. Researchers rely on university library sites, abstracting and indexing (A&I) services, compilations of links assembled by learned societies or research teams, social networks, and search engines both general (Yahoo) and focused (Google Scholar). You might bookmark a favorite journal, or sign up for an e-mail alert when the table of contents for a new issue is out, or use the journal publisher’s website to find an article.

The survey questions cover three research “behaviors” common across the disciplines: (1) following up a citation, (2) browsing in the core journals in a given field, and (3) looking for articles on a specific subject. As indicated, quite a few ways of carrying out these tasks are now available. Some approaches are better-developed in one field than another. The survey shows that researchers in the life sciences use the National Institutes of Health\’s bibliographical database PubMed “almost exclusively,” while the e-mailed table-of-contents (ToC) notifications for chemistry journals are rich enough in information for their readers to find them valuable.

And ease of access to sorting-and-channeling methods varies from one part of the world to the next. A researcher in a poor country is likely to use the search feature on a publisher’s website (bookmarked for just that purpose) for the simple reason that doing so is free – while someone working in a major research library may have access to numerous bibliographical tools so well-integrated into the digital catalog that users barely notice them as such.

North American researchers “are most likely to use an academic search engine or the library web pages if they have a citation,” the reports notes, “whilst Europeans are more likely to go the journal’s homepage.” Humanities scholars “rely much more on library web pages and especially aggregated collections of journals” than do researchers in the life sciences.

Comments made by social scientists reveal that they use “a much more varied list of resources” for following up citations, including one respondent who relied on “my husband’s library because mine is so bad.”

When browsing around the journals in their field, researchers in the field of education “are greater users of academic search engines and of web pages maintained by key research groups” than are people working in other areas. “Social scientists appear to use journal aggregations less than those in the humanities for reading the latest articles.” And all of them rank “library web pages and journal aggregations more highly” than do people in medicine and the physical and life sciences. One respondent indicated that it wasn’t really necessary to look through recent issues of journals in mathematics because “nowadays virtually all leading research in math is uploaded to arXiv.”

Specialized bibliographical databases “are still the most popular resource” for someone trying to read up on a particular topic, “and allowing for a margin of error [this preference] shows no significant change over time.” The web pages compiled by scholarly societies and research groups “have both shown a slight upward trend” in that regard, “which may be due to changes in publisher marketing strategies resulting in readers becoming more familiar publisher and society brands.”

The rise of academic search engines is a new factor — and while there are others, such as Microsoft Academic Search, the bar graphs show Google Scholar looming over all competitors like a skyscraper over huts. And that’s not even counting the general-purpose Google search engine, which remains a standard tool for academic researchers.

One interesting point that the authors extract from the comments of participants is that many scholars remain unclear on the difference between a search engine and, say, a specialized bibliographical database. Unfortunately the survey seems not to have included information on respondents’ ages, though it would be interesting to know if that is a factor in recognizing such distinctions.

As I said
, the e-book version is reasonably priced, and well within reach of anyone intrigued by this column\’s aerial survey. The publishers and information managers who can afford the full-dress, all-the-data version, which will allow comparison between the research preferences of Malaysian physicists and German historians, and so forth, will be able to extract from it information on how better to engineer access to their content by the specific research constituencies using it.

For the rest of us, it\’s a reminder of how many methods we have available for gaining access to the labyrinth of digital scholarship — and, perhaps, of how much we take them for granted.

Author Bio:
Scott McLemee is the Intellectual Affairs columnist for Inside Higher Ed. In 2008, he began a three-year term on the board of directors of the National Book Critics Circle. From 1995 until 2001, he was contributing editor for Lingua Franca. Between 2001 and 2005, he covered scholarship in the humanities as senior writer at The Chronicle of Higher Education.