Showing posts with label scholarly communication. Show all posts
Showing posts with label scholarly communication. Show all posts

Thursday, July 10, 2025

The Festschrift For Cliff Lynch

Source
The festschrift that includes the edited version of the draft we posted back in April entitled Lots Of Cliff Keeps Stuff Safe has been officially published as Networking Networks: A Festschrift in Honor of Clifford Lynch, an open access supplement to portal: Libraries and the Academy 25, no. 3. Joan K. Lippincott writes:
The final CNI membership meeting of Cliff’s tenure, held April 7–8, 2025, in Milwaukee, was to include a surprise presentation of the Festschrift’s table of contents. Though Cliff’s health prevented him from attending in person, he participated virtually and heard readings of excerpts from each contribution. Clifford Lynch passed away shortly after, on April 10, 2025. Authors completed their essays before his passing, and the original text remains unchanged.
Below the fold is a bried snippet of each of the invited contributions and some comments.

Tuesday, April 15, 2025

Cliff Lynch RIP

Source
Last Tuesday Cliff Lynch delivered an abbreviated version of his traditional closing summary and bon voyage to CNI's 2025 Spring Membership Meeting via Zoom from his sick-bed. Last Thursday night he died, still serving as Executive Director. CNI has posted In Memoriam: Clifford Lynch.

Cliff impacted a wide range of areas. The best overview is Mike Ashenfelder's 2013 profile of Cliff Lynch in the Library of Congress' Digital Preservation Pioneer series, which starts:
Clifford Lynch is widely regarded as an oracle in the culture of networked information. Lynch monitors the global information ecosystem for cultural trends and technological developments. He ponders their variables, interdependencies and influencing factors. He confers with colleagues and draws conclusions. Then he reports his observations through lectures, conference presentations and writings. People who know about Lynch pay close attention to what he has to say.

Lynch is a soft-spoken man whose work, for more than thirty years, has had an impact — directly or indirectly — on the computer, information and library science communities.
Below the fold are some additional personal notes on Cliff's contributions.

Thursday, March 6, 2025

The Oligopoly Publishers

Source
Rupak Ghose's The $100 billion Bloomberg for academics and lawyers? is essential reading for anyone interested in academic publishing. He starts by charting the stock price of RELX, Thomson Reuters, and Wolters Kluwer, pointing out that in the past decade they have increased about ten-fold. He compares these publishers to Bloomberg, the financial news service. They are less profitable, but that's because their customers are less profitable. Follow me below the fold for more on this.

Thursday, February 1, 2024

The Stanford Digital Library Project

The Stanford Digital Library Project stated its goal thus:
The Stanford Integrated Digital Library Project will develop enabling technologies for an integrated “virtual” library to provide an array of new services and uniform access to networked information collections. The Integrated Digital Library will create a shared environment linking everything from personal information collections, to collections of conventional libraries, to large data collections shared by scientists.
Stanford librarians Vicky Reich and Rebecca Wesley provided the "library" input for the research.

Wayback Machine, 11/11/98
In particular Vicky explained citation indices, the concept behind Page Rank, to Larry Page and Sergey Brin. Andy Bechtolsheim was famously instrumental in persuading them to turn their demo of a Page Rank search engine into Google, the company. In his fascinating interview in the Computer History Museum's oral history collection, Andy explains why the idea of ranking pages by their inbound links was so important.

Below the fold I have taken the liberty of transcribing and cleaning up the relevant section of Andy's stream of conciousness, both because it is important history and because it exactly reflects the Andy I was privileged to know in the early days of Sun Microsystems.

Thursday, May 11, 2023

Flooding The Zone With Shit

Tom Cowap
CC-BY-SA 4.0
Much of the discussion occupying the Web recently has been triggered by the advent of Large Language Models (LLMs). Much of that has been hypeing the vast improvements in human productivity they promise, and glossing over the resulting unemployment among the chattering and coding classes. But the smaller negative coverage, while acknowledging the job losses, has concentrated on the risk of "The Singularity", the idea that these AIs will go HAL 9000 on us, and render humanity obsolete[0].

My immediate reaction to the news of ChatGPT was to tell friends "at last, we have solved the Fermi Paradox"[1]. It wasn't that I feared being told "This mission is too important for me to allow you to jeopardize it", but rather that I assumed that civilizations across the galaxy evolved to be able to implement ChatGPT-like systems, which proceeded to irretrievably pollute their information environment, preventing any further progress.

Below the fold I explain why my on-line experience, starting from Usenet in the early 80s, leads me to believe that humanity's existential threat from these AIs comes from Steve Bannon and his ilk flooding the zone with shit[2].

Tuesday, July 26, 2022

The Internet Archive's "Long Tail" Program

In 2018 I helped the Internet Archive get a two-year Mellon Foundation grant aimed at preserving the "long tail" of academic literature from small publishers, which is often at great risk of loss. In 2020 I wrote The Scholarly Record At The Internet Archive explaining the basic idea:
The project takes two opposite but synergistic approaches:
  • Top-Down: Using the bibliographic metadata from sources like CrossRef to ask whether that article is in the Wayback Machine and, if it isn't trying to get it from the live Web. Then, if a copy exists, adding the metadata to an index.
  • Bottom-up: Asking whether each of the PDFs in the Wayback Machine is an academic article, and if so extracting the bibliographic metadata and adding it to an index.
Below the fold I report on subsequent developments in this project.

Thursday, September 17, 2020

Don't Say We Didn't Warn You

Just over a quarter-century ago, Stanford Libraries' HighWire Press pioneered the switch of academic journal publishing from paper to digital when they put the Journal of Biological Chemistry on-line. Even in those early days of the Web, people understood that Web pages, and links to them, decayed over time. A year later, Brewster Kahle founded the Internet Archive to preserve them for posterity.

One difficulty was that although academic journals contained some of the Web content that  was most important to preserve for the future, the Internet Archive could not access them because they were paywalled. Two years later, Vicky Reich and I started the LOCKSS (Lots Of Copies Keep Stuff Safe) program to address this problem. In 2000's Permanent Web Publishing we wrote:
Librarians have a well-founded confidence in their ability to provide their readers with access to material published on paper, even if it is centuries old. Preservation is a by-product of the need to scatter copies around to provide access. Librarians have an equally well-founded skepticism about their ability to do the same for material published in electronic form. Preservation is totally at the whim of the publisher.

A subscription to a paper journal provides the library with an archival copy of the content. Subscribing to a Web journal rents access to the publisher's copy. The publisher may promise "perpetual access", but there is no business model to support the promise. Recent events have demonstrated that major journals may vanish from the Web at a few months notice.

This poses a problem for librarians, who subscribe to these journals in order to provide both current and future readers with access to the material. Current readers need the Web editions. Future readers need paper; there is no other way to be sure the material will survive.
Now, Jeffrey Brainard's Dozens of scientific journals have vanished from the internet, and no one preserved them and Diana Kwon's More than 100 scientific journals have disappeared from the Internet draw attention to this long-standing problem. Below the fold I discuss the paper behind the Science and Nature articles.

Thursday, June 18, 2020

Breaking: Peer Review Is Broken!

The subhead of The Pandemic Claims New Victims: Prestigious Medical Journals by Roni Caryn Rabin reads:
Two major study retractions in one month have left researchers wondering if the peer review process is broken.
Below the fold I explain that the researchers who are only now "wondering if the peer review process is broken" must have been asleep for more than the last decade.

Tuesday, June 16, 2020

Supporting Open Source Software

In the Summer 2020 issue of Usenix's ;login: Dan Geer and George P. Sieniawski have a column entitled Who Will Pay the Piper for Open Source Software Maintenance? (it will be freely available in a year). They make many good points, some of which are relevant to my critique in Informational Capitalism of Prof.  Kapczynski's comment that:
open-source software is fully integrated into Google’s Android phones. The volunteer labor of thousands thus helps power Google’s surveillance-capitalist machine.
Below the fold, I discuss "the volunteer labor of thousands".

Thursday, June 4, 2020

"More Is Not Better" Revisited

Source
I have written many times on the topic of scholarly communication since the very first post to this blog thirteen years ago. The Economist's "Graphic Detail" column this week is entitled How to spot dodgy academic journals. It is about the continuing corruption of the system of academic communication, and features this scary graph. It shows:
  • Rapid but roughly linear growth in the number of "reliable" journals launched each year. About three times as many were launched in 2018 as in 1978.
  • Explosive growth since 2010 in the number of "predatory" journals launched each year. In 2018 almost half of all journals launched were predatory.
Below the fold, some commentary.

Thursday, April 23, 2020

Funder Publishing Platforms

After posting Never Let A Crisis Go To Waste earlier this month, I can't resist a shout-out to Elizabeth Gadd's The purpose of publications in a pandemic and beyond:
The virus is reminding us that the purpose of scholarly communication is not to allocate credit for career advancement, and neither is it to keep publishers afloat. Scholarly communication is about, well, scholars communicating with each other, to share insights for the benefit of humanity. And whilst we’ve heard all this before, in a time of crisis we realise afresh that this isn’t just rhetoric, this is reality.
Below the fold, a few comments.

Tuesday, April 7, 2020

Never Let A Crisis Go To Waste

On March 13th, an Elsevier press release entitled Elsevier gives full access to its content on its COVID-19 Information Center for PubMed Central and other public health databases to accelerate fight against coronavirus announced:
From today, Elsevier, a global leader in research publishing and information analytics specializing in science and health, is making all its research and data content on its COVID-19 Information Center available to PubMed Central, the archive of biomedical and lifescience at the US. National Institutes of Health’s National Library of Medicine, and other publicly funded repositories globally, such as the WHO COVID database, for as long as needed while the public health emergency is ongoing. This additional access allows researchers to use artificial intelligence to keep up with the rapidly growing body of literature and identify trends as countries around the world address this global health crisis.
Elsevier and the other oligopoly academic publishers have reacted similarly in earlier virus outbreaks. Prof. John Willinsky pounced on this admission that these companies normal restrictive access policies based on copyright ownership slow the progress of science, and thus violate the US Constitution's intellectual property clause:
That Congress shall have Power...To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.
Below the fold I provide some details of his proposal.

Tuesday, March 3, 2020

Falling Research Productivity Revisited

Last year, in Falling Research Productivity, I commented on Are Ideas Getting Harder to Find? by Nicholas Bloom et al. Now, The Economist's current issue has a Free Exchange column entitled How to get more innovation bang for the research buck that takes off from the same paper:
In a paper by Nicholas Bloom, Charles Jones and Michael Webb of Stanford University, and John Van Reenen of the Massachusetts Institute of Technology (MIT), the authors note that even as discovery has disappointed, real investment in new ideas has grown by more than 4% per year since the 1930s. Digging into particular targets of research—to increase computer processing power, crop yields and life expectancy—they find that in each case maintaining the pace of innovation takes ever more money and people.
Follow me below the fold for some commentary on a number of the other papers they cite.

Tuesday, February 18, 2020

The Scholarly Record At The Internet Archive

The Internet Archive has been working on a Mellon-funded grant aimed at collecting, preserving and providing persistent access to as much of the open-access academic literature as possible. The motivation is that much of the "long tail" of academic literature comes from smaller publishers whose business model is fragile, and who are at risk of financial failure or takeover by the legacy oligopoly publishers. This is particularly true if their content is open access, since they don't have subscription income. This "long tail" content is thus at risk of loss or vanishing behind a paywall.

The project takes two opposite but synergistic approaches:
  • Top-Down: Using the bibliographic metadata from sources like CrossRef to ask whether that article is in the Wayback Machine and, if it isn't trying to get it from the live Web. Then, if a copy exists, adding the metadata to an index.
  • Bottom-up: Asking whether each of the PDFs in the Wayback Machine is an academic article, and if so extracting the bibliographic metadata and adding it to an index.
Below the fold, a discussion of the progress that has been made so far.

Tuesday, November 12, 2019

Academic Publishers As Parasites

This is just a quick post to draw attention to From symbiont to parasite: the evolution of for-profit science publishing by UCSF's Peter Walter and Dyche Mullins in Molecular Biology of the Cell. It is a comprehensive overview of the way the oligopoly publishers obtained and maintain their rent-extraction from the academic community:
"Scientific journals still disseminate our work, but in the Internet-connected world of the 21st century, this is no longer their critical function. Journals remain relevant almost entirely because they provide a playing field for scientific and professional competition: to claim credit for a discovery, we publish it in a peer-reviewed journal; to get a job in academia or money to run a lab, we present these published papers to universities and funding agencies. Publishing is so embedded in the practice of science that whoever controls the journals controls access to the entire profession."
My only criticisms are a lack of cynicism about the perks publishers distribute:
  • They pay no attention to the role of librarians, who after all actually "negotiate" with the publishers and sign the checks.
  • They write:
    we work for them for free in producing the work, reviewing it, and serving on their editorial boards
    We have spoken with someone who used to manage top journals for a major publisher. His internal margins were north of 90%, and the single biggest expense was the care and feeding of the editorial board.
And they are insufficiently skeptical of claims as to the value that journals add. See my Journals Considered Harmful from 2013.

Despite these quibbles, you should definitely go read the whole paper.

Thursday, October 24, 2019

Future of Open Access

The Future of OA: A large-scale analysis projecting Open Access publication and readership by Heather Piwowar, Jason Priem and Richard Orr is an important study of the availability and use of Open Access papers:
This study analyses the number of papers available as OA over time. The models includes both OA embargo data and the relative growth rates of different OA types over time, based on the OA status of 70 million journal articles published between 1950 and 2019.

The study also looks at article usage data, analyzing the proportion of views to OA articles vs views to articles which are closed access. Signal processing techniques are used to model how these viewership patterns change over time. Viewership data is based on 2.8 million uses of the Unpaywall browser extension in July 2019.
They conclude:
One interesting realization from the modeling we’ve done is that when the proportion of papers that are OA increases, or when the OA lag decreases, the total number of views increase -- the scholarly literature becomes more heavily viewed and thus more valuable to society.
Thus clearly demonstrating one part of the value that open access adds. Below the fold, some details and commentary.

Thursday, October 17, 2019

Be Careful What You Measure

"Be careful what you measure, because that's what you'll get" is a management platitude dating back at least to V. F. Ridgway's 1956 Dysfunctional Consequences of Performance Measurements:
Quantitative measures of performance are tools, and are undoubtedly useful. But research indicates that indiscriminate use and undue confidence and reliance in them result from insufficient knowledge of the full effects and consequences. ... It seems worth while to review the current scattered knowledge of the dysfunctional consequences resulting from the imposition of a system of performance measurements.
Back in 2013 I wrote Journals Considered Harmful, based on Deep Impact: Unintended consequences of journal rank by Björn Brembs and Marcus Munaf, which documented that the use of Impact Factor to rank journals had caused publishers to game the system, with negative impacts on the integrity of scientific research. Below the fold I look at a recent study showing similar negative impacts on research integrity.

Tuesday, August 20, 2019

A Tribute To Don Waters

Michael Keller has written, in Exploiting the opportunities of the maturing digital age: the first twenty years of the Scholarly Communications Program of the Andrew W. Mellon Foundation, what is effectively a richly deserved tribute to Don Waters as his retirement looms. Below the fold, some commentary and my two cents worth.

Thursday, July 25, 2019

Carl Malamud's Text Mining Project

For many years now it has been obvious that humans can no longer effectively process the enormous volume of academic publishing. The entire system is overloaded, and its signal-to-noise ratio is degrading. Journals are no longer effective gatekeepers, indeed many are simply fraudulent. Peer review is incapable of preventing fraud, gross errors, false authorship, and duplicative papers; reviewers cannot be expected to have read all the relevant literature.

On the other hand, there is now much research showing that computers can be effective at processing this flood of information. Below the fold I look at a couple of recent developments.

Tuesday, May 21, 2019

Ten Hot Topics

The topic of scholarly communication has received short shrift here for the last few years. There has been too much to say about other topics, and developments such as Plan S have been exhaustively discussed elsewhere. But I do want to call attention to an extremely valuable review by Jon Tennant and a host of co-authors entitled Ten Hot Topics around Scholarly Publishing.

The authors pose the ten topics as questions, which allows for a scientific experiment. My hypothesis is that all these questions, while strictly not headlines, will nevertheless obey Betteridge's Law of Headlines, in that the answer will be "No". Below the fold, I try to falsify my hypothesis.