NMF and Textual Criticism

by | May 21, 2020 | Textual Criticism

It’s becoming increasingly apparent that the advances of our digital age are making a dramatic impact within the field of New Testament textual criticism. Computer processing can perform analysis that would have previously been impossible. At the same time, this also puts a new requirement on textual critics to be versed in the realm of computer science. Joey McCollum is one such practitioner, and his newly published article in Andrews University Seminary Studies introduces a powerful new tool for the work of textual criticism. In short, it effectively solves the problem of text-types. While some have called for the abolition of text-types, these groupings have potential and recognized value if they can be firmly established. For instance, text-types can simplify the text critic’s task by grouping manuscripts into families that share distinct readings. When you’re attempting to determine the genealogical flow of readings within the manuscript tradition, working with text-types or families is more manageable than working with thousands of individual manuscripts. Text-types can also aid our understanding of transmission history, and improve our knowledge of individual manuscripts. This article presents a method that makes classifying manuscripts into text-types a simple and objective task. 

In his article, “Biclustering Readings and Manuscripts via Non-negative Matrix Factorization,” Joey tackles the problem of assigning manuscripts to families or text-types—or clusters, as they are called in his paper, based on shared readings.1On this point of terminology, Eldon Jay Epp points out: “‘Cluster’ is a positive term, emphasizing close contextual relationships, but avoiding the subtle implications in the term ‘text type’—that is rigid, constant, tightly circumscribed, and definitive.” Epp, Eldon Jay. “Textual Clusters: Their Past and Future in New Testament Textual Criticism.” In The Text of the New Testament in Contemporary Research: Essays on the Status Quaestionis. Bart D. Ehrman and Michael W. Holmes, 519–577. 2nd ed. Leiden: Brill, 2014. With so many variants, it can be challenging to assign a manuscript to a specific family when some of its readings may be typical of one family, while others may be typical of another. In other words, how much do two manuscripts have to share in common to be considered part of the same cluster? And then, when you throw in the problem of contamination, the challenge becomes increasingly difficult. This is where non-negative matrix factorization or NMF comes in. The user feeds their collation data into the program, in the form of a data table, and in a matter of minutes, the computer spits out two new data tables, one that reveals how strongly each reading corresponds to a cluster, and another that shows how strongly each manuscript corresponds to each of those clusters. If contamination is present within a manuscript, this is reflected in the second table showing that the manuscript has an affinity to multiple clusters. 

As a test case, McCollumes uses NMF on Tommy Wasserman’s complete collation data for Jude to demonstrate that the results correspond to those previously identified by von Soden, Klaus Wachtel, and others. In other words, their intuitive findings agree with Joey’s results using NMF. And NMF reaches the same conclusions in a much more rigorous fashion. This is a promising tool. And I’m interested to see how it impacts the field of textual criticism. My only real concern is that because it sounds so complex, it may be ignored. Anyone who reads the article will probably wonder whether this actually is “a simple, automated, and efficient solution.” The simplicity lies in the tool, though, not necessarily the algorithms under the hood. So concepts such as the cophentic correlation coefficient may be tough for readers to grasp. But fertile ground for discovery may await those willing to grasp and use NMF to uncover clusters within the mass of existing manuscripts. 

You can access Joey’s article, “Biclustering Readings and Manuscripts via Non-negative Matrix Factorization,” in the current issue of Andrews University Seminary Studies.   

Those interested can access Joey’s NMF program and documentation for free at https://github.com/jjmccollum/jude-nmf.

Brent Niedergall

Pastor, Grammarian, Runner

Brent Niedergall, MDiv, is Chief Editor at Positive Action for Christ in Whitakers, North Carolina. He’s gone to war in Afghanistan, felled towering trees, and parsed Greek verbs.



  1. Biblical Studies Carnival 171 (May 2020) – The Library Musings - […] makes classifying manuscripts into text-types a simple and objective task.” Read on here: https://niedergall.com/nmf-and-textual-criticism/ And for a free resource,…
Brent Niedergall