Using computer science to classify potsherds

‘Now, using digital photographs of pottery, computers can accomplish what used to involve hundreds of hours of tedious, painstaking, and eye-straining work by archaeologists, who physically sorted pieces of broken pottery into groups, in a fraction of the time and with greater consistency.’

Archaeologists from Northern Arizona University have created a computer model that can accurately sort potsherds.

Using a form of machine learning known as Convolutional Neural Networks (CNNs), Leszek Pawlowicz and Christian Downum started their project by creating a database of over 3,000 digital images of sherds of Tusayan White Ware – a type of pottery that largely dates to between AD 825-1300 and is predominantly found in north-eastern Arizona. They then enlisted the help of four ceramic experts to classify each sherd and create a consensus dataset. This dataset was used to train a CNN to classify sherd images by type. To make it even more accurate, after each training cycle, the images were randomly rotated and also either enlarged or shrunk. In this way, the model should be able to identify sherds from most angles and image sizes.

The researchers next assessed how accurately the computer could identify unclassified sherd images when compared with the answers of the four ceramics experts. They found that the CNN achieved an accuracy on par with, and in some cases even better than, the human classifiers. As this project used a small sample size, it is believed a larger dataset would yield even better results.

Photo: Pawlowicz and Downum (2021), Journal of Archaeological Science.

What makes this method particularly useful is that recent updates in CNN technology mean that instead of just assigning typologies without justification, this new model can now create a heat map (above) showing which characteristics it is using to make the classification. But the model still isn’t perfect, and some sherds may not be able to be accurately be identified via computer. It could be used to indicate the most difficult-to-determine sherds, however, allowing experts to focus on these without having to sort through the whole lot.

Commenting on the project, Leszek said: ‘Now, using digital photographs of pottery, computers can accomplish what used to involve hundreds of hours of tedious, painstaking, and eye-straining work by archaeologists, who physically sorted pieces of broken pottery into groups, in a fraction of the time and with greater consistency.’

This research was recently published in the Journal of Archaeological Science: https://doi.org/10.1016/j.jas.2021.105375.