Digitisation projects have created a wealth of online historical primary sources, resources which vary widely in scope and ambition, and in the sophistication and usability of their user interfaces. But even the most carefully designed is likely to frustrate the research needs of many historians who will use it. Every project is governed by material and conceptual considerations which inevitably shape and constrain the final resource. But primary sources are multivalent, capable of answering a wide range of questions depending on research priorities and methods.
In this paper, I discuss the challenges and benefits of re-using already digitised primary source data, using the London Lives Petitions Project as my main case study. London Lives, 1690-1800 (www.londonlives.org) contains possibly the largest digitised collection of local petitions in existence (about 10,000, as it turns out), but they have been difficult to access and use because of the size and variety of the series of documents through which they are dispersed and the limitations of the existing data structure. My aim has been to extract those petitions from the London Lives XML data and create a text corpus with associated metadata which will enable in-depth research. I will examine how decisions made in the original digitisation project may help or hinder this task, and outline some of the key tools and techniques used in taking apart and reshaping an existing dataset to create a new resource.