What is Scrutiny?

Funded by the JISC’s Rapid Innovation Programme, this project is a collaboration between two historians (Prof. Tim Hitchcock and Prof. Robert Shoemaker, the directors of the Old Bailey, Central Criminal Courts and Plebeian Lives projects), the humanities informatics team at the Humanities Research Institute and a serious games company, PlayGen Limited, who will be providing additional programming support as part of an ongoing knowledge exchange relationship between the company and the HRI.

The project will develop a Firefox extension called Scrutiny, which will be able to scan web pages selected by individual users and highlight entities that it thinks will interest them. Its primary purpose is to increase the speed and efficiency with which HE and non-HE researchers are able to locate potentially relevant information within large data objects such as journal articles or full-text datasets, thereby directly addressing the conundrum of information overload and improving research productivity as a consequence.

Users will be able to train Scrutiny to identify entities which are relevant to their field of research both by using pre-defined, subject-specific ‘entity recognition files’, and by refining Scrutiny´s understanding of their personal interests through an iterative process of accepting or discarding the suggestions which Scrutiny presents. Scrutiny will be developed using natural language processing, including `named entity recognition´ based on a Bayesian learning methodology. In this instance, an entity could be the name of a person, a place, an artefact, term or phrase, depending on the subject of study. For example, the test datasets to be used by this project will focus on eighteenth- and nineteenth-century criminal justice and, as a result, the `entity´ identified might be a crime, a verdict or a sentence; or a collection of less well defined types of behaviour recorded in depositions and criminal evidence.

Scrutiny will be available to download for free from this website, our SourceForge account and from Mozilla’s add-on repository. All source code and documentation will be released as open source for further refinement and enhancement by the developer community.