SAPIENTA Web Service

A free-for-non-profit scientific paper extraction engine.

SAPIENTA was developed by Maria Liakata et al. in 2012 and originally ported to Python 2.7 by James Ravenscroft in 2013 to support his Partridge project. It has recently been updated to run in Python 3.X as of January 2021.

Using SAPIENTA

For a small number (1-5 papers) of papers we provide a web ui
The easiest way to use our webservice on a medium or large batch (5-500) of papers is our command line tool.
For very large jobs (500+ papers), the easiest option may be to run SAPIENTA on your own machine via our docker images
We provide a public API and documentation for its usage can be found here.

Please note that this is a volunteer project and our web service API is provided with absolutely NO warranty, SLA or guaranteed uptime.

Supported Files

Sapienta supports annotation of scientific papers in PDF format via University of Manchester's PDFX conversion service. You can also upload any JATS (i.e. documents from PloS and Pubmed Central or SciXML compatible XML document.

Documents must be 5MB or smaller.

Help & Support

SAPIENTA is maintained voluntarily and not-for-profit by a very small team. We are unable to provide fast turnaround support for the service. If you encounter an issue with the web service (e.g. because some component is down) then please contact James Ravenscroft via Twitter. If you find a bug in the source code of SAPIENTA or are struggling to get it running locally, then open a ticket here. We will endeveour to get things running smoothly as soon as possible but can offer no guarantees.

Citing SAPIENTA

If you use our webservice or our software as part of an academic work then please cite the following paper:

@article{liakata2012automatic,
        title={Automatic recognition of conceptualization zones in scientific articles and two life science applications},
        author={Liakata, Maria and Saha, Shyamasree and Dobnik, Simon and Batchelor, Colin and Rebholz-Schuhmann, Dietrich},
        journal={Bioinformatics},
        volume={28},
        number={7},
        pages={991--1000},
        year={2012},
        publisher={Oxford University Press}
}