Implementing CrossRef Text and Data Mining Services

1. Sign-up using the contact form or use it to ask questions: http://www.crossref.org/tdm/contact_form.html

2. Deposit additional metadata for the content you want to enable for Text and Data Mining. This metadata consists of:

Nb. You can upload this information to CrossRef via a Resource Only Deposit or by uploading a .csv file containing the URI links and the related DOIs. Note that the .csv upload criteria have been updated in Jan 2015 to allow publishers to deposit different full text mime types.

3. Publishers who are concerned about the impact of automated text and data mining harvesters on their site performance may optionally want to implement Standard Rate Limiting Headers. Information is available here: http://tdmsupport.crossref.org/publishers/

If the content you want to enable is Open Access, or text and data mining is permitted via your standard subscription license, these are the only steps you need to take.

——–

Publishers who require researchers to agree to a specific set of Terms and Conditions (T&Cs) before they are allowed to text and data mine content that they otherwise have access to (e.g. through an existing subscription) will need to make use of the click-through service.

1. Publishers can simply login to the click-through service using the same credentials that they use for depositing metadata with CrossRef.

2. Once logged- in, publishers can upload and mange their T&Cs. Every agreement registered must have:

  • A unique URI which points to a copy of the T&Cs on the publisher’s site
  • A unique name
  • A short, non-legalese description of the T&Cs
  • The full text of the T&Cs, in Markdown format

Publishers can upload T&Cs, save drafts and publish online.

Remember that T&Cs must be published before researchers can preview, accept/reject them. But also note that, once a T&C agreement has been published and accepted by any researcher, it can no longer be edited or deleted. It can, however, be disabled, which prevents new acceptances/rejections of that version of the agreement.

3. Implementing the click-through service API. Every researcher who uses the click-through service to accept and/or reject T&Cs is issued with a secret Client API Token. Researchers will use this Client API token in their text and data mining tools for requesting the full text of an article. The publisher can then check with the click-through service API to see if the researcher associated with the secret token has accepted the T&Cs that are applicable to the full text they are seeking to download.

In order to query the Publisher API, each publisher is also assigned a Publisher API Token. Using the combination of these two tokens, the publisher can easily check to see which T&Cs have been accepted and/or rejected using a simple HTTP request. The publisher includes the Publisher Token in the headers of the request and the Client API token on the URI itself. To see the command that illustrates this, please see: http://clickthroughsupport.crossref.org/publishers/.

Please note two important things about the API:

  • The click-through service API requires the use of HTTPS. This is because both the Publisher API Token and the Client API Token need to be kept secret. They should never be passed on an open channel.
  • Publishers should not check for T&Cs with every researcher request. Doing so will make responses slow and overburden the click-through service API. It also isn’t necessary as a researcher cannot “reject” T&Cs that they have previously “accepted.” Rather, publishers should consider checking for a particular Client API token once every X requests or once every Y time span.