Assessing the similarity of the research goal
Please join our discussion forum for announcements, questions, etc.
Some countries have strict legislation regarding the authorization of animal testing. For instance, some require that researchers should comply with the so-called 3R principles, i.e., strategies for the replacement with non-animal approaches, reduction of the number of animals in individual experiments, and refinement of the experimental procedures to reduce animal suffering. Further, many countries require researchers to carry out a thorough literature search to ensure that alternative approaches are currently not available.
The SMAFIRA project aims at supporting researchers for finding alternative methods to animal experiments. Recently, we released our SMAFIRA Web tool [1], which allows researchers to perform such a search. The input to the tool is a PubMed identifier (PMID) of a publication, hereafter called “reference article”, that represents the animal experiment for which they want to find an alternative method. The tool retrieves up to 200 similar articles available in PubMed, and presents these as a list of results.
One of the processing tasks that is currently carried out for the retrieved articles in SMAFIRA is the re-ranking of the articles based on the similarity of the research goals, as compared to the reference article. There are three possible values for the similarity: similar, uncertain, or not similar.
We propose a shared task for collaborative annotation of data in the scope of the BioNLP workshop. We will release a list of various reference articles, grouped according to some pre-selected diseases (MeSH terms). The participants will validate the top 20 similar articles, either automatically, with any system of their choice, or manually using the SMAFIRA tool.
The SMAFIRA Web tool is freely accessible and no login is necessary. Instead, the annotators can bookmark the URL of their session for later annotation or share the URL with other colleagues for a collaborative annotation.
We previously released four case studies [2], which can be used for any purpose, e.g., examples for learning the guidelines, evaluation, or few-shot approaches. We previously used this dataset for the evaluation of various similarity methods [3]. The mapping between the labels in these case studies and the similarity values is the following:
We will release the list of PMIDs, i.e., the reference articles. The participants will be free to pick any of the reference articles from any of the topics and perform annotation for the top 20 articles and can submit annotations for as many reference articles as they wish. The annotation can be carried out manually or automatically.
We will release two batches of reference articles:
Please follow the procedure below for each reference article:
For each batch, we will release a test file that will include the reference articles, and the corresponding similar articles and texts (title and abstract).
We will allow the participation of single participants or teams. All participants should provide an institutional e-mail, e.g., from the university, institute, or company in which they work or study. The participants may submit a paper to the shared task track of the BioNLP workshop and attend the event. Further, we plan to publish an overview paper of the shared task in a journal, and participants with valid submissions will be invited as co-authors and to contribute in the paper.
All annotations will be made available for the public under an appropriate licence for open data purposes, e.g., CC BY 4.0. All annotations will be identified according to their respective team, i.e., they are not anonymous.
We will compare the annotations from the participants using metrics for inter-annotator agreement (IAA), e.g., the kappa coefficient. We will rank the participants (individuals or teams) in terms of agreement to others and in terms of the number of annotated reference articles.
The annotation consists of assessing the similarity between the research goals of the two articles, i.e., the reference article and one of the articles from the top 20 list. However, this is a very subjective task, and opinion might vary among annotators. Therefore, we did not try to define strict guidelines for the annotation, but only set some few rules and give some examples.
The annotation should follow the rules below:
Other suggestions:
More details about the assessment of the similarity for the available case studies is described in [2] and we show some examples below. Please note that the details about disease, application, etc are only shown for a better understanding of the similarity and should not be annotated.
PMID | Reference | Application | Disease | Disease feature |
---|---|---|---|---|
19735549 | - | Model development, Disease mechanism | breast cancer, human ductal carcinoma in situ (DCIS) | tumor initiation, growth, progress |
PMID | Similarity | Application | Disease | Disease feature |
---|---|---|---|---|
28707729 | similar | Disease mechanism | breast cancer, DCIS | promote invasive growth |
20421921 | similar | Disease mechanism | breast cancer, DCIS | progression into invasive |
19920187 | uncertain | Disease mechanism; Model development | breast cancer, DCIS | progression into invasive |
27374087 | uncertain | Disease mechanism | breast cancer, DCIS | progression into invasive |
22777354 | not similar | Disease mechanism | breast cancer, DCIS | regulation of tumor cell differentiation |
24691501 | not similar | Disease mechanism | breast cancer, DCIS | myoepithelial cell layer |
We pre-selected a list of 21 diseases, e.g., “Neoplasms” or “Musculoskeletal Diseases”, and their respective MeSH terms. We performed searches in Pubmed for each disease, e.g., (Musculoskeletal Diseases[MeSH Major Topic]) AND (Models, Animal[MeSH Major Topic]), and manually selected five reference articles. The selected reference articles should describe a proper animal experiment and cannot be a review.
For instance, the article with PMID 37775153 belongs to the topic of “Musculoskeletal Diseases” and studies the effect of the L-arginine metabolism on arthritis and inflammation-mediated bone loss. It proposes three methods, including transgenic mice (i.e., an animal experiment), but also in vitro models.
For the manual annotation, here is the list of the 25 PMIDs according to the five selected topics.
Infections | Neoplasms | Nervous System Diseases | Cardiovascular Diseases | Immune System Diseases |
---|---|---|---|---|
36159784 | 34233949 | 35709748 | 33635944 | 34503569 |
36577999 | 33320838 | 37084732 | 37010266 | 36179018 |
32485164 | 36311701 | 37339207 | 37380648 | 37079985 |
37071015 | 37429473 | 37749256 | 37268711 | 37256935 |
31689515 | 35623658 | 37126714 | 35917178 | 37168850 |
For the automatic annotation, here are the available files:
Batch1 | Batch2 | |
Test file | batch1.json | — |
TeamTat files | batch1_teamtat.zip | — |
Details about the files:
We also provide a sample submission file (in JSON), in which all similarity values were set to n/a (not available)
Please register your team by sending a message to the contact e-mail below. You should include the following information:
By registering to the shared task, you agree to have your annotations released at the end of the task. Further, you agree that we use the participants’ data, e.g., names, affiliations, e-mails, in the scope of the shared task, including future publications in a journal. We will not redistribute this data to others, besides for paper publication, and only after confirmation by the participants.
A valid submission to the shared task should consist of the following:
[1] Daniel Butzke et al. SMAFIRA: a literature-based web tool to assist researchers with retrieval of 3R-relevant information. Laboratory Animals, 2024, Aug;58(4):369-373.
[2] Daniel Butzke et al. SMAFIRA-c: A benchmark text corpus for evaluation of approaches to relevance ranking and knowledge discovery in the biomedical domain. Pre-print 2020.
[3] Mariana Neves et al. Is the ranking of PubMed similar articles good enough? An evaluation of text similarity methods for three datasets. In: The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks. Toronto, Canada: Association for Computational Linguistics, July 2023, pp. 133–144.