ARTEMIS: A new method to identify cancers from repeat elements of genetic code once known to us as junk DNA

 


More than half of the human genome comprises repeat sequences that differ from each individual and have key roles in the regulation of genome structure and function. However due to limitations in the short read alignment (process where short DNA reads location and origin within the genome are identified through a reference), incompetent in identifying the location of repeats in the genome, they are unseen for a long time. Repetitive sequences majorly consist of tandem repeats and retrotransposons. Tandon repeats like human satellites are found in centromeres, telomeres, and acrocentric chromosomes. Retrotransposons are a broadly diverse family of genome repeats including long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), long terminal repeats (LTRs), and other transposable elements. 


Recent achievements of the telomere-to-telomere (T2T) genome creation have added around 200 Mb to the original genome which comprises previously inaccessible and highly repetitive sequences. This data has given enormous insights into the epigenetic and genomic status of these repeats and a bridgeway to understanding their structure, function and organisation. 


In the case of cancer, changes in repeat sequence have been considered the cause of it. Loss of silencing transposable elements due to global hypomethylation is thought to drive the movement of cancer as these elements are considered to regulate gene expression, resulting in oncogene activation and genomic instability. Identifying these changes at the genomic level is challenging and requires high-precision and high-cost technologies.


To overcome these challenges, researchers at Johns Hopkins Kimmel Cancer Center have developed a novel approach which is published in Science Translational Medicine as titled “Genome-wide repeat landscapes in cancer and cell-free DNA”. This paper discusses using machine learning to investigate and identify the cancerous tissues as well as cell-free DNA (cfDNA) fragments that are found in the bloodstream shed by tumour cells.


This new method is known as ARTEMIS (Analysis of RepeaT EleMents in dISease), which on laboratory test examined 1200 repeats and found a higher number of repeats associated with tumors that were previously not identified or were associated with tumors. In the study, 1200 repeat sequences were obtained from normal and tumour cells of 525 patients who were participating in Pan-Cancer Analysis of Whole Genomes (PCAWG) and found a median of 807 alterations in elements in each tumour. In addition, one-third of the elements found were not previously identified to be associated with cancer or tumours. 


Moreover, ARTEMIS scored over 0.96 value of Area Under Curve (AUC) while distinguishing 525 patients' tumours from normal cells when the perfect score is 1. On conducting the same experiment on a group of 208 individuals with liver cancer, the AUC score happens to be 0.87. 


In conclusion, researchers say that this approach can be a breakthrough if it passes the clinical trials and detects cancers in people with or who may have in the future. They also propose that this can be a frontier in precise medicine and can help develop a treatment for an individual. 


REFERENCE


Annapragada AV, Niknafs N, White JR, Bruhm DC, Cherry C, Medina JE, et al. Genome-wide repeat landscapes in cancer and cell-free DNA. Science Translational Medicine. 2024 Mar 13;16(738). doi:10.1126/scitranslmed.adj9283 


IMAGE CREDITS


GoodRx, https://images.app.goo.gl/EdN3SpnhmCwaEXNZA


Comments