Fast Read Pre-mapping Filtering for Short Variations Detection in Gene Sequences Using In-Memory Search Technology

Oct 16, 2025·
Chun hsien ho
,
Wei ting lu
,
Wen kai wang
,
Cheng yuan lin
,
Chih hung kuo
,
Lih yih chiou
· 0 min read
Abstract
To accelerate time consuming process for DNA alignment we propose a pre-mapping filtering method, called the Same Token Count (STC), that leverages the high parallelism of In-Memory Search (IMS) technology. We directly identify all reads associated with reference positions containing short variations. STC operates in two stages: offline and processing. In the offline stage, read sequences are encoded into bitvectors based on k-mers; in the processing stage, these bitvectors are efficiently matched by IMS to assess sequence similarity. This method achieves a speed up of 150× compared to traditional CPU-based methods for detecting a single short variation.
Publication
2025 IEEE Biomedical Circuits and Systems Conference (BioCAS)
publications