{"id":1245,"date":"2025-10-28T17:30:52","date_gmt":"2025-10-28T16:30:52","guid":{"rendered":"https:\/\/www.hsu-hh.de\/statistik\/?post_type=tribe_events&#038;p=1245"},"modified":"2025-12-03T13:55:05","modified_gmt":"2025-12-03T12:55:05","slug":"simon-schlumbohm-hsu","status":"publish","type":"tribe_events","link":"https:\/\/www.hsu-hh.de\/statistik\/event\/simon-schlumbohm-hsu","title":{"rendered":"Simon Schlumbohm (HSU)"},"content":{"rendered":"<h1>Efficient Algorithms for Improved Information Retention in Integration of Incomplete Omics Datasets<\/h1>\n<p>The acquisition of high-quality data in the biomedical field, particularly in omics studies<br \/>\nsuch as proteomics or transcriptomics, poses a significant challenge due to incomplete<br \/>\nmeasurements during data acquisition or simply small sample sizes. This issue results in<br \/>\ndatasets with low statistical power that are in addition often compromised by missing<br \/>\nvalues, which impede downstream analysis and the accurate interpretation of biological<br \/>\nphenomena.<\/p>\n<p>A common approach to mitigate such limitations is data integration, which combines<br \/>\nmultiple datasets to increase cohort sizes by incorporating data from different studies or<br \/>\nlaboratories. However, this approach introduces new challenges, notably the so-called<br \/>\nbatch effect, which introduces internal biases and obscures biological meaning. Moreover,<br \/>\ninfrequently measured features (e.g., proteins or genes) create additional gaps in the data<br \/>\nduring integration tasks. <\/p>\n<p>As the volume of available biological data continues to expand, there is an increasing<br \/>\nneed for computational methods capable of efficiently processing and analyzing these<br \/>\ngrowing datasets. Expected future advancements in data acquisition with regards to<br \/>\nthroughput necessitate the development of computationally efficient and robust algorithms.<br \/>\nIn addition, to ensure accessibility and broad adoption, it is crucial that bioinformatics<br \/>\ntools must be user friendly, allowing researchers with varying levels of technical expertise<br \/>\nto effectively utilize them. <\/p>\n<p>To this end, an integration and batch effect reduction tool has been developed, called<br \/>\nthe HarmonizR algorithm. This work features various functionality that has been build<br \/>\nto tackle the aforementioned issues. Dataset integration aims for an increase in cohort<br \/>\nsizes and sample amounts, which is facilitated by the inclusion of a new unique removal<br \/>\napproach. It overcomes prior limitations regarding data retention, greatly increasing<br \/>\nHarmonizR\u2019s benefits as a pipeline tool when used prior to data analysis by significantly<br \/>\nexpanding the number of considerable features and data points of any given study. This<br \/>\nmay be paired with the added functionality of accounting for user-defined experimental<br \/>\ninformation such as treatment-groups (i.e., covariate information) during adjustment,<br \/>\nleading to more robust and high-quality results. Regarding computational efficiency, a<br \/>\nnovel blocking approach exploits the given data structure to brace the algorithm for<br \/>\ncurrent and future big data challenges without negatively impacting adjustment quality.<br \/>\nFurthermore, the algorithm\u2019s batch effect adjustment capabilities are proven effective<br \/>\non various omics types &#8211; with a notable extension towards single cell count datasets by<br \/>\nemploying further adjustment methodology &#8211; as well as non-biological data in the form of<br \/>\nan attention-deficit\/hyperactivity disorder study. <\/p>\n<p>To address remaining challenges, the newly developed BERT algorithm introduces a novel<br \/>\narchitectural approach, offering improvements in information retention and computational<br \/>\nefficiency. A comparative analysis of BERT and HarmonizR explores the advantages<br \/>\nof BERT in terms of feature\/overall data retention and reduced runtimes, providing a<br \/>\nvaluable complement to the existing framework. <\/p>\n<p>Lastly, to enhance accessibility and ease of use, plugins for the popular Perseus software<br \/>\nhave been created and are described, enabling seamless integration of both algorithms<br \/>\ninto established bioinformatics workflows, specifically aiding researchers less familiar with<br \/>\nthe technical aspects of the here shown algorithms and bioinformatics in general.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Efficient Algorithms for Improved Information Retention in Integration of Incomplete Omics Datasets The acquisition of high-quality data in the biomedical field, particularly in omics studies such as proteomics or transcriptomics, [&hellip;]<\/p>\n","protected":false},"author":102,"featured_media":0,"template":"","meta":{"_tribe_events_status":"","_tribe_events_status_reason":"","footnotes":""},"tags":[],"tribe_events_cat":[],"class_list":["post-1245","tribe_events","type-tribe_events","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tribe_events\/1245","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tribe_events"}],"about":[{"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/types\/tribe_events"}],"author":[{"embeddable":true,"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/users\/102"}],"version-history":[{"count":3,"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tribe_events\/1245\/revisions"}],"predecessor-version":[{"id":1259,"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tribe_events\/1245\/revisions\/1259"}],"wp:attachment":[{"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/media?parent=1245"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tags?post=1245"},{"taxonomy":"tribe_events_cat","embeddable":true,"href":"https:\/\/www.hsu-hh.de\/statistik\/wp-json\/wp\/v2\/tribe_events_cat?post=1245"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}