Big data has reached a point where traditional statistical tools are struggling to keep up. In response, a research team featuring statisticians from Cornell University has developed a novel data representation approach, drawing inspiration from quantum mechanics, that filters noise and simplifies massive data sets more effectively than existing techniques.
This breakthrough, published in Scientific Reports, has the potential to transform data-heavy fields such as health care and epigenetics, where conventional methods often fall short due to complexity and noise.
“Physicists have developed powerful mathematical tools rooted in quantum mechanics to represent complex systems. We’re now adapting that structure to make sense of data,” said Martin Wells, co-author of the study and Charles A. Alexander Professor of Statistical Sciences at Cornell’s Ann S. Bowers College of Computing and Information Science and the ILR School.
A core challenge in data science is understanding the complexity of a data set before deeper analysis. One common method is intrinsic dimension estimation, which seeks to gauge the true dimensionality of a data set. However, real-world data—often noisy and multifaceted—can mislead these techniques, resulting in inaccurate or inconsistent outcomes.
“In practice, these methods often produce wildly different and incorrect results,” said lead author Luca Candelori, director of research at Qognitive, an artificial intelligence startup. “Applying them to real-world data sets can be extremely difficult.”
To address this, the team introduced an AI-powered version of intrinsic dimension estimation that is more robust and less sensitive to noise. In trials using both real and intentionally noisy synthetic data sets, their model consistently delivered reliable results.
The technique is based on “quantum cognition machine learning,” a new AI approach developed by Qognitive. Unlike traditional models that rely on classical probability theory, this method mimics the way humans process information—using principles from quantum mathematics to create more nuanced and efficient data representations.
https://github.com/ChrisHNE/kbzte39
https://github.com/DavidKEP9/Kbt934
https://github.com/PaulKBT/Kpt834
https://github.com/JeffRBt/Vrelk78
https://github.com/ChrisDNT9/pkdl9
https://github.com/DannyYAT/Pltr45
https://github.com/PeterKBN/Pkt9
https://github.com/CodyBLT/Dter46
https://github.com/DanielOBT/rxtd8
https://github.com/SteveWRB/kpfd9
https://github.com/RyanGSTR/plkt5
https://github.com/JeffRBT8/pkts59
https://github.com/MichaelBRTG/onkd5
https://github.com/RichardKVT/rkt5
https://github.com/CodyTNN/eklt5
https://github.com/NathanGKT/rcas5
https://github.com/TravisKNT/pkts
https://github.com/SteveTSK9/pkx5
https://github.com/BradleyEGT/ctsk
https://github.com/JoshGBT/HOCR
“One of the big drivers behind quantum cognition machine learning is to reduce the computational and energy costs of working with today’s massive data sets,” Candelori explained. Traditional AI training methods—such as those used in large language models—are resource-intensive, and the new approach offers a more economical alternative.
Despite being built on quantum mathematical frameworks, the model does not require specialized quantum computers. It can be run on standard laptops, making it accessible and scalable.
“This quantum aspect really changes the game,” Wells said. “It opens up mathematical and statistical tools that weren’t available even a few years ago.”
The study's authors include a collaboration of experts from academia and industry: Cameron Hogan (Cornell), Alexander Abanov (Stony Brook University), Mengjia Xu (New Jersey Institute of Technology), and several researchers from Qognitive—Kharen Musaelian, Jeffrey Berger, Vahagn Kirakosyan, Ryan Samson, James Smith, and Dario Villani.
Together, their work marks a significant step forward in managing today’s data explosion—and offers new ways to uncover insights buried in the complexity.