Researchers in the US developed a method that decreases the time in data extraction for drug creation.
Researchers from Purdue University in Indiana, USA, have created a system for the extraction of biomolecular data, which works through machine learning for drug design.
For the creation and development of drugs there is a process in which a computer extracts information about a specific set of data. Scientists in the pharmaceutical industry use biological data for software training, and thus learn how the human body would react to new drugs.
The Purdue University team developed new software for data extraction, with the aim of improving and training machine learning models and thus extracting more effectively in Protein Databases, (PDB)
"The problem is that it can take an enormous amount of time to sort through all the accumulated data. Machine learning can help, but you still need a strong framework from which the computer can quickly analyze data to help in the creation of safe and effective drugs," said Gaurav Chopra, assistant professor of analytical and physical chemistry at the university's Faculty of Science, about the importance of PDB for drug development and discovery.
The platform they created was called Lemon, a quick library of C++11 with Python bindings that extracts data over PDB in minutes. Traditional mmCIF files on a PDB take nearly 4 hours, however, Lemon performs that task in six minutes using an 8-core processor.
"Experimental structures deposited in PDB have resulted in several advances for structural and computational biology scientific and education communities that help advance drug development and other areas," explained Jonathan Fine, from the team’s software development.