A Machine-Learning Approach for theof Enzymatic Activity of Proteins in Metagenomic Samples
Abstract
In this work, a machine-learning approach was developed, which performs the prediction of the putative enzymatic function of unknown proteins, based on the PFAM protein domain database and the Enzyme Commission (EC) numbers that describe the enzymatic activities. The classifier was trained with well annotated protein datasets from the Uniprot database, in order to define the characteristic domains of each enzymatic sub-category in the class of Hydrolases. As a conclusion, the machine-learning procedure based on Hmmer3 scores against the PFAM database can accurately predict the enzymatic activity of unknown proteins as a part of metagenomic analysis workflows.
Domains
Computer Science [cs]Origin | Files produced by the author(s) |
---|
Loading...