A multi-component classifier for nonalcoholic fatty liver disease (NAFLD) based on genomic, proteomic, and phenomic data domains.

Wood GC, Chu X, Argyropoulos G et al.

Geisinger Obesity Research Institute, Danville, PA, USA.

Scientific reports. Mar 2017.

Non-alcoholic disease (NAFLD) represents a spectrum of conditions that include and fibrosis that are thought to emanate from hepatic . Few robust biomarkers or diagnostic tests have been developed for hepatic steatosis in the setting of obesity. We have developed a multi-component classifier for hepatic steatosis comprised of phenotypic, genomic, and proteomic variables using data from 576 adults with extreme obesity who underwent bariatric surgery and intra-operative liver biopsy. Using a 443 patient training set, protein biomarker discovery was performed using the highly multiplexed SOMAscan(®) proteomic assay, a set of 19 clinical variables, and the steatosis predisposing PNPLA3 rs738409 single nucleotide polymorphism genotype status. The most stable markers were selected using a stability selection algorithm with a L1-regularized logistic regression kernel and were then fitted with logistic regression models to classify steatosis, that were then tested against a 133 sample blinded verification set. The highest area under the ROC curve (AUC) for steatosis of PNPLA3 rs738409 genotype, 8 proteins, or 19 phenotypic variables was 0.913, whereas the final classifier that included variables from all three domains had an AUC of 0.935. These data indicate that multi-domain modeling has better predictive power than comprehensive analysis of variables from a single domain.


Leave a Reply

Your email address will not be published. Required fields are marked *