The New England Journal of Statistics in Data Science (2023).
Abstract Double generalized linear models provide a flexible framework for modeling data by allowing the mean and the dispersion to vary across observations. Common members of the exponential dispersion family including the Gaussian, Poisson, compound Poisson-gamma (CP-g), Gamma and inverse-Gaussian are known to admit such models. The lack of their use can be attributed to ambiguities that exist in model specification under a large number of covariates and complications that arise when data display complex spatial dependence.
Journal of the Indian Statistical Association
Abstract: Modeling the dynamics of COVID-19 pandemic spread is a challenging and relevant problem. Established models for the epidemic spread such as compartmental epidemiological models e.g. Susceptible-Infected-Recovered (SIR) models and its variants, have been discussed extensively in the literature and utilized to forecast the growth of the pandemic across different hot-spots in the world. The standard formulations of SIR models rely upon summary-level data, which may not be able to fully capture the complete dynamics of the pandemic growth.
Neuroscience Informatics (2022).
Abstract: Background and Purpose MRI features of tumor progression and pseudoprogression may be indistinguishable especially without enhancing portion of the diffuse gliomas. Our aim is to discriminate these two conditions using radiomics and machine learning algorithm and to compare them with human observations.
Materials and Methods Three consecutive MRI studies before a definitive biopsy in 43 diffuse glioma patients (7 pseudoprogression and 36 true progression cases) who underwent treatment were evaluated.
Biometrics (2023).
Abstract Integrative analyses based on statistically relevant associations between genomics and a wealth of intermediary phenotypes (such as imaging) provide vital insights into their clinical relevance in terms of the disease mechanisms. Estimates for uncertainty in the resulting integrative models are however unreliable unless inference accounts for the selection of these associations with accuracy. In this paper, we develop selection-aware Bayesian methods, which (1) counteract the impact of model selection bias through a “selection-aware posterior” in a flexible class of integrative Bayesian models post a selection of promising variables via $\ell_1$-regularized algorithms; (2) strike an inevitable trade-off between the quality of model selection and inferential power when the same data set is used for both selection and uncertainty estimation.
Scientific Reports (2022).
Abstract: Immune checkpoint inhibitors (ICI) with anti-PD-1/PD-L1 agents have improved the survival of patients with metastatic non-small cell lung cancer (mNSCLC). Tumor PD-L1 expression is an imperfect biomarker as it does not capture the complex interactions between constituents of the tumor microenvironment (TME). Using multiplex fluorescent immunohistochemistry (mfIHC), we modeled the TME to study the influence of cellular distribution and engagement on response to ICI in mNSCLC. We performed mfIHC on pretreatment tissue from patients with mNSCLC who received ICI.
Scientific Reports (2022).
Abstract: Spatial pattern modelling concepts are being increasingly used in capturing disease heterogeneity. Quantification of heterogeneity in the tumor microenvironment is extremely important in pancreatic ductal adenocarcinoma (PDAC), which has been shown to co-occur with other pancreatic diseases and neoplasms with certain attributes that make visual discrimination difficult. In this paper, we propose the GaWRDenMap framework, that utilizes the concepts of geographically weighted regression (GWR) and a density function-based classification model, and apply it to a cohort of multiplex immunofluorescence images from patients belonging to six different pancreatic diseases.
Annals of Applied Statistics (2021).
Abstract: Recent technological advancements have enabled detailed investigation of associations between the molecular architecture and tumor heterogeneity, through multi-source integration of radiological imaging and genomic (radiogenomic) data. In this paper, we integrate and harness radiogenomic data in patients with lower grade gliomas (LGG), a type of brain cancer, in order to develop a regression framework called RADIOHEAD (RADIOgenomic analysis incorporating tumor HEterogeneity in imAging through Densities) to identify radiogenomic associations.
American Journal of Neuroradiology (2021).
Nominated for 2021 Lucien Levy Best Research Article. (AJNR Blog Annoucement)
Abstract: Background and Purpose. T2-FLAIR mismatch sign is a validated imaging sign of IDH-mutant 1p/19q non-codeleted gliomas. It is identified by radiologists through visual inspection of pre-operative MRI scans, and has been shown to identify IDH-mutant 1p/19q non-codeleted gliomas with high positive predictive value. We have developed an approach to quantify the T2-FLAIR mismatch signature, and use it to predict molecular status of lower-grade gliomas (LGG).
Scandinavian Actuarial Journal (2021).
Abstract: In this paper we propose a statistical modeling framework that contributes to advancing methods for modeling insurance policy premium in the actuarial literature. Specification of separate frequency and severity models, accounting for territorial risk and performing accurate inference are some of the challenges actuaries face while modeling policy premium. We focus on building a methodology that builds parsimonious and interpretable models for modeling policy premium. Policy premiums are characterized to follow a semi-continuous probability distribution, featuring a non-zero probability mass at zero along with a positive continuous support.
Canadian Journal of Statistics (2021).
Abstract: We present a scalable Bayesian modeling approach for identifying brain regions that respond to a certain stimulus and use them to classify subjects. We specifically deal with multi-subject electroencephalography (EEG) data with a binary response distinguishing between alcoholic and control groups. The covariates are matrix-variate with measurements taken for each subject at different locations across multiple time points. EEG data has a complex structure with both spatial and temporal attributes to it.