My name is Geoff; I'm a data scientist and physicist. I work in the Research Computing group (RCAC) at Purdue University helping scientists and engineers, both faculty and students, with their scientific computing work, particularly as it relates to high performance computing (HPC).

My background is in astrophysics. I have a bachelors in physics from Purdue University. I went on to graduate school at the University of Louisville and studied astrophysics with a masters dissertation on the local ISM. I continued with a Ph.D. at the University of Notre Dame for another two years doing research in large scale survey astronomy before leaving early to pursue a career in data science.

I helped build out the data analytics team at the New York Power Authority, the largest public power utility in the country. I worked on projects in optimization and automation for both business and operations use-cases (e.g., optimizing the flow of water at the Niagara Falls power project).

I like to contribute to open source projects and promote open science methods and tools. You can view my research interests, publications, projects, and my blog here. You can also follow me on Github and Twitter!

  • Lorem
  • Dolor
  • Ipsum

Data Science

Data Science is notoriously difficult to define. In the academic context, it is the common thread of data and computational facility shared among 21st century scientists in every domain. More generally, and as it lives in an industrial context, data science is the intersection of applied computational math and statistics, basic software engineering, and business acumen.


My university training and research background is in astrophysics and large scale survey astronomy. My research interests are quite varied, including the history and evolution of galaxies as understood through both astronomical surveys involving some of the largest datasets ever collected (LSST) and massive particle simulations. I'm also interested in the search for life using exoplanetary spectroscopy.


A collection of posts, writings, notebooks, examples, and ideas usually related to science and computing.

  • Reproducible and Scalable Workflows with Make

    February 28, 2020
    draft, automation, scientific-computing, and high-performance-computing

    It’s common to find blog posts referencing the venerable Make utility in the pursuit of reproducible workflows in Data Science. What is less common is an actual tutorial covering a real-world scenario in constructing a Makefile.

  • An AutoGUI for Curve Fitting in Python

    June 14, 2018
    python, scientific-computing, and matplotlib

    One of the most basic tasks in science and engineering is fitting a model to some data. Doing so in Python is strait forward using curve_fit from scipy.optimize. In the same way seaborn builds on matplotlib by creating a high-level interface to common statistical graphics, we can expand on the curve fitting process by building a simple, high-level interface for defining and visualizing these sorts of optimization problems.

Contact Me