Research and Professional Experience

Sr. Data Scientist – CredReady, Inc.

White Plains, NY

February 2020 – February 2021

  • Working for a small startup, built models of CNA job success.

  • Designed model’s dependent variables based on client interviews.

  • Imputed demographic data missing from data sources to evaluate fairness of algorithm.

  • Worked with web development team to incorporate my code into website backend.

  • Worked with web design team to design forms to gather data necessary for model.

Data Scientist – eScholar LLC

White Plains, NY

June 2018 – September 2020

  • Model student success from course data to provide personalized guidance.
  • Improved model ROC-AUC by 10%, creating a model that gives useful results.
  • Implemented propensity score matching to identify causal course relationships.
  • Designed Docker containers to work effectively with an air-gapped system.
  • Contribute to product meetings in planning and marketing new product.
  • Write code in SQL, Java, Python, and R.

Freelance Data Scientist

New York, NY

May 2016 – February 2020

  • Diverse projects for six different clients.

  • Currently working on long-term projects with IAG Energy and Cleared Careers.

  • Grouped NYC buildings by owners based on addresses scraped from PDFs, as well as dozens of other address sources.

  • Built algorithm for natural language processing of job postings.
  • Built modular libraries for quick project development and easy maintenance.

  • Deployed scraping programs for long term usage on AWS using Docker to allow clients to run my software with minimal IT support.

Research Assistant – CUNY-Lehman College

June 2011 – Sept. 2015

  • Designed algorithms in Mathematica and Python to model statistical physics systems such as random ferromagnets and analyze 10GB of data.

  • By parallelizing algorithm, reduced processing time by a factor of 15, making previously unsolvable problems quickly accessible.

  • Expanded, maintained, and took ownership of code over four years to explore related problems.

  • Collaborated with two other group members on random magnet research.

  • Developed theory and methods for five published papers.

  • Presented random magnet research at physics conferences.

  • Provided key theoretical solution to a paradox that had vexed physicists since the 1980’s

  • Published paper in PRL, considered the most prestigious physics journal.

Adjunct Lecturer – CUNY-Lehman College

Sept. 2011 – June 2015

  • Mentored high school and university level students one-on-one, studying problems in ferromagnetism.

  • Created learning and research plan for mentees and helped them present their findings.

  • Developed program for high schoolers to teach ferromagnetism and Python.

  • Led physics lab, communicated lab concepts and answered students’ questions.

Programmer – Perfectly Scientific, Inc.

Aug. 2009 – Jan. 2010

  • Working for a small software company, built a C implementation of an algorithm for solving matrix equations as part of a mathematical software suite.

  • Learned Fortran for the purpose of this project.

Drop-in Center Tutor – Reed College Tutoring Center

2007 – 2009

  • Aided students with physics homework problems; communicated complex ideas to struggling students.

Education

Data Science Intensive – SlideRule

Nov. 2015 – Apr. 2016

  • Wrote programs in R, Python, and PostgreSQL to predict taxi drop-offs in a given area of NYC based on demographic and geographic data using Poisson regression and F-score feature selection on 20GB of data.

  • Published non-technical blog post on the SlideRule blog summarizing findings.

  • Completed projects using data analytics methods and machine learning, including regression, classification, GLMs, random forest, Spark, logistic regression, K-means clustering, backpropogated neural networks, PCA, mySQL, among others.

Ph.D., Physics – CUNY Graduate Center

Aug. 2010 – Sept. 2015

B.A., Physics – Reed College

Aug. 2005 – May 2009

  • Devised theoretical argument and programmed a simulation of classical models of charged particles, work published in the American Journal of Physics.

Skills

Programming Languages: Primary: Python, Mathematica, PostgreSQL, Linux+Bash. Limited Knowledge: R

Data Analysis: Machine learning, regression and classification decision trees, ensemble methods such as gradient boosting, GLMs, partial and ordinary differential equations. Expert in combining analytic math with big data to understand complex systems.

Actuarial Exams: Passed Probability (June 2014) and Financial Mathematics (April 2015) Actuarial exams.

Fun Fact: Bicycled across Spain, Peru, Ecuador, Washington, and New England on self-planned trips.