Portfolio

Machine Learning & Policy Analytics
Python Websites
Artifical Intelligence, Predictive Modelling, and Statistics
Spatial Optimization
Financial Modelling
Storymaps
Writing Samples

Machine Learning & Policy Analytics

TrainTracker: Predicting Train Delays for New Jersey Transit & Amtrak

Trains are a timeless mode of transportation that has revolutionized the world. This linear & logistic regression model serves as an experimental use-case that can be replicated to other transit regions in the United States and further enable inter-state travel in order to efficiently utilize train networks. Our study processes previous train delays from 2017-2019 and predict the length of the average delay the train has per trip and the probability of the delay occuring.

Calculating Home Prices based on Zillow Data

Developed a Ordinary Leased Squared (OLS) Linear regression model to predict real estate prices in the Philadelphia area. Our model relies on a combination of data on the internal characteristics and external data including Philadelphia Open Data and the American Community Survey from the U.S Census Bureau to gather comprehensive information in order to create an accurate and generalizable model for home sales prices in the a sample housing market.

Report

Predicting Temporal Patterns and Bike Demand in Washington D.C

A significant challenge faced by bikeshare systems is the optimal redistribution of bikes within the network. Ensuring the availability of at least one bike and one open slot at every station is crucial, as a station without bikes or bike parking inconveniences riders, compelling them to seek alternative transportation. In this initiative, we focus on developing a predictive model for Capital Bikeshare in the District of Columbia. The model & time-series analysis aims to facilitate bike redistribution by forecasting demand in 15-minute intervals for the upcoming week. Departing from traditional truck-based resupply methods, which are resource-intensive, we propose a novel approach—offering dynamic incentives to riders for redistributing bikes across the network.

Report

Evaluating Transit-Oriented Development in San Francisco & Bay Area

This Vector-Based Quantiative Analysis examines San Francisco and the greater Bay Area as it has evolved to into a tech-capital cosmopolitan that has struggled with managing affordability, sustainability, and traffic congestion. Transit-Oriented Development (TOD) has been a nation-wide urban planning practice that aims to alleviate these issues which has impacted every metropolitan area differently. Here, we are assessing that impact of Bay Area Rapid Transit (BART) and surrounding development from 2009 to 2017.

Report

Allocating Home Repair Subsidies

I developed an algorithm to assist with allocating marketing resources for a home repair tax credit program. Based on logistic regression, credit distributors can identify the homeowners most likely to accept the credit and increase their home and general neighborhood house value.

Report

Measuring Risk Assessment of Crime in Chicago

I developed this Geospatial Risk Assessment to predict future arrest risk in Chicago, Illinois. The goal is to identify communities which are projected to experience a high incidence of arrests and allocate social services to these hot-spots.

Report

Python Websites

Using Yelp Fusion API and Web-scraping to Create High-Resolution Neighborhood Amenity Profiles

Alongside Anna Duan and Timothy Oliver, I explored Philadelphia’s amenity landscape for our Geospatial Data Science in Python class project. We analyzed Yelp API data and Zillow listings to uncover a diverse mix of businesses, public spaces, and activities in Philadelphia neighborhoods. Using k-means cluster analysis, we identified six unique amenity clusters, each grouping neighborhoods with similar amenities.

Quarto Website

Python Demonstrations

All python projects and demonstrations can be found on this quarto website! My experience in the language has a heavy focus on geospatial analysis, data visualizations and remote sensing.

Website

Artifical Intelligence, Predictive Modelling, and Statistics

Reading Emotion & Sentiment using Artifical Intelligence

Our objective is to assess variations in people’s differing viewpoints regarding specific parks and compile a comprehensive perspective on the parks and recreational facilities in Philadelphia. Examining emotion and gauging sentiment concerning community amenities enables additional research along with surveying efforts focused on publicly reviewed entities.

Report

Examining Housing Using a Three-Model Apporach

We compare three modeling approaches in when evaluating the Philadelphia Real Estate markeet: Spatial Lag, Spatial Error, and Geographically Weighted Regression.

Report

Predicting Alcohol-Related Car Accidents with Logistic Regression

Ultimately, this investigation sheds light on the critical issue of drunk driving, a national problem claiming lives daily, as reported by the US Department of Transportation. By understanding the predictors associated with drunk driving accidents in Philadelphia, we gain valuable insights into the nature of these incidents and the potential outcomes linked to specific driver behaviors.

Report

Categorizing a City using a Cluster Analysis

To further identify potential spatial autocorrelation in these indicators at the block group level, we will run a k-means cluster analysis to inform us on an optimal number of block group clusters within the city.

Report

Ordinary Least Squares Regression on House Prices

We analyze the correlation between select neighborhood socioeconomic variables and median house value in Philadelphia, Pennsylvania.

Report

Spatial Optimization

Maximum Covering Location Problem

In this study, we use ArcGIS Pro, Power BI, and CPLEX to solve the Maximum Location Coverage Problem in theory and in real-life use. We examine how a person’s willingness to travel to bus stops determines the coverage of transit stations in a sample town.

Report

Detecting Redundancy in Transit Stops & Measuring Accessability vs Converage

Transportation planners often are challenged to balance accessibility through location stops versus maintaining or increasing efficiency to ensure the route is practical. We will specifically use a sample bus line in Charlotte to analyse the sensitivity of the coverage model by varying the willingness of agents to commute to travel nodes. Graphics were created using ArcGIS & Power BI.

Report

Location Allocation

Location-Allocation optimization is a method used to ideally situate facilities with the goal to satisfy all demand with the shortest path possible to each site. We measure the difference in distance traveled (as a collective) when there are a specified number of pharmacies in a given area. Graphics were created using ArcGIS & Power BI.

Report

Dispersion Theory

The p-dispersion problem is a technique to maximize the distance between the two closest pair of points in a network. The function of this optimization technique is to disseminate a value, or object, as far as possible from others of the same type. Graphs were created using Tableau.

Report

Dispersion Location Analysis

In this practice study, we leverage the p-dispersion problem to a political analysis regarding homeowner eligibility for former convicts. Graphics were created using ArcGIS Pro & Tableau.

Report

Maximum Species Coverage Optimization

We will look into endangered species protection in concept, and how the algorithm works to find parcels that are critical for protection.Then identify how this applies to Mecklenburg County, North Carolina, and the parallel between the abstract function’s behavior and the real-world application. Graphs were made using Power BI & ArcGIS Pro.

Report

Comparing Species Preservation and Park Budgets

Report

Financial Modelling

Storymaps (ArcGIS)

Writing Samples

Recidivism Algorithm, NYC Memo