Parking @ TXST

2/10/2025 | By Saksham Adhikari

Research Question

“How does student enrollment growth relate to parking permit demand at Texas State University, and which campus areas are most underserved in terms of parking supply based on the spatial distribution of existing parking spots?”

This question lets you tackle two interrelated issues: forecasting overall parking capacity needs based on enrollment trends and identifying spatial gaps (or “hotspots”) where additional parking could relieve congestion.

Detailed Step-by-Step Plan

Step 1: Data Acquisition and Ingestion

Enrollment Data
- Source: CSV or Excel file containing historical enrollment figures (years, total enrollment, undergraduates, graduates, etc.) through 2024.
- Action: Upload the file to Google Colab and load it using Pandas.
Historical Permit Sales Data (2022)
- Source: A CSV/Excel file with permit type, number sold, date (if available), and possibly other features.
- Action: Load this data into a Pandas DataFrame.
Geocoordinate Data for Parking Spots
- Source: CSV file with columns such as parking spot or lot ID, latitude, longitude, and possibly descriptive location information.
- Action: Load this file into another Pandas DataFrame (or a GeoPandas DataFrame if you plan to leverage spatial libraries).

Example Code Snippet (Python/Pandas):

python
CopyEdit
import pandas as pd

# Load enrollment data
enrollment_df = pd.read_csv('enrollment_data.csv')

# Load permit sales data for 2022
permit_df = pd.read_csv('permit_sales_2022.csv')

# Load geocoordinate data for parking spots
parking_geo_df = pd.read_csv('parking_geocoordinates.csv')

Step 2: Data Cleaning and Preprocessing

Enrollment Data Cleaning:
- Ensure the year column is in datetime or integer format.
- Check for missing values and correct or impute as necessary.
- Convert enrollment figures to numeric types.
Permit Sales Data Cleaning:
- Verify that numeric fields (number of permits sold) are correctly formatted.
- Aggregate data if necessary (e.g., total permits sold in 2022 or by permit type).
Geospatial Data Preparation:
- Check that latitude and longitude columns are numeric.
- (Optional) Convert your Pandas DataFrame to a GeoPandas DataFrame using the coordinates.
- Remove or flag any records with missing or erroneous coordinates.

Example Code Snippet:

python
CopyEdit
# Convert year column to integer
enrollment_df['Year'] = enrollment_df['Year'].astype(int)
enrollment_df['Enrollment'] = pd.to_numeric(enrollment_df['Enrollment'], errors='coerce')

# Aggregate permit sales data for 2022 if needed
total_permits_2022 = permit_df['NumberSold'].sum()
permit_ratio = total_permits_2022 / enrollment_df.loc[enrollment_df['Year'] == 2022, 'Enrollment'].values[0]

# Prepare geospatial data
import geopandas as gpd
from shapely.geometry import Point

parking_geo_df['geometry'] = parking_geo_df.apply(lambda row: Point(row['Longitude'], row['Latitude']), axis=1)
parking_gdf = gpd.GeoDataFrame(parking_geo_df, geometry='geometry')

Step 3: Exploratory Data Analysis (EDA)

Enrollment Trends:
- Plot enrollment over time (line chart) to observe trends.
- Compute growth rates and assess the trend’s impact on parking needs.
Permit Sales Proxy Analysis:
- Calculate the ratio of permits sold to enrollment for 2022. This ratio can serve as a proxy for parking demand.
- Visualize the relationship (e.g., a scatter plot if you have additional cross-sectional data or simply report the constant ratio).
Spatial Visualization:
- Create an interactive map of the parking spots using Folium.
- Visualize clusters of parking spots and note any areas that appear sparse.
- Optionally overlay campus boundaries or locations of major landmarks if available.

Example Code Snippet:

python
CopyEdit
import matplotlib.pyplot as plt
import folium

# Plot enrollment trends
plt.figure(figsize=(10, 6))
plt.plot(enrollment_df['Year'], enrollment_df['Enrollment'], marker='o')
plt.title('Texas State University Enrollment Trends')
plt.xlabel('Year')
plt.ylabel('Enrollment')
plt.show()

# Create an interactive map with Folium
m = folium.Map(location=[parking_gdf['Latitude'].mean(), parking_gdf['Longitude'].mean()], zoom_start=15)
for idx, row in parking_gdf.iterrows():
    folium.CircleMarker(location=[row['Latitude'], row['Longitude']], radius=3, color='blue').add_to(m)
m

Step 4: Modeling Overall Parking Demand

Establish Baseline Ratio:
- Use the 2022 permit sales data to establish the baseline ratio: Permit Ratio=Enrollment (2022)Total Permits Sold (2022)
  
  Permit Ratio=Total Permits Sold (2022)Enrollment (2022)\text{Permit Ratio} = \frac{\text{Total Permits Sold (2022)}}{\text{Enrollment (2022)}}
Forecasting Future Demand:
- Using enrollment projections (from your enrollment data), forecast the number of permits required in future years by multiplying the projected enrollment by the baseline ratio.
- Alternatively, if you have multiple years of enrollment and corresponding permit data (even if limited), you could build a simple linear regression model where:
  - X: Enrollment
  - Y: Permit Sales
- If only one year is available, document that the constant ratio assumption is your key modeling assumption.

Example Code Snippet (Forecasting using Constant Ratio):

python
CopyEdit
# Assume the permit ratio from 2022 is constant
enrollment_df['Forecasted_Permits'] = enrollment_df['Enrollment'] * permit_ratio

# Plot forecasted permits alongside enrollment
plt.figure(figsize=(10, 6))
plt.plot(enrollment_df['Year'], enrollment_df['Enrollment'], marker='o', label='Enrollment')
plt.plot(enrollment_df['Year'], enrollment_df['Forecasted_Permits'], marker='x', label='Forecasted Permits')
plt.title('Forecasted Parking Permit Demand Based on Enrollment')
plt.xlabel('Year')
plt.legend()
plt.show()

Step 5: Spatial Analysis of Parking Spots

Clustering Analysis:
- Use clustering algorithms like DBSCAN to identify clusters of parking spots.
- This helps pinpoint where parking is densely available versus where there are gaps.
Kernel Density Estimation (KDE):
- Apply KDE on the geocoordinates to generate a heat map that visually identifies “hotspots” of parking availability.
- Overlay these hotspots on a campus map to assess their proximity to high-demand areas (e.g., academic buildings, dormitories).

Example Code Snippet (Using DBSCAN):

python
CopyEdit
from sklearn.cluster import DBSCAN
import numpy as np

# Extract coordinates as a NumPy array
coords = parking_gdf[['Latitude', 'Longitude']].to_numpy()

# Apply DBSCAN clustering
db = DBSCAN(eps=0.001, min_samples=5).fit(coords)  # eps value may require tuning
parking_gdf['cluster'] = db.labels_

# Plot clusters
plt.figure(figsize=(8, 6))
plt.scatter(parking_gdf['Longitude'], parking_gdf['Latitude'], c=parking_gdf['cluster'], cmap='viridis')
plt.title('DBSCAN Clustering of Parking Spots')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.colorbar(label='Cluster')
plt.show()

Step 6: Integration and Synthesis

Comparing Supply with Forecasted Demand:
- Sum the total number of parking spots available (using your geocoordinate dataset) and compare this against your forecasted permit demand.
- Calculate a “deficit” or “surplus” measure: Deficit=Forecasted Permits−Total Parking Spots
  
  Deficit=Forecasted Permits−Total Parking Spots\text{Deficit} = \text{Forecasted Permits} - \text{Total Parking Spots}
Spatial Prioritization:
- Identify regions (from your clustering/KDE analysis) where the density of parking spots is lower than what would be expected given proximity to high-demand campus areas.
- This analysis can inform campus planners about potential locations for additional parking facilities or shuttle services.
Robustness and Sensitivity:
- Document all assumptions (e.g., constant permit-to-enrollment ratio).
- Run sensitivity analyses: what happens if the ratio increases by 10% or decreases by 10%?
- Validate using cross-institution benchmarks if available.

Step 7: Documentation and Reporting

Google Colab Notebook:
- Organize your notebook into clearly labeled sections corresponding to the steps above.
- Include detailed markdown cells that explain each step, the assumptions made, and the code functionality.
Visualization Dashboard:
- Consider using interactive visualization libraries (e.g., Plotly or Folium) to build a dashboard that allows decision-makers to explore the data interactively.
- Provide visual comparisons of historical trends, forecasted permit demand, and spatial maps of parking clusters.
Final Report:
- Summarize the methodology, key findings (such as projected deficits and spatial gaps), limitations, and recommendations.
- Discuss how this analysis informs campus planning (e.g., recommendations for additional parking structures or improved shuttle service).

Research Outcomes and Applications

Primary Outcome:

A forecast model that links enrollment growth to parking permit demand and identifies specific campus zones that are under-served in terms of parking infrastructure.
Actionable Recommendations:
- Adjust parking permit allocations or pricing strategies based on enrollment forecasts.
- Recommend construction of new parking facilities or the reallocation of underused spaces.
- Enhance shuttle service in areas with a parking deficit.
Future Research Directions:
- Extend the model by incorporating more granular data (e.g., real-time occupancy data) as it becomes available.
- Compare TXST’s parking demand dynamics with similar institutions to refine the model.
- Investigate correlations between parking availability and student satisfaction or retention rates.

Recently won Datathon 2025. Todo.

Parking @ TXST

2/10/2025 | By Saksham Adhikari

Research Question

Detailed Step-by-Step Plan

Step 1: Data Acquisition and Ingestion

Enrollment Data
- Source: CSV or Excel file containing historical enrollment figures (years, total enrollment, undergraduates, graduates, etc.) through 2024.
- Action: Upload the file to Google Colab and load it using Pandas.
Historical Permit Sales Data (2022)
- Source: A CSV/Excel file with permit type, number sold, date (if available), and possibly other features.
- Action: Load this data into a Pandas DataFrame.
Geocoordinate Data for Parking Spots
- Source: CSV file with columns such as parking spot or lot ID, latitude, longitude, and possibly descriptive location information.
- Action: Load this file into another Pandas DataFrame (or a GeoPandas DataFrame if you plan to leverage spatial libraries).

Example Code Snippet (Python/Pandas):

python
CopyEdit
import pandas as pd

# Load enrollment data
enrollment_df = pd.read_csv('enrollment_data.csv')

# Load permit sales data for 2022
permit_df = pd.read_csv('permit_sales_2022.csv')

# Load geocoordinate data for parking spots
parking_geo_df = pd.read_csv('parking_geocoordinates.csv')

Step 2: Data Cleaning and Preprocessing

Enrollment Data Cleaning:
- Ensure the year column is in datetime or integer format.
- Check for missing values and correct or impute as necessary.
- Convert enrollment figures to numeric types.
Permit Sales Data Cleaning:
- Verify that numeric fields (number of permits sold) are correctly formatted.
- Aggregate data if necessary (e.g., total permits sold in 2022 or by permit type).
Geospatial Data Preparation:
- Check that latitude and longitude columns are numeric.
- (Optional) Convert your Pandas DataFrame to a GeoPandas DataFrame using the coordinates.
- Remove or flag any records with missing or erroneous coordinates.

Example Code Snippet:

python
CopyEdit
# Convert year column to integer
enrollment_df['Year'] = enrollment_df['Year'].astype(int)
enrollment_df['Enrollment'] = pd.to_numeric(enrollment_df['Enrollment'], errors='coerce')

# Aggregate permit sales data for 2022 if needed
total_permits_2022 = permit_df['NumberSold'].sum()
permit_ratio = total_permits_2022 / enrollment_df.loc[enrollment_df['Year'] == 2022, 'Enrollment'].values[0]

# Prepare geospatial data
import geopandas as gpd
from shapely.geometry import Point

parking_geo_df['geometry'] = parking_geo_df.apply(lambda row: Point(row['Longitude'], row['Latitude']), axis=1)
parking_gdf = gpd.GeoDataFrame(parking_geo_df, geometry='geometry')

Step 3: Exploratory Data Analysis (EDA)

Enrollment Trends:
- Plot enrollment over time (line chart) to observe trends.
- Compute growth rates and assess the trend’s impact on parking needs.
Permit Sales Proxy Analysis:
- Calculate the ratio of permits sold to enrollment for 2022. This ratio can serve as a proxy for parking demand.
- Visualize the relationship (e.g., a scatter plot if you have additional cross-sectional data or simply report the constant ratio).
Spatial Visualization:
- Create an interactive map of the parking spots using Folium.
- Visualize clusters of parking spots and note any areas that appear sparse.
- Optionally overlay campus boundaries or locations of major landmarks if available.

Example Code Snippet:

python
CopyEdit
import matplotlib.pyplot as plt
import folium

# Plot enrollment trends
plt.figure(figsize=(10, 6))
plt.plot(enrollment_df['Year'], enrollment_df['Enrollment'], marker='o')
plt.title('Texas State University Enrollment Trends')
plt.xlabel('Year')
plt.ylabel('Enrollment')
plt.show()

# Create an interactive map with Folium
m = folium.Map(location=[parking_gdf['Latitude'].mean(), parking_gdf['Longitude'].mean()], zoom_start=15)
for idx, row in parking_gdf.iterrows():
    folium.CircleMarker(location=[row['Latitude'], row['Longitude']], radius=3, color='blue').add_to(m)
m

Step 4: Modeling Overall Parking Demand

Establish Baseline Ratio:
- Use the 2022 permit sales data to establish the baseline ratio: Permit Ratio=Enrollment (2022)Total Permits Sold (2022)
  
  Permit Ratio=Total Permits Sold (2022)Enrollment (2022)\text{Permit Ratio} = \frac{\text{Total Permits Sold (2022)}}{\text{Enrollment (2022)}}
Forecasting Future Demand:
- Using enrollment projections (from your enrollment data), forecast the number of permits required in future years by multiplying the projected enrollment by the baseline ratio.
- Alternatively, if you have multiple years of enrollment and corresponding permit data (even if limited), you could build a simple linear regression model where:
  - X: Enrollment
  - Y: Permit Sales
- If only one year is available, document that the constant ratio assumption is your key modeling assumption.

Example Code Snippet (Forecasting using Constant Ratio):

python
CopyEdit
# Assume the permit ratio from 2022 is constant
enrollment_df['Forecasted_Permits'] = enrollment_df['Enrollment'] * permit_ratio

# Plot forecasted permits alongside enrollment
plt.figure(figsize=(10, 6))
plt.plot(enrollment_df['Year'], enrollment_df['Enrollment'], marker='o', label='Enrollment')
plt.plot(enrollment_df['Year'], enrollment_df['Forecasted_Permits'], marker='x', label='Forecasted Permits')
plt.title('Forecasted Parking Permit Demand Based on Enrollment')
plt.xlabel('Year')
plt.legend()
plt.show()

Step 5: Spatial Analysis of Parking Spots

Clustering Analysis:
- Use clustering algorithms like DBSCAN to identify clusters of parking spots.
- This helps pinpoint where parking is densely available versus where there are gaps.
Kernel Density Estimation (KDE):
- Apply KDE on the geocoordinates to generate a heat map that visually identifies “hotspots” of parking availability.
- Overlay these hotspots on a campus map to assess their proximity to high-demand areas (e.g., academic buildings, dormitories).

Example Code Snippet (Using DBSCAN):

python
CopyEdit
from sklearn.cluster import DBSCAN
import numpy as np

# Extract coordinates as a NumPy array
coords = parking_gdf[['Latitude', 'Longitude']].to_numpy()

# Apply DBSCAN clustering
db = DBSCAN(eps=0.001, min_samples=5).fit(coords)  # eps value may require tuning
parking_gdf['cluster'] = db.labels_

# Plot clusters
plt.figure(figsize=(8, 6))
plt.scatter(parking_gdf['Longitude'], parking_gdf['Latitude'], c=parking_gdf['cluster'], cmap='viridis')
plt.title('DBSCAN Clustering of Parking Spots')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.colorbar(label='Cluster')
plt.show()

Step 6: Integration and Synthesis

Comparing Supply with Forecasted Demand:
- Sum the total number of parking spots available (using your geocoordinate dataset) and compare this against your forecasted permit demand.
- Calculate a “deficit” or “surplus” measure: Deficit=Forecasted Permits−Total Parking Spots
  
  Deficit=Forecasted Permits−Total Parking Spots\text{Deficit} = \text{Forecasted Permits} - \text{Total Parking Spots}
Spatial Prioritization:
- Identify regions (from your clustering/KDE analysis) where the density of parking spots is lower than what would be expected given proximity to high-demand campus areas.
- This analysis can inform campus planners about potential locations for additional parking facilities or shuttle services.
Robustness and Sensitivity:
- Document all assumptions (e.g., constant permit-to-enrollment ratio).
- Run sensitivity analyses: what happens if the ratio increases by 10% or decreases by 10%?
- Validate using cross-institution benchmarks if available.

Step 7: Documentation and Reporting

Google Colab Notebook:
- Organize your notebook into clearly labeled sections corresponding to the steps above.
- Include detailed markdown cells that explain each step, the assumptions made, and the code functionality.
Visualization Dashboard:
- Consider using interactive visualization libraries (e.g., Plotly or Folium) to build a dashboard that allows decision-makers to explore the data interactively.
- Provide visual comparisons of historical trends, forecasted permit demand, and spatial maps of parking clusters.
Final Report:
- Summarize the methodology, key findings (such as projected deficits and spatial gaps), limitations, and recommendations.
- Discuss how this analysis informs campus planning (e.g., recommendations for additional parking structures or improved shuttle service).

Research Outcomes and Applications

Primary Outcome:

A forecast model that links enrollment growth to parking permit demand and identifies specific campus zones that are under-served in terms of parking infrastructure.
Actionable Recommendations:
- Adjust parking permit allocations or pricing strategies based on enrollment forecasts.
- Recommend construction of new parking facilities or the reallocation of underused spaces.
- Enhance shuttle service in areas with a parking deficit.
Future Research Directions:
- Extend the model by incorporating more granular data (e.g., real-time occupancy data) as it becomes available.
- Compare TXST’s parking demand dynamics with similar institutions to refine the model.
- Investigate correlations between parking availability and student satisfaction or retention rates.

Recently won Datathon 2025. Todo.