CADS Research Visualization System

Semantic analytics platform for Texas State University's research ecosystem

Completed
πŸ“Œ Pinned Project
1/20/2025
2 min read
data-platform
CADS Research Visualization System
Data Visualization
OpenAlex
Python
Supabase
UMAP
HDBSCAN
Research Data

CADS Research Visualization System

Semantic analytics pipeline that turns faculty publication data into interactive visual maps for the Texas State research community.

Snapshot

  • Automated ingestion from OpenAlex + Supabase, clustering 2,400+ publications into thematic groups.
  • UMAP + HDBSCAN embeddings surface collaboration hotspots and cross-department opportunities.
  • Streamlit dashboard and monitoring suite give CADS leadership live health metrics on ingest jobs.

Architecture

CADS Research Visualization System β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Data Sources β”‚ β”‚ Core Pipeline β”‚ β”‚ Visualization β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β€’ OpenAlex API │───▢│ β€’ Data Loader │───▢│ β€’ Web Dashboard β”‚ β”‚ β€’ Supabase DB β”‚ β”‚ β€’ Embeddings β”‚ β”‚ β€’ Search System β”‚ β”‚ β€’ CADS Faculty β”‚ β”‚ β€’ UMAP/HDBSCAN β”‚ β”‚ β€’ Interactive β”‚ β”‚ β€’ Research Data β”‚ β”‚ β€’ Theme Gen β”‚ β”‚ Visualizationsβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why It Matters

  • Cuts hours of manual grant scouting by exposing real-time semantic search and profile matching.
  • Provides reproducible analyticsβ€”CI runs tests on ingest scripts, and monitoring catches schema drift before faculty demos.
  • NSF CAP award showcases the system as the backbone for expanding AI curriculum across campus.

Stack & Operations

  • Python ingestion workers (Poetry + Airflow-ready scripts) writing into Supabase vector tables.
  • Visualization layer served from a hardened visuals/public bundle with CDNs for department-wide access.
  • Documentation set spans pipeline playbooks, troubleshooting guides, and CI/CD runbooks so new CADS hires can onboard in a day.