September 2025 Quarterly Meeting Notes and Summary

September 2025 Quarterly Meeting Notes and Summary

All notes for the session are here: https://docs.google.com/document/d/1LMz6TBbTUsLnOzK5jkHYNZrDR0aS_d7eyrhX9O6XmO8/edit?tab=t.0#heading=h.jodazq439e6w

Summary:

September 2025 Quarterly Meeting Notes: Summary

 

Day 1 (Tuesday, Sept 9) Highlights

  1. Governance & Staff Updates

  • Shelley Knuth (EC Chair) reviewed the agenda and announced new PI roles:

    • PI Support: Shelley

    • PI Metrics: Tom

    • PI Allocations: Stephen

    • PI Operations: Tim

    • RP Chair: Jeremy; new RP Forum Chair (Oct): Eric Adams (Purdue)

  • EC will assign ownership for action items and condense monolithic meeting notes.

 

  1. “What’s Everyone Developing?” Sessions

  • Allocations (Nathan Tolbert)

    • Enforced institutional email and affiliation checks; tied to user enrollment

    • Drafted new AMIE API and workflow automation for high-volume requests

    • Exposed NAIRR allocations data in accounting DB for cross-team use

  • Support (Andrew Pasquale)

    • Q&A bot enhancements with UI integration and XDMoD backend

    • Ticket-analysis tool tagging JIRA issues with AI filters

    • Prototype MCP services and survey dashboards

  • Operations (Winona Snapp-Childs)

    • SSH-key lookup service and improved account registration UX

    • perfSONAR+MMS enhancements

    • Integration dashboard and automated resource-news API

  • Metrics (Greg Dean)

  • XDMoD performance improvements and added OSG/cloud allocation metrics

  • New Category 1 (CloudBank 2, Nexus, AMA27, DeltaAI, SDSC Voyager) and Category 2 (Cosmos, REPACSS, NRP, Kubernetes) integrations

  • Monthly interactive usage reports, automated distribution, Q&A-bot testing

 

  1. ACO Data Inventory

ACO will collect and curate:

  • Registration & attendance (RAC, EAB, SWG)

  • Communications metrics (HPC Wire pickups, newsletters, social media)

  • Surveys (EAB, QM, staff, RP, community)

  • Financial & operational reports; tool usage logs (UDO, VIVO, QRA)

  • Publication and engagement tracking; quarterly ecosystem KPIs

 

  1. Policy & Eligibility Changes (Stephen Deems)

  • NSF-driven requirement: only affiliated researchers with institutional email may request allocations

  • Prevent new unaffiliated submissions; require profile/email updates at renewal

  • Audit recent allocations; update AUP and communications; banner notifications

 

  1. New ACCESS Resources on the Horizon

  • Category 1 (large-scale CI, $10–20 M, 5 yr prod): CloudBank 2; Nexus; AMA27

  • Category 2 (emerging tech, $5 M, 2 yr prod): REPACSS; Cosmos; SDSC prototype

  • Prototype National Research Program: distributed testbed across multiple sites

 

  1. Data-Driven Decision-Making (Interactive)

Teams mapped out data collected across:

  • Usage (XDMoD, Pegasus, webinars)

  • User experience (surveys, focus groups)

  • Support interactions (tickets, chatbot logs)

  • Resource integration effort and maintenance effort

  • Security and monitoring logs; network flow data; external eval surveys

 

  1. Resource Recommenders (Nikolay Simakov)

Three ML-based models to guide users to optimal resources:

  1. Demographic: recommends based on user’s field, status, institution

  2. Alternative: suggests under-used systems for better load balancing

  3. Application-based: predicts time-to-solution from workload parameters

Demo showed KNN-driven wait-time estimates and visualizations.

 

  1. MCP Servers & AI Chat (Andrew Pasquale)

  • Model Context Protocol servers enable AI queries over ACCESS data

  • Demo queries: current GPU wait times, software availability, research-profile generation

  • Planning working group to validate data provenance and trust

 

Day 2 (Wednesday, Sept 10) Highlights

  1. Recap & Data-Presentation Feedback

  • Call for structured data catalog: which team holds which datasets and how to access them

  • Emphasis on combining data discovery with trust-verification processes

 

  1. 20-Year Trend Analysis (Andrew Pasquale via chat)

  • ACCESS-CI CPU-hour growth:

    • 2005–2010: 2.67 B SU → 2011–2015: 7.22 B SU → 2016–2020: 12.06 B SU → 2021–2025: 12.27 B SU

  • Field-specific evolution: Materials Engineering, Astronomy, Biophysics dominated; AI, fluid physics, nanotech emerging

  • Strategic recommendations for institutions, planners, and policymakers on future resource investments

 

  1. HPCPerfStats Status & Roadmap (Amit Ruhela)

  • Collects Slurm job, CPU, memory, I/O, networking, energy metrics; integrated into XDMoD

  • Recent upgrades: containerized deployment, roofline plots, 5× faster ingestion, drop classified job data

  • Future: broaden RP integrations, explore AI-based performance advice

 

  1. SC & WG Updates

  • Ticketing SC – streamlined support forms; plain-language UI

  • Infrastructure Portfolio Expansion SC – defined “fully integrated,” “affiliated,” “enabled” resources; roadmap for new badges and NSF approvals

  • Web Presence SC – unified registration/profile wireframes; backend feasibility review in September

  • Evaluation SC – survey refinements (short primary survey + opt-in deeper survey, NPS question)

  • Cybersecurity Governance SC – proposed institutional tool-approval survey

  • Communication SC – averaged 2–3 HPC Wire stories/month; soliciting community story ideas

 

Key Decisions & Next Steps

  • EC to prioritize a consolidated data catalog and define central storage or infrastructure needs

  • Assign ownership for all action items and SC/WG report templates from Chuck

  • Call for new agenda-planning committee members (Cindy); launch new RAC form (Laura H)

  • Plan demos of chat widget, ticket analysis, and MCP capabilities in future meetings

  • Continue in-person format for interactive sessions and networking


Additional Details

Key Decisions

  • Executive Committee will lead a consolidated data catalog effort, defining what datasets exist, where they’re housed, and how teams access them.

  • Institutional-affiliation requirement adopted: only users with verified organizational email can request new allocations; recent submissions will be audited.

  • Ownership of all outstanding action items to be formally assigned by EC; standardized SC/WG reporting templates will be distributed.

  • In-person meeting format retained for interactive demos and networking.

  • New agenda-planning committee members will be recruited; Research Advisory Committee (RAC) form relaunched this fall.

 

Action Items

  • Upload and maintain presentation slides in the “September 2025 Quarterly Meeting” folder.

  • EC to assign individual owners for every documented action item by next week.

  • Each team to inventory its data assets (type, location, steward, access method) and share in the central catalog.

  • Allocations team to implement institutional-email enforcement logic, prevent unaffiliated submissions, and audit the past two weeks of allocation requests.

  • Support team to schedule demos of the enhanced Q&A chatbot, ticket-analysis tool, and MCP server capabilities.

  • Operations to finalize and launch the Integration Dashboard and rollout the automated resource-news API.

  • Metrics team to automate monthly interactive usage reports and integrate OSG/cloud data into XDMoD.

  • Infrastructure SC to refine “fully integrated,” “affiliated,” and “enabled” resource definitions, prepare badge roadmap, and seek EC/NSF approval.

  • Web Presence SC to vet wireframes with developers, finalize UX language, and plan implementation timelines.

  • Evaluation SC to deploy the shortened primary survey (with NPS question), plus an opt-in detailed follow-up; share results with EC.

  • Cybersecurity Governance SC to draft and circulate an institutional tool-approval survey for feedback.

  • Communications SC to solicit story ideas (user experiences, awards, innovations) and maintain a 2–3 stories/month cadence with HPC Wire.