Speakers
Abstract
Machine learning (ML) assets, such as models, datasets, and metadata—are central to modern ML workflows. Despite their explosive growth in practice, these assets are often underutilized due to fragmented documentation, siloed storage, inconsistent licensing, and lack of unified discovery mechanisms, making ML-asset management an urgent challenge. This tutorial offers a comprehensive overview of ML-asset management activities across its lifecycle, including curation, discovery, and utilization. We provide a categorization of ML assets, and major management issues, survey state-of-the-art techniques, and identify emerging opportunities at each stage. We further highlight system-level challenges related to scalability, lineage, and unified indexing. Through live demonstrations of systems, this tutorial equips both researchers and practitioners with actionable insights and practical tools for advancing ML-asset management in real-world and domain-specific settings.
Resources: Reading List, Demonstrations
Schedule
Part 1: Motivation and Background (00:00 - 00:05)
Part 2: ML-Asset Curation (00:05 - 00:20)
- Metadata and Schema: Support Structured Understanding.
- Repositories and Infrastructure: Backbone of Discovery.
- Licenses: Enable Responsible Reuse.
Part 3: ML-Asset Search and Discovery (00:20 - 00:40)
- Model and Dataset Search.
- Data-driven Model Selection.
- Model-driven Data Discovery.
Part 4: ML-Asset Utilization. (00:40 - 01:05)
- Collaboration: Workflow Aggregation and Automation.
- Reproducibility: Benchmarking and Model Provenance.
- Responsibility: Licensing and Ethical Asset Governance.
Part 5: System Challenges and Opportunities (01:05 - 01:20)
- Storage and Scalability.
- Versioning and Lineage.
- Indexing and Searching.