Collecting, Linking, and Assessing Machine Learning Open-Source Software: A Large Scale Collection and Vulnerability Assessment Pipeline

dc.contributor.authorLazarine, Ben
dc.contributor.authorPulipaka, Srikar
dc.contributor.authorSamtani, Sagar
dc.contributor.authorVenkataraman, Ramesh
dc.date.accessioned2024-12-26T21:04:44Z
dc.date.available2024-12-26T21:04:44Z
dc.date.issued2025-01-07
dc.description.abstractIn recent years, Artificial Intelligence (AI) has seen rapid advances in performance and impact,disrupting major industries, including finance and healthcare. Machine learning open-source software(MLOSS) platforms such as GitHub and Hugging Face have contributed significantly to this advancement,enabling AI developers to share, reuse, and collaborate on AI development. While these platforms accelerate AI development, the MLOSS assets they host also contain vulnerabilities that can impact applications that leverage them. To map the MLOSS landscape and understand the vulnerabilities contained within MLOSS on platforms such as GitHub and Hugging Face,we have developed an MLOSS Collection Pipeline.Our pipeline has collected 373,634 models from Hugging Face and 39,115 repositories from GitHub and identified 6,751,739 vulnerabilities. The results of our pipeline offer several promising directions for future research, including vulnerability linking analysis and cross-platform vulnerability propagation identification.
dc.format.extent8
dc.identifier.doihttps://doi.org/10.24251/HICSS.2025.048
dc.identifier.isbn978-0-9981331-8-8
dc.identifier.other0562f3cb-3fab-46ac-97f8-260113b5edee
dc.identifier.urihttps://hdl.handle.net/10125/108884
dc.relation.ispartofProceedings of the 58th Hawaii International Conference on System Sciences
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectCybersecurity in the Age of Artificial Intelligence, AI for Cybersecurity, and Cybersecurity for AI
dc.subjectai risk management, artificial intelligence, cybersecurity, open-source software
dc.titleCollecting, Linking, and Assessing Machine Learning Open-Source Software: A Large Scale Collection and Vulnerability Assessment Pipeline
dc.typeConference Paper
dc.type.dcmiText
prism.startingpage398

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0041.pdf
Size:
600.21 KB
Format:
Adobe Portable Document Format