Collecting, Linking, and Assessing Machine Learning Open-Source Software: A Large Scale Collection and Vulnerability Assessment Pipeline
| dc.contributor.author | Lazarine, Ben | |
| dc.contributor.author | Pulipaka, Srikar | |
| dc.contributor.author | Samtani, Sagar | |
| dc.contributor.author | Venkataraman, Ramesh | |
| dc.date.accessioned | 2024-12-26T21:04:44Z | |
| dc.date.available | 2024-12-26T21:04:44Z | |
| dc.date.issued | 2025-01-07 | |
| dc.description.abstract | In recent years, Artificial Intelligence (AI) has seen rapid advances in performance and impact,disrupting major industries, including finance and healthcare. Machine learning open-source software(MLOSS) platforms such as GitHub and Hugging Face have contributed significantly to this advancement,enabling AI developers to share, reuse, and collaborate on AI development. While these platforms accelerate AI development, the MLOSS assets they host also contain vulnerabilities that can impact applications that leverage them. To map the MLOSS landscape and understand the vulnerabilities contained within MLOSS on platforms such as GitHub and Hugging Face,we have developed an MLOSS Collection Pipeline.Our pipeline has collected 373,634 models from Hugging Face and 39,115 repositories from GitHub and identified 6,751,739 vulnerabilities. The results of our pipeline offer several promising directions for future research, including vulnerability linking analysis and cross-platform vulnerability propagation identification. | |
| dc.format.extent | 8 | |
| dc.identifier.doi | https://doi.org/10.24251/HICSS.2025.048 | |
| dc.identifier.isbn | 978-0-9981331-8-8 | |
| dc.identifier.other | 0562f3cb-3fab-46ac-97f8-260113b5edee | |
| dc.identifier.uri | https://hdl.handle.net/10125/108884 | |
| dc.relation.ispartof | Proceedings of the 58th Hawaii International Conference on System Sciences | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
| dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Cybersecurity in the Age of Artificial Intelligence, AI for Cybersecurity, and Cybersecurity for AI | |
| dc.subject | ai risk management, artificial intelligence, cybersecurity, open-source software | |
| dc.title | Collecting, Linking, and Assessing Machine Learning Open-Source Software: A Large Scale Collection and Vulnerability Assessment Pipeline | |
| dc.type | Conference Paper | |
| dc.type.dcmi | Text | |
| prism.startingpage | 398 |
Files
Original bundle
1 - 1 of 1
