Ask a Question

Prefer a chat interface with context about you and your work?

A Dataset of Dockerfiles

A Dataset of Dockerfiles

Dockerfiles are one of the most prevalent kinds of DevOps artifacts used in industry. Despite their prevalence, there is a lack of sophisticated semantics-aware static analysis of Dockerfiles. In this paper, we introduce a dataset of approximately 178,000 unique Dockerfiles collected from GitHub. To enhance the usability of this data, …