1 Introduction

Vulnerability in third-party dependencies is a growing concern for the software developer. In a 2018 report, over four million vulnerabilities were raised to the attention of developers of over 500 thousand GitHub repositories (GitHub 2018a). The risk of vulnerabilities is not restricted to the direct users of these software artifacts, but it also extends to the broader software ecosystems to which they belong. Examples include the ShellShock (Bennett 2014) and Heartbleed (Synopsys 2014) vulnerabilities, which caused widespread damages to broad and diverse software ecosystems made up of direct and indirect adopters. Indeed, the case of Heartbleed emphasized its critical role in the modern web. Durumeric et al. (2014) shows that OpenSSL, i.e., the project where the Heartbleed vulnerability originated, is presented on web servers that host (at least) 66% of sites and (at least) 24% of the secure sites on the internet were affected by Heartbleed.

The speed at which ecosystems react to vulnerabilities and the availability of fixes to vulnerabilities is of paramount importance. Three lines of prior works support this intuition:

  1. 1.

    Studies by Howard and Leblanc (2001), Ponta et al. (2018), Nguyen et al. (2016), Munaiah et al. (2017), Hejderup (2015), Pashchenko et al. (2018), and Williams et al. (2018) encourage developers to use security best practices, e.g., project validation, security monitoring, to prevent and detect vulnerabilities in deployed projects.

  2. 2.

    Studies by Kikas et al. (2017), Decan et al. (2017), and Cox et al. (2015) show that vulnerabilities can cascade transitively through the package dependency network. Moreover, they observe that security issues are more likely to occur in the field due to stale (outdated) dependencies than directly within product codes.

  3. 3.

    Studies by Kula et al. (2018b), Bavota et al. (2015), Bogart et al. (2016), and Ihara et al. (2017) show that developers are slow to update their vulnerable packages, which is occasionally due to management and process factors.

While these prior studies have made important advances, they have tended to focus on (i) a coarse granularity, i.e., releases, that focused only on the vulnerable dependency and only the direct client. For example, Decan et al. (2018b) analyzed how and when package releases with vulnerabilities are discovered and fixed with a single direct client, while Decan et al. (2018a) analyzed releases to explore the evolution of technical lag and its impact. Commit-level analysis similar to Li and Paxson (2017) and Piantadosi et al. (2019) is important because it reveals how much development activity (i.e., migration effort) is directed towards fixing vulnerabilities compare to the other tasks. Furthermore, there is also a research gap that relates to (ii) the analysis of the package vulnerability fixes with respect to the downstream clients, and not a single direct client.

To bridge these two research gaps, we set out to identify and characterize the release, adoption, and propagation tendencies of vulnerability fixes. We identify and track a release that contains the fix, which is defined as a fixing release. We then characterize the fixing release for each npm JavaScript package in terms of commits for fixing the vulnerability, which is defined as the package-side fixing release. Based on semantic versioning (Preston-Werner 2009), a package-side fixing release is a package-side fixing releasetypeSmajor, package-side fixing pacage minor landing, or package-side fixing releasetypeSpatch. From a client perspective, we identify client-side fixing release lags to classify how a client migrates from a vulnerable version to the fixing version as a client-side fixing releasetypeSmajor, client-side fixing pacage minor landing, client-side fixing releasetypeSpatch, or dependency removal. By comparing the package-side fixing release with the client-side fixing release, we identify lags in the adoption process as clients keep stale dependencies.

Our empirical study is comprised of two parts. First, we perform a preliminary study of 231 package-side fixing release s of npm packages on GitHub. We find that the package-side fixing release is rarely released on its own, with up to 85.72% of the bundled commits being unrelated to the fix. Second, we conduct an empirical study of 1,290 package-side fixing release s to analyze their adoption and propagation tendencies throughout a network of 1,553,325 releases of npm packages. We find that quickly releasing fixes does not ensure that clients will adopt them quickly. Indeed, we find that only 21.28% of clients reacted to this by performing a client-side fixing releasetypeSpatch of their own. Furthermore, we find that factors such as the branch upon which a fix lands and the severity of the vulnerability have a small effect on its propagation trajectory throughout the ecosystem, i.e., the latest lineage and medium severity suffer the most lags. To mitigate propagation lags in an ecosystem, we recommend developers and researchers to (i) develop strategies for making the most efficient update via the release cycle, (ii) develop better awareness mechanisms for quicker planning of an update, and (iii) allocate additional time before updating dependencies.

Our contributions are three-fold. The first contribution is a set of definitions and measures to characterize the vulnerability discovery and fixing process from both the vulnerable package, i.e., package-side fixing release, and its client, i.e., client-side fixing release. The second contribution is an empirical study that identified potential lags in the release, adoption, and propagation of a package-side fixing release.

The third contribution is a detailed replication package, which is available at https://github.com/NAIST-SE/Vulnerability-Fix-Lags-Release-Adoption-Propagation.

1.1 Paper Organization

The remainder of the paper is organized as follows: Section 2 describes key concepts and definitions. Section 3 presents the motivations, approaches, and results of our preliminary study. Section 4 introduces the concepts of fix adoption and propagation lags modeling and tracking with the empirical study to identify them. We then discuss implications of our results and threats to validity of the study in Section 5. Section 6 surveys the related works. Finally, Section 7 concludes the study.

2 Concepts and Definitions

2.1 Package-side Vulnerability Fixing Process

Figure 1 illustrates the timeline of the package-side vulnerability fixing process of package \(\mathbb {P}\) in the lower part of the figure. We break down this process into two steps:

Fig. 1
figure 1

The relationship between package-side and client-side regarding vulnerability discovery, fixing, and release process of package \(\mathbb {P}\) and client \(\mathbb {X}\) over time. Red and green releases indicate whether releases are vulnerable or not

Step one: Vulnerability Discovery. Figure 1 shows the vulnerability of package \(\mathbb {P}\) being detected after the release of \(\mathbb {P}_{V1.1.0}\). As reported by Kula et al. (2018b), CVE defines four phases of a vulnerability: (i) threat detection, (ii) CVE assessment, (iii) security advisory, and (iv) patch release. We define the vulnerability discovery as the period between the threat detection and before the patch release. It is most likely that the fixing process starts after the CVE assessment, i.e., a vulnerability has been assigned a CVE number. In addition to a CVE number, a vulnerability report details the affected packages, which can include releases up to an upper-bound of reported versions. In this step, the developers of an affected package are notified via communication channels such as a GitHub issue for GitHub projects.

Step two: Vulnerability Fix and Release

Figure 1 shows the vulnerability of package \(\mathbb {P}\) being fixed and released as \(\mathbb {P}_{V1.1.1}\). We define the vulnerability fix and release as the period where developers spend their efforts to identify and mitigate the vulnerable code. Once the fix is ready, developers merge that fix to a package repository. In most GitHub projects, developers will review changes via a GitHub pull request. Semantic versioning convention is also used to manage the release version number of a package (Preston-Werner 2009). We define the fixing release as the first release that contains a vulnerability fix (\(\mathbb {P}_{V1.1.1}\)). We also define package-side fixing release to describe the fixing release of the vulnerable package and to show how a package bumped the release version number from a vulnerable release to a fixing release. There are three kinds of package-side fixing release based on the semantic versioning: (i) package major landing (major number is bumped up), (ii) package minor landing (minor number is bumped up), and (iii) package patch landing (patch number is bumped up). As shown in Fig. 1, package-side fixing release of package \(\mathbb {P}\) is package patch landing, i.e., \(\mathbb {P}_{V1.1.0}\) to \(\mathbb {P}_{V1.1.1}\).

2.2 Client-side Fixing Release

Prior work suggests that lags in adoption could be the result of migration effort (Kula et al. 2018b). Thus to quantify this effort, we compare a vulnerable release that a client is using, i.e., listed in the report, against the package-side fixing release to categorize as a client-side fixing release. Developers of the vulnerable package fulfilled their responsibility, thus the adoption responsibility is left to the client.

Figure 1 shows the timelines of a fixing release and its clients. As illustrated in the figure, client \(\mathbb {X}\) suffers a lag in the adoption of a package-side fixing release by switching dependency branches, i.e., client-side fixing pacage minor landing. Also, client \(\mathbb {X}\) directly depends on package \(\mathbb {P}\). Client \(\mathbb {X}\) is vulnerable (V2.0.0) due to its dependency (\(\mathbb {P}_{V1.0.1}\)). To mitigate the vulnerability, package \(\mathbb {P}\) creates a new branch, i.e., minor branch, which includes the package-side fixing release (\(\mathbb {P}_{V1.1.1}\)). Client \(\mathbb {X}\) finally adopts the new release of package \(\mathbb {P}\) which is not vulnerable (V1.1.2). We consider that client \(\mathbb {X}\) has a lags, which was not efficient because it actually skipped the package-side fixing release (\(\mathbb {P}_{V1.1.1}\)). Instead, client \(\mathbb {X}\) adopted the next release (\(\mathbb {P}_{V1.1.2}\)). A possible cause of lags is the potential migration effort needed to switch branches, i.e., from \(\mathbb {P}_{V1.0.1}\) to \(\mathbb {P}_{V1.1.2}\). The migration effort for major or minor changes may include breaking changes or issues in the release cycle.

2.3 Motivating Example

Figure 2 shows a practical case of the vulnerability fixing process which affects a library for network communication, i.e., socket.io. Figure 2a and b show step one, where the vulnerability is reported as a GitHub issue,Footnote 1 and summarized in snyk.io.Footnote 2 The vulnerability report contains detailed information regarding the identified problem, its severity, and a proof-of-concept to confirm the threat. In this example, socket.io was vulnerable to a medium severity vulnerability. We also found that the reporter is the same person who also created the fix. Figure 2c and d show step two. Figure 2c shows a fix that will be merged into the code base. The fix is submitted in the form of a pull request.Footnote 3 Fig. 2d shows that there are four commits in a pull request with one commit actually fixes the vulnerability.Footnote 4 The other three commits were found to be unrelated, e.g., "removing fixes for other bug". Figure 2e shows that the package-side fixing release was available on July 25, 2012.Footnote 5

Fig. 2
figure 2

Developer artifacts that mitigate a vulnerability (socket.io) on GitHub

Interestingly, there is a lag in the vulnerability fix and release step. This example shows that there is an 89 days period between when the fix was created and released for any client to use. We found that socket.io merged its fixes on April 27, 2012, however, it was actually released on July 25, 2012 and classified as a package-side fixing releasetypeSpatch, i.e., V0.9.7.

3 Package-side Fix Commits and Landing: Preliminaries

From the motivating example in Section 2.3, we reveal how much development activity is directed towards fixing vulnerabilities compare to the other tasks. Thus, we conduct a preliminary study to characterize package-side fixing release at the commit-level which including (i) changes of package-side fixing release, (ii) contents of package-side fixing release. We first highlight the motivation, approach, and analysis to answer our preliminary questions. We then show our data collection and finally provide the results. The following two preliminary questions guide the study:

(P Q 1) What is the prevalence of package patch landing??

  • Motivation Our motivation for PQ1 is to analyze the package-side fixing release. Different from Decan et al. (2018b), we manually investigate the version changes in the package-side fixing release itself. Our assumption is that every fix is applied as a package-side fixing releasetypeSpatch.

  • Approach The approach to answer PQ1 involves a manual investigation to identify the package-side fixing release, i.e., package major landing, package minor landing, package patch landing. This is done in three steps.

    1. 1.

      The first step is to extract the fix-related information on GitHub. The extracted information is captured into three types as (i) an issue, (ii) a commit, and (iii) a pull request.

    2. 2.

      The second step is to identify the release that contains a fix. This step involves an investigation of the package history. From the link in the first step, the first author manually tracked the git commit history to identify when the fix was applied.

    3. 3.

      The final step is to identify a difference between a vulnerable release and a package-side fixing release. We compare a vulnerable release, i.e., listed in the report, against a package-side fixing release to categorize the changes: (i) package major landing, (ii) package minor landing, and (iii) package patch landing.

  • Analysis The analysis to answer PQ1 is the investigation of package-side fixing release s. We use a summary statistic to show the package-side fixing release distribution. Furthermore, an interesting example case from the result is used to confirm and explain our findings.

(P Q 2) What portion of the release content is a vulnerability fix?

  • Motivation Extending PQ1, our motivation for PQ2 is to reveal what kinds of contents are bundled within the package-side fixing release. We complement recent studies, but at the commit-level (Hejderup2015; Decan et al. 2018a, 2018b). We would like to evaluate the assumption that commits bundled in a package-side fixing release are mostly related to the fixing commits.

  • Approach The approach to answer PQ2 involves a manual investigation of contents inside the package-side fixing release. This is done in three steps.

    1. 1.

      The first step is to gather information of a fixing commit. In this case, we tracked the git commit history similar to the second step of PQ1.

    2. 2.

      The second step is to list commits in a package-side fixing release. We use GitHub comparing changes tool to perform this task.Footnote 6

    3. 3.

      The final step is to identify the type of commits bundled in a package-side fixing release. Similar to PQ1, the first author manually tracked and labeled the commit as either (i) fixing commit or (ii) other commit. To label commits, the first author uses source codes, commit messages and GitHub pull request information. For validation, other co-authors confirmed the results, i.e., one author found the evidence and the other validated.

  • Analysis The analysis to answer PQ2 is to examine the portion of fixing commits in a package-side fixing release. We show the cumulative frequency distribution to describe the distribution of fixing commits for 231 package-side fixing release s. We use a box plot to show fixing commit size in terms of lines of code (LoC). Similar to PQ1, we show interesting example cases from the result.

3.1 Data Collection

Our dataset contains the vulnerability reports with fix-related information on GitHub. For the vulnerability reports, we crawled the data from snyk.io (Snyk 2015) that were originally disclosed in CVE and CWE database. For the fix-related information, we focus on packages from the npm JavaScript ecosystem that is one of the largest package collections on GitHub (NPM 2010) and also has been the focus of recent studies (Kikas et al. 2017; Abdalkareem et al. 2017; Decan et al. 2017, 2018a, 2018b; Hejderup 2015). At the time of this study, we extracted fix-related information links directly from snyk.io (e.g., GitHub issue, commit, pull request).

As shown in Table 1, we crawled and collected all reports from April 9, 2009 to August 7, 2020, i.e., in total 2,373 reports. To identify the reports with fix-related information, we removed reports that (i) do not have the fixing release or (ii) do not provide any fix-related information. After that, we randomly select around 237 reports (10% of 2,373) and manually filter reports that the vulnerable package does not follow semantic versioning. In the end, the dataset for PQ1 and PQ2 analysis consists of 231 reports that affect 172 packages.

Table 1 A summary of package-side dataset information for preliminary study

3.2 Results to the Preliminary Study

(P Q 1) What is the prevalence of package patch landing??

Table 2 shows the evidence that not every fix is applied as a patch. This evidence contradicts our assumption. We find that 64.50% of fixes are a package-side fixing releasetypeSpatch. On the other hand, we find that 7.36% and 28.14% of fixes are package-side fixing releasetypeSmajor and package-side fixing pacage minor landing respectively. From our result, we suspect that some releases, especially package-side fixing releasetypeSmajor and package-side fixing pacage minor landing might contain unrelated contents to the fixing commits.

Table 2 A summary statistic of package-side fixing release distribution in PQ1

The example case is a package major landing of an HTTP server framework, i.e., connect (V2.0.0). This package was vulnerable to Denial of Service (DoS) attack (Snyk 2017a). Under closer investigation, we manually validated that the other fixes were bundled in a package-side fixing release, including API breaking changes (i.e., removed function).Footnote 7 This fix also takes 53 days before it gets released. We suspect that this may cause a lag in the package-side fixing release, especially if the project has a release cycle.

figure g

(P Q 2) What portion of the release content is a vulnerability fix?

Figure 3 is evidence that fixes are usually bundled with other kinds of changes. We find that 91.77% out of 231 fixing releases have up to 14.28% commits that related to the fix, which means that 85.72% of commits were unrelated. Figure 4 shows that the fix itself tends to contain only a few lines of code, i.e., median of 10 LoC. Similar to the commit-level analysis, the package-side fixing release tends to contain a lot of changes, i.e., median of 219 LoC. These results complement the finding of PQ1 about package-side fixing release might contain unrelated changes to the fixing commit.

Fig. 3
figure 3

We find that 91.77% out of 231 fixing releases have fixing commits up to 14.28% of commits in a package-side fixing release

Fig. 4
figure 4

LoC of the fixing commits for 231 vulnerabilities. We find that there are only few fixing fix commits in the package-side fixing release, i.e., median of 10

We show two examples to investigate the content of a fix and its size. The first example is a package patch landing of a simple publish-subscribe messing for a web, i.e., faye (Snyk 2017b). Under closer manual inspection, we find that there is one commit that updates the default value of variables.Footnote 8 However, a package patch landing includes a total of 45 commits that is not related to the fix.Footnote 9 In the second example, we show that the actual fix is only a few lines of code. The npm package, which is the command line interface of a JavaScript package manager (Snyk 2017c), took seven lines of code to fix the vulnerability.Footnote 10

figure h

4 Client-side Lags after the Package-side Fixes

The results of our preliminary study characterize the package-side fixing release, where we find that (i) up to 64.50% of vulnerability fixes are classified as a package-side fixing releasetypeSpatch and (ii) up to 85.72% of commits in a release are unrelated to the actual fix. Based on these results, we suspect that potential lags might occur while the package-side fixing release get adopted by the clients and transitively propagate throughout the dependency network. Hence, we perform an empirical evaluation to explore potential lags in the adoption and propagation of the fix.

4.1 Model and Track Lags

To explore potential lags in both adoption and propagation, we model and track the package-side fixing release and client-side fixing release as illustrated in Fig. 5.

Released and Adopted by Version -

We identify lags in the adoption by analyzing the prevalence of patterns between a package-side fixing release and client-side fixing release, which is similar to technical lag (Zerouali et al. 2018) and based on semantic versioning. The definition of package-side fixing release was explained in Section 2.1 which describes how the package bumped the release version number. Note that pre-releases or special releases are not considered in this study. We then define a new term called a client-side fixing release. Client-side fixing release describes how clients bumped the version of an adopted package up from vulnerable version to fixing release. There are four kinds of client-side fixing release: (i) client major landing (major number of an adopted package is bumped up), (ii) client minor landing (minor number of an adopted package is bumped up), (iii) client patch landing (patch number of an adopted package is bumped up), and (iv) dependency removal (adopted package is removed from a client dependency list).

Figure 5a shows an example of the two terms defined above. First, we find that the package-side fixing release for package \(\mathbb {P}\) is classified as a package patch landing. This is because of the difference between a fixing release (\(\mathbb {P}_{V1.1.1}\)) and its vulnerable release (\(\mathbb {P}_{V1.1.0}\)). Furthermore, we find that the client-side fixing release for client \(\mathbb {X}\) is a client-side fixing pacage minor landing. This is because of the difference between the adopted fixing release (\(\mathbb {P}_{V1.1.2}\)) and its previous vulnerable release (\(\mathbb {P}_{V1.0.1}\)).

Fig. 5
figure 5

These figures show the terms that are used to model and track the lags

Propagation Influencing Factors -

We define Hop as the transitive dependency distance between a package-side fixing release and any downstream clients that have adopted this fix, i.e., one, two, three, and more than or equal to four hops. As shown in Fig. 5a, client \(\mathbb {X}\) is one hop away from package \(\mathbb {P}\). We consider two different factors to model and track lags in the propagation:

  1. 1.

    Lineage Freshness: refers to the freshness of the package-side fixing release as inspired by Cox et al. (2015) and Kula et al. (2018a). Figure 5a shows two types of lineage freshness based on the release branches including: Latest Lineage (LL): the client has adopted any package-side fixing release on the latest branch, and Supported Lineage (SL): the client has adopted any package-side fixing release not on the latest branch. Our assumption is that a package-side fixing release in the latest lineage is adopted faster than a package-side fixing release in a supported lineage, i.e., suffer less lags. Figure 5b shows that three versions of package \(\mathbb {P}\) (V1.0.2, V1.0.3, V1.1.3) are classified as SL.

  2. 2.

    Vulnerability Severity: refers to the severity of vulnerability, i.e., H = high, M = medium, L = low, as indicated in the vulnerability report (as shown in Fig. 2a from Section 2). Our assumption is that a package-side fixing release with higher severity is adopted quicker, i.e., less lags.

4.2 Empirical Evaluation

The goal of our empirical study is to investigate lags in the adoption and propagation. We use these two research questions to guide our study:

(R Q 1) Is the package-side fixing release consistent with the client-side fixing release?

Our motivation for RQ1 is to understand whether developers are keeping up to date with the package-side fixing release s. We define that package-side and client-side fixing releases are consistent if client-side fixing release follow package-side fixing release. For example, client-side fixing pacage minor landing and package-side fixing pacage minor landing combination is consistent, but client-side fixing releasetypeSmajor and package-side fixing releasetypeSpatch combination is not consistent. Our key assumption is that the inconsistent combination requires more migration effort than the consistent one, which in turn is likely to create lags.

(R Q 2) Do lineage freshness and severity influence lags in the fix propagation?

Our motivation for RQ2 is to identify the existence of lags during a propagation. Concretely, we use our defined measures, i.e., propagation influencing factors, to characterize a propagation lags. Our assumption is that a package-side fixing release on the latest lineage with high severity should propagate quickly.

Data Collection -

Our data collection consists of (i) vulnerability reports and (ii) the set of cloned npm package and client git repositories. We use the same 2,373 vulnerability reports as shown in our preliminary study which crawled from snyk.io (Snyk 2015). As inspired by Wittern et al. (2016), we cloned and extracted information of npm package and client from public GitHub repositories. In this study, we consider only normal dependencies listed in the package.json file to make sure that the packages are used in the production environment. Hence, other types of dependencies including: (1) devDependencies, (2) peerDependencies, (3) bundledDependencies, and (4) optionalDependencies are ignored in this study since they will not be installed in the downstream clients in the production or cannot be retrieved directly from the npm registry. To perform the lags analysis, we first filter reports that do not have the fixing release. We then used the package name and its GitHub link from the reports to automatically match cloned repositories.

As shown in Table 3, our data collection included 2,373 vulnerability reports that disclosed from April 9, 2009 to August 7, 2020. There are 1,290 reports that already published the fixing releases which affect 786 different packages. The statistics of vulnerable packages and reports are presented in the table. For package and client repositories, we collected a repository snapshot from GitHub on August 9, 2020 with 152,074 repositories, 611,468 dependencies, and 1,553,325 releases (Table 4).

Table 3 A summary of the data collection which used to populate the dataset to answer RQ1 and RQ2
Table 4 A summary of dataset information for the empirical study to answer RQ1 and RQ2

Approach to Answer RQ1 -

The data processing to answer RQ1 involves the package-side fixing release and client-side fixing release extraction. Similar to PQ1, we first identify the package-side fixing release by comparing a vulnerable release and a fixing release. To track the client-side fixing release, we then extract the direct clients’ version history of the vulnerable packages. A client is deemed vulnerable if its lower-bound dependency falls within the reported upper-bound as listed in a vulnerability report.

To ensure quality, we additionally filter out packages and clients that did not follow semantic versioning as shown Table 5. Our key assumption is to keep packages and clients that follow a semantic version release cycle, i.e., packages and clients should have all the update patterns of major landing, minor landing, and patch landing. As a result, 4,000 packages and clients were filtered out from the dataset. As shown in Table 4, our final dataset for RQ1 consists of 410 vulnerability reports that affect 230 vulnerable packages and 5,417 direct clients.

Table 5 A summary number of filtered clients grouped by their update pattern in RQ1. There are 4,000 packages and clients that excluded in the RQ1

The analysis to answer RQ1 is the identification of lags in the adoption. We show the frequency distribution of client-side fixing release in each package-side fixing release. In order to statistically validate our results, we apply Pearson’s chi-squared test (χ2) (Pearson 1900) with the null hypothesis ‘the package-side fixing release and the client-side fixing release are independent’. To show the power of differences between each package-side fixing release and client-side fixing release combination, we investigate the effect size using Cramér’s V (\(\phi ^{\prime }\)), which is a measure of association between two nominal categories (Cramér 1946). According to Cohen (1988), since the contingency Table 6 has 2 degrees of freedom (df*), effect size is analyzed as follows: (1) \(\phi ^{\prime }\) < 0.07 as Negligible, (2) 0.07 ≤ \(\phi ^{\prime }\) < 0.20 as Small, (3) 0.20 ≤ \(\phi ^{\prime }\) < 0.35 as Medium, or (4) 0.35 ≥ \(\phi ^{\prime }\) as Large. To analyze Cramér’s V, we use the researchpy package.Footnote 11

Approach to Answer RQ2 -

The data processing to answer RQ2 involves propagation influencing factors extraction. There are three steps to track downstream clients and classify lineage freshness and severity. First, we build and traverse in a dependency tree for each package-side fixing release using a breadth-first search (BFS) approach. The meta-data is collected from each downstream client which includes: (i) version, (ii) release date, and (iii) dependency list, i.e., exact version and ranged version. We then classify whether or not a client is vulnerable using an approach similar to RQ1. Our method involves removing duplicated clients in the dependency tree, which is caused by the npm tree structure. Second, we classify the lineage freshness of a fixing release by confirming that it is on the latest branch. Finally, we extract the vulnerability severity from the report. As shown in Table 4, our final dataset for RQ2 consists of 617 vulnerability reports, 344 vulnerable packages with fixing releases, and 416,582 downstream clients.

The analysis to answer RQ2 is the identification of lags in the propagation. We show a summary statistic of lags in terms of days, i.e., the mean, the median, the standard deviation, and the frequency distribution, with two influencing factors. In order to statistically validate the differences in the results, we apply Kruskal-Wallis non-parametric statistical test (Kruskal and Wallis 1952). This is a one-tailed test.Footnote 12 We test the null hypothesis that ‘lags in the latest and supported lineages are the same’. We investigate the effect size using Cliff’s δ, which is a non-parametric effect size measure (Romano et al. 2006). Effect size are analyzed as follows: (1) |δ| < 0.147 as Negligible, (2) 0.147 ≤ |δ| < 0.33 as Small, (3) 0.33 ≤ |δ| < 0.474 as Medium, or (4) 0.474 ≤ |δ| as Large. To analyze Cliff’s δ, we use the cliffsDelta package.Footnote 13

4.3 Results to the Empirical Study

(R Q 1) Is the package-side fixing release consistent with the client-side fixing release?

Our results are summarized into two findings. First, Table 6 shows the evidence that most of package-side fixing release s are package patch landings. As shown in the first row of a table, we find that 245 out of 410 fixing releases have package-side fixing releasetypeSpatchs (highlighted in red). We also find that there are 66 package-side fixing releasetypeSmajors and 99 package-side fixing pacage minor landings. This finding complements the result of PQ1.

Table 6 A contingency table shows the frequency distribution of client-side fixing release for each package-side fixing release

Second, Table 6 shows the evidence that there is a dependency between package-side fixing release and client-side fixing release variables. However, there is no consistency across package-side fixing release and client-side fixing release. As highlighted in Client patch landing row of Table 6, we find that there are only 21.28% of clients adopt a package-side fixing releasetypeSpatch as client-side fixing releasetypeSpatchs. Instead, clients are more likely have client-side fixing pacage minor landings, i.e., 36.84% of clients (highlighted in red). For the case of package-side fixing releasetypeSmajor, there are 53.61% of clients remove their dependencies to avoid vulnerability (highlighted in yellow). The majority of clients that still adopt the package-side fixing releasetypeSmajor are around 43.18% as client-side fixing releasetypeSmajor. The only case that we find consistent is package-side fixing pacage minor landing which 50.40% of clients adopt the fix as client-side fixing pacage minor landing (highlighted in green).

For the statistical evaluation, we find that there is an association between the package-side fixing release and the client-side fixing release. Table 7 shows that our null hypothesis on ‘the package-side fixing release and the client-side fixing release are independent’ is rejected (i.e., χ2 = 1,484.48, p-value< 0.001). From the Cramér’s V effect size (\(\phi ^{\prime }\)), we got a value of 0.37 which shows the large level of association.

figure j
Table 7 A result of statistical test for RQ1

(R Q 2) Do lineage freshness and severity influence lags in the fix propagation?

Our results are summarized into two findings. First, Table 8 shows the evidence that the lineage freshness influences lags in a propagation. As highlighted in red, we find that LL has more lags than SL in terms of days for every hops, e.g., median of lags for the first hop: 164 days > 89 days.

Table 8 A summary statistic of lags in the propagation (# days) categorized by lineage freshness to show the difference between lags in LL and SL

Second, Table 9 shows the evidence that the vulnerability severity influences lags in a propagation. As highlighted in green, we find that the high severity fixing release has the least lags than others in every hop, e.g., the first hop: 91 days. We also find that the medium severity fix has the most lags than others as highlighted in red, e.g., the first hop: 194 days.

Table 9 A summary statistic of lags in the propagation (# days) categorized by vulnerability severity to show the difference of lags between high, medium, and low severity vulnerability fixes
Table 10 A comparison of lags in the propagation between clients that adopt the latest lineage and supported lineage fixing release, i.e., by the median

For the statistical evaluation, we find that lags in the latest and supported lineage showed to have a significant (p-value < 0.001), but negligible to small association. Table 10 shows that our null hypothesis on whether ‘lags in the latest and supported lineages are the same’ is rejected, i.e., the first hop to the more than the fourth hop for medium severity, the second hop and more than the fourth hop for low severity; and the second hop for high severity.

figure m

5 Discussion

5.1 Lessons Learned

This section discusses three main implications based on our results in PQ1, PQ2, RQ1, and RQ2. These are presented as lessons learned and could have implications for both practitioner and researcher.

  1. 1.

    Release cycle matters. According to the results of PQ2, fixing commits are small with less than 14.28% of fixing commits in the package-side fixing release. We suspect that vulnerability fix repackage is a cause for lags. Hence, developers of the vulnerable packages are recommended to release fixes as soon as they have applied the fix, if not, they should highlight these fixes when bundling the fix. Additionally, from 10 randomly selected vulnerabilities, we found that discussions between developers did not include an explicit mention of the vulnerability, i.e., GitHub issue, commit, and pull request. Since developers bundled the fix with other updates, developers may have been unaware. In summary, researchers should provide strategies for making the most efficient update via the release cycle. For example, (i) releasing an emergency patch that does not introduce any new features for security fixes to maintain backward compatibility and (ii) providing a security support for an active version, i.e., long-term-release version. Furthermore, practitioners can upgrade security fixes as first class citizens, so that the vulnerability fix can travel quicker throughout the ecosystem.

  2. 2.

    Awareness is important. According to PQ1 and RQ1, 64.50% of fixes are a package-side fixing releasetypeSpatch. However, clients are more likely to have client-side fixing releasetypeSmajor and client-side fixing pacage minor landing, i.e., 22.37% and 36.84%, than the patch client-side fixing releasetype, i.e., 21.28%. Security fixes need to be highlighted in the update note, as a possible reason is for failure to update because client developers are more interested in major features that are highlighted in an update. Recently, some open source communities start to make tools to highlight the vulnerability problems in a software ecosystem. GitHub (2017, 2020) made a new function for notifying a new vulnerability from the dependency list of clients by using a bot. However, GitHub stated that the tool will not be able to catch everything and send the alert notification within a guaranteed time frame. Also, npm (NPM 2018a) made a new command for listing the vulnerability information in downloaded dependencies of clients and try to automatically fix them called npm audit. However, there are some cases that npm audit does not work. For example, the immediate dependency does not adopt the package-side fixing release from the vulnerable package (NPM 2018b). In these cases, a manual review is required. Thus, client developers have to wait for the propagation of the fixing release, i.e., a lag exists. According to RQ1, 1,389 of 5,417 clients, i.e., 25.64%, decided to remove vulnerable dependencies to mitigate the risk of vulnerabilities. Instead of waiting for the vulnerability fix, clients might remove the vulnerable dependency if they are able to find a similar package as a replacement or do not want to use the vulnerable dependency anymore.

    From a result of PQ2, explicit package-side fixing release with a highlight of the vulnerability is needed to speed up the adoption. In summary, researchers and practitioners need to provide developers more awareness mechanisms to allow quicker planning of the update. The good news is current initiatives like GitHub security are trending towards this.

  3. 3.

    Migration cost effort. From our first finding of RQ2, the package-side fixing release in the latest lineage suffers more lags than the supported lineage in terms of days. A possible reason is that developers of clients consider a package in the supported lineage is more worthwhile to adopt the new release than the latest lineage regardless of the fix. In terms of security, according to the official documentation of npm (NPM 2016), when a security threat is identified, the following severity policy is put into action: (a) P0: Drop everything and fix!, (b) P1: High severity, schedule work within 7 days, (c) P2: Medium severity, schedule work within 30 days, (d) P3: Low severity, fix within 180 days. Surprisingly, the second finding of RQ2 shows evidence that low severity fixes are adopted quicker than medium severity fixes. A possible reason for the quicker adopting of low severity could be because the fix is easier to integrate into the application. In summary, researchers and practitioners that are package developers in npm seem to require additional time before updating their dependencies.

5.2 Threats to Validity

Internal Validity - We discuss three internal threats. The first threat is the correctness of the tools and techniques used in this study. We use the listed dependencies and version number as defined in the package.json meta-file. The threat is that sometimes some dependencies are not listed or the semantic version is invalid and vice-versa, so we applied a filter to remove clients that do not follow the semantic versioning, thus making this threat minimal. The second threat is the tools used to implement our defined terminology (i.e., numpy, scipy, gitpython, and semantic-version). To mitigate this, we carefully confirmed our results by manually validating the results for RQ1 and RQ2, then also manually validating results with statistics on the npm website. For the existing tool for suggesting the package-side fixing release like npm audit, we found that this tool was inappropriate to use in this work due to its limitation of the package-lock.json file is required for analyzing repositories, which only 2.27% of repositories of the dataset and the tool assumes the latest information. Unlike our work, we analyzed the data available in the historical snapshot. As the correctness of dependency relations depends on getting all dependencies, the final internal threat is the validity of our collected data. In this study, our ecosystem is made up of packages and clients that were either affected directly or transitively by at least a single vulnerability. We also make sure that the packages and their repositories are actually listed on a npm registry. We are confident that the results of RQ2 are not affected by invalid data.

External Validity - The main external threat is the generality of other results to other ecosystems. In this study, we focused solely on the npm JavaScript ecosystem. However, our analysis is applicable to other ecosystems that have similar package management systems, e.g., PyPI for Python, Maven for Java. Immediate future plans include studying the lags in other ecosystems. Another threat is the sample size of the analyzed data. In this study, we analyzed only 1,290 vulnerability reports with package-side fixing release s from 2,373 extracted reports from snyk.io. This small size of sample data might not be able to represent the population. However, we are confident of the data quality and reduced bias as we followed strict methods to validate by two authors for PQ1, PQ2, and RQ1 data.

Construct Validity - The key threat is that there may be other factors apart from the two factors, i.e., lineage freshness and vulnerability severity. These factors are based on prior studies, i.e., measuring of dependency freshness from Cox et al. (2015), exploring the impact of vulnerability to transitive dependencies from Kikas et al. (2017), responding to a vulnerability NPM (2016). For future work, we would like to investigate other factors.

6 Related Work

Complementary related works are introduced throughout the paper, in this section, we discuss some key related works.

On Updating Dependencies - These studies relate to the migration of libraries to the latest versions of libraries. With new libraries and newer versions of existing libraries continuously being released, managing a system library dependencies is a concern on its own. As outlined in Raemaekers et al. (2012), Teyton et al. (2012), and Bogart et al. (2016), dependency management includes making cost-benefit decisions related to keeping or updating dependencies on outdated libraries. Additionally, Robbes et al. (2012), Hora et al. (2015), Sawant et al. (2016), Bavota et al. (2015), and Ihara et al. (2017) showed that updating libraries and their APIs are slow and lagging. Decan et al. (2017) showed the comparison of dependency evolution and issue from three different ecosystems. Their results showed that these ecosystems faced the dependency update issue which causes a problem to downstream dependencies, however, there is no perfect solution for this issue. Kula et al. (2018b) found that such update decisions are not only influenced by whether or not security vulnerabilities have been fixed and important features have been improved, but also by the amount of work required to accommodate changes in the API of a newer library version. Decan et al. (2018a) performed an empirical study of technical lag in the npm dependency network and found that packages are suffered from the lags if the latest release of dependencies are not covered by their version ranges in package.json file. They also found that semantic versioning could be used in order to reduce the technical lag. Mirhosseini and Parnin (2017) studied about the pull request notification to update the dependencies. Their results showed that pull request and badge notification can reduce lags, however, developers are often overwhelmed by lots of notifications. Another study by Abdalkareem et al. (2017) focused on the trivial packages and found that they are common and popular in the npm ecosystem. They also suggested that developers should be careful about the selection of packages and how to keep them updated. Our work focuses on the lag of the fixing release adoption and propagation with the influencing factors (i.e., lineage freshness and severity). We also expand our study from prior works by taking transitive clients (i.e., downstream propagation) into consideration. Our work complements the findings of prior work, with the similar goal of encouraging developers to update.

On Malware and Vulnerabilities - These studies relate to the security vulnerability within a software ecosystem from various aspects. Decan et al. (2018b) explored the impact of vulnerability within the npm ecosystem by analyzing the reaction time of developers from both vulnerable packages and their direct dependent packages to fix the vulnerabilities. They also considered the reactions of developers from different levels of vulnerability severity. They found that the vulnerabilities were prevalent and took several months to be fixed. Several studies also explored the impact of vulnerability within various ecosystems by analyzing the dependency usage (Kikas et al. 2017; Linares-Vásquez et al. 2017; Hejderup 2015; Lauinger et al. 2017). Some studies tried to characterize the vulnerability and its fix in various ecosystems other than the npm ecosystem (Li and Paxson 2017; Piantadosi et al. 2019). There is a study about the relationship between bugs and vulnerabilities, to conclude that the relationship is weak (Munaiah et al. 2017). In order to increase the developers’ awareness of the security vulnerability, some studies tried to create a tool to detect and alert vulnerability when it disclosed (Cadariu et al. 2015; GitHub 2018b). There is a study about addressing the over-estimation problem for reporting the vulnerable dependencies in open source communities and the impact of vulnerable libraries usage in the industry (Pashchenko et al. 2018). Additionally, some studies tried to predict the vulnerability of software systems by analyzing the source code (Shin and Williams 2008; Chowdhury and Zulkernine 2011; Alhazmi et al. 2007). Our work takes a look at the package-side fixing release and its fix at the commit-level of npm package vulnerabilities, instead of the release-level in prior studies. We also propose a set of definitions to characterize the package-side fixing release and client-side lags, which covers both direct and transitive clients. Prior works, instead, only analyze the direct clients.

Mining-related Studies - These studies relate to the mining techniques in the software repository and software ecosystem. The first step of software repository mining is data collection and extraction. Researchers need to have data sources and know about which part of the data can use in their work. In the case of the npm package repository, we can extract the information of packages from package.json meta-file (Wittern et al. 2016; Mirhosseini and Parnin 2017). In the case of the security vulnerability, we can collect data from Common Weakness Enumeration (CWE) (Mitre Corporation 2018b) and Common Vulnerabilities and Exposures (C

VE) (Mitre Corporation 2018a) database (Linares-Vásquez et al. 2017; Munaiah et al. 2017; Lauinger et al. 2017; Cadariu et al. 2015; Alhazmi et al. 2007; Chowdhury and Zulkernine 2011). To study the issues within the software ecosystem, we also define the traversal of the downstream clients by using the dependency list of clients. These studies introduce some techniques to model the dependency graph (Kikas et al. 2017; Hejderup 2015; Bavota et al. 2015). Our work uses similar mining techniques to extract the dependencies as well as construct the ecosystem. In our work, we manually extract and investigate the commits to understand the contents of the fix.

7 Conclusion

Security vulnerability in third-party dependencies is a growing concern for software developers as the risk of it could be extended to the entire software ecosystem. To ensure quick adoption and propagation of a fixing release, we conduct an empirical investigation to identify lags that may occur between the vulnerable release and its fixing release from a case study of npm JavaScript ecosystem. We found that the package-side fixing release is rarely released on its own, with up to 85.72% of the bundled commits in a package-side fixing release being unrelated to the fix. We then found that a quick package-side fixing release (i.e., package-side fixing releasetypeSpatch) does not always ensure that a client will adopt it quicker, with only 17.69% of clients matching a package-side fixing releasetypeSpatch to a client-side fixing releasetypeSpatch. Furthermore, factors such as the lineage freshness and the vulnerability severity have a small effect on its propagation.

In addition to theses lags that we identified and characterized, this paper lays the groundwork for future research on how to mitigate these propagation lags in an ecosystem. We suggest that researchers should provide strategies for making the most efficient update via the release cycle. Practitioners also need more awareness to allow quicker planning of the update. Potential future avenues for researchers include (i) a developer survey to a better understanding of the reason for releasing and adopting fixes, (ii) a performance improvement plan for highlighting the fixing release tool, (iii) a tool for managing and prioritizing vulnerability fixing process.