Sourcemap Explorer
Guide

How to extract package.json files from a sourcemap

Sourcemaps often include `node_modules/<pkg>/package.json` as a source file — especially when the bundler does module resolution at compile time. Each one is a structured, ready-to-parse manifest: name, version, dependencies, engines, exports map, side-effects flag. Extracting them is essentially trivial JavaScript, and the resulting dataset is the closest thing to a lockfile you can derive from a deployed site.

By Mapree ·

6 min readNode.js, jq, Sourcemap Explorer extension

Background

A `package.json` inside a sourcemap is a peculiar artifact. It's the literal manifest of the bundled version of a library — the same file you'd find in `node_modules/<pkg>/package.json` on the author's machine at build time. For popular packages (React, Vue, Next.js, lodash, the Radix primitives, Tailwind) it includes the `name`, `version`, `main`, `module`, `exports`, `dependencies`, `devDependencies`, `peerDependencies`, `engines`, `sideEffects`, `repository` and (on packages that ship them) `keywords` and `funding` fields. That's more information than you could reasonably extract any other way from a deployed bundle.

Extracting them is mechanical: iterate `sources[]`, filter entries whose path ends in `package.json`, find the matching index, JSON.parse the matching `sourcesContent[]`, push the result into an aggregation. The whole pipeline is 15-20 lines of Node.js or a single `jq` invocation. Doing the same thing across every bundle on a page (a typical Next.js app loads 5-30 chunks each with its own map) requires fanning the extraction out and deduping by `<name>@<version>` at the end. Sourcemap Explorer parallelises this in an offscreen worker pool and serves you the result in the popup, but the underlying primitive is dead simple and writing your own version takes an afternoon if you ever need to integrate it into a custom audit pipeline.

The data is genuinely lockfile-grade. Compare a published npm-registry manifest for `react@18.2.0` with the `package.json` you extract from a sourcemap of a site running React 18.2.0 and the two should match byte for byte (modulo deliberate trimming by the publisher). That equivalence is what makes the extraction so useful for downstream work — security scanning, CVE matching, compatibility-matrix generation, dependency-tree reconstruction — because the inputs are the canonical published manifests for the exact versions the site bundled.

Why this matters

`package.json` files tell you not just what version of a library is running, but what it expects of its environment (`engines`), what it depends on (`dependencies`, `peerDependencies`), what shape it exposes (`exports`, `main`, `module`), and often what the author of the site explicitly chose (`sideEffects`, `bin`, `funding`). Aggregating them across a site's sourcemaps gives you a lockfile-quality view of the bundled ecosystem — the same kind of dataset you would assemble by hand from a project's `pnpm-lock.yaml` or `yarn.lock`, except derived from a deployed bundle rather than from source-control access.

For security work the dataset is directly actionable: feed each `name@version` pair into the GitHub Advisory Database, the Snyk Vulnerability DB, or the OSS Index and you get a per-site CVE exposure report in seconds. For compatibility work the `peerDependencies` and `engines` fields tell you which React, Node or browser versions the site needs to keep around. For audit work the `dependencies` graph tells you the second-degree imports — what `react-hot-toast` brings in, what Sentry's browser SDK pulls into the bundle — without having to run a fresh install yourself.

For competitive and partnership work the published manifests reveal a lot of small choices that summary detectors miss. The `repository.url` field tells you who maintains a package; the `funding` field tells you which projects the site implicitly supports; the `keywords` field tells you how the package author categorises themselves; the `exports` map tells you which API surface the bundler targeted. Aggregated across an entire site, those small signals add up to a richer picture than the categorical 'they use React' answer that any other technique can reach.

For education the dataset is unmatched. If you are learning how a real production application is composed, having the bundled manifests of every dependency in front of you, version-aligned and byte-identical to what the developer shipped, is a kind of access that did not exist before sourcemap-aware tooling. You can read the manifests, follow the `dependencies` chain, see how a modern app is layered, and learn the architectural conventions a thousand teams have settled on.

Prerequisites

  • Node.js 18+ for the JavaScript path, or `jq` for the shell-script path. Either one is fine; pick the one your tooling already uses.
  • A sourcemap file — either downloaded with the [download-sourcemaps-from-a-website](/how-to/download-sourcemaps-from-a-website) workflow or pulled directly with `curl` from the URL referenced by the `SourceMap` header.
  • Optional: Sourcemap Explorer installed, which runs the extraction across every bundle on the page automatically and surfaces the result in the popup's Stack tab.
  • Optional: a CVE database or vulnerability scanner (`npm audit`, `osv-scanner`, Snyk CLI) for the next-step security work that the extracted dataset enables.

Step-by-step

  1. 1

    Load the sourcemap

    Read the `.map` file as a string and `JSON.parse` it. The result has at minimum a `version` field (always 3 for modern maps), a `sources` array (paths the bundler consumed), a `sourcesContent` array (the literal content of each source, in the same order as `sources`), and a `mappings` field (encoded position information, irrelevant for this workflow). Some maps also have `names`, `file`, `sourceRoot` and other fields. You only need `sources` and `sourcesContent` for `package.json` extraction.

    import fs from 'node:fs';
    const map = JSON.parse(fs.readFileSync('bundle.js.map', 'utf8'));
    console.log('sources:', map.sources.length, 'sourcesContent:', map.sourcesContent?.length ?? 0);
  2. 2

    Filter paths ending in package.json

    Loop `map.sources`, keep entries whose path ends in `/package.json` (or starts with `package.json` if the bundler used relative paths from the project root). Remember the index `i` — you'll need it to look up the matching `sourcesContent[i]` in the next step. Be careful with scoped packages: `node_modules/@radix-ui/react-dialog/package.json` is a perfectly valid scoped path and your regex needs to match it.

    const pkgIndices = [];
    for (let i = 0; i < map.sources.length; i++) {
      const src = map.sources[i];
      if (/(^|\/)package\.json$/.test(src)) pkgIndices.push(i);
    }
    console.log('package.json sources:', pkgIndices.length);
  3. 3

    Parse the matching sourcesContent

    For each kept index, `const pkg = JSON.parse(map.sourcesContent[i])`. Pull `pkg.name`, `pkg.version`, and whatever else you care about — `dependencies`, `peerDependencies`, `engines`, `sideEffects`, `repository`, `funding`. Wrap the parse in a try/catch in case some entries are stripped or malformed (rare but possible).

    const packages = [];
    for (const i of pkgIndices) {
      try {
        const pkg = JSON.parse(map.sourcesContent[i]);
        packages.push({
          name: pkg.name,
          version: pkg.version,
          deps: Object.keys(pkg.dependencies ?? {}).length,
          sourcePath: map.sources[i],
        });
      } catch (err) {
        console.warn('skipped', map.sources[i], err.message);
      }
    }
  4. 4

    Build a name → version map

    Aggregate across bundles. If two `package.json`s share a name with different versions (common when pnpm/yarn dedupe differs across chunks, or when a monorepo accidentally hoisted two Reacts), keep both — they're both real and the duplication is itself worth flagging. The simplest aggregation is `Map<name, Set<version>>`; print and sort by name at the end.

    const inventory = new Map();
    for (const p of packages) {
      if (!inventory.has(p.name)) inventory.set(p.name, new Set());
      inventory.get(p.name).add(p.version);
    }
    for (const [name, versions] of [...inventory].sort()) {
      console.log(`${name}: ${[...versions].join(', ')}`);
    }
  5. 5

    Aggregate across all chunks

    A single page typically loads 5-30 chunks, each with its own map. The full library inventory is the union of all the `package.json` extractions. Walk every map URL listed in [download-sourcemaps-from-a-website](/how-to/download-sourcemaps-from-a-website), run the extractor, and merge into a single `Map<name, Set<version>>`. The result is the page-level lockfile-shape inventory.

  6. 6

    Feed the inventory into vulnerability scanning

    Convert the inventory into a synthetic `package.json` (or any format your scanner consumes) and run `npm audit --json`, `osv-scanner --lockfile`, or your scanner of choice. The CVE matches you get back are the per-site security exposure derived purely from the deployed bundle. This is one of the highest-leverage uses of sourcemap-extracted manifests and impossible without the extraction step.

    Tip: `osv-scanner` and the GitHub Advisory database use the same OSV format and are easy to integrate into a script. The whole pipeline — sourcemap → manifests → CVEs — runs in a few seconds for a typical site and produces an actionable security audit at the end.

  7. 7

    Use Sourcemap Explorer for the day-to-day case

    The extension does this step automatically on every bundle it finds, surfaces the resulting versions alongside the library entry on the Stack tab, and lets you click into any entry to see the originating `package.json`. For bulk / offline / scripted analysis, the Node.js script above is what you want; for everyday in-browser use, the extension is faster and lower-friction.

Real-world example

Alternative methods

One-liner with jq

`jq -r '.sources | to_entries[] | select(.value | endswith("package.json")) | .key' bundle.js.map | xargs -I{} jq -r '.sourcesContent[{}]' bundle.js.map` is the shell-equivalent of the JS pipeline above. Faster to type, slower to run on large maps, and harder to extend with the npm-registry validation step.

Use the source-map npm package

The `source-map` npm package (Mozilla's reference implementation) parses maps lazily and exposes the same `sources` and `sourcesContent` arrays via a higher-level API. Useful when you want to combine `package.json` extraction with mapping logic (resolving runtime positions back to original sources). Overkill for this specific workflow but the right tool for adjacent ones.

Reverse-engineer from minified strings

If sourcemaps are not available, you can sometimes still find package names and versions baked into the bundle as comments or string constants — many libraries embed `/*! library v1.2.3 */` markers that survive minification. Tedious and incomplete; the sourcemap path is dramatically better when available.

Troubleshooting

Script finds zero `package.json` entries.

Either the bundler stripped them, or the regex is too strict. Widen to `package.json$` (case-insensitive) and log the first 20 `sources[]` entries to see the actual shape. Some webpack configurations write paths like `./node_modules/react/package.json`; some Vite configurations strip the `node_modules/` prefix entirely.

Multiple versions of the same package.

Monorepo or hoisting artifact. Both are real; decide whether to report both, the semver-max, or investigate which chunks use which. Two Reacts in the bundle is both a performance and a correctness problem and is worth flagging in any audit.

`sourcesContent[i]` is `null` for a kept index.

The bundler kept the path in `sources[]` but stripped the body from `sourcesContent[]` (a webpack optimization). The package is confirmed present; you just cannot read the version this way.

JSON.parse throws on a kept entry.

Rare, but happens when a custom bundler plugin pre-processes the file. Wrap in try/catch and log the offending source path; you can usually inspect it manually.

Some entries have a `version` like `0.0.0-internal`.

Internal monorepo package with a placeholder version. The npm registry HEAD check will return 404. Filter or flag separately depending on whether you care about internal packages in the inventory.

Caveats

What to do next

`package.json` data pairs naturally with the [reconstruct-source-code-from-a-sourcemap](/how-to/reconstruct-source-code-from-a-sourcemap) guide — combined, you get a full browsable project tree with a lockfile-shape manifest of its bundled dependencies. For per-version scanning across a single site without writing a script, [find-the-exact-version-of-an-npm-package-on-a-site](/how-to/find-the-exact-version-of-an-npm-package-on-a-site) is the lighter-weight workflow. For the broader 'what is in the bundle' question, [see-every-javascript-library-a-site-uses](/how-to/see-every-javascript-library-a-site-uses) is the natural next step. For security workflows the extracted dataset feeds directly into vulnerability scanning — `osv-scanner --lockfile <synthetic-package.json>` is the simplest pipeline; `npm audit` consumed via a synthetic `package-lock.json` is another option; commercial scanners (Snyk, GitHub Advanced Security, Sonatype) all consume similar inputs. For competitive analysis the dataset feeds into stack-comparison work; the [Wappalyzer alternatives](/alternatives/wappalyzer) page contrasts what fingerprint-based detectors miss against what sourcemap-derived inventory captures, and the [Next.js detection page](/detect/nextjs) shows how the version-level reading folds back into framework-aware analysis.

FAQ

Why would a bundler include `package.json` at all?

Bundlers read `package.json` at resolution time (for `main`, `module`, `exports`, `sideEffects` fields). When sourcemap generation captures the full module graph, those reads end up as `sources[]` entries. Whether the map inlines their content is a bundler configuration choice — most modern bundlers default to inlining because the disk-vs-bandwidth tradeoff favours larger maps with richer source attribution.

Do CSS sourcemaps have package.json files too?

Occasionally. PostCSS, Sass and Lightning CSS pipelines sometimes include their bundled package's `package.json`. Less common than in JS maps but not unheard of. Tailwind v4's CSS sourcemap usually includes `node_modules/tailwindcss/package.json`, which is how Sourcemap Explorer can read the exact Tailwind version from the CSS layer.

Can I trust the `dependencies` field inside an extracted package.json?

Yes — it is the literal manifest the bundler resolved against. Treat it as authoritative for that specific bundled version of the package. The `dependencies` are what the package declared at publish time; the actual transitively-bundled set may differ slightly because of dedupe and hoisting decisions made by the install-time tool (npm/yarn/pnpm).

How does this relate to the Software Bill of Materials (SBOM) that compliance teams ask for?

The extracted inventory is structurally the same kind of data an SBOM contains — package names, versions, sometimes licenses (which are also in `package.json`). Sourcemap-derived inventories are not a substitute for build-time SBOM generation (which has access to dev-dependencies, exact transitive trees, and license-scan results), but they are the closest you can get to SBOM-shape data without source-tree access, which makes them useful for vendor-risk evaluation and third-party audits.

What if the version field is a pre-release tag like `0.0.0-canary-abc123`?

Real value, just an unusual one — the team is shipping a release candidate or a Vercel-style canary build. The npm registry will have the version published if the tag is public; if the tag is private (internal monorepo build), the npm HEAD check will 404. Both are legitimate outcomes; flag the canary build in your audit notes if it matters.

Can I extract package.json from a sourcemap that's hosted but not directly downloadable?

If the browser can fetch it, you can. Some sites set CORS headers that block cross-origin map fetches; the workaround is to use the browser DevTools 'Save as' on the map URL once it loads, or to run the extraction inside a browser extension (which is what Sourcemap Explorer does — it has the same fetch privileges as the page).

How big does the extraction scale to?

We routinely extract from gigabyte-sized site dumps inside the offscreen worker pool without choking. The bottleneck is JSON.parse over very large maps (~200 MB+); for those, streaming parsers (`stream-json`, the Node `stream-chain` patterns) help significantly.

Related

Skip the manual steps.

Sourcemap Explorer automates every workflow in this guide — free, local, no sign-up.

Install free on Chrome