Skip to content

dependabit / detector/src / Detector

Class: Detector

Defined in: packages/detector/src/detector.ts:183

Orchestrates multi-stage detection of informational external dependencies inside a local repository clone.

Remarks

Detection follows a hybrid pipeline:

  1. Programmatic parsing of README files, documentation, package metadata, and code comments extracts candidate URLs.
  2. An LLM second-pass (up to 5 README files, capped at 5 000 chars each) enriches the candidate set with URLs that plain text parsing missed.
  3. Programmatic heuristics classify each URL's DependencyType and AccessMethod; an LLM fallback is used only when heuristics fail.
  4. Low-confidence entries (below 0.5) are discarded before returning.

The detector does not write to disk or mutate any manifest file — callers are responsible for merging the returned DetectionResult into an existing manifest via @dependabit/manifest.

Use When

Scanning a freshly-cloned or locally-checked-out repository to build an initial manifest, or during CI to detect newly-introduced dependencies from a commit diff.

Avoid When

  • The repository is very large (> 10 000 source files) — source file scanning is hard-capped at 50 files per run.
  • You need deterministic, reproducible output across model versions — LLM classifications are non-deterministic even with temperature: 0.

Pitfalls

  • LLM output format instability: the detector parses raw JSON from the LLM response; a model update that changes the output schema will silently produce zero LLM-sourced results rather than throwing. Pin the model version in DetectorOptions.llmProvider when reproducibility matters.
  • Non-determinism: identical inputs across two runs may produce different dependencies arrays if LLM classification is involved. Never diff two manifests by dependency count alone.
  • Token budget exhaustion: manifests with large README files are truncated to 5 000 characters before being sent to the LLM. URLs that appear only in the truncated portion will not be discovered by the LLM pass (they may still be found by the programmatic parser).
  • Source file cap: only the first 50 source files returned by the directory traversal are scanned for code-comment references. Repositories with many source files may have incomplete coverage.

Example

ts
import { Detector } from '@dependabit/detector';
import { GitHubCopilotProvider } from '@dependabit/detector';

const detector = new Detector({
  repoPath: '/path/to/repo',
  llmProvider: new GitHubCopilotProvider({ model: 'gpt-4o' }),
  repoOwner: 'my-org',
  repoName: 'my-repo',
});

const result = await detector.detectDependencies();
console.log(`Found ${result.dependencies.length} dependencies`);

Constructors

Constructor

ts
new Detector(options): Detector;

Defined in: packages/detector/src/detector.ts:189

Parameters

ParameterType
optionsDetectorOptions

Returns

Detector

Detector

analyzeFiles()

ts
analyzeFiles(filePaths): Promise<DetectionResult>;

Defined in: packages/detector/src/detector.ts:1003

Analyzes a specific list of files for dependencies rather than scanning the entire repository. Prefer this over Detector.detectDependencies when only a handful of files changed (e.g., in a pull-request diff).

Parameters

ParameterTypeDescription
filePathsstring[]Absolute or repoPath-relative file paths to analyze. Paths outside the repository root are silently skipped to prevent directory-traversal attacks.

Returns

Promise<DetectionResult>

Detected dependencies and diagnostic statistics.

Remarks

Unlike detectDependencies, this method does NOT perform an LLM second pass on README files. Classification still falls back to the LLM when programmatic heuristics cannot determine a dependency type.

Use When

Incremental manifest updates after a commit or pull request — pass the list of changed files from the diff parser.

Avoid When

Running a full initial scan — use Detector.detectDependencies instead, which also performs an LLM enrichment pass.

Pitfalls

  • Files outside repoPath are silently skipped without error. Callers relying on path-traversal behaviour will get empty results.
  • File read errors are logged with console.warn and skipped, not thrown; a broken file system will produce partial results without surfacing an error.

detectDependencies()

ts
detectDependencies(): Promise<DetectionResult>;

Defined in: packages/detector/src/detector.ts:253

Performs a full-repository scan and returns all detected informational dependencies as a DetectionResult.

Returns

Promise<DetectionResult>

Detected dependencies and diagnostic statistics.

Remarks

The scan is bounded: README files are capped at 5 for LLM analysis, source files at 50 for code-comment parsing. Results are non-deterministic when LLM classification is involved.

Throws

If the LLM provider's analyze call throws and the error is not caught by the internal try-catch blocks (individual LLM failures are logged and skipped; file-system errors bubble up).

Use When

Building an initial manifest for a repository or running a full refresh on a schedule.

Avoid When

Only a small subset of files changed — prefer Detector.analyzeFiles for incremental updates to avoid unnecessary LLM calls.

Pitfalls

  • Results are not cached between calls; calling detectDependencies twice on the same instance makes duplicate LLM calls.
  • The method does not deduplicate against an existing manifest; callers must use mergeManifests from @dependabit/manifest to avoid duplicate entries.

Released under the MIT License.