dependabit / detector/src / Detector
Class: Detector
Defined in: packages/detector/src/detector.ts:183
Orchestrates multi-stage detection of informational external dependencies inside a local repository clone.
Remarks
Detection follows a hybrid pipeline:
- Programmatic parsing of README files, documentation, package metadata, and code comments extracts candidate URLs.
- An LLM second-pass (up to 5 README files, capped at 5 000 chars each) enriches the candidate set with URLs that plain text parsing missed.
- Programmatic heuristics classify each URL's
DependencyTypeandAccessMethod; an LLM fallback is used only when heuristics fail. - Low-confidence entries (below 0.5) are discarded before returning.
The detector does not write to disk or mutate any manifest file — callers are responsible for merging the returned DetectionResult into an existing manifest via @dependabit/manifest.
Use When
Scanning a freshly-cloned or locally-checked-out repository to build an initial manifest, or during CI to detect newly-introduced dependencies from a commit diff.
Avoid When
- The repository is very large (> 10 000 source files) — source file scanning is hard-capped at 50 files per run.
- You need deterministic, reproducible output across model versions — LLM classifications are non-deterministic even with
temperature: 0.
Pitfalls
- LLM output format instability: the detector parses raw JSON from the LLM response; a model update that changes the output schema will silently produce zero LLM-sourced results rather than throwing. Pin the model version in
DetectorOptions.llmProviderwhen reproducibility matters. - Non-determinism: identical inputs across two runs may produce different
dependenciesarrays if LLM classification is involved. Never diff two manifests by dependency count alone. - Token budget exhaustion: manifests with large README files are truncated to 5 000 characters before being sent to the LLM. URLs that appear only in the truncated portion will not be discovered by the LLM pass (they may still be found by the programmatic parser).
- Source file cap: only the first 50 source files returned by the directory traversal are scanned for code-comment references. Repositories with many source files may have incomplete coverage.
Example
import { Detector } from '@dependabit/detector';
import { GitHubCopilotProvider } from '@dependabit/detector';
const detector = new Detector({
repoPath: '/path/to/repo',
llmProvider: new GitHubCopilotProvider({ model: 'gpt-4o' }),
repoOwner: 'my-org',
repoName: 'my-repo',
});
const result = await detector.detectDependencies();
console.log(`Found ${result.dependencies.length} dependencies`);Constructors
Constructor
new Detector(options): Detector;Defined in: packages/detector/src/detector.ts:189
Parameters
| Parameter | Type |
|---|---|
options | DetectorOptions |
Returns
Detector
Detector
analyzeFiles()
analyzeFiles(filePaths): Promise<DetectionResult>;Defined in: packages/detector/src/detector.ts:1003
Analyzes a specific list of files for dependencies rather than scanning the entire repository. Prefer this over Detector.detectDependencies when only a handful of files changed (e.g., in a pull-request diff).
Parameters
| Parameter | Type | Description |
|---|---|---|
filePaths | string[] | Absolute or repoPath-relative file paths to analyze. Paths outside the repository root are silently skipped to prevent directory-traversal attacks. |
Returns
Promise<DetectionResult>
Detected dependencies and diagnostic statistics.
Remarks
Unlike detectDependencies, this method does NOT perform an LLM second pass on README files. Classification still falls back to the LLM when programmatic heuristics cannot determine a dependency type.
Use When
Incremental manifest updates after a commit or pull request — pass the list of changed files from the diff parser.
Avoid When
Running a full initial scan — use Detector.detectDependencies instead, which also performs an LLM enrichment pass.
Pitfalls
- Files outside
repoPathare silently skipped without error. Callers relying on path-traversal behaviour will get empty results. - File read errors are logged with
console.warnand skipped, not thrown; a broken file system will produce partial results without surfacing an error.
detectDependencies()
detectDependencies(): Promise<DetectionResult>;Defined in: packages/detector/src/detector.ts:253
Performs a full-repository scan and returns all detected informational dependencies as a DetectionResult.
Returns
Promise<DetectionResult>
Detected dependencies and diagnostic statistics.
Remarks
The scan is bounded: README files are capped at 5 for LLM analysis, source files at 50 for code-comment parsing. Results are non-deterministic when LLM classification is involved.
Throws
If the LLM provider's analyze call throws and the error is not caught by the internal try-catch blocks (individual LLM failures are logged and skipped; file-system errors bubble up).
Use When
Building an initial manifest for a repository or running a full refresh on a schedule.
Avoid When
Only a small subset of files changed — prefer Detector.analyzeFiles for incremental updates to avoid unnecessary LLM calls.
Pitfalls
- Results are not cached between calls; calling
detectDependenciestwice on the same instance makes duplicate LLM calls. - The method does not deduplicate against an existing manifest; callers must use
mergeManifestsfrom@dependabit/manifestto avoid duplicate entries.