Scanner Details

CodeStax uses industry-leading security scanning engines, each specialized for a different aspect of code security.

SAST Engine

Type: Static Application Security Testing Languages: 30+ including Python, JavaScript, TypeScript, Java, Go, Ruby, PHP, C#, Kotlin, Swift, Rust, and more

Version pinned: rules + engine pinned in the scanner container; see Reproducibility for how pinning + AI triage cache produce same-scan-same-score guarantees.

The SAST engine performs pattern-based static analysis to find vulnerabilities in your source code.

What It Finds

Injection flaws — SQL injection, XSS, command injection, LDAP injection
Authentication issues — Hardcoded credentials, weak password handling
Cryptographic failures — Weak algorithms, insecure random, missing encryption
Access control — Path traversal, insecure direct object references
Security misconfigurations — Debug mode enabled, verbose error handling
OWASP Top 10 — Full coverage of the most critical web application security risks

Rule Coverage

OWASP Top 10 rules
CWE (Common Weakness Enumeration) mapped rules
Language-specific security patterns

SCA Engine

Type: Software Composition Analysis Ecosystems: npm, pip, Maven, Gradle, Go modules, Cargo, Composer, RubyGems, NuGet

The SCA engine scans your dependency manifests to find known vulnerabilities (CVEs) in third-party packages. It also runs three additional sub-scanners in the same pass:

License compliance — every package’s license, categorized by risk (HIGH/MEDIUM/LOW). Surfaces copyleft-in-permissive-project + unknown-license situations. Emitted as type: "license" findings.
Secret sub-scanner — defense-in-depth over Gitleaks; catches hardcoded secrets inside dependency source. Emitted as type: "secret" findings with redacted previews.
CVE enrichment — every CVE is cross-referenced with NVD (full advisory text), CISA KEV (actively-exploited flag + remediation deadline), and EPSS (30-day exploitation probability percentile). All three signals fuse into the priority score.

What It Finds

Known CVEs — Vulnerabilities published in the National Vulnerability Database
Outdated packages — Dependencies with newer versions available
License issues — Dependencies with incompatible or copyleft licenses
Transitive dependencies — Vulnerabilities in packages your dependencies depend on

Supported Files

All major package manifest formats are supported.

Secrets Engine

Type: Credential and secret scanning

The Secrets engine scans your repository for accidentally committed secrets.

What It Finds

API keys — Cloud provider keys, payment service tokens, messaging platform credentials, and more
Passwords — Hardcoded passwords in configuration files
Tokens — OAuth tokens, JWTs, personal access tokens
Private keys — SSH keys, TLS certificates
Database URLs — Connection strings with embedded credentials

Smart Filtering

CodeStax applies intelligent filtering to reduce false positives, including placeholder detection, test file exclusion, and example file filtering.

IaC Engine

Type: Infrastructure-as-Code security

The IaC engine scans your infrastructure configuration files for security misconfigurations.

Supported Formats

Format	File Types
Terraform	`.tf`, `.tfvars`
Kubernetes	YAML manifests, Helm charts
CloudFormation	JSON/YAML templates
Dockerfile	`Dockerfile`
Docker Compose	`docker-compose.yml`
ARM Templates	Azure Resource Manager

What It Finds

Open security groups — Unrestricted inbound/outbound rules
Unencrypted storage — S3 buckets, EBS volumes without encryption
Public access — Resources exposed to the internet
Missing logging — CloudTrail, access logs not enabled
Weak IAM policies — Overly permissive roles and policies

Container Engine

Type: Dockerfile linting and security analysis

What It Finds

Insecure base images — Using latest tag, non-official images
Running as root — Missing USER directive
Package pinning — Unpinned apt-get install commands
Layer optimization — Best practices for Docker layer caching
Security best practices — COPY vs ADD, health checks, signal handling

Dockerfile Linter (Hadolint)

Type: Dockerfile best-practice linting Runs in: Deep scans only

Scans all Dockerfiles in your repository for best-practice violations, security issues, and configuration problems. Hadolint checks against established Dockerfile conventions and flags common mistakes like running as root, missing health checks, and inefficient layer caching.

What It Finds

Running as root — Missing or incorrect USER directives
Missing health checks — No HEALTHCHECK instruction defined
Inefficient layer caching — Commands that invalidate Docker cache unnecessarily
Unpinned versions — apt-get install without version pinning
Unsafe practices — Using ADD instead of COPY, curl | bash patterns

Output

Each finding includes a Hadolint rule code (e.g., DL3006, SC2086) with a direct link to the rule documentation and remediation guidance. Findings are categorized as IaC issues in the scan results.

Code Quality Engines

Code quality runs alongside security in every scan. Six tools, combined into a single finding stream and fed into A–E ratings + SQALE tech-debt hours (see Quality Ratings).

Cyclomatic Complexity — Lizard

Type: Multi-language cyclomatic complexity Languages: 20+ — C, C++, Java, C#, JavaScript, TypeScript, Go, Ruby, PHP, Swift, Scala, Rust, Kotlin, Lua, Objective-C, GDScript, Erlang, Solidity, TTCN-3, Fortran, and more

Flags functions where cyclomatic complexity exceeds the configured threshold (default: medium ≥ 15, high ≥ 20, critical ≥ 25). Rank A–F is reported per function (A = 1–5, F = 31+).

Cyclomatic + Maintainability Index — Radon (Python)

Type: Python-specific complexity + maintainability What: Radon CC (cyclomatic) + Radon MI (maintainability index 0–100, where values below 20 are flagged as high maintainability risk)

Cognitive Complexity — Python

Type: SonarSource cognitive-complexity metric (complements cyclomatic with nesting/control-flow cost) Language: Python via cognitive_complexity package Thresholds: medium ≥ 15, high ≥ 25, critical ≥ 35

Cognitive complexity models how humans experience code difficulty — nested branches cost exponentially more than flat ones. Better proxy than cyclomatic for “how readable is this function?”

Dead Code — Vulture (Python)

Type: Dead-code detection for Python What it finds: Unused imports, unused variables, unused functions, unused classes, unused attributes Confidence filter: default 80% (configurable per-org)

Dead Code — knip (TS/JS)

Type: TS/JS dead-code detector What it finds: Unused files, unused exports, unused types, unused enum members, unused class members, duplicate exports Runs when: repo has package.json

Code Duplication — jscpd

Type: Multi-language copy-paste detector Thresholds: min 6 lines, 50 tokens (configurable) Summary: if total duplication exceeds 5%, a repo-level summary finding is emitted (used by the duplication_pct_max quality gate).

Coverage Engine

Type: Test coverage ingestion Formats: LCOV (lcov.info), Cobertura XML (coverage.xml), JaCoCo XML (jacoco.xml), Clover XML (clover.xml) Auto-detect paths: coverage/lcov.info, coverage.xml, jacoco.xml, clover.xml + common CI-artifact locations

Two paths to get coverage in:

Upload via API — POST a report to /quality/coverage/upload from your CI pipeline. 25 MB cap.
Auto-detect — scanner walks the cloned repo for known coverage-artifact filenames and parses any matches.

Coverage drives the new_coverage_min quality gate, the “Coverage” rating (A–E), and surfaces as coverage_gap findings on files below 60% coverage. See Coverage Ingestion + Coverage.

AI Triage

Every finding is analyzed by an LLM (GPT-4o primary, Gemini fallback, Claude second-pass) to assess exploitability, generate a fix, map to compliance controls, and cross-check cross-file reachability. The triage runs at temperature=0 and results are cached by (rule_id, code_hash, model_version, prompt_version) — identical input produces identical output on every re-scan. See Reproducibility.