Skip to content

fix: safely record evaluation errors to prevent silent loop aborts #6350

Open
AftAb-25 wants to merge 2 commits intomindersec:mainfrom
AftAb-25:fix/6349-engine-loop-abort
Open

fix: safely record evaluation errors to prevent silent loop aborts #6350
AftAb-25 wants to merge 2 commits intomindersec:mainfrom
AftAb-25:fix/6349-engine-loop-abort

Conversation

@AftAb-25
Copy link
Copy Markdown
Contributor

Description

This PR fixes a critical engine bug where a single malformed rule type would cause the entire executor loop to abruptly crash, silently skipping all subsequent security evaluations for an entity.

Currently, in internal/engine/executor.go, if evaluateRule encounters an error while initializing the rule engine (GetRuleEngine) or compiling its actions (NewRuleActions), it immediately bubbles that error up to EvalEntityEvent via a bare return. That return breaks out of the profile iteration loops completely, meaning the system never evaluates any other rules and never logs the error status to the database.

Change

I refactored the error handling logic in evaluateRule during the rule creation phase. Instead of returning the raw error up the stack and dropping the lock, we now catch those configuration/compilation failures, load them gracefully into evalParams.SetEvalErr, and explicitly track them via e.createOrUpdateEvalStatus.

This allows the engine to record the initialization failure state in the DB so administrators actually know their rule configuration is broken, and more importantly, allows it to return nil, meaning the engine loop survives and seamlessly continues evaluating the remainder of the repository's security checks.

Fixes #6349

Checklist

  • Code compiles cleanly
  • Includes tests for the changes (existing engine execution tests pass)
  • Documentation updated (if applicable)

@AftAb-25 AftAb-25 requested a review from a team as a code owner April 12, 2026 17:40
@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 59.378% (-0.01%) from 59.39% — AftAb-25:fix/6349-engine-loop-abort into mindersec:main

Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A failure in GetRuleEngine or NewRuleActions generally would indicate one of the following scenarios:

  1. Database unreachability. (We're kinda screwed in this case; our best option is probably to log an error)
  2. Database corruption of stored data. (e.g. from manual DB poking -- again,
  3. Failed precondition checks in the APIs. (i.e. we've accepted ruletypes that shouldn't have passed validation)

With that said, this makes sense as a backup defense, but I wouldn't characterize this as a "critical" vulnerability, since it requires that we have already corrupted the database state outside what a minder server would accept.

return fmt.Errorf("error creating rule type engine: %w", err)
evalErr := fmt.Errorf("error creating rule type engine: %w", err)
evalParams.SetEvalErr(evalErr)
logEval(ctx, inf, evalParams, "")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating these lines multiple times doesn't seem correct -- particularly since you also don't cover all the exit paths from this function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A better option might be to restructure this to a defer pattern.)

logger.Err(err).Msg("error marshalling checkpoint")
var chkpjs json.RawMessage
var err error
if ingestRes := params.GetIngestResult(); ingestRes != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to check if ingestRes returns nil, because GetCheckpoint is implemented as:

func (r *Ingested) GetCheckpoint() *checkpoints.CheckpointEnvelopeV1 {
	if r == nil {
		return nil
	}

	return r.Checkpoint
}

Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm marking this PR as "request changes" (has two comments to be addressed, it's not clear that avoiding a panic due to failed invariants is correct) to help track which outstanding PRs need maintainer action vs contributor action.

If you don't think this PR is necessary anymore, feel free to close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Engine silently aborts all profile security evaluations for an entity if a single rule type is misconfigured

3 participants