fix: safely record evaluation errors to prevent silent loop aborts #6350
fix: safely record evaluation errors to prevent silent loop aborts #6350AftAb-25 wants to merge 2 commits intomindersec:mainfrom
Conversation
… configs instead of aborting
evankanderson
left a comment
There was a problem hiding this comment.
A failure in GetRuleEngine or NewRuleActions generally would indicate one of the following scenarios:
- Database unreachability. (We're kinda screwed in this case; our best option is probably to log an error)
- Database corruption of stored data. (e.g. from manual DB poking -- again,
- Failed precondition checks in the APIs. (i.e. we've accepted ruletypes that shouldn't have passed validation)
With that said, this makes sense as a backup defense, but I wouldn't characterize this as a "critical" vulnerability, since it requires that we have already corrupted the database state outside what a minder server would accept.
| return fmt.Errorf("error creating rule type engine: %w", err) | ||
| evalErr := fmt.Errorf("error creating rule type engine: %w", err) | ||
| evalParams.SetEvalErr(evalErr) | ||
| logEval(ctx, inf, evalParams, "") |
There was a problem hiding this comment.
Repeating these lines multiple times doesn't seem correct -- particularly since you also don't cover all the exit paths from this function.
There was a problem hiding this comment.
(A better option might be to restructure this to a defer pattern.)
| logger.Err(err).Msg("error marshalling checkpoint") | ||
| var chkpjs json.RawMessage | ||
| var err error | ||
| if ingestRes := params.GetIngestResult(); ingestRes != nil { |
There was a problem hiding this comment.
You don't need to check if ingestRes returns nil, because GetCheckpoint is implemented as:
func (r *Ingested) GetCheckpoint() *checkpoints.CheckpointEnvelopeV1 {
if r == nil {
return nil
}
return r.Checkpoint
}
evankanderson
left a comment
There was a problem hiding this comment.
I'm marking this PR as "request changes" (has two comments to be addressed, it's not clear that avoiding a panic due to failed invariants is correct) to help track which outstanding PRs need maintainer action vs contributor action.
If you don't think this PR is necessary anymore, feel free to close it.
Description
This PR fixes a critical engine bug where a single malformed rule type would cause the entire executor loop to abruptly crash, silently skipping all subsequent security evaluations for an entity.
Currently, in internal/engine/executor.go, if evaluateRule encounters an error while initializing the rule engine (
GetRuleEngine) or compiling its actions (NewRuleActions), it immediately bubbles that error up to EvalEntityEvent via a barereturn. That return breaks out of the profile iteration loops completely, meaning the system never evaluates any other rules and never logs the error status to the database.Change
I refactored the error handling logic in evaluateRule during the rule creation phase. Instead of returning the raw error up the stack and dropping the lock, we now catch those configuration/compilation failures, load them gracefully into
evalParams.SetEvalErr, and explicitly track them viae.createOrUpdateEvalStatus.This allows the engine to record the initialization failure state in the DB so administrators actually know their rule configuration is broken, and more importantly, allows it to return
nil, meaning the engine loop survives and seamlessly continues evaluating the remainder of the repository's security checks.Fixes #6349
Checklist