Audit logs are the lifeblood of security investigations and compliance evidence — NIST SP 800-171 Rev.2 / CMMC 2.0 Level 2 AU.L2-3.3.4 requires organizations to generate and retain audit records and to detect when audit record generation fails; this post shows cloud-native, practical ways to implement alerting for audit log failures across AWS, Azure, and GCP with concrete examples for small businesses, implementation tips, and runbook ideas.
Key objective and approach
The objective for AU.L2-3.3.4 is straightforward: ensure your environment generates audit records and detect when the pipeline that creates, ships, or stores those records breaks or is tampered with. The most reliable pattern is (1) centralize audit logs, (2) create an observable metric that represents "logs are being produced", and (3) trigger an alert when that metric drops below an acceptable threshold (or when delivery reports show failures). For small businesses this usually means combining built-in cloud services (logging, monitoring, notifications) with a lightweight scheduled check (serverless function) where native metrics are lacking.
AWS: practical implementation steps
How to detect CloudTrail / audit delivery failures
Recommended architecture: enable multi-region CloudTrail, send trails to a dedicated, hardened S3 bucket (KMS-encrypted, versioning enabled, restricted access), and optionally to CloudWatch Logs. To detect failures quickly you can implement either: (A) a CloudWatch Logs metric filter (if using CloudWatch Logs) that counts CloudTrail entries and an alarm when count < 1 over X minutes; or (B) a scheduled AWS Lambda (5–15 minute cron via EventBridge) that checks the latest S3 object timestamp in the CloudTrail prefix and publishes to SNS if the most recent object is older than your threshold. Example Lambda logic (Python/boto3): list_objects_v2(prefix='AWSLogs/ACCOUNT_ID/CloudTrail/'), examine LatestObject['LastModified'], compare to now(), and publish to SNS if age > 15 minutes. Also monitor CloudTrail configuration changes (EventBridge rule for AWS API calls that disable CloudTrail or change S3 destination) and alert immediately. Enable CloudTrail log file validation; if validation fails, generate an alarm. Lock down the S3 bucket with an S3 bucket policy and MFA-delete to reduce tamper risk.
Azure: practical implementation steps
How to detect Activity Log / Diagnostics ingestion gaps
In Azure, the simplest compliance-friendly pattern is to send Activity Logs and resource Diagnostic Settings to a Log Analytics workspace or Storage Account and then create a log-based alert on the absence of recent entries. Create a scheduled alert rule using the Log Analytics query: AzureActivity | where TimeGenerated > ago(15m) | summarize cnt = count(); then set the alert condition to trigger when cnt < 1. For resource logs, similar queries against the Diagnostic table (e.g., AzureDiagnostics) work. Additionally, create an Activity Log alert for administrative changes that affect logging (e.g., Microsoft.Insights/diagnosticSettings/write, Microsoft.Insights/diagnosticSettings/delete) via an Event Grid or Activity Log Alert to detect when diagnostic settings are removed. Tie alerts to Action Groups that notify teams by email/SMS/Teams and optionally trigger an Automation Runbook to re-enable diagnostics or create a ticket in your ITSM system.
GCP: practical implementation steps
How to detect Cloud Audit Logs delivery failures
GCP's recommended approach is to create a logs-based counter metric that counts audit log entries and then create an alerting policy for metric absence. Create a logs-based metric with a filter like: logName="projects/PROJECT_ID/logs/cloudaudit.googleapis.com%2Factivity" (or a broader audit filter), then in Cloud Monitoring create an alert condition of type "Metric absence" that fires if the metric has value < 1 for the last X minutes. Alternatively, use a scheduled Cloud Function that queries the Logging API for recent audit entries and publishes to Pub/Sub/SMS when none are found. Also monitor sink delivery metrics (if you send logs to BigQuery/Cloud Storage/Pub/Sub) — look for dropped logs or sink errors and alert on non-zero dropped_entries or sink_status metrics. For example, create an alert when logging.googleapis.com/user/your_sink/dropped_entries > 0.
Real-world small business scenarios and runbooks
Scenario: a 20-person small business uses AWS for production, Azure for dev/test, and GCP for analytics. Practical implementation: enable CloudTrail (multi-region) and an S3 sink in AWS, send Azure Activity Logs to Log Analytics, and create a GCP logs-based metric; then configure each cloud to notify a central Slack channel and create tickets in Jira when alerts fire. Provide a simple runbook: (1) On alert: confirm alert source and scope (region/account/project), (2) verify ingestion endpoint (S3 bucket / Log Analytics workspace / Cloud Logging sink), (3) check configuration changes (CloudTrail stopped, diagnostic setting removed, sink permissions changed), (4) escalate and open an incident if logs cannot be restored within your SLA. Maintain documented evidence of alert tests and incidents to demonstrate compliance during audits.
Compliance tips and best practices
Best practices include centralizing audit logs in a hardened, immutable store (or use object lock / retention policies), protecting log encryption keys with strict IAM roles, enabling log file validation, and versioning. Define realistic thresholds to avoid noise (for low-traffic systems, absence-based alerts may need longer windows). Test your alerts quarterly by intentionally pausing log generation in a controlled way to validate detection and remediation. Integrate alerts with ticketing and automated remediation where safe (e.g., a runbook that re-enables diagnostic settings with RBAC approval). Keep documented procedures and evidence of testing to satisfy the compliance auditor.
Risk of not implementing AU.L2-3.3.4
Failure to detect missing or tampered audit logs severely weakens incident detection and forensics: breaches can go unnoticed, insider misuse may be untraceable, and you risk non-compliance findings with contractual, regulatory, or CMMC assessments. Practically, this can mean delayed breach response, larger impact, contractual penalties, or failing a CMMC assessment — all avoidable by implementing the alerting patterns described above.
Summary: meeting AU.L2-3.3.4 requires making audit generation observable and actionable — centralize logs, create a measurable "heartbeat" (metric or scheduled check) per cloud, alert on absence/delivery failures, protect log stores, and codify runbooks and tests. Start small: implement one cloud's detection logic this week (Lambda or Log Analytics query) and expand to a centralized SIEM and documented remediation playbooks over time to meet NIST/CMMC expectations.