Technical Engagements, Documented

Each case study follows a structured consulting format — from client background through architecture, implementation, and measurable outcomes.

Identity & Access Management

Azure Identity & Conditional Access Implementation

📅 Duration: 6 weeks 👥 Scope: 300 users 🏢 Industry: Financial Services

1. Client Background

The client is a mid-sized financial services firm with approximately 300 employees operating across three branch offices and a growing remote workforce. Their IT environment was a hybrid configuration with on-premises Active Directory synced to Azure AD via Azure AD Connect. The organization had recently undergone a compliance audit that flagged multiple identity-related control gaps, including the absence of multi-factor authentication, overly permissive role assignments, and no Conditional Access policies governing sign-in behavior.

The firm operated in a regulated environment subject to SOC 2 Type II and state financial regulatory requirements. Their existing identity posture relied on legacy password-only authentication with no centralized policy enforcement for cloud application access.

2. Problem

The compliance audit identified the following critical findings:

  • No multi-factor authentication was enforced for any user population, including administrators and privileged accounts.
  • Global Administrator roles were permanently assigned to six accounts, three of which belonged to non-IT staff who had inherited permissions during earlier migrations.
  • No Conditional Access policies existed — all authentication requests were evaluated identically regardless of location, device state, or risk level.
  • Guest accounts from prior vendor engagements remained active with access to SharePoint sites containing sensitive financial data.
  • No sign-in risk detection or automated remediation was configured in Azure AD Identity Protection.

The audit required remediation within 90 days. The client needed a partner who could design and implement a Zero Trust identity framework without disrupting daily operations for 300 users across multiple offices.

3. Technical Environment

  • Identity Provider: Azure Active Directory (Entra ID) — P2 licensing
  • Directory Sync: Azure AD Connect v2.x with password hash synchronization
  • On-Premises AD: Windows Server 2019 domain controllers (2 DCs, single forest, single domain)
  • Email & Productivity: Microsoft 365 E3
  • Endpoints: 280 Windows 11 devices managed via Intune, 40 macOS devices unmanaged
  • VPN: Cisco AnyConnect for remote access
  • Line-of-Business Apps: 4 SaaS applications federated via SAML, 2 on-premises apps published via Azure AD Application Proxy

4. Architecture

We designed a layered Conditional Access architecture organized into three policy tiers:

Tier 1 — Baseline Policies (All Users)

  • Require MFA for all cloud applications from untrusted networks
  • Block legacy authentication protocols (IMAP, POP3, SMTP AUTH, ActiveSync with basic auth)
  • Require compliant or Hybrid Azure AD Joined devices for Microsoft 365 access
  • Block sign-ins from countries outside the firm's operating geography

Tier 2 — Privileged Access Policies

  • Require MFA on every sign-in for all directory roles (no trusted location exemptions)
  • Require compliant device AND MFA for Azure portal, Microsoft 365 admin center, and Exchange admin center
  • Session controls: 1-hour sign-in frequency for privileged roles, no persistent browser sessions

Tier 3 — Risk-Based Policies (Identity Protection)

  • Medium and high sign-in risk: require MFA challenge
  • High user risk: force password change with MFA
  • Automated risk remediation via self-service password reset (SSPR) integrated with on-premises writeback

RBAC Redesign

We implemented a least-privilege role assignment model using Privileged Identity Management (PIM):

  • Reduced permanent Global Administrator assignments from 6 to 2 (break-glass accounts)
  • All other administrative roles converted to PIM-eligible with 8-hour activation windows
  • Activation requires MFA + justification text
  • Approval workflow for Global Admin and Exchange Admin activations
  • Monthly access reviews configured for all privileged roles

5. Implementation

The implementation was executed over six weeks in four phases:

Phase 1 — MFA Rollout (Week 1–2)

  • Enabled combined security information registration in Azure AD
  • Configured Microsoft Authenticator as the primary MFA method with FIDO2 security keys as backup for executives
  • Deployed MFA registration campaign using Conditional Access in report-only mode to identify users who had not yet registered
  • Provided department-by-department registration support with a 10-day enrollment window
  • Achieved 98% MFA registration coverage before enforcing policies

Phase 2 — Conditional Access Deployment (Week 3–4)

  • Deployed all Tier 1 and Tier 2 policies in report-only mode
  • Analyzed sign-in logs over 5 business days to identify legitimate workflows that would be impacted
  • Identified and remediated 3 legacy applications using basic authentication — migrated to modern auth via app registration updates
  • Defined named locations for office IP ranges and VPN egress points
  • Switched all policies from report-only to enforced after validation

Phase 3 — PIM & RBAC Remediation (Week 5)

  • Inventoried all Azure AD role assignments using Microsoft Graph PowerShell
  • Removed 4 unnecessary permanent Global Admin assignments
  • Configured PIM for 12 directory roles with role-specific activation settings
  • Created 2 break-glass accounts with monitored sign-in alerts
  • Configured access reviews for all privileged roles on a 30-day cycle

Phase 4 — Identity Protection & Cleanup (Week 6)

  • Enabled Azure AD Identity Protection risk policies (sign-in risk + user risk)
  • Configured SSPR with on-premises password writeback
  • Audited and removed 23 stale guest accounts
  • Configured external collaboration settings to restrict guest invitations to specific admin roles
  • Delivered runbook documentation and conducted a 2-hour knowledge transfer session with the client's IT team

6. Tools Used

Azure Active Directory (Entra ID) Conditional Access Privileged Identity Management Azure AD Identity Protection Microsoft Authenticator FIDO2 Security Keys Azure AD Connect Microsoft Graph PowerShell Azure AD Application Proxy Microsoft Intune

7. Outcome

100%
MFA Coverage
14→2
Permanent Privileged Roles
23
Stale Guests Removed
0
Legacy Auth Protocols Active
  • All compliance audit findings related to identity were remediated within the 90-day window.
  • Legacy authentication was fully blocked with zero production impact after the migration to modern auth.
  • Privileged access is now governed by PIM with time-bound activation, MFA, and justification requirements.
  • Risk-based Conditional Access policies automatically remediate compromised credentials without helpdesk intervention.
  • The client passed their subsequent SOC 2 Type II audit with no identity-related findings.

Security Monitoring

SIEM Deployment with Wazuh & Splunk

📅 Duration: 8 weeks 🖥 Scope: 150+ endpoints 🏥 Industry: Healthcare

1. Client Background

The client is a regional healthcare organization operating a network of outpatient clinics and a central administrative office. Their IT environment consisted of approximately 150 endpoints (Windows workstations and servers), 12 Linux servers hosting internal applications and databases, and network infrastructure including Cisco switches and Palo Alto firewalls. The organization was subject to HIPAA compliance requirements and had recently engaged a third-party risk assessor who identified the absence of centralized security monitoring as a critical gap.

Prior to the engagement, the client had no SIEM, no centralized log collection, and no formalized incident detection or response procedures. Security events were reviewed ad hoc — typically only after an incident had already been reported by end users.

2. Problem

  • No centralized log aggregation — logs were stored locally on individual systems with inconsistent retention periods ranging from 7 days to 90 days.
  • No real-time alerting for security events. Failed authentication attempts, privilege escalations, and malware detections were not surfaced to the IT team.
  • No file integrity monitoring (FIM) on servers handling electronic protected health information (ePHI).
  • No correlation between endpoint events, network events, and application logs — making lateral movement detection impossible.
  • The HIPAA risk assessment specifically cited the lack of audit controls (§164.312(b)) and the inability to produce audit trails for access to ePHI systems.
  • The client's cyber liability insurance provider flagged the absence of SIEM as a policy renewal risk.

3. Technical Environment

  • Endpoints: 120 Windows 10/11 workstations, 18 Windows Server 2019/2022 instances
  • Linux Servers: 12 Ubuntu 22.04 LTS servers (web applications, PostgreSQL databases, SFTP)
  • Network: Cisco Catalyst switches, Palo Alto PA-820 firewalls (2 sites)
  • EHR System: On-premises electronic health record system running on Windows Server with SQL Server backend
  • Authentication: On-premises Active Directory (single domain)
  • Antivirus: Microsoft Defender for Endpoint (partially deployed)
  • Backup: Veeam Backup & Replication

4. Architecture

We designed a dual-platform SIEM architecture using Wazuh for endpoint detection and file integrity monitoring, and Splunk for log aggregation, correlation, and dashboarding. The architecture was purpose-built to separate detection from analysis — allowing each platform to do what it does best.

Wazuh Architecture

  • Wazuh Manager: Single-node deployment on Ubuntu 22.04 (8 vCPU, 16 GB RAM, 500 GB SSD) hosted on-premises in the primary data center
  • Wazuh Agents: Deployed to all 150+ endpoints (Windows and Linux) via GPO (Windows) and Ansible (Linux)
  • Capabilities Enabled: File integrity monitoring, rootkit detection, Security Configuration Assessment (SCA), vulnerability detection, active response
  • Integration: Wazuh alerts forwarded to Splunk via syslog (CEF format) for correlation and long-term storage

Splunk Architecture

  • Splunk Enterprise: Single indexer deployment on dedicated hardware (16 vCPU, 32 GB RAM, 2 TB NVMe) with a 90-day hot/warm retention and 365-day cold retention
  • Universal Forwarders: Deployed to all Windows servers for Windows Event Log forwarding (Security, System, Application, PowerShell Script Block logs)
  • Network Log Sources: Palo Alto firewalls configured to send syslog (traffic, threat, URL filtering) to Splunk via Heavy Forwarder
  • Splunk Apps Installed: Splunk Security Essentials, Palo Alto Networks App for Splunk, Wazuh App for Splunk

Log Flow

Endpoints (Wazuh Agents) ──→ Wazuh Manager ──→ Syslog (CEF) ──→ Splunk
Windows Servers (UF) ──────────────────────────────────────────→ Splunk
Palo Alto Firewalls ──→ Heavy Forwarder ───────────────────────→ Splunk
Linux Servers (rsyslog) ───────────────────────────────────────→ Splunk
Active Directory (WEF) ──→ Windows Event Collector ──→ UF ────→ Splunk

5. Implementation

Phase 1 — Infrastructure Build (Week 1–2)

  • Provisioned and hardened the Wazuh manager server (CIS Ubuntu benchmark applied)
  • Installed and configured Splunk Enterprise with defined indexes: wazuh, wineventlog, pan_firewall, linux_syslog
  • Configured TLS encryption for all agent-manager and forwarder-indexer communication
  • Established syslog relay from Wazuh manager to Splunk Heavy Forwarder

Phase 2 — Agent Deployment (Week 3–4)

  • Deployed Wazuh agents to Windows endpoints via Group Policy software installation (MSI package)
  • Deployed Wazuh agents to Linux servers via Ansible playbook with registration key automation
  • Deployed Splunk Universal Forwarders to all Windows servers via GPO
  • Configured Windows Event Forwarding (WEF) to centralize domain controller logs
  • Configured rsyslog on Linux servers to forward auth.log, syslog, and application logs to Splunk

Phase 3 — Detection Engineering (Week 5–6)

  • Configured Wazuh FIM rules for critical paths: EHR application directories, database data files, system binaries, and configuration files
  • Enabled Wazuh SCA policies: CIS Windows Server 2019, CIS Ubuntu 22.04, PCI DSS
  • Built 24 custom Splunk alerts mapped to MITRE ATT&CK techniques:
    • Brute force detection (T1110) — 10+ failed logons in 5 minutes from single source
    • Privilege escalation (T1078) — non-admin account added to Domain Admins
    • Lateral movement (T1021) — RDP sessions from non-standard source hosts
    • Data exfiltration (T1048) — outbound data transfers exceeding 500 MB to external IPs
    • PowerShell abuse (T1059.001) — encoded command execution or download cradles
  • Configured Wazuh active response to automatically block source IPs after 5 failed SSH attempts on Linux servers

Phase 4 — Dashboarding & Tuning (Week 7–8)

  • Built 6 Splunk dashboards: Security Overview, Authentication Activity, Firewall Traffic Analysis, Wazuh Alert Summary, FIM Changes, and Compliance Status
  • Tuned alert thresholds to reduce false positives — iterated through 3 rounds of tuning based on 2 weeks of production data
  • Configured email and webhook alerting for critical and high-severity events
  • Documented 8 incident response playbooks covering the most common alert types
  • Conducted tabletop exercise with the client's IT team using simulated alert scenarios

6. Tools Used

Wazuh 4.x Splunk Enterprise Splunk Universal Forwarder Splunk Heavy Forwarder Palo Alto Syslog Windows Event Forwarding rsyslog Ansible Group Policy (GPO) MITRE ATT&CK Framework

7. Outcome

150+
Endpoints Monitored
24
Detection Rules Deployed
365 days
Log Retention
< 5 min
Alert-to-Notification Time
  • The client achieved full HIPAA audit control compliance (§164.312(b)) with centralized, tamper-evident log storage and defined retention policies.
  • File integrity monitoring on ePHI systems provided audit trails required for breach notification assessments.
  • Within the first 30 days of production, the SIEM detected and alerted on 3 legitimate security events: a compromised vendor VPN credential, a misconfigured service account with interactive logon, and an unauthorized software installation on a clinic workstation.
  • The cyber liability insurance renewal was approved without the SIEM-related exception.
  • The client's IT team was trained to triage alerts and execute playbooks independently within 4 weeks of go-live.

Infrastructure

Active Directory Hardening

📅 Duration: 5 weeks 👥 Scope: 500-user domain 🏭 Industry: Manufacturing

1. Client Background

The client is a mid-sized manufacturing company with approximately 500 employees across a primary headquarters, two production facilities, and a distribution warehouse. Their IT infrastructure was anchored by a single Active Directory forest that had been in continuous operation since Windows Server 2008, having been upgraded in place through Server 2012 R2 and most recently to Server 2016. The domain functional level had been raised to 2016, but many legacy configurations, GPOs, and permission structures remained from earlier versions.

The company had recently engaged a cybersecurity firm for a penetration test as part of a customer compliance requirement. The pentest results revealed multiple critical findings in the Active Directory environment that required immediate remediation.

2. Problem

The penetration test identified the following Active Directory security findings:

  • Weak Password Policies: The Default Domain Policy enforced a minimum password length of 8 characters with no complexity requirements. The password history was set to 3, and the maximum password age was 180 days.
  • Excessive Privileged Access: 34 accounts were members of Domain Admins, including 12 service accounts and 8 accounts belonging to employees who had changed roles or left the company.
  • No Local Administrator Password Management: All workstations shared the same local administrator password, which was set via a GPO preference (recoverable from SYSVOL in cleartext).
  • No Audit Logging: Advanced audit policies were not configured. Domain controller security event logs were set to a maximum size of 20 MB with overwrite-as-needed, resulting in less than 48 hours of audit trail.
  • Legacy Protocols: NTLMv1 authentication was permitted, and SMBv1 was enabled on domain controllers.
  • GPO Sprawl: 87 Group Policy Objects existed in the domain, many with conflicting settings, broken WMI filters, or no linked OUs. No GPO documentation existed.
  • Kerberoastable Accounts: 9 service accounts had SPNs registered with weak passwords, making them vulnerable to Kerberoasting attacks.

3. Technical Environment

  • Domain Controllers: 4 DCs running Windows Server 2016 (2 at HQ, 1 per remote site)
  • Domain Functional Level: Windows Server 2016
  • Forest: Single forest, single domain
  • User Accounts: ~500 user accounts, ~45 service accounts
  • Workstations: ~400 Windows 10/11 domain-joined machines
  • Servers: 28 Windows Server instances (file servers, print servers, application servers, SQL servers)
  • Sites and Services: 3 AD sites with inter-site replication
  • DNS: AD-integrated DNS zones
  • DHCP: Windows DHCP with failover between HQ DCs

4. Architecture

We designed a hardening plan organized into five workstreams, each aligned to CIS Microsoft Windows Server 2016 Benchmark recommendations and Microsoft's tiered administration model:

Workstream 1 — Password Policy Overhaul

  • Implement fine-grained password policies (PSOs) for three user tiers:
    • Standard Users: 14-character minimum, complexity enabled, 90-day max age, 24 password history
    • Privileged Accounts: 20-character minimum, 60-day max age, 24 password history
    • Service Accounts: 30-character managed passwords via group Managed Service Accounts (gMSAs)
  • Deploy Azure AD Password Protection on-premises agents to block known-weak and organization-specific banned passwords

Workstream 2 — Privileged Access Remediation

  • Audit all privileged group memberships (Domain Admins, Enterprise Admins, Schema Admins, Administrators, Account Operators, Backup Operators)
  • Implement tiered administration: Tier 0 (domain controllers and identity systems), Tier 1 (member servers), Tier 2 (workstations)
  • Create dedicated admin accounts for each tier — no single account with cross-tier access
  • Deploy Protected Users security group for all Tier 0 accounts

Workstream 3 — LAPS Deployment

  • Deploy Microsoft LAPS to all domain-joined workstations and member servers
  • Configure unique, randomly generated local administrator passwords with 30-day rotation
  • Restrict LAPS password read access to designated helpdesk and server admin groups
  • Remove the legacy GPO preference that set a static local admin password

Workstream 4 — Audit Policy & Monitoring

  • Configure Advanced Audit Policy via GPO with settings aligned to Microsoft's recommended baseline:
    • Account Logon: Credential Validation (Success/Failure)
    • Logon/Logoff: Logon, Logoff, Special Logon (Success/Failure)
    • Object Access: File System, Registry, SAM (Failure)
    • Privilege Use: Sensitive Privilege Use (Success/Failure)
    • Account Management: All subcategories (Success/Failure)
    • DS Access: Directory Service Changes (Success)
    • Policy Change: Audit Policy Change, Authentication Policy Change (Success)
  • Increase Security event log to 1 GB with archive-when-full retention
  • Configure Windows Event Forwarding to centralize DC logs to a dedicated collector

Workstream 5 — Protocol Hardening & GPO Cleanup

  • Disable NTLMv1: set LAN Manager authentication level to "Send NTLMv2 response only. Refuse LM & NTLM"
  • Disable SMBv1 on all domain controllers and member servers
  • Enable LDAP signing and LDAP channel binding on domain controllers
  • Audit and rationalize GPO inventory: document, consolidate, and remove orphaned GPOs

5. Implementation

Week 1 — Assessment & Documentation

  • Ran Get-ADGroupMember recursively against all privileged groups and exported membership reports
  • Exported all 87 GPOs to HTML reports using Get-GPOReport and documented their link status, WMI filters, and effective settings
  • Ran Get-ADUser -Filter * -Properties PasswordLastSet,PasswordNeverExpires to identify password hygiene issues
  • Identified all accounts with SPNs using Get-ADUser -Filter {ServicePrincipalName -ne "$null"}
  • Documented all findings in a remediation tracker with risk ratings and implementation dependencies

Week 2 — Password Policies & Service Account Migration

  • Created fine-grained password policies (PSOs) for Standard Users, Privileged Accounts, and Service Accounts
  • Migrated 9 Kerberoastable service accounts to gMSAs — updated service configurations on dependent servers
  • Deployed Azure AD Password Protection proxy and DC agents — configured custom banned password list with 45 organization-specific terms
  • Forced password reset for all accounts that had not changed passwords in >90 days (communicated via a staged 2-week rollout per department)

Week 3 — Privileged Access & LAPS

  • Removed 22 accounts from Domain Admins (8 departed employees, 12 service accounts migrated to gMSAs, 2 helpdesk accounts moved to appropriate delegated groups)
  • Created tiered admin accounts: t0-admin-[name], t1-admin-[name], t2-admin-[name]
  • Deployed LAPS via GPO to all workstation and server OUs — validated password generation on 10% of devices before full rollout
  • Removed the legacy GPO preference with the static local admin password and verified SYSVOL cleanup

Week 4 — Audit Policies & Protocol Hardening

  • Deployed Advanced Audit Policy GPO linked to Domain Controllers OU and member server OUs
  • Configured Windows Event Forwarding: DCs forward Security logs to a dedicated Windows Event Collector server
  • Disabled NTLMv1 in a staged rollout — first in report-only mode via network logon audit, then enforced after verifying no NTLMv1 dependencies
  • Disabled SMBv1 on all servers via GPO — confirmed no legacy dependencies via Get-SmbServerConfiguration
  • Enabled LDAP signing (required) and channel binding (supported) on domain controllers

Week 5 — GPO Cleanup & Documentation

  • Consolidated 87 GPOs down to 34 — removed 38 unlinked GPOs, merged 15 overlapping GPOs
  • Documented every remaining GPO with purpose, linked OUs, key settings, and responsible owner
  • Delivered hardening documentation: password policy reference, privileged access model, LAPS operations guide, audit policy baseline, and protocol configuration reference
  • Conducted 3-hour knowledge transfer session with the client's infrastructure team

6. Tools Used

Active Directory Group Policy (GPO) Fine-Grained Password Policies Microsoft LAPS Azure AD Password Protection gMSA (Group Managed Service Accounts) PowerShell Windows Event Forwarding Advanced Audit Policy CIS Benchmarks

7. Outcome

34→12
Domain Admin Members
87→34
GPOs Rationalized
0
Kerberoastable Accounts
100%
LAPS Coverage
  • All penetration test findings were remediated and verified through a follow-up retest that produced zero critical or high findings in the AD environment.
  • NTLMv1 and SMBv1 were fully eliminated from the domain with no production impact.
  • The tiered administration model prevents lateral movement from workstation compromise to domain controller access.
  • LAPS eliminated the shared local administrator password — the single most commonly exploited finding in enterprise pentest engagements.
  • Advanced audit policies and centralized log forwarding provide the audit trail required for the client's customer compliance obligations.
  • The client's GPO environment is now documented, maintainable, and reduced in complexity by 61%.

Linux Administration

Linux Server Hardening & Automation

📅 Duration: 4 weeks 🖥 Scope: 40 servers 🚀 Industry: SaaS / Technology

1. Client Background

The client is a B2B SaaS company that operates a customer-facing platform hosted across a fleet of 40 Linux servers. The infrastructure consists of a mix of Ubuntu 22.04 LTS and Red Hat Enterprise Linux 8 servers running across two data centers and an AWS VPC. The server fleet includes web application servers (Nginx), application servers (Node.js and Python), PostgreSQL database servers, Redis cache servers, and internal tooling servers.

The company had scaled rapidly over the previous 18 months, growing from 8 servers to 40. Server provisioning had been performed manually by different engineers without a standardized hardening baseline. The company's SOC 2 Type II audit preparation identified server hardening and log monitoring as areas requiring immediate improvement.

2. Problem

  • No Hardening Baseline: Each server had been configured independently. SSH configurations varied across the fleet — some permitted root login, some allowed password authentication, and port numbers were inconsistent.
  • Inconsistent Firewall Rules: Some servers ran iptables, some ran firewalld, and 6 servers had no host-based firewall configured at all. Rules were not documented.
  • No Centralized Logging: Application logs were stored locally with no forwarding. Auth logs, syslog, and audit logs were not aggregated — making incident investigation across servers impractical.
  • No File Integrity Monitoring: No mechanism existed to detect unauthorized changes to system binaries, configuration files, or application code on production servers.
  • Unpatched Systems: 14 servers had not been patched in over 120 days. No patch management schedule or process existed.
  • No Audit Logging: auditd was installed but not configured on any server. No syscall monitoring was in place for privilege escalation, file access, or process execution events.
  • Excessive Sudo Access: 11 user accounts had unrestricted NOPASSWD: ALL sudo access. No sudo logging was configured.

3. Technical Environment

  • Operating Systems: 24 servers running Ubuntu 22.04 LTS, 16 servers running RHEL 8
  • Cloud/Hosting: 28 servers in AWS (EC2), 12 servers in a colocation data center
  • Server Roles:
    • 8 Nginx reverse proxy / web servers
    • 12 application servers (Node.js, Python/Gunicorn)
    • 6 PostgreSQL database servers (primary + replicas)
    • 4 Redis cache servers
    • 4 internal tooling / CI-CD runners
    • 4 monitoring and log aggregation servers
    • 2 bastion / jump servers
  • Configuration Management: None (all manual provisioning)
  • Monitoring: Prometheus + Grafana for performance metrics (no security monitoring)
  • SSH Access: Key-based for most servers, with 8 servers still permitting password authentication

4. Architecture

We designed a hardening and monitoring architecture with three components: an automated hardening toolkit (Bash), centralized log aggregation (rsyslog + Splunk), and audit monitoring (auditd).

Hardening Automation

We developed a modular Bash hardening toolkit that could be executed on both Ubuntu and RHEL systems. The toolkit was organized into discrete modules, each addressing a specific CIS Benchmark control area:

harden.sh
├── modules/
│   ├── 01-ssh-hardening.sh
│   ├── 02-firewall-config.sh
│   ├── 03-user-access.sh
│   ├── 04-kernel-params.sh
│   ├── 05-filesystem.sh
│   ├── 06-auditd-config.sh
│   ├── 07-logging.sh
│   ├── 08-services.sh
│   └── 09-patch-management.sh
├── configs/
│   ├── sshd_config.hardened
│   ├── audit.rules
│   ├── rsyslog.conf
│   └── iptables.rules.[role]
├── lib/
│   ├── detect-os.sh
│   └── logger.sh
└── reports/
    └── compliance-check.sh

Centralized Logging Architecture

  • All servers forward logs via rsyslog over TLS to a centralized log aggregation server
  • Log aggregation server runs Splunk Universal Forwarder, indexing to a Splunk instance
  • Log sources: auth.log / secure, syslog, audit.log, sudo.log, application logs
  • Retention: 90 days online, 365 days archived to S3

Auditd Configuration

We deployed a comprehensive auditd ruleset monitoring:

  • All executions of privileged commands (sudo, su, passwd, chsh)
  • Modifications to /etc/passwd, /etc/shadow, /etc/group, /etc/sudoers
  • File permission changes on critical directories
  • Kernel module loading and unloading
  • Network configuration changes
  • All unsuccessful file access attempts (EACCES, EPERM)

5. Implementation

Week 1 — Assessment & Toolkit Development

  • Ran CIS benchmark assessment on a representative sample (2 Ubuntu, 2 RHEL) — documented baseline compliance scores (Ubuntu: 38%, RHEL: 42%)
  • Inventoried all SSH configurations, firewall rules, user accounts, sudo privileges, and running services across all 40 servers
  • Developed the modular hardening toolkit with OS detection (Ubuntu vs. RHEL) and dry-run capability
  • Tested toolkit against non-production servers in a staging environment

Week 2 — SSH & Access Hardening

  • Deployed standardized SSH configuration across all 40 servers:
    # /etc/ssh/sshd_config — hardened baseline
    PermitRootLogin no
    PasswordAuthentication no
    PubkeyAuthentication yes
    MaxAuthTries 3
    ClientAliveInterval 300
    ClientAliveCountMax 2
    X11Forwarding no
    AllowTcpForwarding no
    Banner /etc/issue.net
    Protocol 2
    Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com
    MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
    KexAlgorithms curve25519-sha256,diffie-hellman-group16-sha512
  • Consolidated SSH access through 2 bastion hosts — direct SSH access to production servers removed from security groups / firewall rules
  • Remediated sudo access: replaced 11 NOPASSWD: ALL entries with role-specific command allowlists
  • Enabled sudo logging to /var/log/sudo.log with Defaults logfile directive
  • Removed 7 dormant user accounts that had not logged in within 90 days

Week 3 — Firewall, Kernel & Service Hardening

  • Standardized all servers on iptables with role-specific rulesets (web server, app server, database server, bastion)
  • Default policy: DROP on INPUT and FORWARD, ACCEPT on OUTPUT with explicit allowlists per server role
  • Applied kernel hardening parameters via /etc/sysctl.d/99-hardening.conf:
    # Network hardening
    net.ipv4.ip_forward = 0
    net.ipv4.conf.all.send_redirects = 0
    net.ipv4.conf.all.accept_redirects = 0
    net.ipv4.conf.all.accept_source_route = 0
    net.ipv4.conf.all.log_martians = 1
    net.ipv4.icmp_echo_ignore_broadcasts = 1
    net.ipv4.tcp_syncookies = 1
    
    # Kernel hardening
    kernel.randomize_va_space = 2
    kernel.dmesg_restrict = 1
    kernel.kptr_restrict = 2
    fs.suid_dumpable = 0
  • Disabled 14 unnecessary services across the fleet (cups, avahi-daemon, rpcbind, telnet, etc.)
  • Mounted /tmp, /var/tmp, and /dev/shm with noexec,nosuid,nodev options

Week 4 — Logging, Auditing & Compliance Validation

  • Deployed and configured auditd on all 40 servers with the standardized ruleset
  • Configured rsyslog TLS forwarding to the centralized log server — verified log receipt from all 40 hosts
  • Applied all pending security patches — 14 servers required kernel updates (scheduled during maintenance window with rolling restarts)
  • Ran post-hardening CIS benchmark assessment: Ubuntu servers scored 91%, RHEL servers scored 89%
  • Documented all configurations in a server hardening baseline document with per-module explanations
  • Created a compliance check script (compliance-check.sh) that validates hardening state and outputs a pass/fail report per control
  • Delivered runbook to client engineering team and conducted a 2-hour walkthrough of the toolkit, logging architecture, and ongoing maintenance procedures

6. Tools Used

Bash auditd rsyslog iptables OpenSSH sysctl CIS Benchmarks Splunk Universal Forwarder AWS Security Groups Prometheus / Grafana

7. Outcome

38%→91%
CIS Compliance (Ubuntu)
42%→89%
CIS Compliance (RHEL)
40/40
Servers Hardened
0
NOPASSWD Sudo Entries
  • All 40 servers now conform to a documented, auditable hardening baseline aligned to CIS Benchmarks Level 1.
  • Centralized logging provides a unified view of authentication, authorization, and system events across the entire fleet.
  • The compliance check script enables the client's engineering team to validate hardening state on any server in under 60 seconds.
  • SSH access is now enforced exclusively through bastion hosts with key-based authentication, eliminating direct exposure of production servers.
  • The modular Bash toolkit is versioned in Git and can be executed against new servers as part of the provisioning process, preventing configuration drift.
  • The client passed their SOC 2 Type II audit with no findings related to server configuration or log management.

Have a Similar Challenge?

We scope every engagement based on your specific environment and compliance requirements.

Start a Conversation →