Step 6 : Enterprise Security Architecture
When thinking about security architecture, there are many avenues to go down in terms of a adoption of a defense in depth approach. Given that 40% of breaches are related to insecure development code (Verizon DBIR 2019) being hosted on cloud based servers, I think it’s a good place to start this topic.
SANS Analyst David Shackleford tells us in his article entitled The DevSecOps Approach to Securing Your Code and Your Cloud that only 17% of infosec organizations can keep up with continuous or agile development. He promotes seven imperatives for security teams, the top 4 of which I’ve included here. 1) Embedding security controls and automation in code developed in the cloud 2) Inventory and analysis of reusable code to avoid re-introducing flaws 3) continuous monitoring of code and results in production & 4) creation of “triggered” responses that can roll controls back to a known good state if there’s a problem. Following some other of Dave’s good advice on architecture in his Youtube presentation “A Cloud Security Architecture”, he advises us to think about cloud security controls and tools which should not be designed with one CSP in mind. The reality is that many organizations will be not just be an AWS or Azure or GCP shop, they will likely follow a hybrid model through accident or design. So when designing IAM security policies for example, consider their application across multiple CSP platforms so as not to get locked in to one provider. An alternate approach is to leverage security broker services or CASB’s which are offer for managed security policies across multiple platforms.
Of course with any good security architecture we have to think about designing for availability or designing for failure as it’s often referred to in the cloud. At the application level a few pointers would be for developers to make no assumption about the reliability of the underlying infrastructure thus adaptability should be paramount. It also holds true that each application component must be designed across multiple cloud components and automation tools must be in place to allow applications to respond to infrastructure failures. Specific types of failure recovery objectives might be, recovering from a bad server image deployed across the EC2 environment or recently deployed security group definitions which are too permissive and need to be rolled back. You will also need to consider potential elasticity failure issues such as bad load balancing parameters which fail to scale up applications when needed.
These points only touch on overall security architecture of a cloud environment, the point here is to get the imperatives right and embed security failure scenarios in every infrastructure and application component in the cloud
Step 7 : Assess Third Party Risk
The Ponemon institute report of 2018 estimates that 54% of companies have experienced a breach caused by a third party. Safe to say, supply chain attacks are plaguing the cloud as was the case of the Verizon breach which involved the exposure of six million customer records. The RCA for this breach was allegedly traced back to a 3rd party called Nice Systems, a provider of customer service analytics. Nice, put six months of customer service call logs, which included account and personal information on a public (vs private) Amazon S3 storage server. The documented exploits go on from Magento e-commerce admin credentials exploits to IoT devices with insecure passwords and even insecure HVAC systems. The later was attributable to the Target Breach of 2014. In the shadow of GDPR and ramped up enforcement from DPA’s, third party risk has never been more riskier. So what can be done?
Statistically, companies taking the time to evaluate the security and privacy policies of all its suppliers has shown a reduction of 20% in instances of breaches [Ponemon Report, 2018]. I’ve seen good success with companies who embed security provisions in their standard contract agreements / contract wrappers. The terms of doing business with vendors should be requirements to do self-assessments, allowing onsite visits and audits, or the purchase of cyber insurance. Generally speaking, the greater risks are often linked to smaller suppliers who may often lack the funding or personnel for robust security functions. Of course, given the complexity of some enterprises who may have hundreds of vendors in the mix, it may be more practicable to look at automated TPRM (Third Party Risk Assessment) tools to standardize risk treatment with greater efficacy.
Of course all this TPRM is great but without a continuous monitoring element, reports and questionnaire results from vendors are outdated as soon as the inkjet dries on the paper. This means of course a risk management office function may well be called for with continuous governance over vendor risk
Step 8 : Incident Management
In all likelihood, you may have fairly well developed incident management policies and procedures in place for your on-prem systems. The challenge is now in the shifting sands of trust. Once upon a time DFIR (Data Forensics Incident Response) in the cloud was unpalatable to security pros. Concerns expressed by investigators in a recent survey by SANS (RSA Conf 2018) included the need to access the underlying information needed for forensic examination from a cloud solution, lack of understanding as to what information from the cloud provider is required for analysis and hesitancy about multi-tenancy. These concerns however have started to change with DFIR cloud platform images which can be spun-up quickly to handle ingestion and analysis of large amounts of data. AccessData and several others have ventured heavily into the solutions space over the last 2 years.
While each of the CSP’s offer their own playbooks and best practices, it’s important to understand what to look for in your incident response planning. In AWS, the focus should be on four log sources of Config + Cloudwatch + Cloudtrail + Lambda. So, following the taxonomy of many attacks, attackers enumerate the environment with general “Get*/List*” API type commands for example list permissions, get environment resources. This is followed scanning type activity under the category Resource Data/Event/Collection i. e. API Get*/Describe*/List*/Lookup* then exploitation of resources under the category Creation/Modification/Deletion i.e. Delete*/Disable*/Remove*. This is all frequently finished up with log tampering and/or deletion by the attackers. When investigating potential breaches it’s good to be prepared, which is why an increasing common practice in the cloud is to set up a an isolation VPC which as it seems. is isolated from the rest of the cloud with it’s own segmented access to the internet. The additional step of turning on VPC flow logs (akin to NetFlow) will capture network traffic. This approach is useful where compromised systems can be placed in the VPC for containment and analysis. I found a useful resource to get started on DFIR by looking at the SANS DFIR Youtube channel
Of course it’s worth mentioning again that for any incident response to work, logging has to be enabled everywhere, whether its Azure Log Analytics, Azure Monitor or AWS Cloudtrail / Cloudwatch. If it moves log it !!!
Step 9 : Encryption
Getting a handle on encryption across your cloud deployment is likely the top technical control implementation next to mastering IAM. In tackling encryption, I see a three-fold approach which looks at the following areas; a) specific threats and mitigation’s to data at rest and in transit: b) general Industry framework guidance: & c) CSP architectural guidance on this area.
For ‘a)’, we can look at many of OWASP’s top 10 risks such as “broken user authentication” , “excessive data exposure” and “security misconfiguration” which all highlight the need for encrypted authentication keys and/or passwords as they relate to cloud objects objects and APIs and the need to enable TLS encryption while disabling insecure services such as HTTP. These top 10 risks closely match actual exploits which have led to some of the largest industry breaches in history. Stolen AWS root keys, unencrypted S3 buckets, weak encryption schemes, unencrypted databases & emails, expired security certificates and others are all part of the usual suspects in encryption breach-lore.
For ‘b)’, general industry framework guidance. I would first look at the Cloud Security Alliance and their “Security Guidance for Critical Areas in Cloud Computing v4.0”. Domain 11 is entitled “Data Security and Encryption” which is a good primer for encryption principles in the cloud. This primer is a good accompaniment for their Cloud Controls Matrix which can be used to implement specific controls in your environment. ISO-27017 is also another port of call for higher level governance controls in it’s “Cyptographic Controls” domain (10). Then, when your ready to be tech’ed out extremo, CIS benchmarks are available for free download (with registration) to get to the command, trench level of enabling encryption services for AWS, Azure and GCP.
Finally on this topic, ‘c)’ CSP architectural guidance. AWS is a well resourced library of information for artifacts, including their AWS well architected framework (and tool) to their data security documentation along with product specific sections on CloudHSM, KMS, Amazon-SDK, RDS, REST-API security and more. Azure has it’s own docs section with links to it’s encryption service offerings (bitlocker, SQL, CosmosDB, SSE and more) while it’s cloud architecture framework is a good place to start at a high level before you delve into specifics on encryption. Lastly GCP have a section on Google Docs for data encryption which talks enters the encryption service fray by talking about client and server side encryption technologies.
Step 10 : Cloud AI / ML Security
Hyper automation is becoming a new buzzword in technology which broadly describes a desire to highly automate processes and augment human requirements. From a security perspective the need has never been greater to automate security processes with the growing sophistication of attacks and the increasing array of value added cloud services from all CSP’s. Obvious candidates for AI applications have emerged including threat intelligence & detection which Amazon GuardDuty promotes for its logging platforms. Azure promotes ML enabled adaptive application controls to automatically whitelist applications on Azure VM’s which are based on behavioral analysis. ML enabled behavioral analytics is now a feature of many vendor offerings which is borne out when looking at Gartner’s magic quadrant and various industry analyst reports. AI powered solutions for privileged access management, malware detection and other anomalies are now frequent mentions in articles on top 10/20/30 applications of AI in cybersecurity and should be part of every CSO’s security strategy for securing their cloud deployment.
Kudos to those who have made it to reading this far in my article as the subject matter is a bit heavy. So to distill it all down to a few takeaways, here goes.
1) Automation and Templatization The cloud has a bigger attack surface then on-prem deployments. Given the fact that most companies will use multiple CSP’s in some fashion and the fact that the cloud is still new’ish to security organizations and DevOps, automation has become an imperative. Automation may include ML enabled automated response to threats provided by products like GuardDuty or using a continuous integration tool like Jenkins to automate code testing. The end objective being, to relieve the pressure on staff trying to close the gap on motivated adversaries.
2) Cloud Specific Controls It’s time to integrate cloud specific controls into your existing frameworks. CIS benchmarks and cloud security alliance matrix are two good examples. As the cloud is essentially an API superstructure, controls have to more aligned with the AppSec space.
3) Get the Culture Right With all the technical controls in the world, your best efforts will stay underwater if the cultural approach to security is not properly addressed. Human error is hands down the number one contributory factor to major breaches in organisations. The questions become, why didn’t your engineer enable two factor authentication for that privileged account. Why was that service publicly published, why didn’t you know the service existed in the first place and is the executive team passing a top-down message of the importance of security in the organization, do they fully understand the risks. As I mentioned in step 1, silo’d environments are often a security choke hold. Breaking down those silo’s would be in my top 3 imperatives.
Stay Tuned for Future Posts by following us on Twitter, Linkedin and Facebook