In a recent report published by ISC2 on cloud security, unauthorized access through misuse of employee credentials/improper access controls, insecure interfaces and API’s and mis-configuration of the cloud platform rated in the top 3 concerns of cyber security professionals worldwide. It’s easy to see why these concerns rate so highly considering the vast majority of breaches fall into these categories such as, Capital One’s and Imperva’s breaches in 2019 which were caused by a web application firewall misconfiguration. Lack of multi factor authentication led to compromised AWS credentials being the root cause of stolen Uber customer accounts in 2016 and was also traceable to other cloud breaches over the last number of years.

Each cloud service provider has very good reference and solutions for addressing specific threats to authentication, authorizaton and encryption for example with AWS-IAM and KMS or Azure-IAM and Key Vault. These services being used for identity management and encryption services across a cloud estate. Third party solution providers also occupy various quadrants of the Gartner annual reports with their cloud offerings. All this being said, sometimes it’s difficult to form a risk based approach to mitigate the many risks to the cloud so with these steps, I hope to set your compass heading as a decision maker in your cloud infrastructure. It’s important to note here, that outside of AWS, Azure and GCP, there are many 3rd party solutions which are not mentioned here in the context of securing elements of the cloud, I only focus on what the 3 main service providers offer.

Step 1 : Assess the Human Error Factor in Your Organization

Following the people, process and technology paradigm and reading many root cause analyses, human error is the top factor again and again. In context of the cloud, errors such as a) failure to apply multi-factor authentication to privileged accounts (e.g. root accounts) b) overly permissive rules or ignored alerts on WAF’s or c) failure to document what API’s you have and secure them. (F5 Article Here on API’s for further reading).

A picture of from the Simpsons error depicting the risk of Human Error by a Drinking Bird pecking on a keyboard

culture creep:

The question then becomes, why are these things happening and what can I do. In my experience culture creep often skews all your best efforts to address security problems. Silo’d departments, poor practices of not documenting what you have particularly in a fast paced agile application development world, security perception as an inhibitor within an organization and more. So what works, well assuming you have a decent risk register and your paying attention to the OWASP top 10 cloud risks and running tools like AWS’s Trusted Advisor or Microsoft’s Azure Security Center you’ll have a good view on what to protect against. Using this type of information, my focus would be on strengthening or migration to a full SecDevOps model where security is informed and embedded throughout the development cycle with security NFR’s and functional requirements well documented including a compunction to update documentation along the way such as new SOA services being published in the service registry. Traditionally, infrastructure and appliances are fairly well understood by support teams, but the waters get a lot cloudier when explaining things like REST API’s, SAML Authentication, SABSA etc to your security teams, but application security must be well understood considering the importance of the security governance role.

I’ve also seen benefits in vesting time in developing automation of daily/weekly/monthly type checklists for control checks across teams. Many times, checks on AWS CloudTrail / Azure Sentinel logs, WAF firewall rules, API authentication failures etc are not actually recorded or automatically tallied in any way. These leads to lack of accountability and instances where intrusion attempts may only be casually checked for.

Step 2 : Looking at Cloud Governance Models

Using a top-down approach to cloud security, choosing a governance model which is more aware of concepts such as SaaS and PaaS is a good companion to older governance models that you may be using. For this, I look to the service providers themselves along with institutions such as OMG, NIST and the Cloud Security Alliance on their frameworks. Microsoft uses a cloud adoption framework while Amazon promotes AWS management and governance. OMG has a good one here while NIST has a variety of cloud guidance in the 500 and 800 numbered range. The important thing to remember here is that the cloud governance models are more closely aligned to platform specific role based authentication, cloud API controls and general controls all in the pursuit of limiting the vulnerabilities in the larger attack surface of the cloud.

Step 3 : Identity Access Management (IAM) Security

IAM is perhaps the most widely used security feature in cloud services. In the case of AWS the use cases include fine grained access controls, multi-factor authentication, AD and SAML 2.0 integration, federated identity capabilities etc. IAM however comes with it’s own vulnerabilities which have been exploited in real world breaches. IAM vulnerabilities commonly reported include a) overly permissive policies which allow data leakage from S3 buckets / Azure Blob storage, b) misconfigured roles which allow creation of EC2 / Azure VM instances c) compromised keys or roles which allow snapshoting of RDS/EBS database instances which can later be redeployed with reset passwords to allow access.

When looking to create a robust IAM, I look to automated tools, best practices and logging as first ports of call. AWS offers IAM Policy Simulator to check effective permissions on AWS users, service control policies and permissions boundaries. IAM access advisor API’s is also an effective security AppDev tool for limiting developer access to only the services they need based on past behavioral patterns.

As with all the vendors (AWS, Azure and GCP) their best practices for securing IAM are easily available to consumers. AWS’s are found here, Azure’s best practices for securing IAM are documented here while GCP’s is here. Common themes revolve around security key rotation, enabling MFA as part of privileged access management, root key storage, principles of least privilege etc.

Lastly on this topic, I’ll mention logging which I par on importance with having IAM in the first place. In the context of AWS, CloudTrail and CloudWatch log management are perhaps your most important detective and preventative controls when it comes to answering the who, did what, when questions. CloudTrail in the context of IAM, logs all API calls, events like sign-ins, sessions, role changes, and permission changes. CloudTrail also logs all API calls to KMS (Discussed in part 2 of this article), and in case of a breach, you can analyze the origin of each call. CloudTrail can capture any kind of AWS console, CLI or SDK activity. Microsoft’s logging capabilities offer similar capablilities with their Azure Security Center and Sentinel SIEM/SOAR service offerings

Step 4 : Data Privacy Focused

I’ve touched on some of the governance frameworks up to now, but implementation of data privacy in the cloud has to be a design consideration in every facet of cloud architecture given the advent of GDPR regulations. One thing to remember is that cloud communication is essentially all API based so with respect to data leakage prevention, DDOS attack prevention, ransomware attacks etc GDPR compliance should be more focused then ever on what services (particularly public ones) are exposed in the cloud. You may also be looking at new platform specific services such as AWS-KMS, Amazon Shield, Amazon RDS databases etc which now may need to be factored into your data privacy program.

For SOA environments, protecting SOAP/XML UDDI, WSDL components have to be factored into the architecture holistically. Many of the same type of web application attacks have existed for years such as SQL injection, DDOS, DNS poisoning but with the ever increasing complexity of environments, it’s not hard to imagine where a published service leads to R/W access to a customer database and resultant data leakage. Defense of these kinds to vulnerabilities is a key driver for continuous involvement and assessment in the design process by security personnel

Lastly on this topic, I’ll mention logging which I par on importance with having IAM in the first place. In the context of AWS, CloudTrail and CloudWatch log management are perhaps your most important detective and preventative controls when it comes to answering the who, did what, when questions. CloudTrail in the context of IAM, logs all API calls, events like sign-ins, sessions, role changes, and permission changes. CloudTrail also logs all API calls to KMS (Discussed in part 2 of this article), and in case of a breach, you can analyze the origin of each call. CloudTrail can capture any kind of AWS console, CLI or SDK activity. Microsoft’s logging capabilities offer similar capablilities with their Azure Security Center and Sentinel SIEM/SOAR service offerings

Step 5 : Cloud Policies

Existing policies will of course need to be cloud inclusive if not already. Incident response need to factor in things like lost AWS keys, DDOS attacks in the cloud, S3 bucket compromises etc will all need to be documented. Cloud logging policies and retention (integration into any existing SIEM appliances/services), secure EC2 server builds, password management etc will also have to visited too.

From the security departments perspective, security testing policies, WAF policy review, security testing and remediation policies for application and infrastructure vulnerabilities in the cloud will need to be factored in.

Step 6 : Enterprise Security Architecture

When thinking about security architecture, there are many avenues to go down in terms of a adoption of a defense in depth approach. Given that 40% of breaches are related to insecure development code (Verizon DBIR 2019) being hosted on cloud based servers, I think it’s a good place to start this topic.

SANS Analyst David Shackleford tells us in his article entitled The DevSecOps Approach to Securing Your Code and Your Cloud that only 17% of infosec organizations can keep up with continuous or agile development. He promotes seven imperatives for security teams, the top 4 of which I’ve included here. 1) Embedding security controls and automation in code developed in the cloud 2) Inventory and analysis of reusable code to avoid re-introducing flaws 3) continuous monitoring of code and results in production & 4) creation of “triggered” responses that can roll controls back to a known good state if there’s a problem. Following some other of Dave’s good advice on architecture in his Youtube presentation “A Cloud Security Architecture”, he advises us to think about cloud security controls and tools which should not be designed with one CSP in mind. The reality is that many organizations will be not just be an AWS or Azure or GCP shop, they will likely follow a hybrid model through accident or design. So when designing IAM security policies for example, consider their application across multiple CSP platforms so as not to get locked in to one provider. An alternate approach is to leverage security broker services or CASB’s which are offer for managed security policies across multiple platforms.

Of course with any good security architecture we have to think about designing for availability or designing for failure as it’s often referred to in the cloud. At the application level a few pointers would be for developers to make no assumption about the reliability of the underlying infrastructure thus adaptability should be paramount. It also holds true that each application component must be designed across multiple cloud components and automation tools must be in place to allow applications to respond to infrastructure failures. Specific types of failure recovery objectives might be, recovering from a bad server image deployed across the EC2 environment or recently deployed security group definitions which are too permissive and need to be rolled back. You will also need to consider potential elasticity failure issues such as bad load balancing parameters which fail to scale up applications when needed.

These points only touch on overall security architecture of a cloud environment, the point here is to get the imperatives right and embed security failure scenarios in every infrastructure and application component in the cloud

Step 7 : Assess Third Party Risk

The Ponemon institute report of 2018 estimates that 54% of companies have experienced a breach caused by a third party. Safe to say, supply chain attacks are plaguing the cloud as was the case of the Verizon breach which involved the exposure of six million customer records. The RCA for this breach was allegedly traced back to a 3rd party called Nice Systems, a provider of customer service analytics. Nice, put six months of customer service call logs, which included account and personal information on a public (vs private) Amazon S3 storage server. The documented exploits go on from Magento e-commerce admin credentials exploits to IoT devices with insecure passwords and even insecure HVAC systems. The later was attributable to the Target Breach of 2014. In the shadow of GDPR and ramped up enforcement from DPA’s, third party risk has never been more riskier. So what can be done?

Statistically, companies taking the time to evaluate the security and privacy policies of all its suppliers has shown a reduction of 20% in instances of breaches [Ponemon Report, 2018]. I’ve seen good success with companies who embed security provisions in their standard contract agreements / contract wrappers. The terms of doing business with vendors should be requirements to do self-assessments, allowing onsite visits and audits, or the purchase of cyber insurance. Generally speaking, the greater risks are often linked to smaller suppliers who may often lack the funding or personnel for robust security functions. Of course, given the complexity of some enterprises who may have hundreds of vendors in the mix, it may be more practicable to look at automated TPRM (Third Party Risk Assessment) tools to standardize risk treatment with greater efficacy.

Of course all this TPRM is great but without a continuous monitoring element, reports and questionnaire results from vendors are outdated as soon as the inkjet dries on the paper. This means of course a risk management office function may well be called for with continuous governance over vendor risk

Step 8 : Incident Management

In all likelihood, you may have fairly well developed incident management policies and procedures in place for your on-prem systems. The challenge is now in the shifting sands of trust. Once upon a time DFIR (Data Forensics Incident Response) in the cloud was unpalatable to security pros. Concerns expressed by investigators in a recent survey by SANS (RSA Conf 2018) included the need to access the underlying information needed for forensic examination from a cloud solution, lack of understanding as to what information from the cloud provider is required for analysis and hesitancy about multi-tenancy. These concerns however have started to change with DFIR cloud platform images which can be spun-up quickly to handle ingestion and analysis of large amounts of data. AccessData and several others have ventured heavily into the solutions space over the last 2 years.

While each of the CSP’s offer their own playbooks and best practices, it’s important to understand what to look for in your incident response planning. In AWS, the focus should be on four log sources of Config + Cloudwatch + Cloudtrail + Lambda. So, following the taxonomy of many attacks, attackers enumerate the environment with general “Get*/List*” API type commands for example list permissions, get environment resources. This is followed scanning type activity under the category Resource Data/Event/Collection i. e. API Get*/Describe*/List*/Lookup* then exploitation of resources under the category Creation/Modification/Deletion i.e. Delete*/Disable*/Remove*. This is all frequently finished up with log tampering and/or deletion by the attackers. When investigating potential breaches it’s good to be prepared, which is why an increasing common practice in the cloud is to set up a an isolation VPC which as it seems. is isolated from the rest of the cloud with it’s own segmented access to the internet. The additional step of turning on VPC flow logs (akin to NetFlow) will capture network traffic. This approach is useful where compromised systems can be placed in the VPC for containment and analysis. I found a useful resource to get started on DFIR by looking at the SANS DFIR Youtube channel

Of course it’s worth mentioning again that for any incident response to work, logging has to be enabled everywhere, whether its Azure Log Analytics, Azure Monitor or AWS Cloudtrail / Cloudwatch. If it moves log it !!!

Step 9 : Encryption

Getting a handle on encryption across your cloud deployment is likely the top technical control implementation next to mastering IAM. In tackling encryption, I see a three-fold approach which looks at the following areas; a) specific threats and mitigation’s to data at rest and in transit: b) general Industry framework guidance: & c) CSP architectural guidance on this area.

For ‘a)’, we can look at many of OWASP’s top 10 risks such as “broken user authentication” , “excessive data exposure” and “security misconfiguration” which all highlight the need for encrypted authentication keys and/or passwords as they relate to cloud objects objects and APIs and the need to enable TLS encryption while disabling insecure services such as HTTP. These top 10 risks closely match actual exploits which have led to some of the largest industry breaches in history. Stolen AWS root keys, unencrypted S3 buckets, weak encryption schemes, unencrypted databases & emails, expired security certificates and others are all part of the usual suspects in encryption breach-lore.

For ‘b)’, general industry framework guidance. I would first look at the Cloud Security Alliance and their “Security Guidance for Critical Areas in Cloud Computing v4.0”. Domain 11 is entitled “Data Security and Encryption” which is a good primer for encryption principles in the cloud. This primer is a good accompaniment for their Cloud Controls Matrix which can be used to implement specific controls in your environment. ISO-27017 is also another port of call for higher level governance controls in it’s “Cyptographic Controls” domain (10). Then, when your ready to be tech’ed out extremo, CIS benchmarks are available for free download (with registration) to get to the command, trench level of enabling encryption services for AWS, Azure and GCP.

Finally on this topic, ‘c)’ CSP architectural guidance. AWS is a well resourced library of information for artifacts, including their AWS well architected framework (and tool) to their data security documentation along with product specific sections on CloudHSM, KMS, Amazon-SDK, RDS, REST-API security and more. Azure has it’s own docs section with links to it’s encryption service offerings (bitlocker, SQL, CosmosDB, SSE and more) while it’s cloud architecture framework is a good place to start at a high level before you delve into specifics on encryption. Lastly GCP have a section on Google Docs for data encryption which talks enters the encryption service fray by talking about client and server side encryption technologies.

Step 10 : Cloud AI / ML Security

Hyper automation is becoming a new buzzword in technology which broadly describes a desire to highly automate processes and augment human requirements. From a security perspective the need has never been greater to automate security processes with the growing sophistication of attacks and the increasing array of value added cloud services from all CSP’s. Obvious candidates for AI applications have emerged including threat intelligence & detection which Amazon GuardDuty promotes for its logging platforms. Azure promotes ML enabled adaptive application controls to automatically whitelist applications on Azure VM’s which are based on behavioral analysis. ML enabled behavioral analytics is now a feature of many vendor offerings which is borne out when looking at Gartner’s magic quadrant and various industry analyst reports. AI powered solutions for privileged access management, malware detection and other anomalies are now frequent mentions in articles on top 10/20/30 applications of AI in cybersecurity and should be part of every CSO’s security strategy for securing their cloud deployment.

Key Takeaways

Kudos to those who have made it to reading this far in my article as the subject matter is a bit heavy. So to distill it all down to a few takeaways, here goes.

1) Automation and Templatization The cloud has a bigger attack surface then on-prem deployments. Given the fact that most companies will use multiple CSP’s in some fashion and the fact that the cloud is still new’ish to security organizations and DevOps, automation has become an imperative. Automation may include ML enabled automated response to threats provided by products like GuardDuty or using a continuous integration tool like Jenkins to automate code testing. The end objective being, to relieve the pressure on staff trying to close the gap on motivated adversaries.

2) Cloud Specific Controls It’s time to integrate cloud specific controls into your existing frameworks. CIS benchmarks and cloud security alliance matrix are two good examples. As the cloud is essentially an API superstructure, controls have to more aligned with the AppSec space.

3) Get the Culture Right With all the technical controls in the world, your best efforts will stay underwater if the cultural approach to security is not properly addressed. Human error is hands down the number one contributory factor to major breaches in organisations. The questions become, why didn’t your engineer enable two factor authentication for that privileged account. Why was that service publicly published, why didn’t you know the service existed in the first place and is the executive team passing a top-down message of the importance of security in the organization, do they fully understand the risks. As I mentioned in step 1, silo’d environments are often a security choke hold. Breaking down those silo’s would be in my top 3 imperatives.


Stay Tuned for Future Posts by following us on Twitter, Linkedin and Facebook