Magdalena Jackiewicz
Editorial Expert
Reviewed by a tech expert
Andrzej Lewandowski
Development Leader

Data access control in Snowflake: key principles and best practices

#Sales
#Sales
#Sales
#Sales
Read this articles in:
EN
PL

Data is the spinning wheel of today’s business landscape, powering decision-making and driving innovation. However, as data volumes continue to skyrocket, many companies find themselves struggling to effectively manage and control data access. This challenge is particularly acute for those leveraging the power of Snowflake, the leading cloud data platform, that has unique data access tools and policies.

As you scale your Snowflake deployments, onboard more users and expand the data ecosystem, governing data access can quickly become unmanageable. Who has the ability to read, write, or modify sensitive information? How can we ensure compliance with evolving data privacy regulations? How do we empower cross-functional teams to harness the full value of our data while maintaining its security and privacy? These are some of the typical questions we’ve heard from our clients.

In this blog post, we'll explore proven strategies and best practices for successfully managing data access policies in Snowflake. Whether you're a Chief Data Officer overseeing an enterprise-wide data initiative or an IT leader building a modern data platform, this guide will equip you with the knowledge and tools to scale your Snowflake environment with confidence and control.

From establishing a comprehensive data access framework to leveraging Snowflake's advanced security features, we'll cover the essential elements needed to unlock the full potential of your data while mitigating risk and ensuring regulatory compliance. By the end, you'll have a roadmap for transforming your Snowflake deployment into a secure, scalable, and user-friendly data powerhouse.

What is data access control?

Data access control is about how organizations manage and govern who can access, view, modify, or interact with its data. This will typically be achieved via a combination of different policies, processes and tools that will be adjusted to the context of the organization.

Elements of data access control

There are several dimensions of data access control:

Managing users and roles

Data access control encompasses defining user accounts and mapping them to specific roles or personas, such as data analysts, data scientists, business users, and IT administrators, and then assigning appropriate permissions and access privileges to each role based on the principle of least privilege.

Implementing granular access controls

Organizations must also implement fine-grained data access controls that must be applied to data as well: at the database, schema, table, and even column level. This is required to ensure that specific users can only access the specific data (nothing beyond the data they need to perform their job responsibilities).

Applying data masking and obfuscation

Techniques like data masking, tokenization, or anonymization can (and should!) be applied to protect sensitive or personally identifiable information (PII) (head to our data glossary for definitions) from unauthorized access or exposure.

Ongoing audit and monitoring

Robust auditing and monitoring capabilities are essential, tracking and logging all user activities, data access, and changes within the data platform, and generating reports and alerts to detect anomalies or potential security breaches.

Integration with the organization’s IAM

Seamless integration with the organization's broader Identity and Access Management (IAM) system, leveraging single sign-on (SSO) and multi-factor authentication, further enhances security.

Compliance with rules and regulations

Finally, ensuring the data access policies and controls align with industry regulations, such as GDPR, HIPAA, and PCI DSS, is crucial for demonstrating the organization's ability to protect sensitive data and maintain data governance. By implementing a comprehensive data access control framework, organizations can strike the right balance between data democratization and data security, empowering users to leverage data while mitigating the risk of unauthorized access, data breaches, or compliance violations.

In short, data access control is a set of organizational rules that stipulate who can access what, to what purpose while protecting data privacy and ensuring regulatory compliance, which ensures this ties in with the broader organizational IT infrastructure. In the end, it’s a task of striking the right balance between data democratization and data security, empowering users to leverage data while mitigating the risk of unauthorized access, data breaches, or compliance violations.

Challenges of data access control

“We keep on amassing data and at this point, we’re losing track of who can access and use it, and in what way” – this is a common problem organizations face on their way to becoming more data-driven.

We’d actually like to distinguish between the challenges that arise for businesses that only plan to adopt Snowflake and the specific challenges stemming from already using Snowflake – there are vastly different challenges that have to be address in these cases. 

Challenge 1: Automating data access management

Automating the provision and deprovision of access controls is a critical part of data access management. You need to be able to implement changes quickly and with minimal risk of error as your team grows and access requirements change. Manual handling is prone to multiple risks, including errors, loss of control and privacy breaches, on the one hand, and is simply extremely time-consuming on the other, for the following reasons:

  • Snowflake offers a robust set of granular access control features: the ability to grant and revoke permissions at the database, schema, table, and even column level. Automating the management of these controls so that they ensure the right level of access is a serious and challenging task.
  • Snowflake's data sharing capabilities greatly simplify sharing data with external partners or customers. Automating the management of such relationships, however, brings on additional access control challenges when it comes to coordination and data governance.
  • In the real world, there will also be exceptions and edge cases requiring manual intervention. You need solid processes for handling such exceptions, so that they can override the automated access control policies without disrupting the entire system.
  • Additionally, automating access control requires integrating Snowflake's user management features with other IAM systems (such as Active Directory or SSO) providers on the one hand, and various SIEM tools on the other.
  • Compliance with various industry regulations and internal policies regarding data access is a must for every business. Consequently, you may have to integrate Snowflake's logging and monitoring capabilities with other SIEM tools, which further complicates automating the process of monitoring and auditing user activities.

Challenge 2: Grasping the organization's data landscape

Setting up the relevant and effective data access management policies in Snowflake requires prior mapping of the data environment, with all its assets and key players, as well as deciding who should be able to access what data and to what ends.

Grasping the entire data landscape is a significant challenge for several reasons:

  • Modern organizations often have complex, distributed data environments, with data stored in various cloud platforms, on-premises systems, and disparate data sources. Keeping track of all the different data assets, their relationships, and their sensitivity levels can be daunting, especially when…
  • … the data landscape changes rapidly, as businesses generate and consume more and more data. It makes it even more challenging to remain up-to-date and having the access policies aligned with requirements.
  • In many organizations, data management responsibilities are often siloed across different teams, departments, or business units. This contributes to the lack of the general, holistic visibility into the data assets as well as the lack of consistency when it comes to data quality.
  • Businesses often lack data governance, so they struggle with maintaining a clear understanding of their data landscape. They often don’t have thorough visibility into who has access to what data, how it is being used, and where it is stored.
  • Effective data access management in Snowflake requires a clear understanding of the sensitivity and classification of the data. We’ve seen organizations where the lack of a standardized approach to data classification and labeling added more difficulty to applying appropriate access controls.
  • Many organizations have a mix of legacy systems and modern data platforms, such as Snowflake. Integrating these disparate systems and managing access across the entire data landscape is certainly a complex and time-consuming undertaking, often exacerbated by technical debt.

Challenge 3: Mapping out the policies and players

As your business evolves, the number of users, teams, and roles within the Snowflake environment can increase exponentially. You must be able to map out and track all the different users, their roles, and their associated permissions. This may be complicated because of the following:

  • Snowflake provides a highly granular access control model, allowing organizations to grant and revoke permissions at the database, schema, table, and even column level. Managing this level of granularity across a large and dynamic user base can be a significant challenge.
  • In Snowflake, permissions can also be inherited through the hierarchy of databases, schemas, and objects. You have to have a thorough understanding of how permissions will be inherited throughout the data environment as your business continues to grow.
  • You must follow the principles of segregation of duties and least privilege access, while also allowing them to perform their necessary tasks. This is a balancing act that requires careful planning and ongoing maintenance.
  • Many organizations have temporary or project-based access requirements, where users need to be granted specific permissions for a limited duration. Managing these temporary access grants and ensuring they are revoked when no longer needed is simply prone to errors.
  • Regulatory requirements and internal security policies often mandate regular audits of user access and permissions in Snowflake. Ensuring a comprehensive view of the current state of user access, grants, and permissions is a great challenge, especially in large data environments.

Common data access control risks

Failing to implement effective data access policies can expose your business to a range of serious and grave risks:

  • Data breaches and unauthorized access: without proper access controls, sensitive data will be subject to potential data breaches. We all know what this means in today’s data drive world: reputational damage, financial losses, and legal liabilities for the organization.
  • Regulatory violations: most industries are subject to strict data privacy and security regulations, such as GDPR, HIPAA, or PCI-DSS. Inadequate data access policies can lead to non-compliance, resulting in hefty fines, legal penalties, and credibility damages.
  • Data misuse: improperly-controlled access to data can be manipulated or misused (both by your employees or external parties), leading to inaccuracies, biased decision-making, and potential financial or reputational harm.
  • Slow workflows: poorly managed data access can slow down your business processes, preventing users to swiftly obtain the data they need to perform their tasks. This can reduce productivity and decrease organizational agility.
  • Lack of proper data governance: well-defined access policies help to ensure the proper management and quality of data assets. Without those in place, you risk working with fragmented and inconsistent data and undermining your data-to-insight efforts.
  • Heightened risk of data loss or theft: sensitive information is always prone to theft or loss – be it through accidental deletion, malicious actions, or unauthorized data transfers. The consequences? Financial losses, operational disruptions and reputational damage.

To mitigate these risks, organizations should implement a comprehensive data access policy as part of their overall data governance framework. This policy should define clear roles, responsibilities, and access controls, ensuring that data is accessible only to authorized individuals and that appropriate security measures are in place to protect against unauthorized access, misuse, and data breaches.

Data access control in Snowflake

Here is a paraphrase of the overview of access control in Snowflake:

Snowflake's access control framework combines aspects of DAC (Discretionary Access Control, where object owners can grant access to the object) and RBAC (Role-Based Access Control, where access is managed through roles).

In the Snowflake model there are four key terms to understand:

  • Securable objects: the entities to which access can be granted. Access is denied unless explicitly allowed.
  • Roles: are the entities that can be granted privileges to access objects. Roles can be assigned to users and other roles, creating a role hierarchy.
  • Users: the identities, whether associated with a person or program, recognized by Snowflake.
  • Privileges: define the level of access granted to a role for a securable object.

Snowflake's access control framework combines aspects of DAC (Discretionary Access Control, where object owners can grant access to the object) and RBAC (Role-Based Access Control, where access is managed through roles). This combination is designed to provide high control and flexibility in managing access to securable objects in the system. More specifically:

  • Access to securable objects is granted through privileges assigned to roles. These roles are then assigned to users or other roles, creating a role hierarchy.
  • Privileges are always granted to roles, never directly to users.
  • Granting a role to another role establishes a hierarchy, where the privileges of the granted role are inherited by the higher-level roles.
  • Each securable object has an owner role that can grant access privileges to other roles. So, this is different from a user-based access control model, where rights and privileges are assigned directly to individual users or user groups.
Diagram demonstrating Snowflake's data access control framework.

Securable objects

Securable objects reside within a hierarchical structure, with the top-level being the customer organization, followed by databases, schemas, and individual objects like tables, views, etc.

  • Each securable object is owned by a single role, which is typically the role that created the object. The owner role has all privileges on the object by default.
  • Ownership of an object can be transferred to another role.
  • In a regular schema, the owner role can grant or revoke privileges on the object to other roles.
  • In a managed access schema, only the schema owner or a role with the relevant privilege can make grant decisions.
  • The specific SQL actions a user can perform on an object are defined by the privileges granted to the user's active role.
  • The hierarchical structure and ownership model provide a flexible and granular way to control access to securable objects in Snowflake.
Diagram of hierarchical structure and ownership model in Snowflake.

Roles

Roles are the entities to which privileges on securable objects can be granted and revoked. Users are assigned to roles to perform actions required for their business functions.

  • Users can be assigned multiple roles and switch between them in a Snowflake session to perform different actions.
  • Snowflake has a small number of system-defined roles that cannot be dropped or have their privileges revoked.
  • Users can create custom roles to meet specific business and security needs.
  • Roles can be granted to other roles, creating a role hierarchy. Privileges are inherited by roles higher in the hierarchy.

There are different types of roles, which are managed in accordance with the RBAC model:

  • Account roles: grant privileges to any object in the account
  • Database roles: grant privileges limited to a single database
  • Instance roles: grant access to instances of a class
  • Application roles: enable consumer access to objects in a Snowflake Native App

Privileges

Privileges determine who can access and perform operations on specific objects in Snowflake. Each securable object has a set of privileges that can be granted.

  • Privileges are managed using the GRANT and REVOKE commands.
  • In regular (non-managed) schemas, only the object owner or a role with the MANAGE GRANTS privilege can grant or revoke privileges.
  • In managed access schemas, only the schema owner or a role with the MANAGE GRANTS privilege can grant privileges, centralizing privilege management.
  • Roles can be granted to other roles, creating a role hierarchy. Privileges are inherited by roles higher in the hierarchy.

Snowflake privilege hierarchy

There are some limitations with database roles in the role hierarchy:

  • A database role granted to a share cannot have other database roles granted to it.
  • A database role granted to another database role cannot be granted to a share.
  • Account roles cannot be granted to database roles in the hierarchy.

The privilege model, along with the role-based access control, provides granular control over access to securable objects in Snowflake.

Setting up data access policies in Snowflake

Of course – you need to develop a comprehensive data access policy as part of your data governance strategy. But what does this actually mean? What’s at stake? Let’s take a closer look at each step we’ve identified in this process.

Step 1: Classify all data assets

Begin by preparing a thorough inventory of your data assets, including databases, data lakes, data warehouses, and other data repositories. (We’ve written a thorough comparison of these terms in a separate article). You need to understand the data sources, data types, data volumes, data owners, and the purpose of the data. Data discovery or data cataloging tools such as XYZ can greatly accelerate these tasks.

Then, classify data assets based on the level of sensitivity (e.g., public, internal, confidential, or highly sensitive), the level of criticality to business operations (e.g., mission-critical, essential, or non-essential), and any applicable regulatory requirements (e.g., personally identifiable information, financial data, or intellectual property). The classification framework needs to take into account the organization's risk management strategies and any regulatory obligations.

The data catalog will have to be integrated with your data platforms and databases, to enable automation of metadata capture and updates. It’s good to involve data stewards and domain experts to contribute to cataloging the data – they can add context that enhances the overall data usability.

Step 2: Build a thorough understanding of the organization’s data architecture

This step is critical because the architecture of your data platform will determine the access policies. For instance, in a data mesh architecture, data domains are managed by cross-functional teams, so data ownership is decentralized. This requires a more nuanced approach to data access policies, especially in a situation where each data domain has different access requirements.

You also need to understand the data lineage and dependencies across your data ecosystem. This will help you assess the impact of access policy changes, as w ith Snowflake’s architecture, changes in one domain may have ripple effects on other parts of the organization, so understanding how data is linked is critical for assessing the impact of access policy changes.

While you’re on this, you can’t forget about principles that are pertinent to data architecture, such as data sovereignty, data ownership, and data stewardship, as these will also influence the access policies and the overall data governance framework. If these terms are new, consult our comprehensive data glossary.

Step 3: Build a thorough understanding of the organization

Start with a thorough investigation of the organizational structure. This is crucial for identifying all the different user roles and their data access requirements. Map out the different user roles within your organization, such as executives, data analysts, data scientists, business users, and IT administrators and make sure you understand the different access needs for every individual.

In practice, you’ll notice that the 3 steps are closely intertwined and you may end up gathering all this data simultaneously, but we wanted to highlight that the following aspects are critical here: grasping all the data assets, understanding the data platform architecture and understanding who has to access what data, what they can do with it and what they can’t.

Step 4: Map out the organization's data access policies

Next, establish clear policies for granting, managing, and revoking access to data resources and map out these relationships. This involves assigning individuals a role and establishing the specific data access requirements for each user role:

  • the type of data they need to access (read, write, or administrative permissions), 
  • the level of granularity (e.g., access to specific datasets, tables, or columns),
  • any temporal or contextual constraints (e.g., access only during certain time periods or for specific business purposes).

The mapping must clearly show data lineage and dependencies so that you can understand how data flows through your ecosystem.

This step is essentially about documenting and visualizing everything you’ve established in the previous steps: describing the access policies as they pertain to all of your data assets and identities, who will own and use them, and in what way.

Step 5: Leverage Snowflake's access control features

Snowflake offers robust access control features and you definitely want to take advantage of those. The following will allow you to precisely control who can access your data and what they can do with it, ensuring the security and integrity of your organization's sensitive information.

Data masking

Snowflake's data masking feature enables you to obfuscate sensitive data, such as PII or financial data, helping you to protect it against unauthorized exposure while still allowing users to interact with the data in a meaningful way. Users may be able to manage different data sets without actually seeing the data values, which can be highly useful in multiple scenarios.

Grants

The very foundation of Snowflake's access control system – grants – allow you to assign specific permissions to users and roles, giving you full control of what actions they can perform on database objects. Grants can be applied at the database, schema, or object level, fully supporting the granular control over data access.

Future grants

Snowflake's future grants feature allows you to pre-define grants that will automatically take effect at a specified future date or time. This is particularly useful when you need to prepare for changes in data access requirements, such as onboarding new team members or updating security policies.

Managed schemas

Snowflake's managed schemas provide a way to centrally manage and control access to database objects. By creating managed schemas, you can easily apply access policies, grants, and other configurations to all objects within the schema, simplifying the administration of your data platform.

Hierarchical roles

Snowflake's RBAC system allows you to define a hierarchy of roles, with each role inheriting the permissions of the roles above it. This means you can assign users to specific roles and easily update permissions at the role level.

Row-Level Security (RLS)

Snowflake's RLS feature enables you to restrict access to specific rows of data based on user attributes or other dynamic criteria.

Column-Level Security (CLS)

In addition to row-level security, Snowflake also offers CLS, which allows you to control access to specific columns of data. This is useful when you need to protect sensitive information contained within a specific column.

Step 6: Automate data access management 

You need to have a single, centralized access management system to easily control data access across your entire organization, as it will inevitably evolve. Without it, you will end up wasting a lot of time on manual work, increasing the chances of errors and, thus, the risk of data breaches.

We think that the good old MS Excel works perfectly for automation purposes. We use the spreadsheet as a main dashboard that offers a clear overview of the policy mapping and can be used to automate data access management.

What are the advantages of using Excel for this purpose? It’s the fact that most people are already familiar with it, so it doesn’t require onboarding and can easily be used and consulted by individuals with minimal data experience.

Excel is also easy to integrate with pretty much all data platforms, including data catalogs, data lakes, data warehouses, and other data sources, so it’s great for automating access control management and ensuring seamless user provisioning and de-provisioning.

Step 7: Ensure regular monitoring and review

Once you’ve set up and automated your data access control system, ensure it’s periodically assessed. Your data landscape will certainly change so you need to ensure that the level of access granted to each user is still appropriate based on their current roles and responsibilities.

Regularly analyzing the Snowflake audit logs can help identify any suspicious or unauthorized activities, enabling prompt detection and investigation of potential security incidents. You can also leverage Snowflake's built-in alerting system which allows you to configure custom alerts based on specific criteria, such as failed login attempts, unusual data access patterns, or changes to critical objects.

Need support with data access control in Snowflake?

It's important to note that the specific challenges and solutions may vary depending on the size, complexity, and industry-specific requirements of the business. A careful assessment of the organization's data access and governance needs, as well as the available Snowflake features and third-party tools, is crucial for developing an effective automation strategy.

While these challenges can be significant, the biggest hurdles in data access management often lie in ensuring automation and grasping the organizational data landscape. Automating data access management processes can help organizations scale their efforts, reduce the risk of human error, and respond more quickly to changing access requirements. Additionally, gaining a deep understanding of the organization's data assets, user roles, and access patterns is essential for designing efficient and scalable access management strategies.

If you need support with anything outlined above, just drop us a message via this contact form and we’ll get back to you to schedule a free strategy call. 

People also ask

No items found.
Want more posts from the author?
Read more

Want to read more?

Data

ELT Process: unlock the future of data integration with Extract, Load, Transform

Unlock the future of data integration with our ELT process guide. Learn how Extract, Load, Transform can streamline your data workflow.
Data

Data integration: different techniques, tools and solutions

Master data integration with our comprehensive guide on techniques, tools, and solutions. Enhance your data strategies effectively.
Data

Supply chain analytics examples - 18 modern use cases

Explore real-world applications with our guide on supply chain analytics examples. See how data insights transform operations.
No results found.
There are no results with this criteria. Try changing your search.
en