Part 1 : Adds RLS and CLS control Policies #2048

singhpk234 · 2025-07-14T21:44:56Z

About the PR

This PR adds proposed policy spec proposed as part [OSS] Row and Column Based Access Control: Policy Definitions

This uses iceberg expressions, to define row level filters and uses column projections as way to project, presently since we are just talking of column (waiting for UDF's in Apache Iceberg to standardize), May be we can use transforms for the projections ? open to it !

Note: In last Apache Iceberg Community Sync all were generally alligned that using iceberg expression with extending its support for reference to iceberg UDF's were the right way to go !

This additionally introduces 2 context variables which are resolved at the catalog end based on the caller
$current_prinicipal, checks if the current principal is the principal activated, if yes makes this as true !
$current_principal_role, checks if the the underlying role is one of the activated principal roles based on current caller context !

so a policy being defined with this context variables are replace inline and evaluated using iceberg expression sdk !

so when ever the caller calls /get-applicable-policies it gets back context resolved variables and row and column policy

TODO : Policy merging

please check tests for E2E tests

Note : This is for engines who wants to get the policies directly ! rather than getting the secure view and are willing to integrate to the Polaris Policy Store directly

check this combination
https://docs.google.com/document/d/1AJicez7xPhzwKXenGZ19h0hngxrwAg3rSajDV1v0x-s/edit?tab=t.0#bookmark=id.j29shahtycb8

singhpk234 · 2025-07-14T21:45:36Z

proposal in Apache Polaris for the policy spec - https://lists.apache.org/thread/rf2zsgk9qh36z3s63gx6dgtl0s4cwngr

singhpk234 · 2025-07-14T21:55:29Z

cc @laurentgo

singhpk234 · 2025-07-14T22:23:16Z

service/common/src/main/java/org/apache/polaris/service/catalog/policy/PolicyCatalog.java

-      PolicyEntity policyEntity, boolean inherited) {
+  private boolean filterApplicablePolicy(PolicyEntity policyEntity) {
+    // check the type
+    if (policyEntity.getPolicyType().equals(ACCESS_CONTROL)) {


Need a PolicyEntityAugmentor which could conditionally modify the object, let me think about it more.

snazy · 2025-07-15T10:37:09Z

I'm a bit surprised to see a "ready for review (and merge)" PR for this.
From what I understand quite a few concerns were mentioned, so I'm not sure it's the best idea to start with a code change, but rather collaborate to eventually get to a consensus on the whole approach.

dimas-b

Some preliminary comments :)

dimas-b · 2025-07-15T20:01:55Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  // Optional, if there means policies is applicable to the role
+  private String principalRole;
+
+  // TODO: model them as iceberg transforms


Is there a plan to redo this policy definition after merging or before merging this PR?

we can amend them in another version, if a release goes in with, policies support versioning https://polaris.apache.org/in-dev/unreleased/policy/

The only concern of not using iceberg transform's right now is that they are at the moment limited, yes existing column projection can be modeled as iceberg identity transform ... but if we want to support data masking then transforms should contain references of iceberg UDF's but right now the support is not there hence refrained, open to it if we want to model them like that .

dimas-b · 2025-07-15T20:05:45Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+public class AccessControlPolicyContent implements PolicyContent {
+
+  // Optional, if there means policies is applicable to the role
+  private String principalRole;


How does this policy apply to a role? What is the mechanism? I could not find this in the linked doc 🤔

This is to purely support people who just have role based access control, providers like MS, AWS etc ... for them they can just store their row filters / projections against the role and the applicable policy

I go into the details of this here - https://docs.google.com/document/d/12nhS0GX1U1PqEBKp74bIBZsL9kB5duDlN9diHJAhJsM/edit?tab=t.0#bookmark=id.ij6iuno9gsic

What I mean is: Is this role a Polaris Principal role? If we have multiple policies, how do we find the set of applicable policies? (I'm not sure I saw details on the in the doc 😅 )

More broadly: Is the binding to roles actually part of the policy?

This is not an objection... more of a point to think about.

If we have multiple policies, how do we find the set of applicable policies? (I'm not sure I saw details on the in the doc 😅 )

This is an existing feature of the policy store, you can retrieve all policies applicable to a given entity. Each policy can correspond to one role.

dimas-b · 2025-07-15T20:08:40Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  // Use a custom deserializer for the list of Iceberg Expressions
+  @JsonDeserialize(using = IcebergExpressionListDeserializer.class)
+  @JsonSerialize(using = IcebergExpressionListSerializer.class)
+  private List<Expression> rowFilters;


How can this work "without context functions" (code comment above)? How will Polaris code interface with these expressions?

They're used to generate a view, right?

dimas-b · 2025-07-15T20:09:22Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  private List<Expression> rowFilters;
+
+  private static final String DEFAULT_POLICY_SCHEMA_VERSION = "2025-02-03";
+  private static final Set<String> POLICY_SCHEMA_VERSIONS = Set.of(DEFAULT_POLICY_SCHEMA_VERSION);


dimas-b · 2025-07-15T20:13:21Z

polaris-core/src/main/java/org/apache/polaris/core/policy/PredefinedPolicyTypes.java

@@ -28,7 +28,8 @@ public enum PredefinedPolicyTypes implements PolicyType {
  DATA_COMPACTION(0, "system.data-compaction", true),
  METADATA_COMPACTION(1, "system.metadata-compaction", true),
  ORPHAN_FILE_REMOVAL(2, "system.orphan-file-removal", true),
-  SNAPSHOT_EXPIRY(3, "system.snapshot-expiry", true);
+  SNAPSHOT_EXPIRY(3, "system.snapshot-expiry", true),
+  ACCESS_CONTROL(4, "system.access-control", false);


The term ACCESS_CONTROL is too generic IMHO. How about TABLE_DATA_ACCESS_EXPRESSIONS?

The "expression" part related to the fact that this policy uses Iceberg expressions to represent filters.

IIUC we'll use the same policy to also have non-expression based filtering, but I think that something like TABLE_ACCESS or TABLE_DATA_ACCESS is a good idea.

How can the same policy type support different contents? What is the approach to processing different contents within the same policy type?

Agree, I modeled all 3 policies to be represented in a single policy spec
columnProjections

column hiding (only authorized against RBAC)

column projections (DDM -> to be applied with the UDF's)

rowFilter expressions (iceberg expression which can contain UDF references)

please let me know if you prefer it otherwise ?

I do not have a firm opinion on the specific form of these policies (yet).

However, I'd like to make this system extensible. That is, if we have another kind of policy for row filtering, assigning the generic ACCESS_CONTROL name to the current one will make the new policy kind of marginal. This is why I propose for policy type names to be more specific up front.

dimas-b · 2025-07-15T20:16:57Z

polaris-core/src/main/java/org/apache/polaris/core/policy/PolicyUtil.java

+
+  public static String replaceContextVariable(
+      String content, PolicyType policyType, AuthenticatedPolarisPrincipal authenticatedPrincipal) {
+    if (policyType == ACCESS_CONTROL) {


Why force all policies to go through this if? Would it be possible to refactor the code to leverage java types for invoking type-specific replacements?

dimas-b · 2025-07-15T20:20:45Z

service/common/src/main/java/org/apache/polaris/service/catalog/policy/PolicyCatalog.java

-        .setContent(policyEntity.getContent())
+        .setContent(
+            PolicyUtil.replaceContextVariable(
+                policyEntity.getContent(), policyEntity.getPolicyType(), authenticatedPrincipal))


It would be nice to have a more flexible / extensible approach to connecting policies with the security context than a simple reference to AuthenticatedPolarisPrincipal. The Principal may not be the only factor in making access decisions.

dimas-b · 2025-07-15T20:26:09Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  // Iceberg expressions without context functions for now.
+  // Use a custom deserializer for the list of Iceberg Expressions
+  @JsonDeserialize(using = IcebergExpressionListDeserializer.class)
+  @JsonSerialize(using = IcebergExpressionListSerializer.class)


This binds Polaris API to internal serialization code in Iceberg. Iceberg changes in Expression serialization will affect Polaris APIs. I'd like to avoid this dependency.

This seems fine to me given that Expression is also defined in the IRC spec: https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2162. So Iceberg side need to make sure any changes there won't violate the spec. We've also depends on CatalogHandler, RestResponse, TableIdentifier, and other classes/models from iceberg. We can change to a custom one later if we need some customizations.

dimas-b · 2025-07-15T20:28:59Z

polaris-core/src/main/java/org/apache/polaris/core/policy/validator/PolicyValidators.java

@@ -98,6 +102,10 @@ public static boolean canAttach(PolicyEntity policy, PolarisEntity targetEntity)
      case ORPHAN_FILE_REMOVAL:
        return BaseMaintenancePolicyValidator.INSTANCE.canAttach(entityType, entitySubType);

+      case ACCESS_CONTROL:
+        // TODO: Add validator for attaching this only to table


What's the reason for deferring this?

polaris-core/src/main/java/org/apache/polaris/core/policy/PolicyUtil.java

laurentgo · 2025-07-15T14:47:10Z

polaris-core/src/main/java/org/apache/polaris/core/policy/PolicyUtil.java

+
+        policyContent.setRowFilters(evaluatedRowFilterExpressions);
+        return AccessControlPolicyContent.toString(policyContent);
+      } catch (Exception e) {


(design) is it the right thing to return the original content if an exception occurs?

laurentgo · 2025-07-15T20:14:57Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+import org.apache.iceberg.expressions.Expression;
+import org.apache.polaris.core.policy.validator.InvalidPolicyException;
+
+public class AccessControlPolicyContent implements PolicyContent {


(design) it may be seen as a preference, but it seems the overall language community is moving towards immutable objects as data carriers (like java records) and wonder if this is something we should adopt here as well

I think we can have a separate discussion for this in the community about the adoption/direction to use Immutable objects in the future. For this PR I think be consistent with existing data structure is fine.

laurentgo · 2025-07-15T20:22:47Z

.../src/main/java/org/apache/polaris/core/policy/content/IcebergExpressionListDeserializer.java

+import org.apache.iceberg.expressions.Expression;
+import org.apache.iceberg.expressions.ExpressionParser;
+
+public class IcebergExpressionListDeserializer extends JsonDeserializer<List<Expression>> {


(design) I wonder why a serializer/deserializer pair for the list is needed vs having it for the elementy type (Expression)?

laurentgo · 2025-07-15T20:23:51Z

polaris-core/src/main/java/org/apache/polaris/core/policy/content/PolicyContentUtil.java

@@ -30,6 +31,8 @@ private static ObjectMapper configureMapper() {
    mapper.configure(DeserializationFeature.FAIL_ON_MISSING_CREATOR_PROPERTIES, true);
    // Fails if a required field is present but explicitly null, e.g., {"enable": null}
    mapper.configure(DeserializationFeature.FAIL_ON_NULL_CREATOR_PROPERTIES, true);
+    // This will make sure all the Iceberg parsers are loaded.
+    RESTSerializers.registerAll(mapper);


Could that cause conflicts?

Why do we need all the REST specific types?

polaris-core/src/main/java/org/apache/polaris/core/policy/PolicyUtil.java

laurentgo · 2025-07-15T20:32:05Z

service/common/src/main/java/org/apache/polaris/service/catalog/policy/PolicyCatalog.java

    Namespace parentNamespace = policyEntity.getParentNamespace();

    return ApplicablePolicy.builder()
        .setPolicyType(policyEntity.getPolicyType().getName())
        .setInheritable(policyEntity.getPolicyType().isInheritable())
        .setName(policyEntity.getName())
        .setDescription(policyEntity.getDescription())
-        .setContent(policyEntity.getContent())
+        .setContent(


I'm not familiar with PolicyCatalog exact role but it seems it is used for CRUD operations on the policies. As such, replacing the content of some policies on the fly may not be the right thing to do because people would not be able to access the source of truth?

(design) IMHO this seems like a leakage of an access control mechanism outside of the realm of PolarisAuthorizer

We are just replacing the get-applicable policy with the context, its a key point to resolve server side identity and the client engine doesn't even needs to know about these as it might not even know user's identity (just has token).
Now if the concern is we want to know exact policy one can just load it right and we don't resolve context there :
please have a look in this API :
https://github.com/apache/polaris/blob/main/spec/polaris-catalog-service.yaml#L153
so IMHO this concern is addressed !

this seems like a leakage of an access control mechanism outside of the realm of PolarisAuthorizer

I thought about this really hard, even talked to folks inside, so far the PolarisAuthorizer interface just Authorizes based on RBAC, now Authorzier returning FGAC predicates and projections would require more thorough thought

I just followed the existing way of retriving the policies from persistence and doing checks which policies apply to the caller ?

if we want to go on the route of supporting GetAccessControlPolicy via Authorizer interface, Here is what I had in mind.

public interface FGACAwarePolarisAuthorizer extends PolarisAuthorizer { default AuthorizationContext authorizeOrThrowWithFGAC( @Nonnull CallContext callContext, @Nonnull AuthenticatedPolarisPrincipal authenticatedPrincipal, @Nonnull Set<PolarisBaseEntity> activatedEntities, @Nonnull PolarisAuthorizableOperation authzOp, @Nullable PolarisResolvedPathWrapper target, @Nullable PolarisResolvedPathWrapper secondary) { authorizeOrThrow(callContext, authenticatedPrincipal, activatedEntities, authzOp, target, secondary); // query policies from persistence and give back FGAC policy // create AuthZ context } AuthorizationContext authorizeOrThrowWithFGAC( @Nonnull CallContext callContext, @Nonnull AuthenticatedPolarisPrincipal authenticatedPrincipal, @Nonnull Set<PolarisBaseEntity> activatedEntities, @Nonnull PolarisAuthorizableOperation authzOp, @Nullable List<PolarisResolvedPathWrapper> targets, @Nullable List<PolarisResolvedPathWrapper> secondaries) { authorizeOrThrow(callContext, authenticatedPrincipal, activatedEntities, authzOp, targets, secondaries); // query policies from persistence and give back FGAC policy // create AuthZ context } }

AuthorizationContext

AccessControlPolicyContent

Map<Object, Object> additionalParams

wdyt ?

it seems a good conversation starter, thanks for proposing it. I don't know if it is best to try and extend APIs (which are very generic whereas FGAC only apply to a couple of permission) or have an extra separate method that you combine with a previous check (so 2 calls to the authorizer)

For AccessControlPolicyContent, would it be like this?

public record AccessControlProcessingInstruction(String type, String expression); public record RowFilter(AccessControlProcessingInstruction filter); public record ColumnTransformation(int id, String name, AccessControlProcessingInstruction transformation); public record AccessControlPolicyContent(String rowFilter, List<ColumnTransformation> columnTransformations);

laurentgo · 2025-07-15T20:35:18Z

polaris-core/src/main/java/org/apache/polaris/core/policy/PolicyUtil.java

+    return content;
+  }
+
+  public static boolean filterApplicablePolicy(


(design) this method doesn't have any javadoc, but shouldn't this belong to some dedicated class related to FGAC instead?

snazy · 2025-07-17T07:55:31Z

(Replying to the later editor PR description)

Note: In last Apache Iceberg Community Sync all were generally alligned that using iceberg expression with extending its support for reference to iceberg UDF's were the right way to go !

We have not agreed on a Polaris implementation in an Iceberg community sync. (Note: the Apache policy is: "If it didn't happen on the mailing list, it never happened.")

snazy

Left a couple of technical comments on this PR.

I think we should explore all legit options to manage FGAC rules in Polaris. Iceberg expressions is one option. Another option is leveraging a standard like SQL:92, SQL:99, SQL:2002. Substrait is a mature project with a diverse and active community and an option as well.

My concern about Iceberg expressions is the limitations it has regarding available functions and expressions. Some examples: COALESCE, LPAD, RPAD, NULLIF, CASE, CAST, INTERVAL (...).

I think we lose a lot of flexibility without these functions. Maybe I haven't looked into it enough, but I cannot figure out a way with Iceberg expressions to mask a credit card number like ***234. But I guess that's what should not go into Iceberg expressions? What's the representation of those Strings?

Row-level filter expressions benefit a lot from CASE, CAST, COALESCE etc. Not sure whether it's worth to not have those available to users.

The Iceberg expressions implementation is an interpreter, but that's rather a 2nd-ary concern.

All Iceberg query engines I know are based on SQL and can natively handle and optimize SQL functions. Not sure whether it's worth to have an (intermediate) representation.

Spark datasets/frames are not SQL, but FGAC with those APIs is a different topic ( LogicalPlan).

I think we should start with how row-filters and column-transformations are exposed to query engines and then figure out the best way how those are are managed in Polaris ("top down approach").

snazy · 2025-07-17T07:41:35Z

.../src/main/java/org/apache/polaris/core/policy/content/IcebergExpressionListDeserializer.java

+  @Override
+  public List<Expression> deserialize(JsonParser p, DeserializationContext ctxt)
+      throws IOException {
+    ObjectMapper mapper = (ObjectMapper) p.getCodec();


This cast assumes a Jackson implementation detail and can likely break with Jackson updates.
Therefore please remove this case.

As ObjectCodec does have a readTree(JsonParser) method, is the cast even necessary?

snazy · 2025-07-17T07:48:58Z

.../src/main/java/org/apache/polaris/core/policy/content/IcebergExpressionListDeserializer.java

+    List<Expression> expressions = new ArrayList<>();
+    if (node.isArray()) {
+      for (JsonNode element : node) {
+        // Convert each JSON element back to a string and pass it to ExpressionParser.fromJson
+        expressions.add(ExpressionParser.fromJson(mapper.writeValueAsString(element)));
+      }
+    }


Why is this necessary at all?

snazy · 2025-07-17T07:49:35Z

...re/src/main/java/org/apache/polaris/core/policy/content/IcebergExpressionListSerializer.java

+    jsonGenerator.writeStartArray();
+    if (expressions != null) {
+      for (Expression expression : expressions) {
+        jsonGenerator.writeString(ExpressionParser.toJson(expression));


Wyh "string" and not proper JSON?

snazy · 2025-07-17T07:50:54Z

polaris-core/src/main/java/org/apache/polaris/core/policy/content/PolicyContentUtil.java

@@ -30,6 +31,8 @@ private static ObjectMapper configureMapper() {
    mapper.configure(DeserializationFeature.FAIL_ON_MISSING_CREATOR_PROPERTIES, true);
    // Fails if a required field is present but explicitly null, e.g., {"enable": null}
    mapper.configure(DeserializationFeature.FAIL_ON_NULL_CREATOR_PROPERTIES, true);
+    // This will make sure all the Iceberg parsers are loaded.
+    RESTSerializers.registerAll(mapper);


Why do we need all the REST specific types?

snazy · 2025-07-17T07:51:47Z

polaris-core/src/test/java/org/apache/polaris/core/policy/AccessControlPolicyContentTest.java

+        "{\n"
+            + "  \"principalRole\": \"ANALYST\",\n"
+            + "  \"extraField\": \"someValue\"\n"
+            + // Extra field


What's the "extra field"?

adutra · 2025-07-17T15:50:10Z

Row-level filter expressions benefit a lot from CASE, CAST, COALESCE etc. Not sure whether it's worth to not have those available to users.

Indeed, the ability to express something like "if X, then allow; else, deny" seems essential to row filters. I don't see any ternary operator in the current Iceberg expressions. @singhpk234 is the intent to add more expressions to Iceberg to enable row filters?

singhpk234 · 2025-07-17T16:35:01Z

@snazy @adutra thank you for sharing your feedbacks, i want to walk you all through what my thought process was :

Why Iceberg Expressions and not SQL
Iceberg expresions are portable, dialect agnostic and first class citizens of iceberg world and IMHO are the must for interop
Note almost all engines have
Engine specific SQL -> Engine Expression -> Iceberg Expression -> Iceberg SDK (manifest filtering) pipeline already

we did explore dialect agnostic SQL such as SQL92 for most part of it and we keep coming back to dialect specific requirements if it was a sql, engines want to store their dialect specific stuff directly in policy and hence in persistence, making this policy only workable to the policy definer dialect. what will be the behaviour if i don;t understand the dialect.

What i see is we can expand Iceberg expression to contain UDF references and via UDF you model all you dialect specific stuff but bottom line being you just operate in iceberg expression in policy definition.
If we want dialect specific stuff we can still shove everything in UDF, Here is one common example :
sha256 in spark dialect is sha2, why can model this as hash udf and then for spark its sha2 and trino 256 ?

Also heads up Iceberg Expression are gonna get expanded soon due to the constraints for v4 work that Anton is driving (a uber level idea was discussed in some of the sync), essentially storing table constraints in iceberg metadata so that it can be retrieved and enforced by calling engine but the storage is still iceberg expressions, so iceberg expression is what they plan to use for interoperability

Note : I checked in catalog community sync last time iceberg expression with UDFs seems like a right direction, so i am not saying we pinned on it but just presenting thoughts of what the wide community thinks on this aspect.

I understand UDF's are not there yet and it will take some time meanwhile using iceberg expression seems and storing RLS there seems like good step IMHO.
I know atleast one cloud provider it will for sure help = https://docs.aws.amazon.com/lake-formation/latest/dg/partiql-support.html

Why not Substrait ?

I think if we want subtrait, we can make UDF's return substrait directly, but i think in iceberg community atleast there is no concencus on the IR, as similar discussions have been brought up in the community in past for Iceberg Views, i would request to drive that in iceberg first and then we can incorporate that in policy. Also please note engines like Snowflake / Redshift ... doesn't support substrait, unless its established as a standard IR in iceberg IMHO we should not take dependency on it.

Spark datasets/frames are not SQL, but FGAC with those APIs is a different topic

Yes, Non SQL requires a more through discussions imho, as for example how do we model the SQL written in one langue to be parsed ?

I hope I am able to explain my thought process here ! I really appreciate you taking a look.

snazy · 2025-07-18T07:27:10Z

Managing FGAC rules and exposing protection instructions are two different domains.
The first is defined by the catalog and won't be part of Iceberg, the latter (to be) defined by Iceberg.

I think we should not restrict our users to the limitations of Iceberg expressions.

kevinjqliu · 2025-07-19T16:33:15Z

Thanks for the PR @singhpk234, super interesting to see RLS and CLS policies for governing iceberg tables

Do you know if theres a way I can quickly set this up and do a proof of concept locally?
I looked at #999, #1059, and https://polaris.apache.org/in-dev/unreleased/policy/ but couldnt find anything concrete.

I think it would be helpful to document "getting started with policies" in polaris so folks who are interested can get some hands-on experience with the proposal

singhpk234 · 2025-07-21T15:50:21Z

Managing FGAC rules and exposing protection instructions are two different domains.

@snazy i understand that and in my design i want to leverage existing policy store implementation to manage FGAC rules and then what you are calling as protection instructions which is the result of policy evaluation is my view SQL

The first is defined by the catalog and won't be part of Iceberg, the latter (to be) defined by Iceberg.
I think we should not restrict our users to the limitations of Iceberg expressions.

I am bit confused about your conclusion here, as in one place you are suggesting iceberg needs to set standard for policy evaluation result but at the same time we are discarding iceberg expression which are first class citizens of iceberg world (which are portable and are used in different language like iceberg rust / iceberg python ...) ? IMHO we should not go into dialect specific SQL as it makes the evaluation harder and specially given that we already have path forward with iceberg expressions and UDF's. please let me know you thoughts and lets have more discussions !

singhpk234 · 2025-07-21T15:52:53Z

@kevinjqliu thank you for the interest in the PR, unfortunately there is not getting-started (let me create a ticket for it) with policies that i am aware of, here is some integration tests -
https://github.com/apache/polaris/blob/main/integration-tests/src/main/java/org/apache/polaris/service/it/test/PolarisPolicyServiceIntegrationTest.java
which can help in understanding the flow. Happy to help you over slack as well !

snazy · 2025-07-22T11:29:22Z

Iceberg Expressions as of today can IMHO not be used to represent a bunch of simple and none of the more complex use cases. AFAIU it is not possible to:

replace a column value with a constant expression
credit-card masking (substr + string concatenation)
perform any kind of calculation or rounding or trimming or interval calculation (and more)

I would love to use a portable standard, but as of today (as of Jan 2022 actually) Iceberg Expressions are not yet suitable for this use case.

Until we have a way to express even simple, standard use cases, I strongly believe we have to have to support SQL. Which means, that requiring Iceberg Expressions, as proposed here, is IMHO not a viable option to satisfy the needs of our users today.

As a side note I also believe that the "front end" use case (provide protection instructions) should drive the "CRUD" parts, not the other way around.

It is completely fine when users only rely on Iceberg Expressions, but it would be IMHO quite a mistake to intentionally make it impossible for users to have a much more flexible way.

I also do not think that UDFs are the answer to everything, despite that Iceberg UDFs are not that concrete nor even implemented. But even if Iceberg UDFs would be there, you would still have to deal with data types and casting and null-handling. None of these three aspects is currently covered by Iceberg Expressions.

singhpk234 · 2025-07-30T15:54:01Z

Updating here we talked offline, it seems like a decent option to have iceberg expressions, supporting sql dialect along side is an ongoing discussion.

snazy

There's no consensus on this particular approach.

Particularly, there's no consensus on:

whether security policies can be attached to multiple tables/views or just one
whether security policies shall be attachable to only tables or tables and views
how Polaris stores security policies

All this requires a broader discussion and should IMHO not be set in stone at this time.

singhpk234 · 2025-08-01T17:36:02Z

Could you clarify which of the points above are directly related to this pr? Or is this a general objection to any work on FGAC until the above questions have all been resolved?

From my impression this looks like this pr is just covering defining policies using Iceberg expressions and resolving context variables in said policies? The question of policy attachment is essentially ignored in this PR and I'm not sure I see anything here that changes the persistence layer or would block changes to that in the future?

snazy · 2025-08-06T11:10:05Z

this pr is just covering defining policies

Right - this PR directs the approach how FGAC policies managed in Polaris. I really prefer to avoid having multiple approaches of the same thing in Polaris.

The question of policy attachment is essentially ignored in this PR

It is not. This PR uses the existing policy mechanism, which might not be the right way for FGAC.

HonahX · 2025-08-01T16:46:12Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+import org.apache.iceberg.expressions.Expression;
+import org.apache.polaris.core.policy.validator.InvalidPolicyException;
+
+public class AccessControlPolicyContent implements PolicyContent {


I think we can have a separate discussion for this in the community about the adoption/direction to use Immutable objects in the future. For this PR I think be consistent with existing data structure is fine.

HonahX · 2025-08-01T18:07:23Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  // Iceberg expressions without context functions for now.
+  // Use a custom deserializer for the list of Iceberg Expressions
+  @JsonDeserialize(using = IcebergExpressionListDeserializer.class)
+  @JsonSerialize(using = IcebergExpressionListSerializer.class)


This seems fine to me given that Expression is also defined in the IRC spec: https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2162. So Iceberg side need to make sure any changes there won't violate the spec. We've also depends on CatalogHandler, RestResponse, TableIdentifier, and other classes/models from iceberg. We can change to a custom one later if we need some customizations.

HonahX · 2025-08-06T16:26:10Z

polaris-core/src/main/java/org/apache/polaris/core/policy/validator/PolicyValidators.java

@@ -98,6 +102,10 @@ public static boolean canAttach(PolicyEntity policy, PolarisEntity targetEntity)
      case ORPHAN_FILE_REMOVAL:
        return BaseMaintenancePolicyValidator.INSTANCE.canAttach(entityType, entitySubType);

+      case ACCESS_CONTROL:
+        // TODO: Add validator for attaching this only to table
+        return true;


We probably need to pass in callContext or at least the configuration store to make this default to false and can be manually turned on for POC purpose, until we implement the validator for attaching this policy only to table.

HonahX · 2025-08-06T17:00:27Z

IMHO, if the concern is that the attachment, this PR still does a good job of focusing on the policy definition itself. As an additional step, I suggest we disable attachments of ACCESS_CONTROL policy by default while still allowing them to be manually enabled for testing/PoC purposes. So, we can ensure no invalid attachment will happen before we finalized the attachment requirement.

singhpk234 · 2025-08-07T00:13:50Z

I suggest we disable attachments of ACCESS_CONTROL policy by default while still allowing them to be manually enabled for testing/PoC purposes

Thats a nice option @HonahX (This is why i added a TODO in above for attachment) this would help address @snazy feedback let me hide it behind a flag for feature / E2E testing and also rebase this pr

eric-maynard

Left a few comments, overall looks reasonable to me.

As for the objection that There's no consensus on this particular approach, that's what the PR is for. If you have specific objections to aspects of this approach please do raise them here so that they can be addressed. In the absence of such objections, it seems that we have a consensus.

polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java

polaris-core/src/main/java/org/apache/polaris/core/policy/AccessControlPolicyUtil.java

eric-maynard · 2025-08-07T20:24:48Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+public class AccessControlPolicyContent implements PolicyContent {
+
+  // Optional, if there means policies is applicable to the role
+  private String principalRole;


If we have multiple policies, how do we find the set of applicable policies? (I'm not sure I saw details on the in the doc 😅 )

This is an existing feature of the policy store, you can retrieve all policies applicable to a given entity. Each policy can correspond to one role.

eric-maynard · 2025-08-07T20:25:05Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  // Use a custom deserializer for the list of Iceberg Expressions
+  @JsonDeserialize(using = IcebergExpressionListDeserializer.class)
+  @JsonSerialize(using = IcebergExpressionListSerializer.class)
+  private List<Expression> rowFilters;


They're used to generate a view, right?

eric-maynard · 2025-08-07T20:25:24Z

...is-core/src/main/java/org/apache/polaris/core/policy/content/AccessControlPolicyContent.java

+  @JsonSerialize(using = IcebergExpressionListSerializer.class)
+  private List<Expression> rowFilters;
+
+  private static final String DEFAULT_POLICY_SCHEMA_VERSION = "2025-02-03";


Like for other policies, this is a sort of policy "version"

snazy · 2025-08-12T10:52:27Z

Clarifying: the approach of how FGAC policies are managed deserves a public discussion on the dev mailing list considering all the UX, implementation and runtime implications that the very specific approach implemented in this PR has.

singhpk234 · 2025-08-12T15:49:10Z

There has been a dev-thread for this since (06/25)
dev-thread link - [1]
public proposal link (mentioned in PR above as well) - [2],[3]

[1] https://lists.apache.org/thread/rf2zsgk9qh36z3s63gx6dgtl0s4cwngr
[2] https://docs.google.com/document/d/1AJicez7xPhzwKXenGZ19h0hngxrwAg3rSajDV1v0x-s/edit?tab=t.0#bookmark=id.9k7xe039ivxj
[3] https://docs.google.com/document/d/12nhS0GX1U1PqEBKp74bIBZsL9kB5duDlN9diHJAhJsM/edit?tab=t.0#heading=h.loa0wu1y4ox1

snazy · 2025-08-14T06:42:55Z

Discussions around how FGAC is exposed are still ongoing and things can still change. We do not have a comprehensive list of requirements, meaning the whole user-experience including how Iceberg expressions evolve, how the support for UDFs will be and how UDFs will eventually look like.

"Secure views", as mentioned in the linked docs and in this PR, are IMHO not the right way. That one changes the expected behavior, requires engines to do hide parts of their query plan from users and it prevents people from updating their tables (b/c those would be represented as views). Some more concerns were raised in community meetings (Iceberg + Polaris) around "secure views".

We also need to define, for Polaris, how all the required things play together, including the evaluation of the policies, considering input from AuthZ sources (OAuth et al). The users' needs for Polaris are very important. Technically, we need a performant way to expose the right protection instructions. That cannot be even designed until the Iceberg API and expressions and Polaris requirements discussions have settled.

I'd prefer a top-down approach, not a bottom-up approach.

singhpk234 · 2025-08-14T14:31:35Z

Again, Could you clarify which of the points above are directly related to this pr? this pr just defines policies and is not related how protection instructions are exposed (via view text or structured object as part of loadTable). Never the less iceberg expression are both part of IRC spec [1] and Iceberg SDK [2], this just adheres to that.

I would prefer to avoid bringing unrelated discussions to the PR.

[1] https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2162
[2] https://iceberg.apache.org/javadoc/1.0.0/index.html?org/apache/iceberg/expressions/Expressions.html

snazy · 2025-08-18T15:18:35Z

Again, this PR sets the way how the policies are persisted pretty much in stone.

I think, there's an assumption that policies are managed in Polaris only as Iceberg Expressions, and I'm convinced that this is not the consensus.

github-project-automation bot added this to Basic Kanban Board Jul 14, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jul 14, 2025

singhpk234 requested review from HonahX and flyrain July 14, 2025 21:47

singhpk234 commented Jul 14, 2025

View reviewed changes

singhpk234 force-pushed the feature/fgac-policies branch 2 times, most recently from decd92b to 85ba526 Compare July 15, 2025 06:53

jbonofre self-requested a review July 15, 2025 11:49

dimas-b reviewed Jul 15, 2025

View reviewed changes

laurentgo reviewed Jul 15, 2025

View reviewed changes

adutra self-requested a review July 16, 2025 12:12

snazy reviewed Jul 17, 2025

View reviewed changes

singhpk234 force-pushed the feature/fgac-policies branch from 85ba526 to 11a087e Compare July 30, 2025 03:17

snazy requested changes Aug 1, 2025

View reviewed changes

HonahX reviewed Aug 6, 2025

View reviewed changes

singhpk234 force-pushed the feature/fgac-policies branch from 11a087e to cf70efb Compare August 7, 2025 01:30

eric-maynard reviewed Aug 7, 2025

View reviewed changes

singhpk234 mentioned this pull request Aug 7, 2025

[FEATURE REQUEST] Add Row Level and Column Level access control. #137

Open

singhpk234 added 5 commits August 11, 2025 19:30

FGAC policies

b2ed95b

Replace context

ab235a9

Partial Address review feedback

4a71e47

Hide policy attachment behind a config

b4644e1

Address review feedback

62afe5f

singhpk234 force-pushed the feature/fgac-policies branch from a0a4a47 to 62afe5f Compare August 12, 2025 02:31

Part 1 : Adds RLS and CLS control Policies #2048

Are you sure you want to change the base?

Part 1 : Adds RLS and CLS control Policies #2048

Uh oh!

Conversation

singhpk234 commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

About the PR

Uh oh!

singhpk234 commented Jul 14, 2025

Uh oh!

singhpk234 commented Jul 14, 2025

Uh oh!

singhpk234 Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

snazy commented Jul 15, 2025

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

singhpk234 Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

singhpk234 commented Jul 14, 2025 •

edited

Loading

singhpk234 Jul 14, 2025 •

edited

Loading

dimas-b Jul 17, 2025 •

edited

Loading

dimas-b Jul 15, 2025 •

edited

Loading

singhpk234 Jul 16, 2025 •

edited

Loading

dimas-b Jul 17, 2025 •

edited

Loading