Filtering Data in Syncs

Do take a look at Starting Syncs to get started with syncing data

If you want to filter the data that is being fetched in the syncs, you can use the Sync Filter APIs to provide relevant filters.

What is a Sync Filter?

A sync filter is essentially an expression tree that is evaluated for each record fetched from the source app.

For example, let's consider that we are syncing employee data and want to fetch the employees that belong to either the Technology or Marketing department.

Take a look at the event data example for employee sync here

An expression tree for the above filter, keeping in mind the event data JSON, could look like this:

Since the department value is inside the orgStructure model, we represent nested values with dot (.) notation, and hence the key that we want to filter on becomes orgStructure.department

At the top, we have a logical operator or, and the left and right trees evaluate whether the department is equal to Technology or Marketing respectively. The above tree would evaluate to true if the employee's department is either Technology or Marketing, otherwise it would evaluate to false.

How to construct a Sync Filter?

A node in the JSON representation of the Sync Filter essentially has four parts:

KeyData Type
dataString
typeEnum
leftFilter Node
rightFilter Node

Let's look at each one in detail.

data

The data key represents the operator/key/value that is being used while evaluating

type

The possible types are:

TypeDescriptionPossible Values
logical_operatorLogical Operators. Must have a left and right branch that is made up of other operators. or, and
string_operatorString Operators. Used to compare filter on keys that or type StringMust have a left and right branch, in which one is of type key and the other is of type value_stringeq
(Equal to)

beginsWith (Begins With)

endsWith
(Ends With)

contains (Contains)
date_operator Date operators are used to compare filter on keys that are of two types
1. Date time based (ISO-8601) : Must have a left and right branch, in which one is of type key and the other is of type value_date.
2. Rolling Time based : Must have a left and right branch, in which one is of type key and the other is of type value_string e.g. rollingDays, rollingWeeks, rollingMonths, rollingYears
isGreaterThan
isLessThan
isInBetween
rollingDays
rollingWeeks
rollingMonths
rollingYears
keyThe node is a key
value_stringThe String value the corresponding key should be compared against. Should be used with string_operators
value_dateThe ISO-8601 Date value the corresponding key should be compared against. It should be used with date_operator

Left And Right

Represent the left and right branches of the node. Can be null if the node is a leaf node.

🚧

Operators must have left and right branches

Operators like logical_operator, date_operator and string_operator must have both, a left branch and right branch to be properly evaluated

Let's construct the JSON representation of the following simple tree with just three nodes:

Starting from the bottom,

  • Node #2 is the key that we want to compare against. Since the department value is inside the orgStructure model in the Employee Data Model, and we represent nested values with dot (.) notation, and hence the key that we want to filter on becomes orgStructure.department. From the model documentation, we also note that the data type of the key is String. The JSON representation of this node would look like this:
    • {
        "data": "orgStructure.department",
        "type": "key",
        "left": null,
        "right": null
      }
      
  • Node #3 is the value that we want to be filtered. Previously, we noted the key orgStructure.department is of type String, hence we will specify value_string as the type here. The JSON representation of this node would look like this:
    • {
        "data": "Technology",
        "type": "value_string",
        "left": null,
        "right": null
      }
      
  • To combine the two nodes we have created so for, we have an operator as the Node #1. Like the previous node, since we are doing String comparison, we specify string_operator as the type and in particular, since we want to compare the equality of the key - value, we will use the eq operator.

Combining all the above nodes, a full JSON representation of the above tree would look like this:

{
  "data": "eq",
  "type": "string_operator",
  "left": {
    "data": "orgStructure.department",
    "type": "key",
    "left": null,
    "right": null
  },
  "right": {
    "data": "Technology",
    "type": "value_string",
    "left": null,
    "right": null
  }
}

📘

Things to keep in mind while constructing a filter

  • Nested keys are represented by dot(.) notation, eg, orgStructure.department
  • Operators like logical_operator ,string_operator and date_operator must have both, a left branch and right branch to be properly evaluated
    • logical_operator must have a left and right branch that is made up of other operators.
    • string_operator must have a left and right branch, in which one is of type key and the other is of type value_string
    • date_operator must have left and right branch, in which one should be key and other will be one of among , value_date (For Date type filter), value_string ( For rolling time based filter).
  • Use the right operator type to make comparisons. For eg, If the key that you want to compare with is of type String, then you must use string_operator with it.

Now that we have our filter JSON ready, let's add it to the sync.

How to add filter to my integrations?

To add a filter to an integration, you can use the Update Sync Filter API to provide the filter JSON that we have just constructed.

📘

Updating the filter triggers an initial_sync

When you update the filter using the above API, it would trigger an initial_sync, so as to baseline the data with the new filter.

You can use the triggerSync parameter to control this behavior.

That's it! You're all set with filtering data in syncs! Once you set a filter, all future delta_syncs for that integration will only track the data points that pass the filer, and thus you will be able to get targeted data from the syncs and keep better track of your data!