It depends on each database. If you talk about binlog in MySQL, you will see the below information
- binlog filename
- binlog position
- GTID set

You may need this information if you want to change your current checkpoint into another checkpoint.

Is it the timestamp of the data so far synced to your data warehouse?
Yes, it is

As you may know, Data Lake uses timestamp ingestion as a folder structure. Previously, we use this timestamp information from the data itself, which was provided by other team's services. But, sometimes, in some cases, their services don't update the timestamp in the exact time. Maybe they need to buffer those data before ingesting it into their database. As a result, there is a change where the data would be late in the Data Lake.

We could not control this behavior across all teams, or other teams may not know or aware of this behavior. Therefore, we need a solution to change the value of timestamp ingestion so that we no need interception from other teams. Finally, we found & decided to use a native transaction time itself as our timestamp ingestion into Data Lake.

Data Engineering at Bukalapak, have the interest to become a solution architect

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store