It depends on each database. If you talk about binlog in MySQL, you will see the below information
- binlog filename
- binlog position
- GTID set
You may need this information if you want to change your current checkpoint into another checkpoint.
Is it the timestamp of the data so far synced to your data warehouse?
Yes, it is
As you may know, Data Lake uses timestamp ingestion as a folder structure. Previously, we use this timestamp information from the data itself, which was provided by other team's services. But, sometimes, in some cases, their services don't update the timestamp in the exact time. Maybe they need to buffer those data before ingesting it into their database. As a result, there is a change where the data would be late in the Data Lake.
We could not control this behavior across all teams, or other teams may not know or aware of this behavior. Therefore, we need a solution to change the value of timestamp ingestion so that we no need interception from other teams. Finally, we found & decided to use a native transaction time itself as our timestamp ingestion into Data Lake.