Understanding CDC (Full Architecture and Detail Breakdown)

Hello,

Thank you for accepting me … iam really new in Epicor (bear with me lol). I am more in networking.

I have some questions about CDC. i read in the forum that it is not a real CDC but a trigger which is activated in some tables.

my question is : are the change queue saved in the table CDC.SubscriberQueue ? we have more than 200.000 rows only for today ? and the agent is runing every 30 minutes ? is it normal ?

what is the goal of the table CDC.Subscriber table? it is an empty table.

where are the change saved ? XML files ? how does it communicate with a spreadsheet server for ex

Thanks

I don’t think it does at all… (unless this is a new feature and they made a spreadsheet server client but I don’t actually understand the usecase…_)

CDC (as of the last time I checked) is only used for Epicor Collaborate triggers / notifications. Collaborate becomes a client which consumes the CDC Entries as needed and uses that to send information / notifications to subscribers of certain records.

You can do the same with your own custom client if you’d like to.

1 Like

thanks, but where are the records saved ? which tables ?

1 Like

The records are saved as an XML bloob /diff in the cdc tables (CaptureLog)

1 Like

OK so the 5 minute internals of CDC Post
CDC - Change Data Capture is real CDC just not sql servers. it was added in the original 10.0 release because we knew we would need it but at that time Microsoft’s CDC was only available to Enterprise level customers and we didn’t feel that was reasonable for our customers.

it was initially only used for ESE - Epicor Social Enteprise. When that was deprecrated in favor of the new Collaborate product the CDC Tables got an actual UI in 10.2.500 and then it became available for general use in 10.2.600.
As of right now CDC is used internally by Collaborate and Enterprise Search. It is also used by at least 2 partner products I am aware of, one being spread sheet server, and is available for customers to use themselves.

Next reply- details

5 Likes

we are in the cloud, i have access to a read-only database.

i can not understand why the capturelog table is empty.

PS: i confirm the spreadsheet server is a partner product … we use it but i didnt understand from where it picks up data

1 Like

CDC Consists of 6 tables with a schema of CDC, a SQL Table trigger on every epicor delivered database table, 2 Task Processes run by the Task Agent and 2 UI’s in the product to configure it.

Tables:
CDC.Subscriber - this are all the external programs that have registered to use CDC against your database
CDC.SubscriberRule - When a subscriber registers they can define a set of rules against data changes - this rules determine what the subscriber cares about and what they want to be told.
CDC.SubscriberQueue - When a database change occurs that meets the requirements of a Subscribers Rule, a record is written to this table to let them know their rule was satisfied. The contents of what is written is controlled by the Rule definition.

CDC.AvailableTables - to avoid tracking too much data this controlled whether or not the CDC triggers on the data table are enabled and firing. you can only have rules for tables that are tracking CDC. By default nothing is tracked for any table unless they are enabled.

CDC.CaptureStatus - in current code this is ignored. it was the ‘soft stop’ for tracking data against a table. But that feature was accidentally lost when we switched to enabling/disabling triggers directly. I would like to bring it back but it is not currently a priority.

Finally we have the CDC.CaptureLog - this is where the raw change data gets written out. For Deletes and Inserts you get the current or last set of values for the row in xml format and what action was taken. For Updates you get a Before and After row state in xml to allow you to look for changes.

next - UI and Processes

4 Likes

All CDC UI elements start with CDC - so just search for that in the menu and follow along.

When you create a subscriber you have to pick a mode: Push, Push-Pull or Pull. Push means kinetic will notify your 3rd party app via a webhook any time a row is written to the CDC.SubscriberQueue for your Subscriber. Pull means that it will write records to the CDC.SubscriberQueue but will not notify you - you will have to ask for them. Do NOT lose your subscriber ID and Secret - you need those to be ablet o get the CDC data back out of the system.
The interesting part of the subscriber details are page size, TTL Read, TTL Unread. Page size is how many queue records you pull at once when you pull. TTL (Time To Live) determines how long kinetic keeps your records in the Queue table one’s you have already received and rows you have not pulled yet can have different TTL.

CDC Table Management is boring it just lets you enable CDC for table - when you do it enabled the DB trigger in the Database.

CDC LOg Processor - this is the most important task that you want running constantly if you have subscribers. We recommend putting it on an hourly schedule with the processor set to run for 59 minutes. that way if the server restarted thats OK you won’t be down for more than 60 minutes before it starts processing again.
This task loads the rule definitions for all Subscribers and then runs a loop reading in all records in CDC.CaptureLog looking for rows that trigger a rule. If any rules are triggered it will write out a record to the CDC.SubscriberQueue table. Roughly every 10 minutes to will go and delete records out of the CDC.CaptureStatus table that have already been processed. This keeps that table from overwhelming the database.

CDC Notification Sender - hopefully the name gives it away. This process monitors the CDC.SubscriberQueue for any subscribers that have enabled ‘push’ and provided a web hook. When a row is added to the Queue for a push subscriber, this is the process that will build a message and then make a rest call to the provided Webhook to inform the 3rd application that a rule has triggered and provide it with whatever details were defined in the rule itself.

next - Wrap up

3 Likes

Thanks Patrick, very very clear.

I just do not understand why when i do a select in the CaptureLog table. it is empty ?

I have thounsands of records in SubscribersRules … and the Agents (CDC LOG and CDC notification) are runung every 30mn

So more than 5 minutes but I had a rare meeting free morning block and caffeine to burn. That is the basics of CDC. It is the outbound half of Epicor Functions. For basic starting integration with Epicor you can:

  1. Use CDC to know when something changed and make rest calls to get the data you need.
  2. Use functions to tell kinetic you want to do something or update a record and let the complex kinetic business rules be handled by the function not by the calling application - keep your Kinetic rules in kinetic where they belong and are understood.

Something like Data Fabric that is out and growing is obviously a much more advanced tool for integration but it also comes with a cost and currently it is for targeted elements. CDC and functions are base functionality available to all customers and is a very generic tool - you have to write all the tooling and logic but you can do whatever you want with it.

hope this helps.

3 Likes

Because we delete rows from CDC.CaptureSTatus as they are processed. If we did not then you would quickly have millions or even hundreds of millions of change rows piled up in the table - our experience shows that it quickly can become the single largest table in your database and not so long after start challenging you for disk space. Want to try this at home? enable CDC but do not run the CDC Log Processor and watch your DB balloon and your free HD space shrink rapidly - especially every time MRP is run :slight_smile:

2 Likes

Well Damn @pferrington! thanks for this. Moving to experts corner! :partying_face::partying_face:

Shark Tank Mark GIF by ABC Network

5 Likes

Amazing ! everything is clear.

for Subscription TTL Unread, what happens if it is configured for 86400 (24 hours) and the subscriber doesnt PULL for more than 24 hours.

will these rows be lost ?

What does TTL READ mean please ?

Time To Live Read - how long, I think it is in seconds but don’t remember right now without looking. A message in the subscriber queue will be kept before it is deleted as part of a cleanup process. TTL Read is for a message the subscriber has already processed. It is adjustable per subscriber.

thank, i think it is not used cause the cleanup in the queue CDC is done only after 24 hours … the spreadsheet synchoonizes every hour