Any tips for training the system?

MikeGross · September 2, 2022, 8:58pm

@JerseyEric - good catch! And great discussion on Classification - thanks for some real usage stats too!

@Banderson — update! @JerseyEric is correct that I totally misspoke - I said User and it’s Vendor. I think it was the older Artsyl program (before Ancora) that was per user. sorry for the confusion!

IDC training data is template specific - in the sense that the unique value chosen for each type of document. (the “Vendor OCR” field in our template). We’ve not covered this here, and that might be a huge reason you have poor accuracy.

gpayne · September 2, 2022, 10:55pm

In the 9.2 release notes it has a note that training is improved overall and could take two or less and you could imagine that less than two could be one.

So if you are not running 9.22, upgrading could help.

Data training process is improved overall. It takes two or less iterations, in most cases. Edge
cases may require up to 10 iterations.

https://epicweb.epicor.com/dwn/DownloadFile.aspx?url=11243425E318085077868B08584DBEFF6D1A8091

Banderson · September 8, 2022, 3:34pm

Hey @gpayne , we are running 9.2. So hopefully that’s a good thing.

@JerseyEric , I just had a meeting with the people working with the system, and even after flipping that flag for a week, almost all of the incoming invoices get stuck in class verify

We only have 1 type of document that we are trying to classify, and the other type is “Attachment” so that the OCR process doesn’t try and read anything from it.

So my question is, is the system supposed to learn/update the classification models as you process normal business? Or do we have to feed in documents to the learning system manually to improve the matching process?

My last question is, if I import a batch to use for classification training, will this add to the classification model? Or will this wipe out the model and replace it with what it learns from here? Ideally, it would be nice to be able to add invoices from vendors where there invoices aren’t recognized, and not have to feed in the everything every time.

JerseyEric · September 8, 2022, 6:50pm

I’m following this discussion closely.

For my existing batch types, where each document type / DFD is essentially customer or vendor-specific, we’ve built a Classification model for each batch type using the Classification tool and 40 samples of each document type.

But I think @Banderson is implementing the more common scenario for Epicor IDC – Don’t build any Classification model with the stand-alone Classification tool (the one with Demo still in the name!). Instead, use one document type / DFD for all vendor AP invoices (not including your additional DT for Attachments) and have IDC build and refine its Classification model as IDC processes invoices from your wide range of vendors. @Banderson, is my understanding correct for your AP Automation?

I’m curious, because I may be implementing the same [1 batch type; 1 document type; 1 DFD; don’t build the model from the Classification tool] for a “full-blown” AP Automation in the near future, but relying on my Epicor professional services provider to set things up correctly.

Banderson · September 8, 2022, 7:13pm

That’s really interesting. Yes, we are just trying to get “Invoice” or “Not Invoice” regardless of the vendor. I would imagine a category for each vendor would be easier… But I’m not sure how to be able to merge the data sources again. (at this point all we are doing is renaming PDF files anyways).