So our customer has an AP workflow set up for their invoices - basically they go into IDC, get indexed, output the PDF and CSV datafile, and then the ECM client picks it up from there.
However, we’ve been having this sporadic issue where the ECM client will grab the CSV file and try and import it, but it will leave the PDF sitting alone in the watched folder. When I try and put the CSV back in the watched folder to try and manually import, the client says it cannot see any documents to upload. It’s like the PDF is invisible.
This customer puts through hundreds of invoices a week and 99% of them are fine, but then they’ll get one or two hung up in this manner. So I know it’s not a configuration issue.
Support have been less than helpful on this, I’ve had a ticket open for a month and it’s boiled down to “idk weird”. I escalated the other day but I’m still waiting on a response. Has anyone else had this problem?
@Banderson can say for sure but I vaguely remember having an issue like this and it had to do with the PDF file being corrupted or generated in a “funky” way.
To test it out open the PDF and send it through a PDF Printer or some other method to convert it back to a new PDF and try it.
I would restart the EclipseAutomation service daily with a scheduled task as a way to maintain the ECM Client.
ECM client batch imports typically find the new pdfs that are put in a folder, but when the ECM Client has been running constantly for a week or more, I find the client starts getting sloppy with the import of files. I would encourage the client to restart the server/computer with the ECM client weekly.
I use XML/PDF from IDC and do not usually have the PDF stay behind if I use the steps above. The older versions of ECM clients did experience this issue more often. If they are on-prem, an upgrade may help in these cases, but I would encourage verification of version changes by looking at the release notes and possible use of a test environment whenever possible.
IDC will add its own data to the PDF, which is why the files being exported as larger than they were when imported, even if you compress them as part of the Batch Type configuration. There is an option to export the original file which can circumvent this and may help resolve the issue if IDC is in fact corrupting the data as it maps things. With that being said, this works better in the later versions as previous versions would still export the IDC PDF and the original, which caused other issues.
Your issue reminds me of an issue I faced about 3 years ago with an On Premises DocStar IDC/ECM using PDFs w/CSV files. Occasionally, a few PDFs were not getting imported, but the majority were. Detection was not so easy, because the PDF-to-IDC-to-ECM process has so many points of failure (especially if your PDFs are getting intercepted from e-mail attachments or if ECM is then feeding the data to another application as a transaction).
When I troubleshooted, I could see the unprocessed PDF(s) sitting in the server folder where they should be awaiting import into ECM. But the corresponding CSV files were missing.
Root Cause: An Anti-Virus lock on the PDFs! Our Windows server admins had deployed a new anti-virus tool that was locking the PDF for a short duration. The same tool was either ignoring CSV files or the lock duration to scan CSVs was much shorter than for PDFs.
Meanwhile, the CSV file was getting picked up for import into ECM. But since the import could not lock (and possibly temporarily rename) the PDF file, the import process was simply deleting the CSV. Subsequently, the PDF never got imported after anti-virus tool released its hold, because its CSV file had already been deleted from that folder.
To resolve this issue, I had to ask our Windows server admins to stop scanning PDFs in the folder where IDC dropped the PDFs & CSV files for import! They thought it was an odd & unwise request, but they did it!
ECM importer will not tolerate the # character if it is at the beginning of the line. This vendor’s VendorOCR began with a #, and since it’s the first field, it’s at the front of line 2 in the CSV. Soon as I removed it, it worked.
Then I set VendorOCR to NonExportable because ECM doesn’t need it anyway.
Everyone keep an eye out for that it was a sneaky problem and took me ages to realize.