Pulling SQL-Like queries from Server for Machine Learning and Data Analytics

I was brought on for data analytics, and I’m trying to learn the underlying structure of our information in epicor. At first, I thought a reasonable approach to get our data was cleaning production detail reports and efficiency logs and clean them for processing with python packages. This is definitely not the move and I need to find ways to query the underlying server for mass data collection.
In your opinion would connecting to the sql server via excel be a good place to start? Where would you start if you wanted to build searches with SQL queries?

NOTE: I have just started this so anything is helpful. I’ll do my best to provide as much information about our epicor layout if needed

I would recommend data replication setup. Where you SQL DB or certain views are replicated to a separate server / db and that can be used to get data out make cubes etc.

I would not recommend that you connect your production db to excel that’s a terrible idea IMO the production DB should be as pristine as possible and excel is a pig.

You can then connect to the replicated db with Power Pivot or the analytics engine of your choice. If you insist on using Excel Id recommend an OData Feed Pull via BAQ’s and the REST Interface.
(Quite a few articles about that in this forum, but the gist is you can pull data directly into Excel via the Epicor REST interface using an OData feed)

4 Likes

These recommendations are priceless. Excel is a mess I couldn’t agree more.

Great idea.

You can also pay Epicor for a replicated DB that you can build analytics around depending on your requirements.

1 Like

Is the ability to clone your data onto a replicated DB surrounded by red tape or relatively straightforward?

I honestly do not know. It’s just a replicated SQL db so I can’t imagine that there is too much red tape. Have you made a decision on your BI platform or is that still being decided?

1 Like

Haven’t decided yet. It isn’t as much about packaging the data into digestible dashboards, as it is taking it and running regressions and decision trees on it.

Peter, I would love to talk to you about this after you get something up and running. Would you keep me in mind?

I am looking to learn more about the subject matter and it would be cool to connect with you to learn about what you end up doing.

2 Likes

How clean is the data? Do you need to edit it to clean it up? Is your system on-prem or in the cloud?

If your data is clean enough and doesn’t need any editing then I guess you dont need to stage it anyway. In that case a replicated Db you can work with will be good. You could connect to it with python, excel or any other tool you have without the worry of impacting production. If you are on prem then spinning up a copy of the database should be straight forward. I assume you don’t need update the minute data to work with and if the data is a month old that is fine.

I did some experimental machine learning on our data. I had to do a fair bit of editing to clean up the data and I didnt want to edit the live data. I used Azure Data Studio to query the data with SQL queries then saved it out as a csv file. Cleaned it up in excel then used ML.Net/C# to create the machine learning model (I am familiar with C# so this option for easiest for me). As i had to edit the data the csv files were a good staging area for that process.

Very keen to hear how you go.

Brett

PS there is the Data Dictionary screen in Epicor that lists all the tables and fields. You can use that to start to get to know the structure of the tables.

1 Like

Said every data scientist everywhere.

:rofl:

1 Like