PID stuck in sending status

cchang · March 22, 2017, 3:40pm

Does anyone have anything set up to monitor PID’s or processes that are stuck in the sending state? Lately, we’ve been seeing more processes on the appservers stuck in sending for more than 5 minutes. We would have expect it to not stay in sending status for longer than 5 minutes. What causes a process to get hung up and what is a reasonable amount of time before manually killing the process. This has been impacting network performance whenever we have a stuck process.

MiguelS · March 23, 2017, 8:19pm

Are you in progress or SQL.

If Prog.
$ proGetStack
where pid is the process ID of the ABL process. The file generated is protrace.

I would find out that it was tied to a BPM.

gpayne · March 23, 2017, 8:25pm

We use the fathom tool in a browser with tabs open to server pool and the client connections. You can match the sending pid to the client to find the user. Then you can search the appserver log ( if in verbose ) for the PID and find what the user was running. Captures and Postings can take up to 45 minutes for us, so we always confirm the user is not in accounting before we kill them.

There is an issue in our version 905.702a with dashboards if you refresh a second time without clicking in the tracker view that will attempt to deliver all records for the query. I also recently found a baq report for a list of jobs that a user was running it for all jobs and using a huge amount of memory.

Greg

gpayne · March 22, 2017, 8:59pm

We use the fathom tool in a browser with tabs open to server pool and the client connections. You can match the sending pid to the client to find the user. Then you can search the appserver log ( if in verbose ) for the PID and find what the user was running. Captures and Postings can take up to 45 minutes for us, so we always confirm the user is not in accounting before we kill them.

There is an issue in our version 905.702a with dashboards if you refresh a second time without clicking in the tracker view that will attempt to deliver all records for the query. I also recently found a baq report for a list of jobs that a user was running it for all jobs and using a huge amount of memory.

Greg

cchang · March 24, 2017, 1:57pm

We are on SQL. Have you done any automation to alert you that a PID was hung?

cchang · March 24, 2017, 2:02pm

We do not have our appserver logs set to verbose to find out what was running due to large amounts of error messages being written. The hung PID’s are on a different appserver then the task agent server. Even if the Capture or Posting process is taking 45 minutes, we are not seeing the issue on the task agent server. We would expect the process to send it over to the task agent server within less than 5 minutes so that it can be run on the task agent.

We have our appserver and task agent server on separate servers so why would a process get stuck on the appserver in the sending state?

gpayne · March 24, 2017, 7:08pm

Nothing runs on the task or process servers. You will see them on the main appservers. We have a separate print appserver, so reports run there and gives me a clue where to start looking.

Our logs are in 40MB files and total about 2GB per week and are erased when i restart the appservers on the weekend. I can’t imagine troubleshooting without them.

If you use the client connections and search for the user then you can ask them what they were running. Mine usually have no idea or won’t tell me they closed Epicor with task manager.