Long-running job

| 0 comments | 12,778 views
Categories:

VN:F [1.9.17_1161]
Rating: 3.7/5 (7 votes cast)

SQL Monitor’s Job duration unusual alert is a powerful tool for keeping track of most jobs in your SQL Server environment. It works by comparing the running time of any job instance against the median for that job, and identifying when a variation in the job duration could hint at a performance problem. However, if your environment contains big jobs whose durations vary by several orders of magnitudes, it can be tricky to specify what kind of variation is expected and what would be unusual. For these cases, it is possible to go beyond the functionality of the built-in alert with a custom metric.

This metric returns the longest running instance of the named job in the last hour. In the example, the job of interest is called ‘My_long_running_job’, so you’ll need to replace this with the job in your environment that you want to target.

By creating multiple custom metrics from this example – one for each job you’re interested in – you can specify alert thresholds for each job individually. Alert thresholds are absolute rather than relative values, which may help cover a few scenarios that the built-in alert is not suited for.

Long-running job, 3.7 out of 5 based on 7 ratings

Metric definition

Metric Name:
Long-running job: (My_long_running_job)
Note: Replace (My_long_running_job) with the name of the job you're monitoring.
Description:
This metric returns the total duration (in seconds) of the longest running instance of the named job.
Enter the T-SQL query that will collect data:
SELECT
      (run_duration / 10000) * 3600 +
      ((run_duration % 1000)/ 100) * 60 +
      (run_duration % 100) as total_job_duration
FROM 
    msdb.dbo.sysjobs j
INNER JOIN
    msdb.dbo.sysjobhistory jh ON j.job_id = jh.job_id
WHERE
      j.name LIKE 'My_long_running_job' 
AND CONVERT(DATETIME, CONVERT(CHAR(8), run_date, 112) + ' ' 
    + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), run_time), 6), 5, 0, ':'), 3, 0, ':'), 121) > DATEADD(HOUR, -1, GETDATE())
ORDER BY total_job_duration DESC
Note: Replace 'long_running_job' with the name of the job you're monitoring.
Select instances to collect from:
Select all
Choose databases to collect from:
Specify databases and type
msdb
Set collection frequency:
Collect data every hour
Use collected or calculated values:
Leave the Use a calculated rate of change between collections check box turned off
Metric collection is:
Enabled

Alert definition

Alert name:
Long-running job: (My_long_running_job)
Note: Replace (My_long_running_job) with the name of the job you're alerting on.
Description:
This alert is raised when the named job has been running for longer than the thresholds specified.
Raise an alert when the metric value goes:
Above the defined thresholds
Default threshold values:
Set the thresholds as follows:
High:120
Medium:60
Low:30
Raise an alert when the threshold is passed for:
Enter 1 collection
This alert is:
Enabled
VN:F [1.9.17_1161]
Rating: 3.7/5 (7 votes cast)