Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexOptimize job randomly reporting failure even when the log file says the Outcome succeeded #807

Open
gkshitij opened this issue May 28, 2024 · 6 comments

Comments

@gkshitij
Copy link

gkshitij commented May 28, 2024

Description of the issue
IndexOptimize Job randomly reports failures in SQL Agent history even when it runs all the Index maintenance commands successfully. I checked the logs.txt files for a successful run and failed run and all the steps look similar beside the different in statistics. For example, last successful run completed in around 3:46 hours and the recent 'failed' job ran for 3:40 hours. Upon checking the logs in command log table and the .TXT file, the commands seems to be running successfully until the end. One job was reported failed whereas the other was reported successful

SQL Server version and edition
Execute SELECT @@VERSION
SQL Server Standard 2018 14.0.2052.1

Version of the script
Check the header of the stored procedure
Version: 2019-12-31 15:48:33

What command are you executing?
EXECUTE [dbo].[IndexOptimize]
@databases = 'USER_DATABASES',
@FragmentationLow = NULL,
@FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationLevel1 = 5,
@FragmentationLevel2 = 30,
@UpdateStatistics = 'ALL',
@OnlyModifiedStatistics = 'Y',
@LogToTable = 'Y',
@indexes = 'ALL_INDEXES, -%.%.%.IX_PerfSample, and bunch other specific indexes'

What output are you getting?

This is the end of .txt log file

"ate and time: 2024-05-27 03:58:27 [SQLSTATE 01000]
Command: UPDATE STATISTICS [redacted].[VeeamOD].[Staging] [PK_OD_Staging] [SQLSTATE 01000]
Comment: ObjectType: Table, IndexType: Index, IndexType: Clustered, Incremental: N, RowCount: 235, ModificationCounter: 658 [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:00:00 [SQLSTATE 01000]
Date and time: 2024-05-27 03:58:27 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2024-05-27 03:58:27 [SQLSTATE 01000]
[SQLSTATE 01000]"

This is the end of Message in SQL Agent Job history, which I believe is truncated

" Date and time: 2024-05-27 00:18:25 [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-05-27 00:18:25 [SQLSTATE 01000] (Message 50000) Command: UPDATE STATISTICS [redacted].[dbo].[DatabaseStatus] [_WA_Sys_00000002_5AEE82B9] [SQLSTATE 01000] (Message 50000) Comment: ObjectType: Table, IndexType: Column, Incremental: N, RowCount: 352, ModificationCounter: 33 [SQLSTATE 01000] (Message 50000) Outcome: Succeeded [SQLSTATE 01000] (Message 50000) Duration: 00:00:00 [SQLSTATE 01000] (Message 50000) Date and time: 2024-05-27 00:18:25 [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-05-27 00:18:25 [SQLS... The step failed."

@ampetelin
Copy link

Dear @gkshitij

Have you found a solution? I ran into the same problem, but with a backup script. Everything is successful in the logs, but in the history of the SQL agent [SQLSTATE 01000... The step failed.

@Stiffboard
Copy link

Search for msg in the output file. Or "come: F" as in Outcome: Failed and you will find a deadlock or something similar.

@drstonephd
Copy link

The step did not fail so perhaps it's not the step. Did you check the agent error log? Perhaps the agent is failing to do a final activity like a notification. I've seen similar in the past when the agent could not log to the file, but you are looking at the file so.... Is there a message at the end of the log file that's missing when it does not fail? Is there another process running that can lock your log files? Are the recommended virus scan exclusions for SQL exclusions in place?

There is a subtle difference in "non-error" severity levels in jobs. Severities 0-10 are informational in the script.
However, severities 2-9 cause the job step to fail. I don't think the step is failing, so this would not be the cause. Besides, I doubt there are messages with levels 2-9. It would not be random unless the message was random.

https://chadbaldwin.net/2021/01/15/raiserror-cheatsheet.html

@ampetelin
Copy link

Search for msg in the output file. Or "come: F" as in Outcome: Failed and you will find a deadlock or something similar.

Hmm, you were right. I thought that if an error occurs on some database, then the script is not executed on the following databases and stops.

I did find an error message on the RESTORE VERIFYONLY FROM DISK command:
Command: RESTORE VERIFYONLY FROM DISK = N'\my_path\mydatabase.bak' WITH CHECKSUM [SQLSTATE 01000]
Msg3203, Sev 16, State 1, Line 1 : Read on "\my_path\mydatabase.bak" failed: 13(The data is invalid.) [SQLSTATE 42000]

I will deal with it, thanks for the help

@Stiffboard
Copy link

I did find an error message on the RESTORE VERIFYONLY FROM DISK command: Command: RESTORE VERIFYONLY FROM DISK = N'\my_path\mydatabase.bak' WITH CHECKSUM [SQLSTATE 01000] Msg3203, Sev 16, State 1, Line 1 : Read on "\my_path\mydatabase.bak" failed: 13(The data is invalid.) [SQLSTATE 4200

Well, this is not from the index job, this is output file from the backup job.

@ampetelin
Copy link

I did find an error message on the RESTORE VERIFYONLY FROM DISK command: Command: RESTORE VERIFYONLY FROM DISK = N'\my_path\mydatabase.bak' WITH CHECKSUM [SQLSTATE 01000] Msg3203, Sev 16, State 1, Line 1 : Read on "\my_path\mydatabase.bak" failed: 13(The data is invalid.) [SQLSTATE 4200

Well, this is not from the index job, this is output file from the backup job.

Yep, I initially said that I encountered a similar problem, only with a backup script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants