Track the wibbly-wobbliness of the integration tests over time. You'll find a bunch that are consistently inconsistent. Those tests need more love than the others.

Those tests probably have race conditions in them. Fix the race conditions.

When fixing race conditions, prefer waiting for an event over simply sleeping. In the worst case, the sleep won't fix anything (because build agents are always slower than you think), and in the best case it's slowing down an already slow test suite.

If you can't wait for an event, prefer polling for status (with short intervals) over simply sleeping.

Consider adding event / status polling endpoints to your product. If you can't make them secure, make sure they are turned off in production. If you can make them secure, document them and tell your users about them. They'll find them useful, too.

For implementing event endpoints, if you're using HTTP, remember that server-sent-events are a thing. There are client implementations in all popular languages. You don't need to use websockets.
_________________________
-- roger