Skip to content

Conversation

@mpforce1
Copy link

Description

Implements an integration with the new Hermes API. As per issue #722 it seems like Twitch has finally finished A/B testing the new Hermes API. This PR adds a new WebSocket based integration with this API. It acts as a drop in replacement for the older PubSub WebSocket integration.

As part of the implementation some of the handling of PubSub Messages has been abstracted. This was done to avoid duplicating the handling logic in both integrations.

I've added a new setting to the miner constructor use_hermes. It defaults to False for now to avoid changing current behaviour for most users. In the future this might be removed if Twitch drops the old PubSub API.

I've opened this PR as draft as I'm currently seeing how it acts in the long term. This is because the reconnection logic in the new integration is different and I want to make sure it works as designed. To that end, it'd be really useful if some other community members could try out this branch and see if they run into any issues. In particular, I'd like to know:

  • If anyone runs into any cases where some topics don't end up getting subscribed to or they get dropped at some point later, for example during reconnection.
  • If anyone finds a case where a client should reconnect but no new client is actually created.
  • If anyone finds issues when the pool manages multiple clients, for example if the pool leaves a dangling connection or more clients are created than are actually needed.
  • If I've introduced a regression with the old PubSub integration. This can be tested by setting user_hermes to False or omitting it from the configuration.

I've tested all of this locally but more users testing in more varied setups should help catch any edge cases.

Fixes #722

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Tested on my local setup. I'm currently running a long running test to check reconnection logic. However, my test setup only connects to a few streamers.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented on my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md)
  • My changes generate no new warnings
  • Any dependent changes have been updated in requirements.txt

Implements a WebSocket client and pool for the Hermes API.

Abstracts some of the handling for PubSub Messages.

Adds the "use_hermes" Setting to allow the use of the Hermes WebSocket API instead of the PubSub WebSocket API. Defaults to False.

Renamed WebSocket classes and moved them to new websocket module.
@Gamerns10s
Copy link

Have been using it for around 48+ hours and only encountered "websocket closed None : None error" and "websocket error 104 connection closed by peer " error otherwise all features of the code are working perfectly fine ( excluding betting and drops which are still yet to test)
FYI:- I am on Linux x64 vps

@mpforce1
Copy link
Author

Have been using it for around 48+ hours and only encountered "websocket closed None : None error" and "websocket error 104 connection closed by peer " error otherwise all features of the code are working perfectly fine ( excluding betting and drops which are still yet to test) FYI:- I am on Linux x64 vps

Hey, thanks for your help!

The websocket closed None : None error message is normal. The Nones there are the status code and reason text. None generally comes through there if the WebSocket gets closed locally for any reason, there should be something more useful elsewhere in the logs that'll show up as an error if there's a problem. I might downgrade it to a DEBUG since it's not something you should normally have to worry about., although the original WebSocket logs it as INFO so we'll see.

The other message is probably fine. 104s can happen for a number of reasons but if it's reconnecting without issues I wouldn't worry about it.

@Gamerns10s
Copy link

For the None : None errors I just looked into another of my Miner which is on pubserver and noticed that actually the twitch servers was having some errors as both of the miners (one on hermes and one on pubserver) were disconnected at the same time which means that it isn't actually any code issue

@G0KU0
Copy link

G0KU0 commented Sep 30, 2025

how to download?

@mpforce1
Copy link
Author

how to download?

You can clone my branch here. No automated builds for this one since it's not an official release.

@G0KU0
Copy link

G0KU0 commented Sep 30, 2025

how to download?

You can clone my branch here. No automated builds for this one since it's not an official release.

I have a question: If I upload this to rander.com after I've linked it to my Twitch account, why does it ask me to link it again if I've already linked it?

@mpforce1
Copy link
Author

I have a question: If I upload this to rander.com after I've linked it to my Twitch account, why does it ask me to link it again if I've already linked it?

Sorry, I have no idea.

@G0KU0
Copy link

G0KU0 commented Sep 30, 2025

If I modify this part, will it work the same way?
username=os.environ.get('username'),
password=os.environ.get('password'),

@Gamerns10s
Copy link

Please try to use chatgpt for code modifying questions or just open a discussion or issue in the main repo. Also do read the readme file thoroughly it explains everything.
We are trying to keep conversation related to the actual issue here 🙂

@Gamerns10s
Copy link

Gamerns10s commented Sep 30, 2025

Update: There might be an issue somewhere related to betting cuz some of the best are getting logged twice that means maybe they are being processed twice (there might be a chance that the streamer actually made 2 different predictions back to back not sure though but still thought to drop my observations here)
@mpforce1
IMG_20251001_010110

@mpforce1
Copy link
Author

Update: There might be an issue somewhere related to betting cuz some of the best are getting logged twice that means maybe they are being processed twice (there might be a chance that the streamer actually made 2 different predictions back to back not sure though but still thought to drop my observations here)

Yeah, that looks like 2 different predictions with the same name happening close together. Might be worth checking the stream to see if it happens again.

@Gamerns10s
Copy link

I checked the stream and the predictions were actually made so not any issue about that. But the disconnections are a lot frequent as compared to pubsub like i disconnected 7-8 times in a 12 hours run and the most frequent was on a gap of like 35-40 mins only

@mpforce1
Copy link
Author

mpforce1 commented Oct 1, 2025

But the disconnections are a lot frequent as compared to pubsub like i disconnected 7-8 times in a 12 hours run and the most frequent was on a gap of like 35-40 mins only

I've seen an increase in reconnects too, not to the same degree but then I only run with a single client in the pool. I think this is something they've built into the new API to better distribute server load but no way to know for sure. As long as it reconnects without issue there should be no problem.

For that 35-40 min gap, were the disconnections from something like a timeout or were they from a reconnect message from Twitch? I'm asking because it could be you were just experiencing a period of poor connectivity. The latter case can be checked by looking in the DEBUG logs for a reconnect message just prior to the reconnection attempt. Otherwise, it'll be the prior case and you should see either the code and reason in the #[client number] - WebSocket closed: [status] - [reason] or the error in the #[client number] - WebSocket error: [error message] logs.

For reference, the reconnect message logs look like:

[timestamp] - DEBUG - TwitchChannelPointsMiner.classes.websocket.hermes.Client - [on_message]: #0 - Received: {"reconnect":{"url":"wss://hermes.twitch.tv/b/v1?..."},"id":"...","type":"reconnect","timestamp":"..."}

where the type is reconnect.

@Gamerns10s
Copy link

I am on normal logging level as on debug the whole console gets flooded and it really hard to observe things but still I have changed the logging to debug only for websockets module (in the Twitchpointsminer.py file) and am still waiting for a disconnect to occur.
For that 35-40 min gap one it only logged connecting to hermes server in 30 secs and the None:None error (in INFO logging level)

Also while going through logs rn I found these too and fyi they didn't occurred during or near any reconnection/disconnection
IMG_20251001_152557

@mpforce1
Copy link
Author

mpforce1 commented Oct 1, 2025

I am on normal logging level as on debug the whole console gets flooded and it really hard to observe things but still I have changed the logging to debug only for websockets module (in the Twitchpointsminer.py file) and am still waiting for a disconnect to occur. For that 35-40 min gap one it only logged connecting to hermes server in 30 secs and the None:None error (in INFO logging level)

You can set the file logs to DEBUG while keeping the console logs to INFO, that's how I've got mine set up. You might already have it that way, in which case you can just check the log file.

Also while going through logs rn I found these too and fyi they didn't occurred during or near any reconnection/disconnection !

Looks like an unrelated error, a quick issue search doesn't find that anywhere so if it happens again while using the regular miner release you should open an issue about it.

@Gamerns10s
Copy link

I don't use that too cuz my vps doesn't have root access 🤦🏻‍♂️😂

@Gamerns10s
Copy link

AI analysis of what could be happening
It might be useful to spot the issue so thought to share
IMG_20251001_203050

@mpforce1
Copy link
Author

mpforce1 commented Oct 1, 2025

AI analysis of what could be happening It might be useful to spot the issue so thought to share

I've actually already considered, at least the accurate parts, of this and I'm not sure it's necessary to respect the keepalive timeout. It feels unnecessary to close the connection early when keepalives aren't being sent in a timely manner when it could just be server congestion. Saying this, I will give you that the Twitch Web client does seem to implement something like this so maybe it would be for the best.

The bit about "initiate the keep-alive" cycle is wrong, the server is responsible for sending keepalives. All the client has to do is connect, authenticate, and subscribe. The old PubSub API used a PING/PONG approach to keep the WebSocket connection alive but Hermes does not expect the client to send regular messages.

The more frequent disconnections might just be how this API is going to be. As long as the miner is able to reconnect, does it really matter if it's disconnecting more often? I might try and hide some of the log messages if the client is told by Twitch to reconnect, it's not a case the user should worry about.

@Gamerns10s
Copy link

Gamerns10s commented Oct 1, 2025

Logs from Debug websocket logging
It's always this error after which it logs that it disconnected
IMG_20251002_000641

@mpforce1
Copy link
Author

mpforce1 commented Oct 1, 2025

That's a weird one, never seen it log the frame as an error before. From what I can tell fin=1 means this is the final frame, opcode=8 means this is a close frame, and data=b'\x10\x04ping pong failed' is the data for the frame. The first 2 characters are actually the WebSocket status code and are calculated like 256 * int(self.data[0]) + int(self.data[1]) in the library we use. 0x10 is 16 and 0x04 is 4 so that's 256 * 16 + 4 = 4100 which is a custom code so we don't know exactly what that means to Twitch. You can see in the Twitch Web Hermes client that they specifically handle 4100 by (I think) logging it and resetting the socket. I definitely need the full logs files for this one since it's so strange.

Also, can you say where NoneType: None coming from?

@Gamerns10s
Copy link

As of now these are the only logs related to disconnections but I'll switch to debug logging and will get more info about the same

@Gamerns10s
Copy link

As of now this is all I could gather hope it helps
IMG_20251002_144220
IMG_20251002_144526

@mpforce1
Copy link
Author

mpforce1 commented Oct 2, 2025

@Gamerns10s I'm sorry but without the full stack trace for the error I can't pinpoint where it's coming from. Is there any chance you could run it on a machine where you have access to the log files? I would also need less=False in the logger settings.

@Gamerns10s
Copy link

In that case I can run it on termux but it'll take some time to get the logs but I'll surely try to share it asap

@Gamerns10s
Copy link

@mpforce1 do you have any other social media or email address where I can share you the logs file
I don't wanna share it publically here 😅

@mpforce1
Copy link
Author

mpforce1 commented Oct 2, 2025

@mpforce1 do you have any other social media or email address where I can share you the logs file I don't wanna share it publically here 😅

No, sorry. The last time this happened the user uploaded them to proton storage and posted a link here. Once I'd confirmed download they deleted the files. You could try something like that.

@Gamerns10s
Copy link

I think at one point my device lost internet connection you can ignore that disconnection 😅
https://limewire.com/d/GT4VA#YmcVtKLf43

@mpforce1
Copy link
Author

mpforce1 commented Oct 4, 2025

same :(

Any chance you could post your log file for this? I'd need file_level=logging.DEBUG and less=False turn on to get the most detail.

@OlevO1
Copy link

OlevO1 commented Oct 4, 2025

here you go

log.log

@mpforce1
Copy link
Author

mpforce1 commented Oct 4, 2025

@OlevO1 Weirdly, you seem to somehow have my latest commit which better respects the keepalive messages but you don't have the 2 other ones I pushed up. I can tell this because I see it logging #0 - Keepalive timeout but I don't see correctly formatted timestamps, you're sending timestamps like "2025-10-04T14:23:45Z" which is missing milliseconds and is not in UTC. No idea how that could have happened, probably worth pulling a completely fresh copy of the branch.

@OlevO1
Copy link

OlevO1 commented Oct 4, 2025

log2.log

Alright reinstalled everything. It’s better now.
Is it normal that it reconnects every 40-50 mins?

@mpforce1
Copy link
Author

mpforce1 commented Oct 4, 2025

Is it normal that it reconnects every 40-50 mins?

Seems to be, enough people have confirmed frequent reconnections that it looks like Twitch has built it into the API. I'll do a little more investigation and see if it's something I can do anything about.

@OlevO1
Copy link

OlevO1 commented Oct 4, 2025

Is it normal that it reconnects every 40-50 mins?

Seems to be, enough people have confirmed frequent reconnections that it looks like Twitch has built it into the API. I'll do a little more investigation and see if it's something I can do anything about.

okay, ty

@OlevO1
Copy link

OlevO1 commented Oct 4, 2025

log.log

It was on Less=True, but I don’t get what’s wrong again

@mpforce1
Copy link
Author

mpforce1 commented Oct 5, 2025

@OlevO1 I think I see what's happened here. First it's timed out for this ping pong failed reason. Then it's created a new client and closed the old one. Then it checks for stale connections and, erroneously, considers the new unopened client as stale. It then closes that client but the closure logic doesn't work for uninitialised clients. This leaves the pool with a dangling client. I'm just testing a fix I'll ping when it's ready.

…nitialised client is closed and then opened.

Adds a timeout (of 5 minutes) for a client to wait for initialisation.
Adds the client id to log messages, unless less is True.
@mpforce1
Copy link
Author

mpforce1 commented Oct 5, 2025

@OlevO1 I've pushed up a new version that should fix that dangling client issue. Let me know if it keeps happening.

If you run into that ping pong failed close error again, could you please run a test for me? If so, could you set less=False and go to TwitchChannelPointsMiner.py and change

logging.getLogger("chardet.charsetprober").setLevel(logging.ERROR)
logging.getLogger("requests").setLevel(logging.ERROR)
logging.getLogger("werkzeug").setLevel(logging.ERROR)
logging.getLogger("irc.client").setLevel(logging.ERROR)
logging.getLogger("seleniumwire").setLevel(logging.ERROR)
logging.getLogger("websocket").setLevel(logging.ERROR)

to

import websocket
logging.getLogger("chardet.charsetprober").setLevel(logging.ERROR)
logging.getLogger("requests").setLevel(logging.ERROR)
logging.getLogger("werkzeug").setLevel(logging.ERROR)
logging.getLogger("irc.client").setLevel(logging.ERROR)
logging.getLogger("seleniumwire").setLevel(logging.ERROR)
logging.getLogger("websocket").setLevel(logging.DEBUG)
websocket.enableTrace(True)

This sets the websocket-client log level to DEBUG and enables trace logging so I can see much more detail about the websocket frames. Since I'm not seeing the issue I'd really like to get a look at that to see if the library is failing to send pong frames for some reason.

@mpforce1
Copy link
Author

mpforce1 commented Oct 5, 2025

hope it’s enough that I ran the miner for this long

I'm sorry but I can't find ping pong failed in that log.

@weermis
Copy link

weermis commented Oct 5, 2025

I'm using the latest version with the Hermes API implementation (PR #728), but I'm still experiencing an issue where the bot only collects points from one streamer at a time, even when multiple streamers I'm following are live simultaneously. The bot appears to connect to multiple streams based on logs, but points are only accruing for one streamer. This seems to be a limitation with how Twitch recognizes 'active viewing' rather than an issue with the Hermes API implementation itself. Is there any way to modify the script to rotate between streamers more frequently or to somehow enable simultaneous point collection from multiple streams?

@OlevO1
Copy link

OlevO1 commented Oct 5, 2025

hope it’s enough that I ran the miner for this long

I'm sorry but I can't find ping pong failed in that log.

nvm, works for me now thanks

@mpforce1
Copy link
Author

mpforce1 commented Oct 5, 2025

I'm using the latest version with the Hermes API implementation (PR #728), but I'm still experiencing an issue where the bot only collects points from one streamer at a time, even when multiple streamers I'm following are live simultaneously. The bot appears to connect to multiple streams based on logs, but points are only accruing for one streamer. This seems to be a limitation with how Twitch recognizes 'active viewing' rather than an issue with the Hermes API implementation itself. Is there any way to modify the script to rotate between streamers more frequently or to somehow enable simultaneous point collection from multiple streams?

It should collect from 2 streamers at once. You might be experiencing the same issue as #722 (comment) where the same channel was filling both of the miner's slots. If you want you can try to apply the fix I posted in that comment. I'm not sure how successful the fix was because I haven't heard back from the user yet.

@weermis
Copy link

weermis commented Oct 5, 2025

It should collect from 2 streamers at once. You might be experiencing the same issue as #722 (comment) where the same channel was filling both of the miner's slots. If you want you can try to apply the fix I posted in that comment. I'm not sure how successful the fix was because I haven't heard back from the user yet.

Im gonna apply that fix and run script for like 30 minutes and tell u if it worked is that okay?

@weermis
Copy link

weermis commented Oct 5, 2025

@mpforce1 So i ran it for a while and its working good no problem im also using my modified version of script but fused with your latest commit here and have no issues.

@type2diabetes
Copy link

I'm not sure how successful the fix was because I haven't heard back from the user yet.

Ah my bad, forgot about that. Been working fine since, not sure if it was that fix or me removing kai from run.py so it only loads him from rest of the followed channels, most likely the fix though. Haven't really noticed any issues other than the reconnects every 5-40 minutes but it's not getting hung up or stopped. Seems pretty random when and how often it happens.

@Gamerns10s
Copy link

I indeed had a unreleased version of websocket ( I didn't even knew that lol) but now I fixed it and here are my logs with the latest changes to this pr
https://www.mediafire.com/file/nm0q6re1d7i58uq/Gamerns10s.20251006-014702.log/file

@mpforce1
Copy link
Author

mpforce1 commented Oct 6, 2025

I indeed had a unreleased version of websocket ( I didn't even knew that lol) but now I fixed it and here are my logs with the latest changes to this pr https://www.mediafire.com/file/nm0q6re1d7i58uq/Gamerns10s.20251006-014702.log/file

Hey, thanks for that. I can see you're having an issue I noticed in my own logs last night. This Cannot open Client, wrong state: 3 error is caused by an issue with the stale connection detection. I've got a change ready that I'm gonna let run for a couple of days to see if that fixes it. I'll ping here once it's done.

@Gamerns10s
Copy link

Just noticed that for me after the recent changes the disconnections have actually increased like in a 6 hr run I got disconnected 16 times ( 9 times in an hour and most frequent was at an interval of 4 mins only )

@mpforce1
Copy link
Author

mpforce1 commented Oct 7, 2025

Just noticed that for me after the recent changes the disconnections have actually increased like in a 6 hr run I got disconnected 16 times ( 9 times in an hour and most frequent was at an interval of 4 mins only )

Yeah, that's almost certainly this stale connection detection bug. My new change a) fixes this issue b) improves resources usage by removing a Thread from each client and c) moves reconnection logs to DEBUG so end users don't get spammed with information that really isn't relevant to them.

It ran fine for me last night with no stale connections, no errors, and 2 reconnect requests from Twitch resulting in correct client reconnection. I still want to let it run for another day before pushing it up though, especially given the relative instability we've had on here with me pushing up every little change.

… message flow.

 Keepalive timeout is now a message timeout to avoid scenarios where the server doesn't send a keepalive because it's sent a message.
 Fixed issue where newly opened clients could be considered stale if the stale check happens at the right time.
 Removed client thread that checks for pending subscriptions as we only subscribe once at startup.
 Changed INFO logs to DEBUG for hermes, ERROR/WARNING should be all users need.
@mpforce1
Copy link
Author

mpforce1 commented Oct 8, 2025

I've finished testing the new change, this should fix the client reconnecting all the time. The issue was twofold:

  1. Twitch only sends keepalives if no other messages have been sent in a 10 second interval. However, we were only checking the keepalive message interval when looking for stale connections. Now we check the interval since the last message of any type.
  2. There was an oversight in my stale check logic where I shouldn't have combined some if clauses.

I also had a look at the Twitch web client again and tried to better match their message flow. Now the client starts Unopened, the pool calls client.open which starts the client run Thread. Once connection is established the client goes into an Unwelcomed state. Twitch should then send a welcome message which puts it in an Unauthenticated state, we also send an authenticate request at this point. Twitch will then respond with an authenticateResponse putting us in an Open state, we also subscribe to any pending topics at this point. At any point the connection can be considered stale if it spends too long waiting in Unopened, Unwelcomed, or Unauthenticated states. Previously, we were not waiting properly for the welcome message before authenticating. We were also running a Thread to do the authentication and pending subscriptions rather than responding to Twitch messages. Now we don't need to since it does those steps in response to Twitch messages.

Thanks again to everyone who's participating, you're helping us get closer to a stable implementation. Please test this new version and let me know if you have any issues.

@Gamerns10s
Copy link

Happy to tell you that the disconnections have decreased a lot but now I encountered a new error 4100 once sadly don't have the log file for it rn
IMG_20251009_210651

Encountered this error also many times but not sure if it was because of websocket or not cuz I am working on irc also rn
Exception raised: file descriptor cannot be a negative integer (-1). Thread is active: True

@mpforce1
Copy link
Author

mpforce1 commented Oct 9, 2025

I encountered a new error 4100 once sadly don't have the log file for it rn

That's actually the same error as before, the ping pong failed one. It's just correctly coming through in on_close now since you're on the right release of websocket-client. No idea what causes a 4100 close status, but as long as the miner reconnects and continues working I wouldn't worry about it.

Encountered this error also many times but not sure if it was because of websocket or not cuz I am working on irc also rn Exception raised: file descriptor cannot be a negative integer (-1). Thread is active: True

I think I've seen somebody else complain about that one, might be worth doing a search. That message format Exception raised: [message]. Thread is active [state] is from the Chat.py file. So yeah, probably an irc thing.

@type2diabetes
Copy link

Been pretty solid since that last update, no issues on this end.

@Gamerns10s
Copy link

Ahh just wanted to tell you that the error update client version has become very frequent in long run sessions (24+ hours) and some more logs you should take a look at
IMG_20251012_212056
IMG_20251012_212219
IMG_20251012_212803
IMG_20251012_213156
IMG_20251012_213502
IMG_20251012_213747
IMG_20251012_213848
I just pray it's not again my websocket module error 😅🤦🏻‍♂️

@mpforce1
Copy link
Author

@Gamerns10s It's hard to tell what's going on because of how you've formatted the logs. To get a better idea I'd need text from the log file or an upload of the log file.

However, from what I can see it looks like you might have a poor connection between you and the Twitch servers. errno 32 and errno 104 could be caused by a timeout, afaik it's not really possible to tell since we'd need the Twitch server logs. The issue with update_client_version happening around the same time could also be the result of a poor connection. All that's doing is requesting the main Twitch site and parsing the page for the client version, the fact it fails suggests just a temporary bad connection. I sometimes see that error and it's usually fine.

I still have no idea what the 4100 ping pong failed issue is about. It almost sounds like the websocket-client library isn't sending pong frames correctly but you've have to enable trace logging (as per #728 (comment)), wait for the issue to occur, and send me the log file so I can see if it's receiving/sending them. It's a bit of a hassle since it clogs up the log files and makes it hard to even see the issue happening, but it's the only way to see the ping/pong frames.

@Gamerns10s
Copy link

Yeah I can totally understand that and will try to share the logs file soon
Also an unrelated question I am trying to make a drop fetching script but on sending gql request to the server I get persistent query error (I think I also needed to get an integrity token in some way and send it in the payload but I can't figure it out you got any idea about that? )

@mpforce1
Copy link
Author

Also an unrelated question I am trying to make a drop fetching script but on sending gql request to the server I get persistent query error (I think I also needed to get an integrity token in some way and send it in the payload but I can't figure it out you got any idea about that? )

Sorry, I don't know much about the drops system. I'd like to keep this pr more on topic, please start a separate discussion/issue if you want help with something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate With The Hermes API

6 participants