This is a summary of QoS rules I'm currently using. My goal was to create a good all-around ruleset for gaming, web browsing/streaming and torrents. All of this without having to hunt down and specify port ranges constantly. I'm posting this to see if others would have similar success with it as well.
The original idea probably goes to orangetek's 512 byte rule, however I wanted to dig deeper to see if it could be improved and to research what kind of data you can expect from games and other sources in general. I would probably throw this in (at least parts of it) as a suggestion to improve the default QoS rules as well (of which I will be comparing my setup against).
Note that in both directions every class gets equal share. The goal is not to starve any one of them, but to give a fair chance to each type. In my experience this works out much better overall and you run into less weird edge cases. The result is that you can be doing multiple things on the same device without either service significantly affected (apart from sharing the bandwidth, obviously).
My ingame pings and bufferbloat values remained very good even with the link completely saturated. I also found the routers' setting page to remain fairly responsive under load, which normally would be negatively affected in my experience.
I mainly tested with a combination of Gargoyle's QoS settings, Wireshark, ingame statistics (ping, network graphs), DSLReports' bufferbloat test (which unfortunately seems to be in a semi functional state currently), and by starting the corresponding services to saturate the link (streams, browsing, downloads, torrents etc).
Upload direction


(Note U1 quickly filling up with download-related traffic, while gaming-related packets are kept separate in U2.)
U1: Max packet length: 76 bytes
This is probably the most important change. Separates the smallest packets (mainly ACKs, FIN, SYN) into its own class. Torrents, sites like MEGA and perhaps just downloads in general agressively max out this class. A max value of 128 would include packets used by games and therefore when this class is maxed (which it will) excessive lag and choking would appear ingame. 76 is not necessarily a magic number, but seems like a very good middle ground between capturing all the tiny packets while staying just below the size games tend to use (usually starting somewhere above 80 bytes).
U2: Max packet length: 512 bytes
This range is primarily used by games and light browsing. A max value of 128 was not enough as many of these seem to operate higher. 512 captures the overwhelming majority quite reliably. Only a small percentage is not captured and it happens occasionally.
U3: Max packet length: 1270 bytes
This class captures the occasional (maybe 1-5% of total) game-related packets that for some reason have a much bigger payload. False positives in my experience were fairly limited as larger, not realtime uploads seem to prefer values closer to the MTU (so around 1500 bytes). So this class basically sits idle most of the time.
Bulk: Default class
Captures everything else, such as large file uploads which are generally not time sensitive.
Download direction


(Note the simultaneous torrent and HTTP download traffic evenly divided between Bulk and D3. Gaming traffic is separated to D2 and D4. ACKs are in D1.)
D1: Max packet length: 76 bytes
In my experience this class does not seem as critical on the download side as it is on upload (at least with the next class in place). Needs more testing, but I might end up merging this with D2. If someone is using Gargoyle's ACC, I imagine this would be a bit more important, but I have not tested it.
D2: Max packet length: 512 bytes
Gives priority to anything smaller. This mainly includes light browsing, old games and/or low playercounts (<10 players). Since this class does not seem to be at risk of overflowing it's possible it could be merged with D1 and get essentially the same effect.
D3: Source ports: 80, 443
Captures anything bigger that uses the HTTP/HTTPS ports. This is to ensure streaming services like YouTube or Twitch does not end up buffering (for example due torrents also running) or downloads don't completely starve regular browsing. It also helps somewhat to reduce false positives in the next class.
D4: Max packet length: 1270 bytes
Valve/Source engine-based games don't fragment packets until around this amount. So older games with higher playercounts (even just 10+) or new games in general will end up in this class. Even with lower playercounts, switching between players in spectator mode will often cause a spike high enough to end up here. So unlike on the upload side, this class is definitely a necessity to capture game-related packets and depending on the game, this is where the majority of the traffic will be at all times. The downside is that false positives are much more common here than on the upload side. I have noticed that sometimes half the torrent traffic flows into this class, other times none. In either case pings remained largely unaffected ingame in my experience.
Bulk: Default class
Captures anything else. This primarily means torrents and other unknown services that you most likely don't care about, as gaming and browsing/streaming has already been prioritized (or rather, everything given a fair chance).
Miscellaneous stuff
Examples of upload packet lengths used by games:
CSS: 84-300 (99% between 82-128, 1% 128-256)
Titanfall 2 4vsAI: 78-915 (85% between 128-256, 3% over 512, 10% under 128)
Titanfall 2 5v5: 111-1270 (90% 128-256, 10% 256-512)
Titanfall 2 6v6+AI: 83-1270 (85% 128-256, 15% 256-512)
Apex Legends: 80-1269 (96% between 128-256)
Some examples of download packet lengths used by games:
CSS: 88-1242 bytes (85% between 512-1200, another game 80% 256-768)
Titanfall 2 4vsAI: 88-1270 (3% 80-128, 65% 128-512, 30% 512-1200)
Titanfall 2 5v5: 83-1270 (20% 128-512, 80% 512-1270)
Titanfall 2 6v6+AI: 83-1270 (10% 0-512, 20% 512-1260, 65% 1260-1270)
Apex Legends : 83-1270 (96% was over 128, 56% 1200-1270)
Other:
- Download packet length scales with number of players, but the upload does not (as expected).
- UDP has been used by games in all cases.
- The highest number before packets are split can be adjusted with the net_maxroutable (default is usually 1200) command in Source-engine based (Valve) games. Counter-Strike: Source ended up with 1242 bytes packets, Apex Legends capped at 1270 with the same setting.
- ACK, SYN, FIN packets were mainly 54, 60 or 66 bytes long, but have seen it go up to 90.
- You can get a nice graph in Valve games with net_graph 5
- Switching between players will often cause spikes in packet sizes:


I tested this on a fairly weak router (by today's standards), but the amount of rules did not seem to negatively affect the throughputs in my case (which are also fairly low by default). I'd be curious if removing/merging some rules would show any improvement in throughput for those with very high speed connections (at the cost of less of sophisticated QoS rules). U3, D1 and D3 could probably safely be removed without losing a lot of functionality (with the latter not being required if torrents aren't common).