I realize this is a very specific topic and situation, but this solution needs to exist on the internet. In the past I've written about Link Aggregation and what to expect performance-wise and since then I've run into an issue as our network has expanded. We're using a TP-LINK TL-SG2424 switch for our SAN network (budget, I know) but this applies to many TP-LINK switches which offer LACP.
Here's the setup, we have 2 iSCSI SAN appliances connected to our switch. One of the appliances is connected via 4 network interfaces which are aggregated using LACP 802.3ad with Jumbo Frames enabled. This is our clustered shared volume for our virtual machines. This was the first LAG group that we configured using LACP and it was straight forward: enable LACP on the appliance bonded NICs, then choose the ports on the switch and enable LACP in Active mode:
Once this was completed, the appliance and the switch negotiated the link aggregation and we were off and running.
A few weeks later we migrated our backup appliance to the SAN. The backup appliance is connected via 2 network interfaces which we also want to enable LACP on. Following the same steps as in our first setup, we bonded the NICs on the appliance and enabled LACP, then we went onto the switch and set the two ports to Enabled and Active.
This is where the trouble began. Upon setting the new port group (11+12) to Active/Enabled in LACP, the switch added the ports to the initial port group under LAG 1. Obviously these are different groups, connections, and appliances so this wreaked havoc for a few minutes until we disabled the new LACP ports.
After banging around in the switch settings, we emailed TP-LINK support to confirm that multiple LAG groups were possible using LACP. They assured us that they were supported but their budget support could not explain to us how to configure it. Likewise, the manual doesn't describe how to enable multiple groups either. Google search failed us as well.
The solution, it turns out, is a column idiotically labeled "Admin Key". On a whim, we changed the value in this column to "2" when enabling LACP on the two ports. This time, the two ports were placed into a new group LAG 2 and LACP was properly negotiated.
The Admin Key column is not talked about anywhere in the documentation as having an affect on the link aggregation grouping and I hadn't seen anybody mention it online. While the audience of this how-to is fairly limited, someone, somewhere, someday is going to be saved a massive headache by reading it.