Skip to content

Improve udp tunnel stability#406

Open
magnww wants to merge 10 commits into
jpillora:masterfrom
magnww:master
Open

Improve udp tunnel stability#406
magnww wants to merge 10 commits into
jpillora:masterfrom
magnww:master

Conversation

@magnww

@magnww magnww commented Feb 5, 2023

Copy link
Copy Markdown

Comment thread share/tunnel/tunnel.go Outdated
}
}

func (t *Tunnel) getSSHNoWait(ctx context.Context) ssh.Conn {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't getSSH equivalent when c != nil ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right.
Remove the wait may also cause high cpu usage, I will revert this change.

Comment thread share/tunnel/tunnel_in_proxy_udp.go Outdated
return u.Errorf("inbound-udpchan: %w", err)
u.Errorf("inbound-udpchan: %w", err)
// wait for a short while for the ssh connection
time.Sleep(time.Duration(100) * time.Millisecond)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of sleeping, can use channels to know when the ssh connection is ready?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for adding sleep before is that, when the connection is lost, getSSH() will return nil immediately, and the runOutbound() loop will take up a lot of memory and cause the process to be killed.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh the loop spins? I haven’t seen that happen before though that sounds like a bug, where the real fix might be to block with a waitgroup

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After connection lost, activatingConnWait() cannot block, because t.activatingConn.DoneAll() is called.
Removing DoneAll() fixes the issue.

Comment thread share/tunnel/tunnel.go
}

func (t *Tunnel) Close() {
t.activatingConn.DoneAll()

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a lock/unlock + nil check?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activatingConn is a waitGroup, it is value type and thread safe, so no need for lock and nil checks

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yea that's right, its a custom wait group (which might be a bad idea itself - anyway)

it should block while disconnected/connecting and it should not-block while connected

DoneAll here would stop blocking, but Tunnel Done means closed?

how does this improve UDP? sorry, I think I'm missing something

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original code, when the connection is lost, DoneAll in the BindSSH method will be called, which causes activatingConnWait() to not work and cannot block.

This leads to high resource usage of runOutbound in udpListener. Calling u.getUDPChan(ctx) will immediately gets nil instead of blocking.

Close is added to correctly release resources in server mode, because DoneAll was removed from BindSSH.

In client mode, there is no need to call DoneAll because the same tunnel is always used.

VHSgunzo added a commit to VHSgunzo/chisel that referenced this pull request Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants