Lately, I had some problems with stalling transfers when using
scp, the file transfer tool for the SSH protocol and part of the OpenSSH suite. Here is a report on what happened, how I found the cause and how I applied the solution. Actually, this is a pretty lame story, but I just wanted to write it down anyway.
Problem, Frustration and Workaround
As I was trying to push some file to my desktop PC at home, the transfer would suddenly stop and the connection was just idle as of this moment.
scp(1) even reports that as stalled and finally I had to kill the process:
$ scp /tmp/largefile Varuna.Home:/tmp/ largefile 0% 208KB 99.5KB/s - stalled -
I started debugging lots of stuff, as after a quick Google search MTU issues were mentioned as one common source of this problem. To get some more insight, I started several sessions with tcpdump, analyzed the traffic in wireshark and looked for any hints on way this connection suddenly won’t transfer any more data. I tried several different MTU values at the network interfaces in my router at home as I thought that would be the culprit. I noticed in these tests the transfer would continue some more, but then it stalled at some other point without a recognizable scheme.
This was a few weeks ago. In the end, I just gave up. I did not find the problem and thus, no solution.
However, starting the transfer from the target host – pulling the files instead of pushing them – worked flawlessly. So I had found a workaround and I repressed more thoughts about this problem to avoid wasting time.
Return of the Problem and its Identification
Today, the same thing happened again! I tried to copy some files from my laptop to my desktop PC, and suddenly the transfer stalled. Since I was using Wi-Fi, this was crossing the mentioned router on which I had experimented with different MTU values. Of course, on my first thought I assumed that really this router is the reason. Once again I tried different MTU values, fired up tcpdump and analyzed the traffic – to no avail.
Almost giving up again, I tried another Google search on the topic. And magically, I did the right thing this time and appended to the search query that I am using Gentoo Linux on the desktop PC. Now, I got the right hints in the Gentoo forum and in the end found a bug report in their tracker: https://bugs.gentoo.org/show_bug.cgi?id=414401
The whole problem is based on the fact that Gentoo automatically applies the High Performance Networking patchset, called
hpn in short. These patches are supposed to improve the network performance of OpenSSH for larger file transfers. This does not seem to work correctly yet for the latest OpenSSH 6.0 – as I experienced myself.
As others already identified the cause it is so simple: Disable the
hpn patches. Everything works without them. Maybe it does not reach the highest possible performance now, but at least it works realiable.
The easiest way is to just remove the
hpn use flag from the openssh package, rebuild it, then restart the sshd:
$ flaggie openssh -hpn $ emerge -1 openssh $ /etc/init.d/sshd restart
Problem gone! My scp transfers to my desktop PC now work the same as they used to before I upgraded to OpenSSH 6.0 a few months ago.
As I said: a lame story. However, I am relieved that this was no issue with my network setup at home and my router is not doing anything wrong. And maybe this post is even helpful to someone else with the same problem.
Update 2012-06-08: The bug report linked above has been marked RESOLVED FIXED and Gentoo is now using a previous version of the hpn patchset.