Dear All,
maybe this is a bit offtopic here, but I don't know a better place to
find experts who might have the answer for my question.
What we do is an application consisting of two parts, one that captures
an application's OpenGL call stream, and an other one that receives and
re-renders it on another machine.
In out current configuration, the capture side runs on a Windows XP x64
machine, connected to 16 Linux receivers with a Gigabit Ethernet
network. All communication goes through TCP channels (two per client,
one for data, another for sync).
Now this application runs real-time most of the times (achieving >100
Frames per Second in some cases). But sometimes, when capturing an
application's OpenGL stream, frame rate is limited to 5 FPS, and stays
there forever. If I stop it, and restart the app (don't change
anything), it usually runs fast. Sometimes it is slow 5 times in a row.
Sometimes it runs correctly for 20 times in a row.
If it's fast when the application started, it remains fast for the whole
run. If it starts slow then it remains slow for the whole run.
When it's slow and when it's fast seems to be totally undeterministic
for me.
I thought it might be a network problem, so I've run Wireshark on the
capture machine, and looked at the trace. All I've seen is that packets
are sent in 200 ms intervals. Some packets are sent our rapidly, then
nothing happens for 200 ms, then another bunch of packets are sent. No
errors, no warnings in the expert info, nothing strange. It's simply the
host that waits ~200 ms for some unknown reason.
We've already tried TCP_NODELAY, now all our sockets are created with
this, but it does not help. We tried changing the network adapters,
increasing buffer sizes, nothing helped so far.
BTW, this never happens if the capture machine is a Linux box too.
Does anybody have an idea about why this could happen? I'm open for
every weird idea, as this is very annoying.
Thanks for your kind help in advance,
Peter Kovacs