gnutls_record_recv() hangs

Tue Aug 16 01:45:20 CEST 2011

Dinh Le writes:

> Hi,
>
> Please straighten out my understanding of the gnutls_record_recv()
> related functions.  Listed below is a portion of my code for connecting
> to an imap server, login, and list all the mail boxes.
>
> The "Connect" and "Login" responses from the server are less than 1024
> bytes while the 'list "" "*"' response is almost 4000 bytes in my case.
> Since I don't know how to check if there's data available to be received,

Right, you don't, and you do not need to know that. You do not need to check  
how much data is available to receive.

Having implemented both an IMAP server and an IMAP client myself, I can  
authoritatively tell you that by parsing the IMAP protocol properly, the  
client or the server always knows when it receives a complete IMAP message  
from its peer. Once you begin receiving an IMAP message from your peer, it  
is the content of the message, and not any byte count that you get from a  
single read attempt (gnutls_record_recv() with GnuTLS, a plain read() for  
unencrypted IMAP), that determines where the message ends, and the next one  
begins. You may very well find that some arbitrary gnutls_record_recv()  
gives you back the end of the current IMAP message you're reading, and the  
start of the next IMAP message from the peer.

> I wrote a loop that continuously calls gnutls_record_recv() if the
> previous message received is 1024 bytes in length.  This is a bad hack,
> but I just wanted to see the code works before reading the manual more
> carefully.
>
> But this hack does not work since the previously received message may
> be less than 1024 bytes long but there are still more data available
> for receiving.  Worse yet, gnutls_record_recv() would hang indefinitely
> if there's no data available to be received.

This approach has several logical flaws. There's nothing that requires an  
IMAP server to always write full records. An IMAP server may decide, for  
example, to flush its output buffer every ten folders when it's producing  
the response to your LIST request, for example. This results in records 
that are less than the full size.

Or, for example, the IMAP server you're talking to might have a disjoint  
TLS/SSL proxy wrapper running on top of it; that is, the server just talks  
non-encrypted IMAP, and a separate TLS/SSL wrapper reads and writes to the  
server, encrypting and decrypting its input and output streams on the fly.  
Perfectly valid approach, and I can tell you that this is exactly how some  
IMAP servers work. Then, if the server is busy for some reason, and pauses  
for a few milliseconds, and the TLS/SSL proxy wrapper sees that the server  
stops sending --the TLS/SSL wrapper is just a dumb wrapper with no explicit  
domain knowledge of IMAP -- it just takes whatever it collected in its  
buffer so far, and sends you a partial record. When the actual IMAP server  
gets more CPU and begins producing more output, the TLS/SSL proxy resumes  
sending you more records. Perfectly valid, nothing wrong with that, and your  
IMAP client code must be prepared to handle this eventuality too. In either  
case, you simply cannot use the raw byte count from gnutls_record_recv() to  
tell you anything at all, especially when you received a well-formed IMAP  
message. This will never work reliably.

Taking it a step further, I do believe that each call to  
gnutls_record_recv() right now gets cut off at the end of the current record  
GnuTLS has read from the server. That if you ask for 10,000 bytes, but the  
current record that the GnuTLS library has read has only 1,000 bytes left,  
GnuTLS will give you just the 1,000 bytes even though there might be unread  
data on the underlying transport, that GnuTLS could read for the next data  
record, and give you more data in response to your request.

I believe that's the way the library works now, but it's certainly within  
the realm of possibility for some future version to try to be smart and  
attempt to automatically coalesce multiple records together, so if you asked  
for more bytes than what's available, some future version of GnuTLS might  
attempt to read ahead and see if it gets another complete record that can be  
processed to produce additional data to satisfy the gnutls_record_recv()  
request. So, even if your IMAP server remains the same, a future version of  
GnuTLS might break for you.

The bottom line is that the byte count from gnutls_record_recv() says  
absolutely nothing whatsoever regarding the current IMAP message being read.  
IMAP, or any other structured protocol, does not work this way. This has  
nothing to do with GnuTLS. The same applies to non-encrypted IMAP, just  
substitute read() in place of gnutls_record_recv(), and the same principle  
applies. Trying to parse IMAP (or any other structured protocol) using  
counts of bytes individually read, is a failing proposition. You might be  
able to find a way to get your approach working with the specific behavior  
of the IMAP server you are testing again, and the individual manner it reads  
and writes. But as soon as your IMAP server is updated to a newer version,  
or even switched for a different IMAP server, or even if the version of  
GnuTLS gets switched out for you, be prepared to fix your logic, again.

You should treat the input you're getting from gnutls_record_recv() as an  
unstructured byte stream. How many bytes you're getting from a single  
individual gnutls_record_recv() is absolutely and completely irrelevant. You  
must read the data that gnutls_record_recv() gives you, irrespective of how  
many bytes you get each time, and parse the ongoing byte stream into  
discrete IMAP messages, one following each other, then act upon each message  
as you deem necessary. You may very well find that your last call to  
gnutls_record_recv() not only gave you the tail end of the current IMAP  
message you are parsing, but a partial beginning of the next IMAP message  
from the server! IMAP permits the server to send certain kinds of  
asynchronous messages, at any time. Even though you are using only 1024-byte  
reads, some IMAP messages are quite small; and it's perfectly feasible to  
get a single 1024-byte read containing two or small IMAP messages, one after  
the other.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: </pipermail/attachments/20110815/8621383c/attachment.pgp>