[gnutls-dev] Re: living without global variables

Mon Dec 26 13:10:44 CET 2005

> 
> bryanh at giraffe-data.com (Bryan Henderson) writes:
> 
> >>The entire tree is constant as well.
> >
> > That's great.  I really couldn't tell just what information 
> is in there.
> > What is it, anyway?  Something you can explain quickly?
> 
> It is the ASN.1 schema for various PKCS and PKIX standards, in a
> compiled form for libtasn1.  The variables are constant and read-only
> throughout the lifetime of the GnuTLS library, except for when it is
> deallocated in gnutls_global_deinit().
> 
> Perhaps GnuTLS should deallocate these variables in an `atexit' hook
> instead.  Then it seem all of this problem would go away.  In GNU
> libc, atexit hooks are even called if GnuTLS was dlopen+dlclose'd.
> gnutls_global_deinit() would then only call gc_done().
> 

Use of atexit is not portable. On many platforms atexit routines get
call when program terminate. Using in a DSO can lead calling not
existing code. I did some research in how to handle this but I didn't
find a portable way to do this... If you detect gcc the better solution
is to use __attribute__((destructor)) (not easy to check this on
configure script). 

...
> 
> > But there's still more work to do.  It isn't thread-safe.  Often, a
> > program is modular in that two threads are executing separate code
> > independently.  If both of those threads decide to use the 
> facilities
> > of the gnutls library, they might simultaneously call
> > gnutls_global_init() and the reference count would get screwed up.
> 
> I'm not sure I see the problem here.  GnuTLS should in general be
> thread safe.  Allocating and deallocating these two global read-only
> variables may be the only thing causing problems.  Let's consider if
> both threads call gnutls_global_init() at the same time.  As far as I
> can see, the worst thing that can happen is that there is a race in:
> 
>   if (_gnutls_init++)
>     goto out;
> 
> between checking the variable and incrementing it.  If there is a
> thread context switch between checking the variable and increment it,
> so that both threads read the value 0 from the variable, the worst
> thing that seem to happen is that the global variables are allocated
> twice.  That is a memory leak, but a pretty insignificant one, and one
> that would happen rarely.  There is no serious problem, that cannot
> easily be solved by the caller, as far as I can see.  Nothing else bad
> will happen, as far as I can see.
> 

No, during initialization you set up some linked list so is not a good
idea to initialize them twice at the same time.

> > On the other hand, in what I propose each thread has its own memory
> > and there can't possibly be a conflict.  (The synchronization point
> > moves to malloc(), which is a reliable place for it).
> 
> Alternatively, each thread could do a mutex around the call to
> gnutls_global_init:
> 
>  mutex_get()
>  gnutls_global_init();
>  mutex_release();
> 
> Although I wonder if the complexity is worth it.  The only danger in
> not doing this is that if there is a thread race, you could leak some
> memory.
> 

This do not solve problems if gnutls_global_init is called by two
libraries using two separate mutexes (assuming library A does not know
library B you cannot share same mutex).

> >>Remember that API changes are bad.
> >
> > But API _additions_ are no big deal.  I'm talking about adding three
> > new functions for people to use optionally in place of 
> three existing
> > ones, and people who use the existing ones will see no difference at
> > all.
> 
> Good!
> 

IMHO with some changes is possible to retain also ABI...

> >>I think it would be useful to see exactly what changes you are
> >>thinking of in more detail.
> >
> > I don't want to go so far as coding a patch if you don't 
> buy into the
> > basic idea, because it would be too much work.  But let me 
> try a more
> > detailed explanation.  This is nothing revolutionary -- 
> it's how lots
> > of libraries work in order to cater to threaded, modular, object
> > oriented programs.
> 
> There are a lot of different solutions to this problem, so I wanted to
> know exactly which one you thought of.
> 
> For example, many libraries has chosen to only support one thread
> library, and then use mutexes from that thread library to solve
> problems like this.  GnuTLS could have done that too, and wrapped
> gnutls_global_init and gnutls_global_deinit around thread mutexes.
> But I believe that is too inflexible for GnuTLS, consider this:
> 
> Application A (uses NPTL)
>    Library B (uses pthreads)
>      GnuTLS
>    Library C (uses GNU Pth)
>      GnuTLS
> 
> If GnuTLS used a pthread mutex internally to protect against this, it
> would not work well.
> 
> I believe it is best to write thread safe code.  If that can't be done
> (and I think we may change the GnuTLS code here), declaring that the
> function is thread-unsafe and relying on the caller to protect the
> call around a mutex is the second best bet.  Of course, if there is a
> good argument for another approach, I could change my mind.
> 

An easy way to make GnuTLS thread-safe is use mutex directly from
GnuTLS.

> > A program calls gnutls_create_session_group() instead of
> > gnutls_global_init().  This new function returns a "session 
> group" handle.
> > The program then calls gnutls_session_init_group() instead of
> > gnutls_session_init().  This new function takes that 
> session group handle
> > as an argument.  When it's all done, the program calls
> > gnutls_destroy_session_group(handle) instead of 
> gnutls_global_deinit().
> >
> > The session handle is really a pointer to a descriptor (struct) that
> > contains root pointers for the "global" trees.  So each 
> session group
> > has its own copy of the trees.
> >
> > The session group handle gets added to the session descriptor, and
> > everything in gnutls that references the trees references 
> them via the
> > relevant session group descriptor instead of direct reference to
> > global variables.  Some internal functions need to get an extra
> > parameter so they know which tree to access.
> >
> > So that the existing API keeps working, there is a global session
> > group descriptor that takes the place of the two global 
> tree pointers.
> > gnutls_global_init() does a gnutls_create_session_group() of that
> > descriptor and gnutls_session_init() does a
> > gnutls_session_init_group() with the global handle.
> 
> This could work, but I don't exactly see what the problem is that
> would warrant all those changes.  The only gain I see is that you
> protect against a potential memory leak in gnutls_global_init() that
> will most likely never happen in real code.  For comparison, the
> constant memory leak inside libgcrypt should be more worrying to
> you...
> 

The session idea works however it require extra parameters in many
functions... This has also an advantage. Assuming someone can add
ciphers or other informations to GnuTLS these informations are not used
globally but only limited for each session. Also it fix all DSO problems
cause memory deallocation is demanded to caller, not to GnuTLS. This
also make it possible to demand thread-safety to caller.

...

freddy77