329 lines
16 KiB
Plaintext
329 lines
16 KiB
Plaintext
============================================
|
|
Datagram Transport Layer Security for Python
|
|
============================================
|
|
|
|
PyDTLS brings Datagram Transport Layer Security (DTLS - RFC 6347:
|
|
http://tools.ietf.org/html/rfc6347) to the Python environment. In a
|
|
nutshell, DTLS brings security (encryption, server authentication,
|
|
user authentication, and message authentication) to UDP datagram
|
|
payloads in a manner equivalent to what SSL/TLS does for TCP stream
|
|
content.
|
|
|
|
DTLS is now very easy to use in Python. If you're familiar with the
|
|
ssl module in Python's standard library, you already know how. All it
|
|
takes is passing a datagram/UDP socket to the *wrap_socket* function
|
|
instead of a stream/TCP socket. Here's how one sets up the client side
|
|
of a connection::
|
|
|
|
import ssl
|
|
from socket import socket, AF_INET, SOCK_DGRAM
|
|
from dtls import do_patch
|
|
do_patch()
|
|
sock = ssl.wrap_socket(socket(AF_INET, SOCK_DGRAM))
|
|
sock.connect(('foo.bar.com', 1234))
|
|
sock.send('Hi there')
|
|
|
|
Design Goals
|
|
============
|
|
|
|
The primary design goal of PyDTLS is broad availability. It has therefore
|
|
been built to be widely compatible with the following:
|
|
|
|
* Operating systems: apart from the Python standard library, PyDTLS
|
|
relies on the OpenSSL library only. OpenSSL is widely ported, and
|
|
in fact the Python standard library's *ssl* module also uses it.
|
|
* Python runtime environments: PyDTLS is a package consisting of
|
|
pure Python modules only. It should therefore be portable to many
|
|
interpreters and runtime environments. It interfaces with OpenSSL
|
|
strictly through the standard library's *ctypes* foreign function
|
|
library.
|
|
* The Python standard library: the standard library's *ssl* module is
|
|
Python's de facto interface to SSL/TLS. PyDTLS aims to be compatible
|
|
with the full public interface presented by this module. The ssl
|
|
module ought to behave identically with respect to all of its
|
|
functions and options when used in conjunction with PyDTLS and
|
|
datagram sockets as when used without PyDTLS and stream sockets.
|
|
* Connection-based protocols: as outlined below, layering security
|
|
on top of datagram sockets requires introducing certain
|
|
connection constructs normally absent from datagram
|
|
sockets. These constructs have been added in such a way as to be
|
|
compatible with code that expects to interoperate with
|
|
connection-oriented stream sockets. For example, code that
|
|
expects to go through server-side bind/listen/accept connection
|
|
establishment should be reusable with PyDTLS sockets.
|
|
|
|
Distributions
|
|
=============
|
|
|
|
PyDTLS requires version 1.0.0 or higher of the OpenSSL
|
|
library. Earlier versions are reported not to offer stable DTLS
|
|
support. Since packaged distributions of this version of OpenSSL are
|
|
available for many popular operating systems, OpenSSL-1.0.0 is an
|
|
installation requirement before PyDTLS functionality can be called.
|
|
On Ubuntu 12.04 LTS, for example, the Python interpreter links with
|
|
libcrypto.so.1.0.0 and libssl.so.1.0.0, and so use of PyDTLS requires
|
|
no further installation steps.
|
|
|
|
In comparison, installation of OpenSSL on Microsoft Windows operating
|
|
systems is inconvenient. For this reason, source distributions of
|
|
PyDTLS are available that include OpenSSL dll's for 32-bit and 64-bit
|
|
Windows. For 32-bit Windows, a version built with the MinGW toolchain
|
|
is also available. Its archive includes stripped as well as
|
|
non-stripped dll's. The latter can be debugged with gdb on
|
|
Windows. All dll's have been linked with the Visual Studio 2008
|
|
version of the Microsoft C runtime library, msvcr90.dll, the version
|
|
used by CPython 2.7. Installation of Microsoft redistributable runtime
|
|
packages should therefore not be required on machines with CPython
|
|
2.7. The version of OpenSSL distributed with PyDTLS 0.1.0 is 1.0.1c.
|
|
|
|
The OpenSSL version used by PyDTLS can be determined from the values
|
|
of *sslconnection's* DTLS_OPENSSL_VERSION_NUMBER,
|
|
DTLS_OPENSSL_VERSION, and DTLS_OPENSSL_VERSION_INFO. These variables
|
|
are available through the *ssl* module also if *do_patch* has been
|
|
called (see below). Note that the OpenSSL version used by PyDTLS may
|
|
differ from the one used by the *ssl* module.
|
|
|
|
Interfaces
|
|
==========
|
|
|
|
PyDTLS' top-level package, *dtls*, provides DTLS support through the
|
|
**SSLConnection** class of its *sslconnection*
|
|
module. **SSLConnection** is in-line documented, and
|
|
dtls/test/echo_seq.py demonstrates how to take a simple echo server
|
|
through a listen/accept/echo/shutdown sequence using this class. The
|
|
corresponding client side can look like the snippet at the top of this
|
|
document, followed by a call to the *unwrap* method for shutdown (or a
|
|
call to the **SSLConnection** *shutdown* method, if an instance of
|
|
this class is used for the client side also). Note that the *dtls*
|
|
package does not depend on the standard library's *ssl* module, and
|
|
**SSLConnection** can therefore be used in environments where *ssl* is
|
|
unavailable or incompatible.
|
|
|
|
It is expected that with the *ssl* module being an established, familiar
|
|
interface to TLS, it will be the preferred module through which to
|
|
access DTLS. To do so, one must call the *dtls* package's *do_patch*
|
|
function before passing sockets of type SOCK_DGRAM to either *ssl's*
|
|
*wrap_socket* function, or *ssl's* **SSLSocket** constructor.
|
|
|
|
It should be noted that once *do_patch* is called, *dtls* will raise
|
|
exceptions of type **ssl.SSLError** instead of its default
|
|
**dtls.err.SSLError**. This allows users' error handling code paths to
|
|
remain identical when interfacing with *ssl* across stream and
|
|
datagram sockets.
|
|
|
|
Connection Handling
|
|
===================
|
|
|
|
The DTLS protocol implies a connection as an association between two
|
|
network peers where the overall association state is characterized by the
|
|
handshake status of each peer endpoint (see RFC 6347). The OpenSSL library
|
|
records this handshake status in "SSL" type instances (a.k.a. struct
|
|
ssl_st). Datagrams can be securely sent and received by referring to a
|
|
unique "SSL" instance after handshaking has been completed with this
|
|
instance and its network peer. A connection is implied in that traffic
|
|
may be directed to or received from only that network peer with whose
|
|
"SSL" instance the handshake has been completed. The fact that the
|
|
underlying network protocol, UDP in most cases, is itself connectionless
|
|
is immaterial.
|
|
|
|
Further, in order to prevent denial-of-service attacks on UDP DTLS
|
|
servers, clients must undergo a cookie exchange phase early in the
|
|
handshaking protocol, and before server-side resources are committed to
|
|
a particular client (see section 4.2.1 of RFC 6347). The cookie exchange
|
|
proves to the server that a client can indeed receive IP traffic at
|
|
the source IP address with which its handshake-initiating ClientHello
|
|
datagram is marked.
|
|
|
|
PyDTLS implements this connection establishment through the *connect*
|
|
method on the client side, and the *accept* method on the server side.
|
|
The latter returns a new **dtls.SSLConnection** or **ssl.SSLSocket**
|
|
object (depending on which interface is used, see above), which is
|
|
"connected" to its peer. In addition to the *read* and *write* methods
|
|
(at both interface levels), **SSLSocket's** *send* and *recv* methods
|
|
can be used; use of *sendto* and *recvfrom* on connected sockets is
|
|
prohibited by *ssl*. *accept* returns peer address information, as
|
|
expected. Note that when using the *ssl* interface to *dtls*, *listen*
|
|
must be called before calling *accept*.
|
|
|
|
Demultiplexing
|
|
==============
|
|
|
|
At the network io layer, only datagrams from its connected peer must be
|
|
passed to a **SSLConnection** or **SSLSocket** object (unless the object
|
|
is unconnected on the server-side, in which case in can be in listening
|
|
mode, the initial server-side socket whose role it is to listen for
|
|
incoming client connection requests).
|
|
|
|
The initial server-side listening socket is not useful for performing this
|
|
datagram routing function. This is because it must remain unconnected and
|
|
ready to receive additional connection requests from new, unknown clients.
|
|
|
|
The function of passing incoming datagrams to the proper connection is
|
|
performed by the *dtls.demux* package. **SSLConnection** requests a new
|
|
connection from the demux when a handshake has cleared the cookie
|
|
exchange phase. An efficient implementation of this request is provided
|
|
by the *osnet* module of the demux package: it creates a new socket that
|
|
is bound to the same network interface and port as the listening socket,
|
|
but connected to the peer. UDP stacks such as the one included with Linux
|
|
route incoming datagrams to such a connected socket in preference to an
|
|
unconnected socket bound to the same port.
|
|
|
|
Unfortunately such is not the behavior on Microsoft Windows. Windows
|
|
UDP routes datagrams to whichever currently existing socket bound to
|
|
the particular port the earliest (and whether or not that socket is
|
|
unconnected, or connected to the datagram's peer, or a different
|
|
peer). Other sockets bound to the same port will not receive traffic,
|
|
if and until they become the earliest bound socket because another
|
|
socket is closed.
|
|
|
|
The demux package therefore provides and automatically selects the module
|
|
*router* on Windows platforms. This module also creates a new socket when
|
|
receiving a new connection request; but instead of binding this socket
|
|
to the same port as the listening socket, it binds to a new ephemeral
|
|
port. *router* then forwards datagrams originating from the peer for which
|
|
a connection was requested to the corresponding socket.
|
|
|
|
For efficiency's sake, no forwarding is performed on outgoing traffic.
|
|
Instead, **SSLConnection** directs outgoing traffic from the original
|
|
listening socket, using *sendto*. At the OpenSSL level this requires
|
|
separate read and write datagram BIO's for an "SSL" instance, one in
|
|
"connected" state and one in "peer set" state, respectively, and
|
|
associated with two separate network sockets.
|
|
|
|
From the perspective of a PyDTLS user, this selection of and
|
|
difference between demux implementations should be transparent, with
|
|
the possible exception of performance deviation. This transparency
|
|
does however have some limits: for example, when *router* is in use,
|
|
the *accept* methods can return *None*. This happens when
|
|
**SSLConnection** detects that the demux has forwarded a datagram to a
|
|
known connection instead of initiating a connection to a new peer
|
|
through *accept*. Returning *None* in this case is important whenever
|
|
non-blocking sockets or sockets with timeouts are used, since another
|
|
socket might now be readable as a result of the forwarded
|
|
datagram. *accept* must return so that the application can iterate on
|
|
its asynchronous *select* loop.
|
|
|
|
Shutdown and Unwrapping
|
|
=======================
|
|
|
|
PyDTLS implements the SSL/TLS shutdown protocol as it has been adapted
|
|
for DTLS. **SSLConnection's** *shutdown* and **SSLSocket's** *unwrap*
|
|
invoke this protocol. As is the case with DTLS handshaking in general,
|
|
applications must be prepared to use the *get_timeout* and
|
|
*handle_timeout* methods in addition to re-invoking *shutdown* or
|
|
*unwrap* when sockets become readable and an exception carried
|
|
SSL_ERROR_WANT_READ. (See more on asynchronous IO in the Testing section.)
|
|
|
|
**SSLConnection's** *shutdown* and **SSLSocket's** *unwrap* return a
|
|
(possibly new) socket that can be used for unsecured communication
|
|
with the peer, as set forth by the *ssl* module. The demux
|
|
infrastructure remains in use for this communication until the
|
|
returned socket is cleaned up. Note that when the *router* demux is
|
|
in use, the object returned will be one derived from
|
|
*socket.socket*. This is because the send and recv paths must still be
|
|
directed to two different OS sockets. In addition, the right thing
|
|
happens if secured communication is resumed over such a socket by
|
|
passing it to *ssl.wrap_socket* or the **SSLConnection**
|
|
constructor. If *osnet* is used, an actual *socket.socket* instance is
|
|
returned.
|
|
|
|
Framework Compatibility
|
|
=======================
|
|
|
|
PyDTLS sockets have been tested under the following usage modes:
|
|
|
|
* Using blocking sockets and sockets with timeouts in
|
|
multi-threaded UDP servers
|
|
* Using non-blocking sockets, and in conjunction with the
|
|
asynchronous socket handler, asyncore
|
|
* Using blocking sockets, and in conjunction with the network
|
|
server framework SocketServer - ThreadingTCPServer (this works
|
|
because of PyDTLS's emulation of connection-related calls)
|
|
|
|
Multi-thread Support
|
|
====================
|
|
|
|
Using multiple threads with OpenSSL requires implementing a locking
|
|
callback. PyDTLS does implement this, and therefore multi-threaded
|
|
programming with PyDTLS is safe in any environment. However, being a
|
|
pure Python library, these callbacks do carry some overhead. The *ssl*
|
|
module implements an equivalent locking callback in its C extension
|
|
module. Not requiring interpreter re-entry, this approach can be
|
|
expected to perform better. PyDTLS therefore queries OpenSSL as to
|
|
whether a locking callback is already in place, and does not overwrite
|
|
it if there is. Loading *ssl* can therefore improve performance, even
|
|
when only the *sslconnection* interface is used.
|
|
|
|
Note that loading order does not matter: to obtain the performance
|
|
benefit, *ssl* can be loaded before or after the dtls package. This is
|
|
because *ssl* does not do an equivalent existing locking callback
|
|
check, and will simply overwrite the PyDTLS callback if it has already
|
|
been installed. But *ssl* should not be loaded while *dtls* operation
|
|
is already in progress, when some locks may be in their acquired
|
|
state.
|
|
|
|
Also note that this performance enhancement is available only on
|
|
platforms where PyDTLS loads the same OpenSSL shared object as
|
|
*ssl*. On Ubuntu 12.04, for example, this is the case, but on
|
|
Microsoft Windows it is not.
|
|
|
|
Testing
|
|
=======
|
|
|
|
Almost all of the Python standard library's *ssl* unit tests from the
|
|
module *test_ssl.py* have been ported to *dtls.test.unit.py*. All tests
|
|
have been adjusted to operate with datagram sockets. On Linux, each
|
|
test is executed four times, varying the address family among IPv4 and
|
|
IPv6 and the demux among *osnet* and *router*. On Windows, where
|
|
*osnet* is unavailable, each test is run twice, once with IPv4 and once
|
|
with IPv6.
|
|
|
|
The unit test suite includes tests for each of the above-mentioned
|
|
compatible frameworks. The class **AsyncoreEchoServer** serves as an
|
|
example of how to use non-blocking datagram sockets and implement the
|
|
resulting timeout detection requirements. DTLS in general and OpenSSL
|
|
in particular require being called back when used with non-blocking
|
|
sockets (or sockets with timeout option) after DTLS timeouts expire to
|
|
handle packet loss using re-transmission during a
|
|
handshake. Handshaking may occur during any read or write operation,
|
|
even after an initial handshake completes successfully, in case
|
|
renegotiation is requested by a peer.
|
|
|
|
Running with the -v switch executes all unit tests in verbose mode.
|
|
|
|
dtls/test/test_perf.py implements an interactive performance test
|
|
suite that compares the raw throughput of TCP, UDP, SSL, and DTLS.
|
|
It can be executed locally through the loopback interface, or between
|
|
remote clients and servers. In the latter case, test jobs are sent to
|
|
remote connected clients whenever a suite run is initiated through the
|
|
interactive interface. Run test_perf.py -h for more information.
|
|
|
|
It should be noted that comparing the performance of protocols that
|
|
don't offer congestion control (UDP and DTLS) with those that do (TCP
|
|
and SSL) is a difficult undertaking. Raw throughput even across
|
|
gigabit network links can be expected to suffer without congestion
|
|
control and peers that generate data as fast as possible without
|
|
throttling (as this test does): the link's throughput will drop
|
|
significantly as it enters congestion collapse. Similarly, loopback is
|
|
an imperfect test interface since it rarely drops packets, and never
|
|
duplicates or reorders them (thus negating the relative performance
|
|
benefits of DTLS over SSL). Nevertheless, some useful insights can be
|
|
gained by observing the operation of test_perf.py, including software
|
|
stack behavior in the presence of some amount of packet loss.
|
|
|
|
Logging
|
|
=======
|
|
|
|
The *dtls* package and its sub-packages log various occurrences,
|
|
primarily events that can aid debugging. Especially *router* emits many
|
|
messages when the logging level is set to at least *logging.DEBUG*.
|
|
dtls/test/echo_seq.py activates this logging level during its operation.
|
|
|
|
Currently Supported Platforms
|
|
=============================
|
|
|
|
At the time of initial release, PyDTLS 0.1.0 has been tested on Ubuntu
|
|
12.04.1 LTS 32-bit and 64-bit, as well as Microsoft Windows 7 32-bit
|
|
and 64-bit, using CPython 2.7.3. Patches with additional platform
|
|
ports are welcome.
|