rlm_x99_token [plain text]

This is the combined README for pam_x99_auth, a PAM module, and
rlm_x99_token, a FreeRADIUS module.  This module supports ANSI X9.9
token authentication.  If you have questions not answered in this doc,
contact Frank Cusack, <frank@google.com>.  Please send bug reports
to the same address.

This software is Copyright (c) 2002 Google, Inc.
All Rights Reserved.

See the COPYRIGHT file, included with this distribution, for copyright
and redistribution information.


0.  INTRODUCTION

    This module allows authentication of users who hold ANSI X9.9
    challenge/response tokens.  These tokens are basically handheld
    DES calculators and are sold by various vendors.  Currently known
    vendors are:

	CRYPTOCard (RB-1, KT-1)
	<http://www.cryptocard.com/>
	Full support.

	Secure Computing (SafeWord (various models))
	<http://www.securecomputing.com/index.cfm?sKey=688>
	Async support only.
	Formerly Enigma Logic.

	VASCO (DigiPass (various models))
	<http://www.vasco.com/R3DEngine.asp?reference=03-01.01-01&lang=en>
	Not supported.

	ActivCard (ActivCard One)
	<http://www.activcard.com/>
	Not supported (patented sync mode).

	PassGo (Defender aka SNK, Defender Plus, Defender One)
	<http://www.passgo.com/products/defender/>
	Fully supported.
	This product has a history of acquisition:
	PassGo, Symantec, AXENT, AssureNet Pathways (name change only),
	Digital Pathways.  Also, to add to the fun, PassGo was acquired
	by AXENT, whom Symantec then acquired, and later forked off.  The
	product was originally named "SecureNet Key" aka SNK or SNK/4.

    If you know of other vendors, please email me (frank@google.com)
    and I'll add them to the list above.

    Although ActivCard has full specs, I cannot add support to the public
    codebase because their synchronous algorithm is patented.  Oh, well,
    their loss.  I can, however, add code which you may not distribute.
    You will need to obtain their development kit and sign an NDA.  Async
    mode is not supported for ActivCard because you need the dev. kit (+NDA)
    to extract or program the keys.

    The PassGo Defender One is an OEM version of the ActivCard One and
    so it suffers from the same patent issues.  I believe the Defender Plus
    also uses the same algorithm.

    No support is available for programming tokens.  You will need to
    write this yourself, or use the vendor's programming tools and extract
    the key information in order to use this module.

    I *strongly* discourage the use of "soft tokens" or Palm tokens.  These
    are easily compromisable, since the key is likely to be insufficiently
    protected.

    This module is available for FreeRADIUS and PAM.  FreeRADIUS is
    available at <http://www.freeradius.org/>, the PAM module is available
    at <http://www.fcusack.com/>.


1.  STRONG WARNING SECTION

    ANSI X9.9 has been withdrawn as a standard, due to the weakness of DES.
    An attacker can learn the token's secret by observing two
    challenge/response pairs.  See ANSI document X9 TG-24-1999,
    <http://www.x9.org/TG24_1999.pdf>.

    The obvious fix is to not display the challenge; the attacker will
    not have access to the plaintext.  This is possible since most X9.9
    tokens support a synchronous mode; the only exception I know of is
    the PassGo Defender.  So, in synchronous modes, the challenge presented
    to the user is NOT the challenge used for response calculation.
    Read on for more info.

    The default configuration of this module effectively disables pure
    challenge/response (hereafter: async) mode.


2.  INSTALLATION

    You'll need to have a DES library in order to build and use this module.
    Currently, only openssl is supported.

    You will also need /dev/urandom available.  This is available on all
    Linux, *BSD and Solaris 9.  For Solaris 8, you'll need to install
    patch 112438-01 (sparc) or 112439-01 (x86).  Information for other OS's
    is welcome.

    You'll also need to write a site-specific challenge transform in order
    to use async mode.  You might need async mode to sync the user's token
    with the server initially.  More on this below.

3.  END-USER OPERATION

    "Normally", upon login, users would enter enter the challenge into
    their token and give the server the response.  However, this is unsafe
    given that DES is so weak.  Luckily, most tokens support a synchronous
    mode which lets the user skip the part where they enter the challenge.

    For some tokens, the token displays the synchronous challenge, which
    typically the user would verify is the same as the challenge presented
    by the server.  Then they can safely just press "ENT" and enter the
    response.  This is very easy to use, but we're still stuck with the
    problem that an attacker has observed a plaintext/ciphertext pair.

    So instead of presenting the synchronous challenge, the server ALWAYS
    displays a random challenge.  Instead of verifying that the challenge
    matches the token display, the user should just press "ENT" and enter
    the response.  This might be confusing, you will need to train users.
    Even with training, they will forget.  Be warned!

    For other tokens, the token does not display the synchronous
    challenge--only the response is displayed.  This is a bit easier on
    the user, they won't be confused as to which number to enter for the
    response.  Since the token's challenge display really just serves
    to verify the sync state, and we don't present that information,
    I recommend operating tokens in the no-challenge-display mode if
    possible.

    In addition to the shielding of the plaintext, and ease of use,
    another advantage of sync mode is to support authentication methods
    where a challenge cannot be presented to the user, e.g. when using
    the Microsoft Windows VPN client.
    

4.  SITE-SPECIFIC CHALLENGE TRANSFORM

    Even though the normal mode of operation will be sync mode,
    we want async mode support for (at least) two reasons:

        1) to sync/resync the token, and
        2) because state is not shared across multiple RADIUS/PAM servers.

    Note that only some tokens support "user" sync/resync.  For others,
    an admin must create the initial state and manual intervention is
    required for resync.

    Since pure challenge/response is unsafe, I came up with the concept
    of the "site-specific challenge transform".  For the user, this
    means that instead of entering the challenge as presented to them,
    they enter something based on the challenge.  For example, a simple
    transform would be to enter the challenge backwards; if the server
    presents "123456" the user enters "654321".  This has the effect
    that an observer does not have access to the plaintext.

    This is security through obscurity, and is not really "safe", but
    for an outsider it may present at least some barrier.  Even though
    it presents no advantage in the face of a determined attacker, I
    recommend using it.

    The server logs each time a user authenticates via async mode, so
    I recommend a log scanner which alerts you to this.  You should
    reprogram tokens when the user authenticates via async mode.

    x99_site.c implements the site-specific challenge transform.  The
    default transform is to replace the challenge with the text "DISABLED".
    This effectively disables async mode (the user will not be able to
    enter this into their token).

    DO NOT use the transform suggested above, reversing the challenge.
    That is now exceptionally weak.  An example of a possibly strong
    transform is to have the user enter the square of the challenge.
    The VASCO DigiPass 500 is also a [regular] calculator, so this
    could be a good one if you use that token.  Well, there's no support
    for that token, and now that I've mentioned it, it is another
    exceptionally weak transform, but you get the idea.

    Note that older CRYPTOCard RB-1 tokens support arbitrarily long
    challenge strings.  You should take advantage of this when implementing
    your transform.  You will still have to stay under MAX_CHALLENGE_LEN
    digits.  (This is why MAX_CHALLENGE_LEN is set to 32 even though
    the displayed challenge would generally be much smaller.)

    If you do not believe applying a transform gives any advantage,
    you can just comment out the single line of code there.  This actually
    may have some benefit, since your users don't need to be trained.
    I can guarantee your most annoying user will complain when they can't
    remember what they really are supposed to enter into the token.
    Also, this can be safe if you diligently reprogram tokens when async
    mode has been used.  You might automatically disable a token after two
    async authentications.


5.  CONFIGURATION

    Most of the configuration is documented fairly well in the sample
    x99.conf file (FreeRADIUS) or man page (PAM).  I will only discuss
    a few options here.

    softfail/hardfail:  See x99.conf (FreeRADIUS) or the man page (PAM)
    for how these work.  It is critically important to have these since
    the password (response) space is so small.  Without a delay/lockout,
    it would be trivially easy for an attacker to just try every
    possible password.  With the default softfail setting of 5, an
    attacker could try, at most, ~50 passwords/day.  No indication is
    given to the user that they are in "delay mode" (except that a valid
    password doesn't work).

    ewindow_size:  This is how far out of [event] sync the server can
    get with the token.  The value is how far the user can be ahead of
    the server -- essentially how many times the user can play with the
    token.  You'll want to set this to at least 1 or 2, in case the user
    mistypes the response and the token turns off before he is able to
    try again.  It's 'e'window because I am reserving twindow for
    time synchronous modes.

    ewindow2_size:  This is similar to ewindow_size.  In "delay mode",
    users' responses are normally /not/ checked during the delay period.
    However, if ewindow2_size is non-zero, the response /is/ checked up to
    ewindow2_size events/counts.  If the user gets two consecutive sync
    responses correct within ewindow2_delay seconds, he is authenticated.

	For example, say softfail=1, ewindow_size=2 and ewindow2_size=8.
	The server's state is such that the next 8 responses are
	1, 2, ..., 8.  The user, however, has played with the token and
	the response showing is '3'.

	This is ahead of ewindow_size, so the server refuses them (and
	records the fact the last response was '3').

	The user tries again immediately, using '3' again.  Since the
	user is now in "delay mode" (because softfail is only 1),
	the server would normally refuse them (remember, we said they
	tried again "immediately").  Even if the user weren't in delay
	mode, the server would still refuse them because they are too
	far ahead of the window.

	But since ewindow2_size is non-zero, instead the server looks
	ahead up to 8 responses.  It stops at 3 and since the previous
	response was not '2', it refuses the user, but records that the
	last response was '3'.

	Now the user tries again immediately, this time using the next
	response of '4'.  Again, normally this would be refused since
	the user is in delay mode.  But because ewindow2_size is set,
	the server will check up to 8 responses and since '4' is within
	the window, and the user's previous response ('3') matches the
	previous response in the window, the user is authenticated.

	* If an attacker knows a response that is ahead of the window,
	  they can launch a simple attack by just guessing in pairs:
	  <known_response>, <guess>.  However, they could also simply
	  just wait until the user advances the window so that their
	  known response becomes valid, so this shouldn't be an issue.
	  Nevertheless, you may wish to set hardfail when using
	  ewindow2_size (but keep in mind the trivial DoS with hardfail).

	* Setting ewindow2_size to ANY value increases CPU usage.
	  Without it set, when a user is in softfail the server returns
	  failure without checking response values.  With ewindow2_size
	  set, the server now does check response values; and you
	  normally want ewindow2_size to be somewhat large (compared to
	  ewindow_size), so the number of values checked is GREATER when
	  the softfail-imposed delay mode is in effect.

	* Setting ewindow2_size to LARGE values may increase potential
	  for abuse via DoS.  I  have not performed any server sizing
	  exercises, but I expect that for most installations any modern
	  hardware is fast enough for reasonable values.

    radiusd.conf:  x99_token must be listed in both the authorize and
    authenticate stanzas.  In the authorize code, x99_token will set
    the Auth-Type to x99_token (ie, itself) if the Auth-Type attribute
    isn't already present.  You can use this to selectively authenticate
    users via a token.  Any examples I could give here would be poor,
    and subject to other modules' [changing] operations, so it's probably
    best to direct any questions to <freeradius-users@lists.cistron.nl>.

6.  FILES

    See the sample x99passwd file.  State files are stored in
    /etc/x99sync.d by default.  There is one state file per user.
    The state file contains the information needed for synchronous
    mode; also the number of consecutive failed logins and the last
    time the user authenticated via async mode is stored here.


7.  LOG MESSAGES

    All errors begin with "rlm_x99_token" (FreeRADIUS) or "pam_x99_auth"
    (PAM).  Only errors are logged, there are no "success" log messages.
    You will want to scan for these automatically or periodically.

    "bad state" messages (FreeRADIUS) indicate a problem with the State
    attribute, which the server uses to track challenges (for async mode).
    They are all of the form "bad state for [%s]: <problem>",
    where <problem> is one of:

    length:  The length is not as expected.  Could be an attempted attack,
             but more likely a network blip.
    hmac:    The state is protected by a cryptographic hash which was not
             able to be verified.  This could be because you just HUP'd
             the server.
    expired: The state is older than maxdelay seconds.  If you get a lot
             of these you may wish to increase the value.
    missing: This should never happen and indicates a bug.

    Another message you'll want to lookout for is
    "%d/%d failed/max authentications" which indicates a user that is
    locked out due to exceeding hardfail/softfail failures.  You can reset
    this user by editing the state file (see x99passwd.sample).

    Also, look for "[%s] authenticated in async mode" which indicates
    a user with a sync mode card that used async authentication.  You
    may wish to reprogram these users' cards.


8.  BUGS

    Send bug reports or any other questions to Frank Cusack,
    <frank@google.com>.