optimize.txt   [plain text]



		    FreeType Optimization HOWTO



Introduction
============

  This file describes several ways to improve the performance of the
  FreeType  engine  on  specific  builds.   Each  `trick'  has  some
  drawbacks, be it on code size or portability.

  The  performance  improvement  cannot  be quantified  here  simply
  because id depends significantly on platforms _and_ compilers.



I. Tweaking the configuration file
==================================

  The FreeType configuration file  is named `ft_conf.h' and contains
  the definition  of various macros  that are used to  configure the
  engine at build time.

  Apart from  the Unix configuration  file, which is  generated when
  calling the `configure' script from the template called

    freetype/ft_conf.h.in    ,

  all configuration files are located in

    freetype/lib/arch/<system>/ft_conf.h    ,

  where  <system>  stands  for  your platform.   This  release  also
  provides an `ansi' build, i.e., the directory `lib/arch/ansi' used
  to compile with any ANSI-compliant compiler.

  The configuration macros that  relate to performance are described
  next.


  1. TT_CONFIG_OPTION_INTERPRETER_SWITCH
  --------------------------------------

    If set,  this macro builds  a bytecode interpreter which  uses a
    huge  `switch' statement  to  parse the  bytecode stream  during
    glyph hinting.

    If unset,  the interpreter  uses a big  jump table to  call each
    bytecode's routine.

    This macro is *set* by default.  However, it may be worthwile on
    some platforms to unset it.

    Note      that      this      macro      is      ignored      if
    TT_CONFIG_OPTION_NO_INTERPRETER is set.


  2. TT_CONFIG_OPTION_STATIC_INTERPRETER
  --------------------------------------

    If set,  this macro builds  a bytecode interpreter which  uses a
    static variable  to store its  state.  On some  processors, this
    will produce code which is bigger but slightly faster.

    Note  that you  should NOT  DEFINE  this macro  when building  a
    thread-safe version of the engine.

    This macro is *unset* by default.


  3. TT_CONFIG_OPTION_STATIC_RASTER
  ---------------------------------

    If set,  this macro  builds a scan-line  converter which  uses a
    static variable to store  its state.  On some processors, though
    depending on the compiler used,  this will produce code which is
    bigger but moderately faster.

    Note  that you  should NOT  DEFINE  this macro  when building  a
    thread-safe version of the engine.

    This macro is *unset* by  default.  We do not recommend using it
    except for extreme cases where a performance `edge' is needed.



II. Replacing some components with optimized versions
=====================================================

  You can also, in order to improve performance, replace one or more
  components  from   the  original  source  files.    Here  are  our
  suggestions.


  1. Use memory-mapped files whenever available
  ---------------------------------------------

    Loading a  glyph from a  TrueType file needs many  random seeks,
    which take a lot of time when using disk-based files.

    Whenever   possible,  use   memory-mappings   to  improve   load
    performance dramatically.  For an example, see the source file
  
      freetype/lib/arch/unix/ttmmap.c

    which uses Unix memory-mapped files.


  2. Replace the computation routines in `ttcalc.c'
  ---------------------------------------------------

    This file contains many  computation routines that can easily be
    replaced by  inline-assembly, tailored for  a specific processor
    and/or compiler.

    After  heavy  testing,  we  have  found  that  these  functions,
    especially TT_MulDiv(),  are the ones that  are most extensively
    used and called when loading glyphs from a font file.

    We do not provide inline-assembly  with this release, as we want
    to  emphasize the  portability  of our  library.  However,  when
    working on a specific project  where the hardware is known to be
    fixed  (like on  an  embedded system),  great performance  gains
    could be achieved by replacing these routines.

    (By the way, the square root  function is not optimal, but it is
    very  seldom  called.   However,  its  accuracy  is  _critical_.
    Replacing it with a fast  but inaccurate algorithm will ruin the
    rendering of glyphs at small sizes.)



III. Measuring performance improvements
=======================================

  Once you  have chosen some  improvements and rebuilt  the library,
  some quick ways to measure the `new' speed are:

  - Run  the test program  `ftlint' on  a directory  containing many
    TrueType fonts, and measure the time it takes.  On Unix, you can
    use the shell command `time' to do it like in

      % time test/ftlint 10 /ttfonts/*.ttf

    This will  measure the  performance improvement of  the TrueType
    interpreter.

  - Run the test program `fttimer' on a font containing many complex
    glyphs (the  latest available versions of Times  or Arial should
    do it), probaby using anti-aliasing, as in:

      % time test/fttimer -g /ttfonts/arial.ttf

  Compare the results of several of these runs for each build.


--- end of OPTIMIZE ---