FreeType Optimization HOWTO Introduction ============ This file describes several ways to improve the performance of the FreeType engine on specific builds. Each `trick' has some drawbacks, be it on code size or portability. The performance improvement cannot be quantified here simply because id depends significantly on platforms _and_ compilers. I. Tweaking the configuration file ================================== The FreeType configuration file is named `ft_conf.h' and contains the definition of various macros that are used to configure the engine at build time. Apart from the Unix configuration file, which is generated when calling the `configure' script from the template called freetype/ft_conf.h.in , all configuration files are located in freetype/lib/arch//ft_conf.h , where stands for your platform. This release also provides an `ansi' build, i.e., the directory `lib/arch/ansi' used to compile with any ANSI-compliant compiler. The configuration macros that relate to performance are described next. 1. TT_CONFIG_OPTION_INTERPRETER_SWITCH -------------------------------------- If set, this macro builds a bytecode interpreter which uses a huge `switch' statement to parse the bytecode stream during glyph hinting. If unset, the interpreter uses a big jump table to call each bytecode's routine. This macro is *set* by default. However, it may be worthwile on some platforms to unset it. Note that this macro is ignored if TT_CONFIG_OPTION_NO_INTERPRETER is set. 2. TT_CONFIG_OPTION_STATIC_INTERPRETER -------------------------------------- If set, this macro builds a bytecode interpreter which uses a static variable to store its state. On some processors, this will produce code which is bigger but slightly faster. Note that you should NOT DEFINE this macro when building a thread-safe version of the engine. This macro is *unset* by default. 3. TT_CONFIG_OPTION_STATIC_RASTER --------------------------------- If set, this macro builds a scan-line converter which uses a static variable to store its state. On some processors, though depending on the compiler used, this will produce code which is bigger but moderately faster. Note that you should NOT DEFINE this macro when building a thread-safe version of the engine. This macro is *unset* by default. We do not recommend using it except for extreme cases where a performance `edge' is needed. II. Replacing some components with optimized versions ===================================================== You can also, in order to improve performance, replace one or more components from the original source files. Here are our suggestions. 1. Use memory-mapped files whenever available --------------------------------------------- Loading a glyph from a TrueType file needs many random seeks, which take a lot of time when using disk-based files. Whenever possible, use memory-mappings to improve load performance dramatically. For an example, see the source file freetype/lib/arch/unix/ttmmap.c which uses Unix memory-mapped files. 2. Replace the computation routines in `ttcalc.c' --------------------------------------------------- This file contains many computation routines that can easily be replaced by inline-assembly, tailored for a specific processor and/or compiler. After heavy testing, we have found that these functions, especially TT_MulDiv(), are the ones that are most extensively used and called when loading glyphs from a font file. We do not provide inline-assembly with this release, as we want to emphasize the portability of our library. However, when working on a specific project where the hardware is known to be fixed (like on an embedded system), great performance gains could be achieved by replacing these routines. (By the way, the square root function is not optimal, but it is very seldom called. However, its accuracy is _critical_. Replacing it with a fast but inaccurate algorithm will ruin the rendering of glyphs at small sizes.) III. Measuring performance improvements ======================================= Once you have chosen some improvements and rebuilt the library, some quick ways to measure the `new' speed are: - Run the test program `ftlint' on a directory containing many TrueType fonts, and measure the time it takes. On Unix, you can use the shell command `time' to do it like in % time test/ftlint 10 /ttfonts/*.ttf This will measure the performance improvement of the TrueType interpreter. - Run the test program `fttimer' on a font containing many complex glyphs (the latest available versions of Times or Arial should do it), probaby using anti-aliasing, as in: % time test/fttimer -g /ttfonts/arial.ttf Compare the results of several of these runs for each build. --- end of OPTIMIZE ---