Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

TTF Parser (was: help needed from graphics GURUS ?)

 [1/1] from: greggirwin::mindspring::com at: 11-Sep-2002 10:49


Hi Gang, Here's a crude, quickly hacked, TTF parser that will extract the English name of the font from the file, add it to a block of font names, and print it out to the console. TTF files are insanely flexible and can contain all kinds of platform and language information. This basic piece was easy enough to get working, and if the font-requester project gets going, it could certainly be enhanced and improved as we figure out what it needs. --Gregg P.S. WATCH FOR LINE WRAP! REBOL [ Title: "ttf-parser" File: %ttf-parser.r Author: "Gregg Irwin" Email: [greggirwin--acm--org] Purpose: {Parse TrueType Font files.} ] null-buff: func [ {Returns a null-filled string buffer of the specified length.} len [integer!] ][ head insert/dup make string! len #"^@" len ] buff-to-num: func [buf /big-endian] [ either big-endian [ to integer! to binary! buf ][ to integer! to binary! head reverse buf ] ] ;The following data types are used in the TrueType font file. ;All TrueType fonts use Motorola-style byte ordering (Big Endian): ;BYTE: ;8-bit unsigned integer. ;CHAR: ;8-bit signed integer. ;USHORT: ;16-bit unsigned integer. ;SHORT: ;16-bit signed integer. ;ULONG: ;32-bit unsigned integer. ;LONG: ;32-bit signed integer. ;FIXED: ;32-bit signed fixed-point number (16.16) ;FUNIT Smallest measurable distance in the em space. ;FWORD 16-bit signed integer (SHORT) that describes a quantity in FUnits. ;UFWORD Unsigned 16-bit integer (USHORT) that describes a quantity in FUnits. ;F2DOT14 16-bit signed fixed number with the low 14 bits of fraction (2.14). ;The TrueType font file begins at byte 0 with the Offset Table. table-directory: make object! [ version: ;Fixed 0x00010000 for version 1.0. num-tables: ;USHORT Number of tables. search-range: ;USHORT (Maximum power of 2 <= numTables) x 16. entry-selector: ;USHORT Log2(maximum power of 2 <= numTables). range-shift: ;USHORT NumTables x 16 - searchRange. none ] ;This is followed at byte 12 by the Table Directory entries. ; Entries in the Table Directory must be sorted in ascending order by tag. table-directory-entry: make object! [ tag: ;ULONG 4-byte identifier. checkSum: ;ULONG CheckSum for this table. offset: ;ULONG Offset from beginning of TrueType font file. length: ;ULONG Length of this table. none ] ;The Table Directory makes it possible for a given font to contain only ;those tables it actually needs. As a result there is no standard value ;for numTables. comment { Tags are the names given to tables in the TrueType font file. At present, all tag names consist of four characters, though this need not be the case. Names with less than four letters are allowed if followed by the necessary trailing spaces. A list of the currently defined tags follows. } ;Required Tables ;Tag Name ; required-tables: [ ; cmap "character to glyph mapping" ; glyf "glyph data" ; head "font header" ; hhea "horizontal header" ; hmtx "horizontal metrics" ; loca "index to location" ; maxp "maximum profile" ; name "naming table" ; post "PostScript information" ; OS/2 "OS/2 and Windows specific metrics" ; ] ;Optional Tables ;Tag Name ; optional-tables: [ ; cvt "Control Value Table" ; EBDT "Embedded bitmap data" ; EBLC "Embedded bitmap location data" ; EBSC "Embedded bitmap scaling data" ; fpgm "font program" ; gasp "grid-fitting and scan conversion procedure (grayscale)" ; hdmx "horizontal device metrics" ; kern "kerning" ; LTSH "Linear threshold table" ; prep "CVT Program" ; PCLT "PCL5" ; VDMX "Vertical Device Metrics table" ; vhea "Vertical Metrics header" ; vmtx "Vertical Metrics" ; ] comment { Other tables may be defined for other platforms and for future expansion. Note that these tables will not have any effect on the scan converter. Tags for these tables must be registered with Apple Developer Technical Support. Tag names consisting of all lower case letters are reserved for Apple's use. The number 0 is never a valid tag name. } ; name table name-table: make object! [ format: ;USHORT Format selector (=0). num-records: ;USHORT Number of NameRecords that follow n. offset: ;USHORT Offset to start of string storage (from start of table). none records: copy [] ;The NameRecords. string-data: none ;(Variable) Storage for the actual string data. ] name-record: make object! [ platform: ;USHORT Platform ID. encoding-id: ;USHORT Platform-specific encoding ID. language-id: ;USHORT Language ID. name-id: ;USHORT Name ID. string-length: ;USHORT String length (in bytes). string-offset: ;USHORT String offset from start of storage area (in bytes). none ] ; platform-ids: [ ; 0 Apple-Unicode "" ; 1 Macintosh "Script manager code" ; 2 ISO "ISO encoding" ; 3 Microsoft "Microsoft encoding" ; ] ; ; ; ?encoding ids are only used with Microsoft platform? ; encoding-ids: [ ; 0 "Undefined character set or indexing scheme" ; 1 "UGL character set with Unicode indexing scheme" ; ] ; ; language-ids: [ ; ; lots of stuff here. Not sure I want to tackle it right now. ; ] ; ; name-ids: [ ; 0 ;Copyright notice. ; 1 ;Font Family name ; 2 ;Font Subfamily name; for purposes of definition, this is assumed to address style (italic, oblique) and weight (light, bold, black, etc.) only. A font with no particular differences in weight or style (e.g. medium weight, not italic and fsSelection bit 6 set) should have the string Regular stored in this position. ; 3 ;Unique font identifier ; 4 ;Full font name; this should simply be a combination of strings 1 and 2. Exception: if string 2 is "Regular," then use only string 1. This is the font name that Windows will expose to users. ; 5 ;Version string. In n.nn format. ; 6 ;Postscript name for the font. ; 7 ;Trademark; this is used to save any trademark notice/information for this font. Such information should be based on legal advice. This is distinctly separate from the copyright. ; ] ;=============================================== ;== Set your files here for testing font-dir: %//windows/fonts/ files: copy [] foreach file read font-dir [ if %.ttf = suffix? file [ append files file ] ] ;=============================================== integer-to-version: func [value [integer!]][ add to integer! divide value 65536 divide (value and 65535) 10 ] get-ttf-num: func [data offset length] [ buff-to-num/big-endian copy/part at data offset length ] font-names: copy [] foreach file files [ tables: copy [] data: read/binary join font-dir file ; Main table directory td: make table-directory [] td/version: get-ttf-num data 0 4 td/num-tables: get-ttf-num data 5 2 td/search-range: get-ttf-num data 7 2 td/entry-selector: get-ttf-num data 9 2 td/range-shift: get-ttf-num data 11 2 ; print [ ; "version=" integer-to-version td/version ; "num-tables=" td/num-tables ; "search-range=" td/search-range ; "entry-selector=" td/entry-selector ; "range-shift=" td/range-shift ; ] ; table directory entries for each table repeat i td/num-tables [ tbl-pos: i - 1 * 16 + 13 ; 16 = struct length, 13 = offset from BOF entry: make table-directory-entry [ tag: to-word to-string copy/part at data tbl-pos + 0 4 checksum: get-ttf-num data tbl-pos + 4 4 offset: get-ttf-num data tbl-pos + 8 4 length: get-ttf-num data tbl-pos + 12 4 ] append tables entry ] ;foreach table tables [print [table/offset table/length table/tag]] ; Do a brute force search for the name table name-table-data: none foreach table tables [ if table/tag = 'name [ name-table-data: copy/part at data table/offset + 1 table/length break ] ] if name-table-data [ ;print length? name-table-data name-tbl: make name-table [ format: get-ttf-num name-table-data 0 2 num-records: get-ttf-num name-table-data 3 2 offset: get-ttf-num name-table-data 5 2 string-data: to-string copy at name-table-data offset ] ;probe name-tbl repeat i name-tbl/num-records [ tbl-pos: i - 1 * 24 + 7 ; 24 = struct length, 7 = offset from table start entry: make name-record [ platform: get-ttf-num name-table-data tbl-pos + 0 2 encoding-id: get-ttf-num name-table-data tbl-pos + 2 2 language-id: get-ttf-num name-table-data tbl-pos + 4 2 name-id: get-ttf-num name-table-data tbl-pos + 6 2 string-length: get-ttf-num name-table-data tbl-pos + 8 2 string-offset: get-ttf-num name-table-data tbl-pos + 10 2 ; This is my extension for testing string-data: to-string copy/part at name-table-data 1 + name-tbl/offset + string-offset string-length ] append name-tbl/records entry ; Use English string for testing here ; name-id 4 = full font name ; language-id 0 = english if all [entry/name-id = 4 entry/language-id = 0] [ append font-names entry/string-data print entry/string-data ] ] ] ] halt