Documention for: application-sizer.r
Created by: sunanda
on: 1-Dec-2008
Last updated by: sunanda on: 7-Dec-2008
Format: html
Downloaded on: 8-May-2024

1. Application sizer
2. Installation
3. Running
3.1. Basic
3.2. Creating a CSV file
3.3. With an override configuration
3.4. /config and /csv
3.5. Preprocessing the files
4. Interpreting the results
5. CSV output file
6. Configuration
7. Limitations -- practical
8. Limitations -- philosophical
8.1. What is a line of code?
8.2. What is an application?
8.3. What is the size of an application?
8.4. Not counted
8.5. Counted when it should not be
9. What does it all mean in practice?
10. Change log
10.1. 0.0.0 1-dec-2008
10.2. 0.0.1 3-dec-2008
10.3. 0.0.2 4-dec-2008
10.4. 0.0.3 4-dec-2008
     Author: Sunanda
    Version: 0.0.3
       Date: 6-dec-2008
 

1. Application sizer

Purpose: to generate some metrics on a REBOL application to give some indication of how large / complex the application is.

These metrics are highly interpretable. So I suspect the early versions of this application is more likely to generate discussion than facts!

2. Installation

Copy the script to a folder.

You need do nothing more. You could edit the configuration object to meet your precise circumstances, but it is probably easier to run the script with the /config refinement.

3. Running

3.1. Basic

    do %application-sizer.r
    results: app-sizer/run
    probe results
 

Will run with the default configuration: it will count all files:

All these options (and more) are configurable -- see Configuration below.

3.2. Creating a CSV file

    do %application-sizer.r
    results: app-sizer/run/csv
    probe results
 

The default CSV file is called application-sizer.csv.

3.3. With an override configuration

You can change several settings by supplying a config object:

    my-config: make object! [
                  app-name: "website project"
                  app-folders: [%/c/mystuff/ %/c/morestuff/]
                 ]
    do %application-sizer.r
    results: app-sizer/run/config my-config
    probe results
 

In the example above, the config object changes:

3.4. /config and /csv

    do %application-sizer.r
    results: app-sizer/run/csv config make object! [csv-file-name: %my-csv-file.txt]
    probe results
 

In the example above, a CSV file with the name my-csv-file.txt will be created (or extended if it already exists).

3.5. Preprocessing the files


 ppf: func [
     folder-name [file!]
       file-name [file!]
   file-contents [string!]
 ][
   if find file-name "-test-" [return none]
   return file-contents
   ]

 do %application-sizer.r
    results: app-sizer/run/preprocess :ppf
    probe results
 

The example uses a preprocessing function to perform manipulations on the script file before it is processed.

The example excludes some files (those whose name contains the string "-test-") by returning none. For files it wants processed it simply returns the supplied file.

The preprocess function may also return a block of strings. If so, each string will be counted as a separate file. This is a way of expanding one file to many (perhaps you actually hold the sources in a Source Control Facility -- if so, you can use the preprocess function to return them all).

Another use for preprocessing is to expand the source if you are using PREBOL or other include subsystems.

4. Interpreting the results

Application-sizer returns you an object that looks like this:

make object! [
     app-sizer-version: 0.0.2
     run-date: 4-Dec-2008
     app-name: "REBOL Library CGIs"
     app-version: "None"
     folders: 2
     files: 257
     raw-bytes: 2116471.0
     compressed-size: 440048
     raw-lines: 88589.0
     code-lines: 42621.0
     elements: make object! [
         string: [16005 227226.0]
         datatype: [2571 16878.0]
         number: [3609 5413.0]
         refinement: [1320 8672.0]
         function: [10814 64803.0]
         operator: [3257 3859.0]
         native: [15915 70030.0]
         action: [9914 54112.0]
         object: [170 3457.0]
         image: [0 0.0]
         comment: [14981 529475.0]
         body: [57781 614478.0]
         whitespace: [136337 637107.0]
     ]
     element-definitions: [
     "comment" [cmt]
     "datatype" [date! issue! money! pair! time! tuple!]
     "number" [decimal! integer!]
     "refinement" [refinement!]
     "string" [char! email! file! string! tag! url!]
     ]
 ]
 

The results object above, incidentally, comes from running the application against the cgi scripts that run the REBOL.org website. So the numbers are for a real application.

The various counts and numbers are:

folders: count of the number of folders
files: total unique script files counted. Unique script files are identified by a checksum/secure test. So a script is only counted once even if it is present in more than one of your folders
raw-bytes: total length of all the scripts
compressed-size: length of compress all-scripts -- gives an indication of the density
raw-lines: total lines (as measured by length? read/lines)
code-lines: Total lines that are not:
  • blank (only spaces)
  • comments (first non whitespace is ;
  • minimal (sole content is ], [, { or }
  • minimal code plus comment (eg ] ;end of func)
elements: Counts of various elements of the scripts. The elements identified and counted are:
  • string
  • datatype
  • number
  • refinement
  • function
  • operator
  • native
  • action
  • object
  • image
  • comment
  • body
  • whitespace
each element has a block with two numbers:
  • 1. the count -- how many of them found
  • 2. the size -- total size of them
element-definitions: The definitions used when identifying the above elements. For example, it is a string if it was any of these datatypes:
  • char!
  • email!
  • file!
  • string!
  • tag!
  • url!
If you do not like those definitions, you can change some of them by editing the colors block in the script

5. CSV output file

If you have one of these:

The columns are the same (and in the same order) as the results object described above. The only difference is that block elements are named by an -n suffix to the name.

It is not guaranteed that the columns will remain consistent between versions of application-sizer.r. (The version number of application-sizer.r is a column in the data). However, output from version 0.0.4 is compatible with output from 0.0.3.

6. Configuration

The full set of configuration options are:

Option Default Purpose
app-name: "No name" string The name of your application
app-version: "None" string The version of your application
app-folders: [%./] file or block of files: folder or folders to search. If more than one folder, supply as a block
exclude-files: [%application-sizer.r] block of files: files to ignore
source-suffixes: [%.r] block of files What suffixes define a script
add-header?: false logic: If the script cannot be loaded, should a REBOL [] header be added to a copy of it?
minimal-chars: "{}[]" string lines containing only these characters will not be counted as code lines
csv-file-name: %application-sizer.csv file: csv file name -- if you are using the /csv refinement

7. Limitations -- practical

8. Limitations -- philosophical

8.1. What is a line of code?

All of the following count as 2 lines:
     if a = b [
         c: true
     ]
 
     if a = b
        [
         c: true
     ]
 
     if a = b
         [c: true]
 
But this is 1 line:
     if a = b [c: true]
 

8.2. What is an application?

An application may have Javascript on the client-side and some C++ or other language on the server-side. Just counting the REBOL code may miss important elements.

8.3. What is the size of an application?

In the REBOL.org CGI scripts above, some things have been included when they should not, and perhaps vice versa:

8.4. Not counted

8.5. Counted when it should not be

REBOL.org makes use of many "library functions" -- ie scripts written for a different application, and included in the REBOL.org code base. Examples include skimp.r and color-code.r. These are being counted towards the total application size, when they really should not be.

9. What does it all mean in practice?

I'm hoping a few people will publish their application sizes, and we'll get some discussion going. Watch this space!

10. Change log

Thanks to Dockimbel, Reichart, Robert Munch and Ashley Truter for suggestions that improve the metrics.

10.1. 0.0.0 1-dec-2008

First release.

10.2. 0.0.1 3-dec-2008

10.3. 0.0.2 4-dec-2008

10.4. 0.0.3 4-dec-2008

Last updated: 7-Dec-2008