Author: Sunanda Version: 0.0.3 Date: 6-dec-2008
Purpose: to generate some metrics on a REBOL application to give some indication of how large / complex the application is.
These metrics are highly interpretable. So I suspect the early versions of this application is more likely to generate discussion than facts!
Copy the script to a folder.
You need do nothing more. You could edit the configuration object to meet your precise circumstances, but it is probably easier to run the script with the /config refinement.
do %application-sizer.r results: app-sizer/run probe results
Will run with the default configuration: it will count all files:
All these options (and more) are configurable -- see Configuration below.
do %application-sizer.r results: app-sizer/run/csv probe results
The default CSV file is called application-sizer.csv.
You can change several settings by supplying a config object:
my-config: make object! [ app-name: "website project" app-folders: [%/c/mystuff/ %/c/morestuff/] ] do %application-sizer.r results: app-sizer/run/config my-config probe results
In the example above, the config object changes:
do %application-sizer.r results: app-sizer/run/csv config make object! [csv-file-name: %my-csv-file.txt] probe results
In the example above, a CSV file with the name my-csv-file.txt will be created (or extended if it already exists).
ppf: func [ folder-name [file!] file-name [file!] file-contents [string!] ][ if find file-name "-test-" [return none] return file-contents ] do %application-sizer.r results: app-sizer/run/preprocess :ppf probe results
The example uses a preprocessing function to perform manipulations on the script file before it is processed.
The example excludes some files (those whose name contains the string "-test-") by returning none. For files it wants processed it simply returns the supplied file.
The preprocess function may also return a block of strings. If so, each string will be counted as a separate file. This is a way of expanding one file to many (perhaps you actually hold the sources in a Source Control Facility -- if so, you can use the preprocess function to return them all).
Another use for preprocessing is to expand the source if you are using PREBOL or other include subsystems.
Application-sizer returns you an object that looks like this:
make object! [ app-sizer-version: 0.0.2 run-date: 4-Dec-2008 app-name: "REBOL Library CGIs" app-version: "None" folders: 2 files: 257 raw-bytes: 2116471.0 compressed-size: 440048 raw-lines: 88589.0 code-lines: 42621.0 elements: make object! [ string: [16005 227226.0] datatype: [2571 16878.0] number: [3609 5413.0] refinement: [1320 8672.0] function: [10814 64803.0] operator: [3257 3859.0] native: [15915 70030.0] action: [9914 54112.0] object: [170 3457.0] image: [0 0.0] comment: [14981 529475.0] body: [57781 614478.0] whitespace: [136337 637107.0] ] element-definitions: [ "comment" [cmt] "datatype" [date! issue! money! pair! time! tuple!] "number" [decimal! integer!] "refinement" [refinement!] "string" [char! email! file! string! tag! url!] ] ]
The results object above, incidentally, comes from running the application against the cgi scripts that run the REBOL.org website. So the numbers are for a real application.
The various counts and numbers are:
folders: | count of the number of folders |
files: | total unique script files counted. Unique script files are identified by a checksum/secure test. So a script is only counted once even if it is present in more than one of your folders |
raw-bytes: | total length of all the scripts |
compressed-size: | length of compress all-scripts -- gives an indication of the density |
raw-lines: | total lines (as measured by length? read/lines) |
code-lines: | Total lines that are not:
|
elements: | Counts of various elements of the scripts. The elements identified and counted are:
|
element-definitions: | The definitions used when identifying the above elements. For example, it is a string if it was any of these datatypes:
|
If you have one of these:
The columns are the same (and in the same order) as the results object described above. The only difference is that block elements are named by an -n suffix to the name.
It is not guaranteed that the columns will remain consistent between versions of application-sizer.r. (The version number of application-sizer.r is a column in the data). However, output from version 0.0.4 is compatible with output from 0.0.3.
The full set of configuration options are:
Option | Default | Purpose |
app-name: | "No name" | string The name of your application |
app-version: | "None" | string The version of your application |
app-folders: | [%./] | file or block of files: folder or folders to search. If more than one folder, supply as a block |
exclude-files: | [%application-sizer.r] | block of files: files to ignore |
source-suffixes: | [%.r] | block of files What suffixes define a script |
add-header?: | false | logic: If the script cannot be loaded, should a REBOL [] header be added to a copy of it? |
minimal-chars: | "{}[]" | string lines containing only these characters will not be counted as code lines |
csv-file-name: | %application-sizer.csv | file: csv file name -- if you are using the /csv refinement |
if a = b [ c: true ]
if a = b [ c: true ]
if a = b [c: true]But this is 1 line:
if a = b [c: true]
An application may have Javascript on the client-side and some C++ or other language on the server-side. Just counting the REBOL code may miss important elements.
In the REBOL.org CGI scripts above, some things have been included when they should not, and perhaps vice versa:
REBOL.org makes use of many "library functions" -- ie scripts written for a different application, and included in the REBOL.org code base. Examples include skimp.r and color-code.r. These are being counted towards the total application size, when they really should not be.
I'm hoping a few people will publish their application sizes, and we'll get some discussion going. Watch this space!
Thanks to Dockimbel, Reichart, Robert Munch and Ashley Truter for suggestions that improve the metrics.
First release.