Documention for: application-sizer.r Created by: sunanda on: 1-Dec-2008 Last updated by: sunanda on: 7-Dec-2008 Format: text/editable Downloaded on: 28-Mar-2024 [numbering-on [contents [asis Author: Sunanda Version: 0.0.3 Date: 6-dec-2008 asis] [h2 Application sizer [p **Purpose:** to generate some metrics on a REBOL application to give some indication of how large / complex the application is. [p These metrics are highly interpretable. So I suspect the early versions of this application is more likely to generate discussion than facts! [h2 Installation [p Copy the script to a folder. [p You need do nothing more. You **could** edit the **configuration** object to meet your precise circumstances, but it is probably easier to run the script with the **/config** refinement. [h2 Running [h3 Basic [asis do %application-sizer.r results: app-sizer/run probe results asis] [p Will run with the **default configuration**: it will count all files: [li in the current folder [li that have a **.r** suffix [li and have a **REBOL [...]** header [li It will not count any file called **application-sizer.r** list] [p All these options (and more) are configurable -- see **Configuration** below. [h3 Creating a CSV file [asis do %application-sizer.r results: app-sizer/run/csv probe results asis] [p The default CSV file is called **application-sizer.csv**. [li If it does not exist, it is created with a header row and one data row [li If it does exist, a new data row is added to the end. So you can gather stats on the same application on different days, or on different applications, into the same spreadsheet list] [h3 With an override configuration [p You can change several settings by supplying a **config** object: [asis my-config: make object! [ app-name: "website project" app-folders: [%/c/mystuff/ %/c/morestuff/] ] do %application-sizer.r results: app-sizer/run/config my-config probe results asis] [p In the example above, the config object changes: [li the default application name (useful if you are using the **/csv** refinement to gather stats on several applications); and [li the folders searched for scripts list] [h3 /config and /csv [asis do %application-sizer.r results: app-sizer/run/csv config make object! [csv-file-name: %my-csv-file.txt] probe results asis] [p In the example above, a CSV file with the name **my-csv-file.txt** will be created (or extended if it already exists). [h3 Preprocessing the files [asis ppf: func [ folder-name [file!] file-name [file!] file-contents [string!] ][ if find file-name "-test-" [return none] return file-contents ] do %application-sizer.r results: app-sizer/run/preprocess :ppf probe results asis] [p The example uses a **preprocessing** function to perform manipulations on the script file before it is processed. [p The example **excludes** some files (those whose name contains the string ""-test-"") by returning **none**. For files it wants processed it simply returns the supplied file. [p The **preprocess function** may also return a **block** of **strings**. If so, each string will be counted as a separate file. This is a way of expanding one file to many (perhaps you actually hold the sources in a **Source Control Facility** -- if so, you can use the **preprocess function** to return them all). [p Another use for preprocessing is to expand the source if you are using **PREBOL** or other **include** subsystems. [h2 Interpreting the results [p Application-sizer returns you an **object** that looks like this: [asis make object! [ app-sizer-version: 0.0.2 run-date: 4-Dec-2008 app-name: "REBOL Library CGIs" app-version: "None" folders: 2 files: 257 raw-bytes: 2116471.0 compressed-size: 440048 raw-lines: 88589.0 code-lines: 42621.0 elements: make object! [ string: [16005 227226.0] datatype: [2571 16878.0] number: [3609 5413.0] refinement: [1320 8672.0] function: [10814 64803.0] operator: [3257 3859.0] native: [15915 70030.0] action: [9914 54112.0] object: [170 3457.0] image: [0 0.0] comment: [14981 529475.0] body: [57781 614478.0] whitespace: [136337 637107.0] ] element-definitions: [ "comment" [cmt] "datatype" [date! issue! money! pair! time! tuple!] "number" [decimal! integer!] "refinement" [refinement!] "string" [char! email! file! string! tag! url!] ] ] asis] [p The results object above, incidentally, comes from running the application against the **cgi scripts** that run the REBOL.org website. So the numbers are for a real application. [p The various counts and numbers are: [table/style/width:80% [row [cell/class/lskey1 folders: [cell/class/lsdata1 count of the number of folders [row [cell/class/lskey2 files: [cell/class/lsdata2 total unique **script** files counted. Unique script files are identified by a **checksum/secure** test. So a script is only counted once even if it is present in more than one of your folders [row [cell/class/*-3 raw-bytes: [cell/class/*-3 total length of all the scripts [row [cell/class/*-3 compressed-size: [cell/class/*-3 length of **compress all-scripts** -- gives an indication of the density [row [cell/class/*-3 raw-lines: [cell/class/*-3 total lines (as measured by **length? read/lines**) [row [cell/class/*-3 code-lines: [cell/class/*-3 Total lines that are not: [li blank (only spaces) [li comments (first non whitespace is **;** [li minimal (sole content is **]**, **[**, **{** or **}** [li minimal code plus comment (eg **] ;end of func**) list] [row [cell/class/*-3 elements: [cell/class/*-3 Counts of various **elements** of the scripts. The elements identified and counted are: [li string [li datatype [li number [li refinement [li function [li operator [li native [li action [li object [li image [li comment [li body [li whitespace list] each element has a block with two numbers: [li 1. the **count** -- how many of them found [li 2. the **size** -- total size of them list] [row [cell/class/*-3 element-definitions: [cell/class/*-3 The definitions used when identifying the above elements. For example, it is a **string** if it was any of these datatypes: [li char! [li email! [li file! [li string! [li tag! [li url! list] If you do not like those definitions, you can change some of them by editing the **colors** block in the script table] [h2 CSV output file [p If you have one of these: [li it is a **comma delimited** text file (not tab delimited) [li a row is **appended** each time you run the application (run date is a column in the data) [li a header row is added when the file is created [li you can specify your own CSV file by an entry in the **configuration**. list] [p The columns are the same (and in the same order) as the **results object** described above. The only difference is that block elements are named by an **-n** suffix to the name. [p It is not guaranteed that the columns will remain consistent between versions of **application-sizer.r**. (The version number of **application-sizer.r** is a column in the data). However, output from version 0.0.4 is compatible with output from 0.0.3. [h2 Configuration [p The full set of configuration options are: [table [row [cell/class/lsh Option [cell/class/lsh Default [cell/class/lsh Purpose [row [cell/class/lskey1 app-name: [cell/class/lsdata1 ""No name"" [cell/class/lsdata1 **string** The name of your application [row [cell/class/lskey2 app-version: [cell/class/lsdata2 ""None"" [cell/class/lsdata2 **string** The version of your application [row [cell/class/*-5 app-folders: [cell/class/*-5 [%./] [cell/class/*-5 **file** or **block of files**: folder or folders to search. If more than one folder, supply as a block [row [cell/class/*-5 exclude-files: [cell/class/*-5 [%application-sizer.r] [cell/class/*-5 **block of files**: files to ignore [row [cell/class/*-5 source-suffixes: [cell/class/*-5 [%.r] [cell/class/*-5 **block of files** What suffixes define a script [row [cell/class/*-5 add-header?: [cell/class/*-5 false [cell/class/*-5 **logic**: If the script cannot be loaded, should a **REBOL []** header be added to a copy of it? [row [cell/class/*-5 minimal-chars: [cell/class/*-5 ""{}[]"" [cell/class/*-5 **string** lines containing **only** these characters will not be counted as code lines [row [cell/class/*-5 csv-file-name: [cell/class/*-5 %application-sizer.csv [cell/class/*-5 **file**: csv file name -- if you are using the **/csv** refinement table] [h2 Limitations -- practical [li application-sizer is an **R2** product, not an **R3** one: [list [li it uses **load/header** not **transcode** to interpret code [li each load/header may add words to **system/words**. In R2 this block has a finite size, so a large application may overflow it list] [li conflicts in the **needs:** header may stop you sizing an entire application in one go. For example, in a client/server application, some scripts may have a need for **View** while others need **Command** list] [li **comment [....]** blocks are counted as code. Comments in the counts refer only to partial lines that start with ""**;**"" list] [h2 Limitations -- philosophical [h3 What is a line of code? All of the following count as **2 lines**: [asis if a = b [ c: true ] asis] [asis if a = b [ c: true ] asis] [asis if a = b [c: true] asis] But this is **1 line**: [asis if a = b [c: true] asis] [h3 What is an application? [p An application may have **Javascript** on the client-side and some **C++** or other language on the server-side. Just counting the REBOL code may miss important elements. [h3 What is the size of an application? [p In the REBOL.org CGI scripts above, some things have been included when they should not, and perhaps vice versa: [h3 Not counted [li many HTML templates.....If they'd been coded in some sort or REBOL dialect, they would be counted [li some images. These are hardly crucial for REBOL.org, but could be a major part of the application in other cases. list] [h3 Counted when it should not be [p REBOL.org makes use of many ""library functions"" -- ie scripts written for a different application, and included in the REBOL.org code base. Examples include **skimp.r** and **color-code.r**. These are being counted towards the total application size, when they really should not be. [h2 What does it all mean in practice? [p I'm hoping a few people will publish their application sizes, and we'll get some discussion going. Watch this space! [h2 Change log [p Thanks to Dockimbel, Reichart, Robert Munch and Ashley Truter for suggestions that improve the metrics. [h3 0.0.0 1-dec-2008 [p First release. [h3 0.0.1 3-dec-2008 [li added **version** to results object [li non-code lines counted differently. [li initial value of **folders** to search is the **current folder** [li added **exclude-list** for files that are not be included [li added **operator**, **action**, **native**, **object**, and **image** to the datatypes counted. (previously, these were all counted as **function**). list] [h3 0.0.2 4-dec-2008 [li changed **Size counts** to decimal [li added **/csv** refinement [li added **/config** refinement list] [h3 0.0.3 4-dec-2008 [li added **/preprocess** refinement [li changed **app-folder** in config: may be a single file, or a block of files list] [date