World: r3wp
[Dialects] Questions about how to create dialects
older newer | first last |
btiffin 19-Sep-2006 [120x2] | Hey, thanks for the interest Volker. And Gregg thanks for the comments. But I'm still kinda stuck on an editor dialect that could handle random text in a parse block. Even the nifty fed above will break on "p $10,000,000" as rebol can't quite form a money value out of $10,000,000. So the basic question exists. Is there a way to block parse random potentially unloadable text? |
Sorry Volker, missed the lineno arg to the p command. Nice. | |
Anton 20-Sep-2006 [122] | Brian, I think you would have to write your own load-parser, that can, first of all, match brackets [] {} "", including escaping strings, "^"", load well-formed rebol values etc.. as well as be able to handle non-matching brackets and ill-formed rebol values. This is a tough job. The essence of the problem is that if the file is not fully loadable, then you can't be sure that *any* part of it that you might be looking at is a properly formed value. For example, if there is one extra unmatched } at the end of the file, does that imply that the whole lot was supposed to be a string ? Would that mean that all the prior text shouldn't be treated as individual values then ? |
Gregg 21-Sep-2006 [123] | My preference is to use block parsing whenever possible, and trap errors so you can warn if something isn't valid. Block parsing is just so much more powerful than string parsing, it's hard to give it up. Of course, there are improvements I would like to see, so more "normal" text can load successfully; things like 50%, or 33MPH. Maybe R3 and custom datatypes will offer something in the area of expanded lexical forms. In any case, we should identify the most important things that don't load today and see if RT can do something about them. :-) |
Maxim 21-Sep-2006 [124] | funny, in experience, I find it easier in many cases to do a hybrid model. one where you load the string into some block you can then more easily parse. There are many kinds of real-world data which is not easily loadable in REBOL and in cases where you must make a dialect over some outside data... blocks are rarely useable. |
Gregg 21-Sep-2006 [125] | Agreed. What I often do is "parse" the data, using standard string manipulation, not really thinking as a dialect, and then writing a dialect against the loaded data. |
Maxim 21-Sep-2006 [126x2] | exactly. |
the hybrid model is much easier to "Get things done" | |
btiffin 21-Sep-2006 [128x2] | So the pros don't seem to mind a string parse followed by a block parse? |
What would be the implications of LOAD converting unmatchable types to string! behind the scene? Or perhaps a new unstring! type to keep other dialecting more concise? | |
Maxim 21-Sep-2006 [130x5] | that's how I handle XML for example, a modified version of xml2rebxml.r . I convert the tags into native rebol blocks, then handle the blocks within the application which expects the data... this way, you can more easily re-use code. |
I have recently discovered the Issue! datatype. | |
and use it for many in-between values. | |
its like a power word, which accepts anything but a space. | |
(almost anything) | |
btiffin 21-Sep-2006 [135] | Ahh, just add a #. Nice. |
Maxim 21-Sep-2006 [136x3] | using string as a temporary data holder is ok too, as long as you can be sure the dataset will not get mixed up in a temporary value or actual string data. either by its structure or content. |
funny... I had been using rebol for 9 years and didn't know about that datatype... yet its extremely usefull | |
you just have to be carefull not to mix up other # using REBOL notation like char! (#"^/" ) or the serialized data format (#[none] ) | |
btiffin 21-Sep-2006 [139] | I can't say I've been 'using' Rebol for long, but I've been playing for quite a while now. I discover something new every time I open up the system. It's too cool how RT has put something as wide and deep as the ocean into a cup, a cup with blinking lights no less. |
Maxim 21-Sep-2006 [140x2] | hehe even if the cup has a few crack ;-) |
I feel its the most productive language out there. not in how powerfull it CAN get but in how productive it IS from the onset of the very first time you use it. | |
btiffin 21-Sep-2006 [142] | Yep, that's what turned me all evangelical. |
Gregg 22-Sep-2006 [143] | Issues can actually contain spaces, but they don't parse or mold that way. i.e. the datatype can hold them, but the lexical form doesn't allow it. Meaning you can get bitten, but do tricky things. :-) >> a: #This issue has spaces in it ** Script Error: issue has no value ** Near: issue has spaces in it >> a: to issue! "This issue has spaces in it" == #This >> probe a #This == #This >> b: to string! a == "This issue has spaces in it" |
Maxim 22-Sep-2006 [144x2] | well, that holds for words to btw ;-) |
to = too | |
Ladislav 22-Sep-2006 [146] | just an alternative form: >> form #[issue! "aaaaa aaa"] == "aaaaa aaa" |
Jerry 31-Oct-2006 [147x2] | Is the dialect conecpt original in REBOL? Or is it from another language? Does any other language have this concept too? |
What makes a good dialect? Does anyone have any rules to share with us? | |
Graham 31-Oct-2006 [149] | DSLs have been around for a long time. |
Gregg 31-Oct-2006 [150x2] | A "true" dialect in REBOL follows REBOL lexical form--i.e. you use block parsing--which is what would be called an embedded DSL in other languages. The concept is often associated with Lisp and its descendants. REBOL takes it furhter, and makes it easier (IMHO). |
What makes a good dialect? That's a hard question to answer. What makes a good GPL (General Purpose Language)? There is no formula I know of, but I would say it should be: * Focused. *Domain* specific is the key. If you don't know the domain, it will be hard to get it right. * Well thought out and refined. Don't just take the first pass and call it good. Like a writer, think about the words you choose and how they're put together. * Small. Think about how the language will grow, but don't try to put too much in it. | |
Jerry 31-Oct-2006 [152] | Thank Gregg. It's very helpful. DSL stands for Domain-Specific Language, right? http://en.wikipedia.org/wiki/Domain-specific_language |
Gabriele 31-Oct-2006 [153] | yes |
Geomol 31-Oct-2006 [154x2] | As mentioned, you can parse in two different ways in REBOL: string parsing and block parsing. Recently (after using REBOL for years!!! Yes, you always keep discovering new things in REBOL.), I start to think about the two different ways of parsing, before I make a dialect. It's rather crucial, which way you choose, creating a dialect. String parsing is good for dialects, where you allow the user to type almost anything ... where you give lots of freedom. Block parsing is good, when you want the rules to be more narrow ... when you want the user to think in terms of works and symbols. Latest I made the math dialect for NicomDoc. I choose string parsing giving lots of freedom. The dialect ended up specifying presentation more than semantic. The dielect is good to produce the formulas, just like you want them visualized. If (when?) I would make a math dialect, where I would put weight on the semantic (the meaning of the mathematical symbols), I would choose block parsing. |
*terms of works and symbols* = terms of words and symbols | |
xavier 13-Jan-2007 [156] | . |
Chris 10-Jun-2007 [157x4] | The next requirement for 'Filtered Import' <http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=filtered-import.r> is support of depth: |
import [ name ["Chris" "RG"] address [ street "19th Terrace" town "Birmingham" zip 35205 ] ][ name: block! [string! length is more-than 2 string!] address: block! [ street: string! apt: opt string! town: string! zip: issue! [5 digit opt "-" 4 digit] else "Must have a valid US zip code" ] ] == [ name ["Chris" "RG"] address [ street "19th Terrace" apt none town "Birmingham" zip #35205 ] ] | |
I'm not quite sure how to pan this out. Also, the 'name rule doesn't have any set words, it is operating on an unnamed series. I think I want this type of rule to match the content. In that if [string! string!] does not exactly describe the content, 'name throws a bad-format error. | |
But this target is achievable, there are some clear patterns. And means that 'Filtered Import' can process more complex Rebol data (though not objects), basically Json class data. | |
Gregg 11-Jun-2007 [161] | Nice Chris. If you can nest named and unnamed value blocks, what you say seems logical, that the parent block is given as the error location. Why do you use literal bitset values, and have the human-friendly format of charsets just as a comment in the code? |
Chris 11-Jun-2007 [162] | You mean 'digit vs 'chars-n? I've been using the latter for some time, mainly for consistency. I'm going to migrate to more common names where there is a precedent. |
Gregg 12-Jun-2007 [163] | I mean you have: comment {[ chars-n: charset [#"0" - #"9"] ; numeric .... But then the code actually uses: chars-n: #[bitset! 64#{AAAAAAAA/wMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}] Why have the second one, the #[bitset!] syntax, at all? |
Chris 12-Jun-2007 [164x2] | I assumed the literal was faster to start... |
Albeit, I'm not a master at measuring such things... | |
Gregg 12-Jun-2007 [166] | It will definitely be faster (about 8x here), but either one is so fast, that the difference is insignificant, unless you're doing it 10,000 times, then you'll save .2 seconds or so. |
Geomol 23-Jun-2007 [167x2] | Gregg wrote (in group Rebol vs Scheme): I would *love* to see mini-primers on language design for Lisp, Forth, Logo, etc. in REBOL. I've taken the first step for a BASIC dialect: do http://www.fys.ku.dk/~niclasen/rebol/basic.r It only knows a few commands so far: auto list new old And these statements: end goto print rem run And these functions: cos sin |
Example of use: BASIC >auto 10 print "Hello World!" 20 0 >run Hello World! >list 10 print "Hello World!" > | |
Gregg 24-Jun-2007 [169] | Very cool John. Now, let me throw another thought into the mix, just for fun. If you were to write a language interpreter long ago, you would do it in a low level language like ASM or, later, C. In those languages you didn't have high level constructs like we have in REBOL. Certain languages have very specific models; consider Lisp and Forth, each has a few core definitions and the rest of the language it built on those. Lisp has lists, Forth has blocks, etc. With REBOL, we can do things in many ways. 1) Leverage all REBOL has to offer. For example, how hard would it be to write a simple Lisp system if you (basically) use blocks for lists and supply a few standard Lisp functions? Is eval'ing a Lisp paren/list different than DOing a REBOL block? 2) Write lower level code, simulating how you would have to write a language using something like C or ASM. You could go as far as writing a simple virtual machine with its own set of ops. 3) Write dialects that are designed for building specific kinds of languages, showing the core concepts of languages, where they're similar, and where they differ; tools for teaching language design. I think all of those approaches have something to offer. |
older newer | first last |