[REBOL] Re: Fast way to remove all non-numerical chars from a string
From: kpeters:otaksoft at: 24-Sep-2007 10:25
Wow - this seemingly "little" question really sparked some responses!
I like it when that happens because it really shows off the brilliance of
Rebol and the people mastering it.
All solutions will go into my library collection since they all shine in
their own way and I can learn from all of them - so I thank you all.
As you likely have guessed, I asked because I need to re-format phone numbers.
The vast majority of these will arrive formatted by various people according to what
they consider proper formatting - sometimes quite creative and riddled with typos as
At any time, I have to be prepared for the occasional complete junk string.
The numbers may reside in MySQL tables or in text files with one phone record (number
& address) per
line. Each of these tables or text files will be processed exactly once (as far as the
standardizing goes) - speed is important but a extra handful of seconds per file (containing
500,000 and 1,000,000 numbers) won't hurt anybody.
The phone numbers are stored with a max of 15 characters each prior to processing - these
will be overwritten with a standardized phone number string if they contain a valid number
be emptied otherwise.
For now, all phone numbers hail from
North America - so valid lengths are
a) 7 digits - local number
b) 10 digits - area code included
c) 11 digits - leading 1 in front of area code
Here's the function logic I intend to use:
1) Lose all non-numerical characters from ph#-string
2) If length not in (7,10,11) return empty string because phone# is invalid
3) If length = 11 and first char = 1 then chop off first char // now only 2 possibilities
4) If length = 10 then
frame the three leftmost digits with a pair or parentheses
insert a '1' in front
5) Insert hyphen before fourth character from the end of string
Does this sound like a good strategy or are there other, maybe radically different (but
ways to do this?