OT: Competition - Is this solvable
[1/6] from: robert::muench::robertmuench::de at: 24-Nov-2007 12:59
Hi all, yesterday I had a situation where I had to hunt a difference
between some numbers and I thought if this problem can be solved in
general. Here is the situation:
You have one sum: 41695,83
And a set of numbers: 6594,14 + 981,75 + 8747,39 + 1457,90 + 15114,65
The sum you have doesn't match the sum of the single numbers.
Now the question is: What (minimal) typos in the set of numbers leads to
the sum you have?
I find this an interesting question. If it's possible to derive a set of
possible typos these sets could be ranked by some propability. And than it
would be possible to check differences in data sets back to typos.
I hope it's clear what I mean. So, has anyone any idea if this is
solvable? Robert
[2/6] from: ale870::gmail at: 24-Nov-2007 16:02
I didn't tried, but I think you could use the same algorithm used to convert
a decimal valu into binary.
Basically, you have a number to be converted (example 122).
These are the minimal needed steps:
binary number weight are (8 bits): 128 64 32 16 8 4 2 1
(Your numbers are (from upper to lowest): 15114,65 // 8747,39 // 6594,14 //
1457,90 // 981,75
Apply the standard conversion algorithm to get the bit to be set to "1".
Maybe you can do the same with your problem...
On Nov 24, 2007 12:59 PM, Robert M. Münch <robert.muench-robertmuench.de>
wrote:
> Hi all, yesterday I had a situation where I had to hunt a difference
> between some numbers and I thought if this problem can be solved in
<<quoted lines omitted: 12>>
> To unsubscribe from the list, just send an email to
> lists at rebol.com with unsubscribe as the subject.
--
//Alessandro
http://sguish.wordpress.com
http://laccio.wordpress.com
[3/6] from: Tom::Conlin::gmail::com at: 24-Nov-2007 11:28
Hi Robert
Perhaps not surprisingly I see this as an analog of of
a common problem in bioinformatics, where you want to
know if stretches of DNA are related.
this is a dynamic programming problem with penalties for edits
the solution with the smallest edit distance is your goal.
The penalties for particular edits can be empirically determined
if you have a large set that has been solved or you could come up with a
with a scoring matrix wit the penalty for one symbol to become another
(or a gap/insertion).
basic local alignment search tool
BLAST , "smith waterman algorithm"
are some search terms that come to mind
Robert M. M=FCnch wrote:
[4/6] from: dhsunanda::gmail::com at: 23-Nov-2007 16:57
Robert:
> I hope it's clear what I mean. So, has anyone any idea if this is
> solvable? Robert
Interesting....I think so!
No time to write the code (so my hand-waving solution is
untested), but I'd say:
You are clearly 8800 short of the desired target:
t: 41695.83
s: 6594,14 + 981,75 + 8747,39 + 1457,90 + 15114,65 + 8800
== 41695.83
So, for *minimal* typos, you need to change an 0-->8 or an 1-->9
in the hundreds column and thousands column of one or two of the
given numbers. That will affect the minimum digits: ie just two.
There are several ways to do that, easily eyeballable, eg:
s: 6594,14 + 981,75 + 8747,39 + 9457,90 + 15914,65 + 0000
s: 6594,14 + 8981,75 + 8747,39 + 1457,90 + 15914,65 + 0000
One instance changes two 1's to 9s. The other adds a leading 8 to
one of the givens.
If that had not worked, it gets more interesting: at which point,
I'm off out :-) But, following Tom's comment, perhaps take a look
at simetrics.r in the REBOL.org Script library -- it has
algorithms that can help rate how close strings are to each other.
Sunanda
[5/6] from: robert:muench:robertmuench at: 25-Nov-2007 18:25
On Fri, 23 Nov 2007 17:57:37 +0100, Sunanda <dhsunanda-gmail.com> wrote:
> No time to write the code (so my hand-waving solution is
> untested), but I'd say:
>
> You are clearly 8800 short of the desired target:
Hi, that's right.
> t: 41695.83
> s: 6594,14 + 981,75 + 8747,39 + 1457,90 + 15114,65 + 8800
<<quoted lines omitted: 5>>
> s: 6594,14 + 981,75 + 8747,39 + 9457,90 + 15914,65 + 0000
> s: 6594,14 + 8981,75 + 8747,39 + 1457,90 + 15914,65 + 0000
Here is the typo that caused the error:
6594,14+9781,75+8747,39+1457,9+15114,65
ans = 41695,83
9781,75 - 981,75
ans = 8800
It was an additional 7 inserted into the 981,75 value.
> One instance changes two 1's to 9s. The other adds a leading 8 to
> one of the givens.
I think one approach might be to calculate the offset and than look at
each number step by step to see what needs to be changed to get the offset
as difference. The assumption in this case is, that there is only one typo.
> If that had not worked, it gets more interesting: at which point,
> I'm off out :-)
come on ;-)
> But, following Tom's comment, perhaps take a look
> at simetrics.r in the REBOL.org Script library -- it has
> algorithms that can help rate how close strings are to each other.
I thought about this too, but this requires a reference string/sequence
which is not available in our case. Robert
[6/6] from: Tom::Conlin::gmail::com at: 25-Nov-2007 12:27
Robert M. M=FCnch wrote:
> On Fri, 23 Nov 2007 17:57:37 +0100, Sunanda <dhsunanda-gmail.com> wrote:
>> No time to write the code (so my hand-waving solution is
<<quoted lines omitted: 34>>
> I thought about this too, but this requires a reference string/sequence
> which is not available in our case. Robert
in biology things not exactly the same are compared all the time
say DNA and protein. this requires translation according to biological
rules just as your missing reference strings require a rearrangement
according to arithmetic rules.
looking for a single mutation/ single error in one of the givens
given + delta = reference
where delta is derived from the 'sum' which is your overall reference
for this problem.
sum: 41695.83
val: [6594.14 981.75 8747.39 1457.90 15114.65]
valsum: 0
forall val [valsum: valsum + first val]
print valsum
print delta: sum - valsum
forall val[
candidate: round/to (delta + first val) .01
edit: difference form first val form candidate
probe reduce[first val candidate edit]
]
using a better discriminator than difference will get you a better edit
but the idea is the same.
if it is not a single edit then it grows quickly but the process is the
same. distribute the delta over all possible combinations and look for
cheap edits.
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted