Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Multiuser database

 [1/23] from: fergus4:bellatlantic at: 19-Mar-2006 19:26


I'm making a small "Phonebook" type data base program that I want to access from several computers on a local net and maybe remotely. I was originally going to use a text file to store the data but I'm thinking that there is a possibility that it might be accessed by two different people at the same time. Without going with a MySQL database what methods can I use to insure that simultaneous access will not screw up the data base file. Is this a thing for which "rebol services" would be useful?

 [2/23] from: fergus4:bellatlantic at: 19-Mar-2006 19:41


I'm making a small "Phonebook" type data base program that I want to access from several computers on a local net and maybe remotely. I was originally going to use a text file to store the data but I'm thinking that there is a possibility that it might be accessed by two different people at the same time. Without going with a MySQL database what methods can I use to insure that simultaneous access will not screw up the data base file. Is this a thing for

 [3/23] from: greggirwin:mindspring at: 19-Mar-2006 18:13


Hi Alan, A> I'm making a small "Phonebook" type data base program that I want to access A> from several computers on a local net and maybe remotely. I was originally A> going to use a text file to store the data but I'm thinking that there is a A> possibility that it might be accessed by two different people at the same A> time. Without going with a MySQL database what methods can I use to insure A> that simultaneous access will not screw up the data base file. There is no definitive solution for this, aside from using a real database; there are various techniques you can use though. 1) Create a "lock" file when you access the database. The first thing you do is check for its existence, create it if not there, and delete it when you're done. You can't access the DB unless you're the one who created the lock file. A spin on this approach, depending on your needs and desires, is to store actual lock data in the file, so you can lock specific records, etc. Of course, this can just push the same problem over to the lock file rather than the DB. 2) Use the WinAPI (or other lib access) to "lock" part or all of the file. If you can't acquire the lock, you can't do the work. 3) Check if the DB file exists, if so, rename it (so others won't find it). When you're done, rename it back. Doesn't work if you want to auto-create the file of course. 4) Use the file system as your DB, storing one record per file. Works well for small-scale apps. From there, you can build what you want as far as extra protection, locking, versioning, etc. Bazillions of systems use this approach these days. None of these are perfect solutions. A> Is this a thing for which "rebol services" would be useful? Yes and no. If you write a DB server that serializes requests, handles lock contention, and all the rest of the stuff a real database does, it would be very robust. At some point, a real DB makes sense. There are lighter-weight DBs than MySQL as well. What is the risk, what is the worst that can happen, etc. Don't build a solution, or a file format, that is easily corrupted; and make sure it can be rebuilt without too much difficulty; that's the main thing to keep in mind if you're building your own system. -- Gregg

 [4/23] from: compkarori::gmail::com at: 20-Mar-2006 13:13


Using LNS will obviate the need for file locking ( there's a script in the library to help with that ), but won't prevent someone elses' edits overwriting yours. You would have to have LNS functions to check timestamps etc to prevent that, so it would work. On 3/20/06, Alan <fergus4-bellatlantic.net> wrote:
> I'm making a small "Phonebook" type data base program that I want to > access
<<quoted lines omitted: 8>>
> To unsubscribe from the list, just send an email to > lists at rebol.com with unsubscribe as the subject.
-- Graham Chiu http://www.compkarori.com/emr/

 [5/23] from: SunandaDH:aol at: 20-Mar-2006 6:19


Gregg:
> 1) Create a "lock" file when you access the database. The first thing > you do is check for its existence, create it if not there, and > delete it when you're done.
Though that gums things up if the task fails and does not delete the lock file -- all future tasks are blocked waiting for the lock file to be deleted by a dead process. One way around that is to write the lock file with an expiry time -- say 2 seconds in the future. Tasks can then overwrite the lock file if the timestamp has expired. Of course: * Still delete it at the end (otherwise you could only have 30 updates a minute) * Extend it by a second if you've taken over a second of elapse time (otherwise other tasks may overlap yours if you take more than 2 seconds to complete. Getting ownership of the lock file can be tricky if you are running Windows as it does not, by default, always lock files -- so two or more tasks could overwrite it; and then they all think they have the all clear to run. There are ways to fix that -- let me know if you want to know them. Basically, you either have a database or you need to build the equivalent access controls for a multi-user environment. How clever your controls need to be depends in part on how often you are happy to lose data. If once every 10,000 updates is fine, then you can deploy a simpler set of controls than if you need to be safe for 1,000,000 on average. Sunanda.

 [6/23] from: compkarori:g:mail at: 20-Mar-2006 23:38


Ahh .. yes, that has happened to me .. a lock file that sticks around :( On 3/20/06, SunandaDH-aol.com <SunandaDH-aol.com> wrote:
> Gregg: > > 1) Create a "lock" file when you access the database. The first thing
<<quoted lines omitted: 4>>
> deleted by a > dead process.
-- Graham Chiu http://www.compkarori.com/emr/

 [7/23] from: petr::krenzelok::trz::cz at: 20-Mar-2006 13:03


Gregg Irwin wrote:
> Hi Alan, > A> I'm making a small "Phonebook" type data base program that I want to access
<<quoted lines omitted: 13>>
> can just push the same problem over to the lock file rather than > the DB.
I once was in a need of creating some locking. I did some small functions for it, and it worked good. The trick was to create some file and let it open. When other app tries to look at lock, it tries to delete it. And that is the trick which makes your app one level more robust. Because - if you don't do it that way, your app creating e.g simple db.lock file crashes, then you are left with db.lock on your harddrive and the app will not work till someone actually does not delete it manually :-) The trouble is, that my linux friend claims, that this method is not safe under linux. I don't know, I haven't tried, but I am curious. He told me, that even if I have file opened, other process may succeed in deleting it, which seems a bit strange to me :-)
> 2) Use the WinAPI (or other lib access) to "lock" part or all of the > file. If you can't acquire the lock, you can't do the work.
<<quoted lines omitted: 6>>
> systems use this approach these days. > None of these are perfect solutions.
As for simple db solutions, we have RebDB - that one is really nice small rebol based db. The trouble is, it works in memory only, and it does not solve your locking situation. Otoh you can create table called lock and check for particular record. To prevent dead locks, it would be good to register time of lock creating and regarding being unlocked if e.g. 20 min passes and the lock is not removed. But then app would have to recheck periodically, if it would need more than 20 min lock, to renew it :-) As for db which solves locking on file level, works really fast, is easy to maintain, the solution is to use sqlite, to which several drivers are available on rebol.org Sqlite is one file, no config, ultra fast on-disk database - plays well with rebol .... -pekr-

 [8/23] from: greggirwin::mindspring::com at: 20-Mar-2006 8:06


Sunanda et al, Sac> Gregg:
>> 1) Create a "lock" file when you access the database. The first thing >> you do is check for its existence, create it if not there, and >> delete it when you're done.
Sac> Though that gums things up if the task fails and does not delete the lock Sac> file -- all future tasks are blocked waiting for the lock file to be deleted by a Sac> dead process. Yup, thanks for making that clear, since I didn't go into too much detail in my post. What is that saying about "an exercise for the reader...". :) -- Gregg

 [9/23] from: gabriele::colellachiara::com at: 20-Mar-2006 20:34


Hi Petr, On Monday, March 20, 2006, 1:03:20 PM, you wrote: PK> The trouble is, that my linux friend claims, that this method is not PK> safe under linux. I don't know, I haven't tried, but I am curious. He PK> told me, that even if I have file opened, other process may succeed in PK> deleting it, which seems a bit strange to me :-) Indeed, you can unlink open files on Unix; and that's not too strange, as unlinking a file does not mean destroying it. It just isn't referenced by the directory structure anymore. The space on disk is actually feed only when there are no more references to it and no processes have an handle to it. Regards, Gabriele. -- Gabriele Santilli <gabriele-rebol.com> --- http://www.rebol.com/ Colella Chiara software division --- http://www.colellachiara.com/

 [10/23] from: petr::krenzelok::trz::cz at: 20-Mar-2006 23:38


Gabriele Santilli wrote:
>Hi Petr, >On Monday, March 20, 2006, 1:03:20 PM, you wrote:
<<quoted lines omitted: 9>>
>Regards, > Gabriele.
OK, I don't understand what you mean. So please, when I open %some-file.lock, can another process delete it or not? -pekr-

 [11/23] from: kpeters-vu:ware at: 20-Mar-2006 16:12


>> > OK, I don't understand what you mean. So please, when I open %some- > file.lock, can another process delete it or not? > > -pekr-
Pekr ~ yes, easily - as long as that process has sufficient rights, of course. Try this: Open two shell windows on a Unix box: In shell 1 create a file with your favorite editor, save it and keep it open. In shell 2 delete this file and follow this with an ls command - you won't see the file anymore. If you save it again in shell 1 (and it should tell you that it's writing a new file!) and do another ls in shell 2 it will be there again. So semaphores are the better way... Regards, Kai

 [12/23] from: gabriele:colellachiara at: 21-Mar-2006 10:09


Hi Petr, On Monday, March 20, 2006, 11:38:21 PM, you wrote: PK> OK, I don't understand what you mean. So please, when I open PK> %some-file.lock, can another process delete it or not? Yes. Regards, Gabriele. -- Gabriele Santilli <gabriele-rebol.com> --- http://www.rebol.com/ Colella Chiara software division --- http://www.colellachiara.com/

 [13/23] from: petr:krenzelok:trz:cz at: 21-Mar-2006 10:17


Thanks for explanation, Kai ....
> So semaphores are the better way... >
Are you talking about semaphores as some kind of os functionality? Because, hand made semaphores was what I was talking about .... But how to make sure process does not crash and you don't end-up with dead-lock sitting on your hd? So if we can't depend upon file-locking when opened under Unix (which does seem just plain crazy and wrong to me, because if I have a lock, how is that other process can delete what I requested to have locked?!), the only technique I can come up with right now is to use some kind of time-stamp and regard .lock file being dead if the app does not renew it in certain time period. But not sure it is a good method, because you would make your app dependant upon timers .... -pekr-

 [14/23] from: SunandaDH:aol at: 21-Mar-2006 5:14


Pekr:
> the only technique I can come up with right now is to > use some kind of time-stamp and regard .lock file being dead if the app > does not renew it in certain time period.
And even that isn't guaranteed to work: Two (or more) processes could be spinning waiting for the timestamp in the lock file to expire. They then all write the new lock file. They all think they have the lock, but only one truly does. You can reduce that risk by: -- read the lockfile -- wait for timestamp in the lock file to expire -- wait a further random fraction of a second -- read the lock file again: restart if it has an unexpired timestamp (cos someone beat you to it) -- write the lock file with your timestamp -- wait a random fraction of a second -- reread the lock file again: restart if it does not have your timestamp (because someone came in just after you and grabbed it) But even then on a very busy machine (which is when all this matters) two or more processes could set exactly the timestamp, so they could not tell who owns the lock file from the expiry timestamp alone. To get around that, you need a guaranteed unique id (GUID). Write that to the lock file along with the timestamp. Then, when you reread the lock file, a changed GUID tells you you do not own the lock file Which simply opens a further problem of how you get a GUID. IP address is not sufficient (a caller may have multiple requests on the go). Something like: guid: 0 attempt [guid: 1+ load %guid-file.txt] write %guid-file.txt guid would work, except you've now pushed the problem back to ensuring serialised access to %guid-file.txt. And for that, you need a lock file. Moral: attempting to use a non-locking file system to make an inter-process lock file is not easy. It can be done, but you need to accept a level of risk as the locking mechanism is never going to be foolproof. Sunanda.

 [15/23] from: petr:krenzelok:trz:cz at: 21-Mar-2006 12:17


SunandaDH-aol.com wrote:
> Pekr: >> the only technique I can come up with right now is to
<<quoted lines omitted: 15>>
> -- reread the lock file again: restart if it does not have your timestamp > (because someone came in just after you and grabbed it)
uh, that all sounds crazy and because of what? Because unix can't guarantee you file-lock if you particularly ask for it? What a weak system ....
> But even then on a very busy machine (which is when all this matters) two or > more processes could set exactly the timestamp, so they could not tell who
<<quoted lines omitted: 4>>
> Which simply opens a further problem of how you get a GUID. IP address is not > sufficient (a caller may have multiple requests on the go).
what about checksum/secure? Should provide you enough of randomizer .... but two same time values will give you the same result ... maybe you could use checksum/secure/key and randomly generate the key or so :-) -pekr-

 [16/23] from: Paavo::Nevalainen::saunalahti::fi at: 21-Mar-2006 14:44


I have been quite happy with semaphores. A typical cgi app needs an expiration mechanism, since it is very hard to guarantee any sensible logic for user behaviour. Here is a unit test for a semaphore-handler I use. It has to use some until -- loop -- try - disarm construct to make it working sensibly. File based dbs have never crashed... (I found a semaphore example from some public source, don't remember where it was). test: has [ max-period-locked semaphore-handler ][ max-period-locked: 0:0:30 semaphore-handler: make-semaphore-handler %locks/ max-period-locked print semaphore-handler/lock 0 ; --> true print semaphore-handler/lock/meek 0 ; ---> false print semaphore-handler/release 0 ; --> true ] Above, a non-meek lock attempt simply waits until it succeeds. The locks are used to change the state of objects, for example obj/state == 'free --> obj/state: 'edited . So others can check the obj and see if it is available. Summary: rather non-programmer guy can survive well with file dbs.

 [17/23] from: SunandaDH:aol at: 21-Mar-2006 10:32


Paavo:
> I have been quite happy with semaphores.
It'd be worth checking the source of the semaphore handler you are using. it could be doing something incredibly clever and will work safely on all REBOL platforms -- if so I'd like to know. Or it could be doing something that works safely on *your* operating system -- such as creating a lock file under an opsys that handles exclusive locks sensibly. Or maybe you've just been lucky so far. The sort of cases Pekr and I have been mentioning are probably not a problem for a lightly loaded system. You may never hit a problem even with a faulty locking mechanism. Pekr:
> what about checksum/secure? Should provide you enough of randomizer ....
random/secure 100'000'000'000 Gives you a highly probable GUID, especially if concatenated to the process start time, the time you request the GUID, and the IP address. Assuming random/secure is not generated from the process start time. Probably not good enough for safety critical work, but should work in most office critical systems. Of course, if REBOL had a semaphore system that worked cross-platform, it'd be so much simpler for all of us. Surely the Async changes must deliver such a toy? Sunanda.

 [18/23] from: gabriele:colellachiara at: 21-Mar-2006 16:37


Hi Petr, On Tuesday, March 21, 2006, 12:17:25 PM, you wrote: PK> uh, that all sounds crazy and because of what? Because unix can't PK> guarantee you file-lock if you particularly ask for it? What a weak PK> system .... Unix gives you a lock if you ask about a lock. But if you don't, then it doesn't - it's that simple. Locking is not a trivial issue, and that's why OS' provide semaphores and things like that natively. The best solution is always to have a single-threaded process providing serialized access to the given resource. When you can't do that, you are forced to deal with troubles. A good solution that hasn't been mentioned so far is to use a TCP listen port as the lock. Two processes are (or at least should not) never allowed to listen to the same port. Regards, Gabriele. -- Gabriele Santilli <gabriele-rebol.com> --- http://www.rebol.com/ Colella Chiara software division --- http://www.colellachiara.com/

 [19/23] from: petr::krenzelok::trz::cz at: 21-Mar-2006 19:38


Gabriele Santilli wrote:
>Hi Petr, >On Tuesday, March 21, 2006, 12:17:25 PM, you wrote:
<<quoted lines omitted: 3>>
>Unix gives you a lock if you ask about a lock. But if you don't, >then it doesn't - it's that simple.
OK, so why opening file under linux does not lock it automatically? I would simply vote for such mechanism being available to rebol, if possible and cross platform ... open/lock
>Locking is not a trivial issue, and that's why OS' provide >semaphores and things like that natively. >
If natively, it could be handled inside rebol then ....
>The best solution is always to have a single-threaded process >providing serialized access to the given resource. When you can't >do that, you are forced to deal with troubles. >
how is that? we have semaphores, no? :-) Well, Windows gives me lock for opened file, so no troubles here. Maybe Windows is weaker here, because in contrast to unix, it actually probably can't open the file handle and not lock it, as in rebol I found something like open/new %my-lock.lock as sufficient ....
>A good solution that hasn't been mentioned so far is to use a TCP >listen port as the lock. Two processes are (or at least should >not) never allowed to listen to the same port. >
;-) And I used this check just few hours ago, to check if some service in OS is already running! Never thought about using it as a check for locking ;-) I feel really dumb not to think of it for all those years :-) -pekr-

 [20/23] from: gabriele::colellachiara::com at: 22-Mar-2006 12:34


Hi Petr, On Tuesday, March 21, 2006, 7:38:42 PM, you wrote: PK> OK, so why opening file under linux does not lock it automatically? I PK> would simply vote for such mechanism being available to rebol, if PK> possible and cross platform ... open/lock [...] PK> If natively, it could be handled inside rebol then .... It probably should, but it's likely to be a pain to do in a multiplatform way (i.e. if we want the exact same behavior in all platforms)... not that I have studied the issue deeply, though.
>>The best solution is always to have a single-threaded process >>providing serialized access to the given resource. When you can't >>do that, you are forced to deal with troubles.
PK> how is that? we have semaphores, no? :-) Yep, but the above is much simpler. :) Regards, Gabriele. -- Gabriele Santilli <gabriele-rebol.com> --- http://www.rebol.com/ Colella Chiara software division --- http://www.colellachiara.com/

 [21/23] from: SunandaDH:aol at: 22-Mar-2006 16:15


Gabriele:
> A good solution that hasn't been mentioned so far is to use a TCP > listen port as the lock. Two processes are (or at least should > not) never allowed to listen to the same port.
That's true on a single machine. But I think (if I read it right) the original poster wanted to lock a resource on an application that is running (in effect) asynchronously across a number of machines. Gabriele:
> It probably should, but it's likely to be a pain to do in a > multiplatform way (i.e. if we want the exact same behavior in all > platforms)... not that I have studied the issue deeply, though.
I strongly suspect you are right -- the surface commonality would hide a series of opsys-specific locking/semaphoring mechanisms; and then REBOL may be reduced to implementing only the most common features. I can see why it isn't at the top of Carl's to do list. On the other hand, Ada had a rendezvous mechanism (so, in effect semaphores) and async built into the language from day 1. So it must be possible if you have enough Government money. Gabriele:
> The best solution is always to have a single-threaded process > providing serialized access to the given resource. When you can't > do that, you are forced to deal with troubles.
Absolutely. Let me describe a slightly OT example I had to design a couple of years back. It's a humungous legacy application now sprawled across multiple mainframes with hunks chopped off and running on various PC-class machines. And all of a sudden we needed to lock a resource at the application level. Providing a mainframe-wide lock was trivial -- ENQ and DEQ (enqueue and dequeue) have been doing that since before Neal Armstrong bounced around on the moon. A PC-network-wide lock as almost as simple as the PC-side applications used a common SQL database: so we just added a lock table in a simpler manner than the lock files described in earlier posts in this thread. Then some things come for free: like an uncommitted read lets you "peek" at the lock setting without blocking yourself. But combining those two make a single application-wide lock that was foolproof and resilient (so if one side died the other side wasn't blocked) -- well, that was fun. Sunanda.

 [22/23] from: gabriele:colellachiara at: 23-Mar-2006 12:36


Hi SunandaDH, On Wednesday, March 22, 2006, 10:15:58 PM, you wrote: Sac> That's true on a single machine. But I think (if I read it right) the Sac> original poster wanted to lock a resource on an Sac> application that is running (in Sac> effect) asynchronously across a number of machines. File locking on network file system has always been known to be a big problem, so I'd rule this case out anyway. You need a process on the server machine to handle serialization; and if you need n processes on the server machine (i.e. CGI) you can still use the TCP trick. Regards, Gabriele. -- Gabriele Santilli <gabriele-rebol.com> --- http://www.rebol.com/ Colella Chiara software division --- http://www.colellachiara.com/

 [23/23] from: SunandaDH:aol at: 25-Mar-2006 17:18


> Paavo: > I have been quite happy with semaphores.
Sunanda:
> It'd be worth checking the source of the semaphore handler you are using.
Paavo sent me the source of the functions he'd been using. They'd come originally from Brett's obtain-semaphore and release-semaphore: http://www.codeconscious.com/rebol/rebol-net.html Brett's come up with a neater idea than trying to open a file for update to obtain an exclusive lock. He's flipping a file between two names. That should result in an atomic operation on pretty much all platforms, so could be used as a fairly safe locking mechanism when all you have in common is access to a shared folder. It has problems as noted earlier in this thread -- if the task that applies the lock dies, then all future tasks are blocked. It also requires some sort of single-threaded start-up to create the file in the unlocked state; and if that start-up process is accidentally run again, it could accidentally release a lock. Those problems are fixable at the cost of a greatly complicated lock protocol. As Gabriele has said, this stuff isn't easy. Or at least, it isn't easy if you are trying to do it from outside of a privileged kernel state. Sunanda.

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted