WarpDoctor - General Meeting for 2004-07-18
12:09:48 <WalterOS2> This meeting of WarpDoctor is now in session.
<WalterOS2> The agenda is at http://www.warpdoctor.org/agendas/warpdoc_2004-07-18.html for those who have not read it.
<WalterOS2> :-)
<Kris> Walter: what exactly is the browsers problem?
<wdl> Been logging since just before noon.
<WalterOS2> With some versions of Mozilla-based browsers, the plugin apparently causes the browser to crash.
<Kris> Walter: apart from ns4*, there is't much else, is it?
<WalterOS2> Why don't I let Doug explain it--he's the one with first hand knowledge.
<WalterOS2> Kris: That's what worries me.
<Doug> Right now the plug-in doesn't work with Mozilla v 1.6, and browsers based on that code, e.g. IWB v 2.0.3
<Doug> More generally speaking - there are currently certains brands and versions of browsers the plug-in won't work with.
<Doug> And in the future - there will probably be certain brands and versions of browsers that are incompatible with the plugin
<Kris> Doug, do you mean hereby "versions2 of mozilla?
<Doug> e.g. plug-in works with Mozilla v 1.5 and v 1.7 but not v 1.6
<Doug> Works with IBM Web Browser (IWB) v 2 and (somewhat) v 2.0.2 but not v 2.0.3
<Kris> IBM web 2.03 is mozilla 1.6, so that quite logical for once.
<Doug> Actually I am beginning to believe that the IBM browsers are more different from the associated Mozilla version than I first thought.
<WalterOS2> Doug, can you explain the symptoms?
<Doug> Depends on the version and browser. For Mozilla v 1.6 there are two problems:
<Doug> 1) the browser "hangs" because two different threads are waiting for the same memory area/semaphore.
<WalterOS2> Doug: FWIW, I think there are some significant differences as well.
<MikeG> Should keep quiet - do you ask mike kaply?
<Doug> 2) the browser just plain crashes in other circumstances
<Kris> Doug: are we sure it is related to the WD plugin?
<Doug> I have made some changes to the semaphore logic to partitally get around the hang, but the result is that occassionally now other versions of the browser crash.
<Doug> Which NEVER happened before.
<Doug> Sorry - now my IRC is crashing... sigh
<NielsJ> Doug: Are you using IRC in Mozilla 1.7?
<Kris> Doug: mozilla (1.7) crashed here in the beginning all the time. Sorted out why, you'll be surprised.
<Doug> Yes???
<Doug> No - I am using ezIRC
<Gord> Is it possible for the plugin to determine what version of browser it is running in?
<Doug> But I have been doing other programming stuff that may have effected it.
<NielsJ> Doug: I just installed 1.7 half an hour ago, and it crashed as soon as I hit the IRC link.
<Doug> Yes - and right now we have a number of different logic paths taken depending on the brand and version of the browser.
<Kris> Niels: do you have Uniaud as audio?
<Doug> No - I don't think I have any audio.
<NielsJ> Kris: No, Crystal.
<Gord> - and the problem persists?
<Kris> Niels: disable sound in the chatzilla setuo, there are four entries.
<Kris> in the mozilla setup of course.
<NielsJ> Kris, Thanks - I will attempt that latter.
<WalterOS2> Doug: I know it would probably entail too much extra work to be practical, but do you think my second idea (B.a.a.ii) would solve the problem?
<Doug> Baaii ?
<Doug> oh _ i get it - let me look
<wdl> Doug: You had a problem with Moz 1.6, and now disallow it. Some chance a relative of its problem remains, propagated into Moz 1.7???
<Kris> Walter: it would solve the problem if WD was the real cause of the crashes. I'm noy convinced WD is.
<Doug> No - my experience with Moz v 1.7 so far has been very good.
<Doug> Let me respond to walter here for a second.
<Doug> Agenda says : Rewrite plugin in C++ and make it reentrant and serially resusable. This would mean giving each browser and user a separate data area. (Last Resort!)
<Doug> The plug-in is written as C code that will compile with either a C compiler or a C++ compiler
<Doug> It is now reentrant AND has private data areas for each instance.
<WalterOS2> Oops. I thought it was written in Rexx. !
<Doug> The applications that are run by the plugin are in Rexx. Those applications are what you interface with.
<Doug> The plug-in itselft (npwrpdoc.dll) is C code.
<WalterOS2> Got it.
<Doug> The way tthe (any) plug-in works is:
<Doug> When the browser encounters an <EMBED> </EMBED> tag in an HTML page, the browser loads the plug-in DLL associated with that tag.
<Doug> And calls some initialization functions in the plug-in DLL.
<Doug> Since multiple pages can be loaded at once, or the <EMBED> tag can appear multiple times on the same page, the plug-in code has to be highly reentrant
<WalterOS2> Makes sense.
<Doug> Each time an EMBED tag is encountered the browser starts an "instance" of the plug-in. Meaning that the very first time it loads the DLL, the rest of the times the DLL is already loaded.
<Doug> Each instance has to be coded to be separate from other instances. However --- there are certain circumstances when they need to share data.
<Doug> For instance (pun intended): we load the security configuration file when the plug-in DLL is first loaded. But each new instance just references the data from the security file
<Doug> - no point in loading the same file over and over again.
<Doug> In order to keep the browser from crashing - memory areas (variables) that are referenced by more than one thread (instance) must be protected so that two threads don't look at or change the variable at the same time.
<Doug> The semaphore is the device that does that.
<Doug> Before a thread starts to access the variable - it requests a semaphore.
<Doug> If another thread already has the variable, the first thread doesn't get the semaphore until it is released.
<Doug> Make sense so far?
<WalterOS2> I think so.
<Doug> What appears to be happending with Moz v 1.6 is that when a second thread that wants to access a variable requests the semaphore, it shuts down everything until the timeout value is reached.
<Doug> What should happen is that when the second thread requests the semaphore and doesn't get it, the CPU is switched to process another thread.
<Doug> If that was happening the other thread that reserved the semaphore could finish and release it before the timeout value was hit by the thread that is waiting.
<Doug> And this is what happens in earlier and later versions of Mozilla.
<Doug> In version 1.6 for some reason - it appears that when a thread hits a semaphore "wait", the cpu is never switched to any other thread - hence the thread that is holding the semaphore can never release it.
<WalterOS2> That's not good.
<Doug> My (partial) solution is: to reduce the amount it time (lines of code) that are guarded by the semaphore, so that the chances of a wait are reduced.
<Doug> This appears to mostly work - however it leaves the memory area unprotected for a small amount of time.
<Doug> And if two threads happen to access that area - the browser crashes.
<WalterOS2> IMO, and with no offence to you, those are half-solutions that will also sooner or later fail. :-(
<Doug> So with the latest version - I am now seeing an occassional crash of the browser that I never saw before making the semaphore change.
<NielsJ> I seem to recall, that the plugin was not a must for access to the new WD site, but it added functionality and made life easier for the user. Is that correct?
<Doug> right now - the plug-in is required to put data into the site - but not to view data already in the site.
<Doug> There are some "features" of putting data into the site that I don't think can be replicated without using the plug-in.
<NielsJ> Is that requirement related to security issues?
<WalterOS2> If you are right, then maybe we should take a look at my first suggestion which is: Make a contribution to Mozilla on condition they support the WD plugin.
<Doug> No - it is related to: file uploading and user interface issues that are very difficult to do without the plug-in.
<WalterOS2> Doug: I should have said, If you're analysis is correct..........
<WalterOS2> you're==your. :-)
<Doug> Yes - that is a big if. It is entirely possible at this stage that there is some other problem that I haven't found yet.
<Doug> It certainly would not be the first time.
<WalterOS2> Doug discussed with me privately that when he posted bug reports on "Bugzilla", his name would be removed. :-(
<KenKrchnr> I'm finally here :-)
<WalterOS2> Hi Ken
<WalterOS2> :-)
<Doug> If contributing a reasonable amount of money to Mozilla would prevent them from releasing new versions of Mozilla if they didn't work with the plug-in, it would probably be worth it.
<WalterOS2> Here are three things to consider:
<KenKrchnr> But that (contribution) would not fix the buggers in the field already
<Doug> No it would not.
<NielsJ> What interest does Mozilla.org have in making the WD plugin work?
<Doug> My experience with the Mozilla team is that they have very different priorities from the priorities of a web site developer.
<WalterOS2> Mozilla really wants financial support;
<WalterOS2> Some VOICE Board Members are anxious to see WD released and working. See yesterday's log.
<Doug> Mozilla releases browser that they know don't work in certain areas.
<Doug> People start using those browsers.
<Doug> Web site administrators/developers are faced with: no using that feature that doesn't work in certain versions, or having a site that doesn't work with certain versions of the browser
<WalterOS2> So I think VOICE would be willing to spend a reasonable amount of money to this end.
<WalterOS2> Hi again Mark
<Doug> which is basically the choice we find ourselves with.
<MADodel> I wish GTIRC would tell me when I'm not connected.
<WalterOS2> Are there other suggestions as to how to resolve these browser problems?
<NielsJ> I still don't see the reasons Mozilla should support a plugin specific to one particular web-site. Can we make a case for the plugin being useful for other sites?
<Doug> I frankly don't think Mozilla wants to spend much time supporting any plug-ins.
<Gord> Do they want to spend any time at all fixing on 'old' version?
<Doug> I think they would perfer web site developers to use different approaches, e.g. Java.
<Doug> They wont' fix an old version. And in fact - it has already been fixed. I don't see the hang problem with version 1.7
<WalterOS2> Actually, it seems to me they don't want to spend time with OS/2 at all.
<Gord> Even if Mozilla fixed V6 how would that affect the IBM browser?
<WalterOS2> Doug, which version of IWB is based on 1.7?
<WalterOS2> Gord: IWB is based (closely) on Mozilla.
<Doug> As far as I know - IBM hasn't put out a version beyond v 2.0.3 yet which is the problem here.
<NielsJ> How difficult would it be to convert the plugin to a java-applet?
<Doug> It would be impossible.
<WalterOS2> Which is fine with me, because J applets are usually hideously slow.
<Doug> The main reaon for the plug-in is access to the user's hard drive - used for file uploading.
<Gord> but IWB 2.0.3 wouldn't be fixed if Moz 6.0 were fixed, necessaruly?
<Kris> Walter: IBMweb 2.03 is mozilla 1.6. There is no counterpart for mozilla 1.7
<Doug> Java applets cannot get access to the users's hard drive without special permissions and security certificates
<WalterOS2> I don't want to go there!!!
<WalterOS2> :-)
<Doug> The problem right now is that the current version of IWB doesn't work because it just happens to be based on the version of Mozilla that doesn't work
<WalterOS2> Doug: How well did you say your fix (for IWB 2.0.2) works? I forget.
<Doug> It will (I think) eventually be fixed, because IBM will eventually put out a new version that is based on a different version of Mozilla that will (probably) work.
<WalterOS2> Personally, it doesn't matter to me if I use 2.0.3 or 2.0.2
<NielsJ> At university I am working at here in Denmark I regularly upload files from my local HD to our CampusNet. That system has worked in the versions of Mozilla and IWB I have tryed to far.
<Doug> Walter - the message I sent you last night says that v 2.0.2 appears to work ok. I have since found some issues that I didn't see last night.
<WalterOS2> But you aren't uploading using a browser, are you?
<WalterOS2> Doug: I guess that's why I was uncertain. :-)
<NielsJ> I am uploading using the browser.
<WalterOS2> How do you upload using a browser?
<Doug> My experience is that uploading binary files with Netscape and Mozilla is not reliable. Meaning that sometimes the file is scrambled when it arrives at the server.
<NielsJ> I could attempt to contact the people doing the development and see if they are willing to share theri knowledge with us.
<MikeG> CGI ?
<Doug> Please do!
<WalterOS2> OK, great!
<Doug> Yes - there is a part in the HTTP (or related) specification that calls out a method for uploading files.
<Doug> It is called multipart MIME.
<NielsJ> I will. There is simply a bottom called upload file, which opens a window which allows me to browse my local HD.
<WalterOS2> Of course, forms are a tried and true method, but you can't upload very large quantities of data that way.
<Doug> Yes - there is also an HTML tag for opening a file dialog.
<Doug> You have to write a CGI program to break apart the multipart form and extract the data from the form
<Doug> I did that about 1 1/2 years ago. But when I tested it - uploading PDF files - about 20% of the time the PDF file was unreadable at the server
<WalterOS2> I know, it can get messy. :-(
<Doug> The Netscape documentation also says that binary uploads from Windows browses is not supported.
<Doug> Whether that means that Mozilla now is "fixed" or supports it on Windows/OS/2 is another story.
<WalterOS2> Doug: How can you determine for sure whether or not the problem is in the plugin or in the browsers?
<KenKrchnr> Yes, uploads from the browser are easy to set and retrieve, it's just that you may or may not have the proper data when you get it ;-)
<Doug> You mean our problem with the hang?
<WalterOS2> That's one. However, there are others as well aren't there?
<WalterOS2> I think it's necessary to isolate the problem (if possible) and find out who's ballpark it is.
<Doug> Yes - with Mozilla v 1.6 there is also an unrelated crash problem. That crash problem may or may not appear in IWB even though it is based on the same Mozilla version - because of the way it is compiled.
<WalterOS2> Please explain the bit about compilation.
<Doug> My guess is that you won't get ANY support from Mozilla to fix a previous version. You MAY get support from IBM to fix their current version.
<WalterOS2> I'm not really concerned about previous versions.
<Doug> After you get past the hang part on Mozilla v 1.6, it crashes at another location - when I call a function.
<WalterOS2> If we get them to solve the problem for future versions, that's really enough.
<Doug> It is a version simple function call. I suspect - although I do not know for certain - that it is related to how much stack space the browser has allocated.
<Doug> This is changed by the switches/parameters used when compiling.
<NielsJ> Doug: I just got a copy of the upload.asp.html, which manage the upload. Do you want a copy of it?
<Doug> So if that is the problem - it is very possible that it might not exist in IWB simply because they are using different parameters when compiling.
<WalterOS2> Doug: Too bad there isn't a dynamic stack.
<Doug> NielsJ - yes do you know my email?
<NielsJ> Doug: Yes.
<Doug> Please send it.
<Doug> I started the whole plug-in thing originally because I thought there was no other reliable alternative in OS/2 for uploading files.
<WalterOS2> Doug: How can find out for sure whether the crash is
<WalterOS2> related to how much stack space the browser has allocated.
<Doug> (BTW - the binary upload works fine on Unix based browsers for some reason.)
<Doug> I probably never can know that for sure - with the information I have.
<WalterOS2> What additional information do you need?
<Doug> What I can do is try my V 1.6 fixes on IWB v 2.0.3 and see if they work.
<Doug> I would really have to know that compiler options they are using. But the real way would be to compile it with the stack space bumped way up and see if it runs.
<Doug> I obviously don't have access to IBM's source code.
<WalterOS2> :-(
<Doug> I THINK however, that they convert the Mozilla GCC/EMX code to IBM VAC and the VAC compiler to produce IWB.
<Doug> At least that is what I remember from the Austin Warpstock presentation.
<WalterOS2> ugh.
<Doug> So it is very possible that there are bugs in one version (mozilla) that do not appear in the other version. And vis a versa
<MikeG> I have a question - Is here any description as to what to actually being done by the plugin. It might make it easier for additional ideas.
<Doug> simply because of the difference between compilers.
<Doug> There is a very high level overview of the plug-in and related Rexx applications as used by the WD site on the plug-in page.
<WalterOS2> MikeG: Doug wrote the plugin. I'm not sure what you mean.
<MikeG> I thought after v1.4 Kaply was using only Innotek GCC.
<Doug> www2.warpdoctor.org - go to Site Help - Plugin - How it works
<Doug> He might have switched over. I don't know one way or the other.
<Kris> Mike: quite right. So I'm afraid the lessons of Austin 2003 are outdated already.
<MikeG> Looks like you are using Apache 2.0.48 ==> could PHP be an option?
<MikeG> I believe there is rexx == php for OS/2
<Doug> PHP might be used as a replacement for a portion of the CGI stuff. But the CGI stuff is pretty unrelated to the plug-in.
<Doug> Although the Rexx applications running IN the plug-in make CGI calls to retrieve data from the database.
<WalterOS2> Doug: I just went to www2.warpdoctor.org, logged in, and got the error message: "Error sending cookie message" Error - Invalid instance name. (I'm currently running IWB 2.0.2)
<WalterOS2> Would downloading your latest plugin fix that?
<WalterOS2> BTW, I've gotten that before, at least once.
<Doug> Walter - this one of the problems I ran into later last night. Clear the cache - shut down the browser - start the browser - clear the cache again - then try.
<NielsJ> I just did a seach of PHP file upload scripts - There appear to be many available, e.g. at hotscripts.com.
<Doug> I uploaded a version last night that will help IWB v 2.0.2 - but it degrades Mozilla v 1.7
<Doug> What I have found so far is: that what works in PHP, and Pearl, and CGI in Unix doesn't necessarily work in Windows or OS/2. Because of differences in the browsers.
<Doug> Consider this: WarpIN users have had problems for years now with the WarpIN archives becomming corrupt (occassionally) when downloaded.
<Doug> Even when following the correct procedures for downloading binary files.
<WalterOS2> Doug: It still failed, but with a cache error message instead of the cookie error message. :-(
<Doug> when using HTTP. And that should never happen.
<Doug> You have to clear the cache, shut down the browser, start the browser and clear the cache again, before going to the site.
<Doug> BTW - that routine is necessary for all Mozilla based browsers whenever the browser terminates abnormally.
<Doug> They seem to have some cache problem(s)
<WalterOS2> Doug: I think I've run into that WarpIn several times: WarpIn now leaves a sort bad taste in my mouth. :-(
<Doug> We get around the WarpIN download problem by not using HTTP. We download ALL binary files on WD using FTP.
<WalterOS2> Doug: I already did that once.
<Doug> BLOBS however ARE downloaded via HTTP.
<WalterOS2> Clear the cache routine bit, I mean.
<Doug> Wlater - OK - looks like that fix doesn't work.
<WalterOS2> :-(
<WalterOS2> Doug: would it help if all of us her tried WD during the next week or so, and report any error messages to you?
<WalterOS2> Or would that inundate you?
<MikeG> Right wpi files were getting downloaded *not as binary files* which (I think) could be corrected by setting a mime type for wpi.
<Doug> Actually - if you use WD and get an error message - you can report the bug in WD - assuming it is working well enough for that to happen.
<WalterOS2> I don't usually get that far. :-( Most my errors so far occur just after I login.
<Doug> In that case you will have to send me the bug report.
<WalterOS2> OK
<Doug> This just occurred to me - but I am wondering if HTTP transport is guarenteed reliable - like FTP is.
<Doug> Meaning: FTP will retransmit a packette that is corrupt. I don't know if HTTP does that or not.
<Doug> Does anyone here know if HTTP has the CRC built in to check the integrity of the packets?
<KenKrchnr> HTTP does have data check
<Doug> OK
<KenKrchnr> The problem can occur in that it does not guarantee all packets are delivered
<WalterOS2> I've had that happen more often than I care to remember. :-(
<Doug> If that is true - then I think it means that binary file transfer cannot be guaranteed to be 100% reliable with HTTP
<Doug> Your milage wil vary, depending on how reliable your connection is - and other circumstances
<Doug> Right now on WD - all downloads of binary files that are stored on disk use FTP. The link is transferred to an FTP link that gets the file.
<WalterOS2> Doug: I try to stay in the habit of d/l'ing large file using FTP.
<Doug> BLOB data is sent via HTTP.
<Doug> All uploads of text or binary data happen via FTP.
<Doug> So we should be very reliable for data integrity.
<WalterOS2> Another advantage of most ftp clients is the ability to retransmit a file from where you left off. :-)
<Doug> That advantage also works in most newer browsers when they are downloading a file using an FTP URL. Which is way we store larger binary files on disk rather than the database.
<Doug> way= why
<Doug> The cache setting on the browser has to be set larger than the file you are downloading for restart to work in most newer browsers on FTP urls
<WalterOS2> For very large files, that can be a problem.
<Doug> yes - I used to set my browser cache at 10MB.
<Doug> Anyway - I will spend some more time looking at IWB v 2.0.2 and v 2.0.3
<Doug> I am still occassionally running into problems with the plug-in/site, but I am fixing those as I go on www2.warpdoctor.org
<Doug> For instance - today I discovered that dropping a 280K HTML file on the screen hangs the browser.
<Doug> But mostly - except for the browser version issues - the site runs pretty well.
<WalterOS2> That sort of thing is normal is normal; that's why I asked if all of us should make a effort to use WD and reported problems would help you making it stable quicker.
<Doug> Please do. We want to find all the bugs before the "normal" users do.
<WalterOS2> Agreed.
<WalterOS2> What kind of information should we report?
<Doug> which folder - type of information you were trying to input - what happened
<wdl> Doug/all: How much of this log should be 'bleeped'? That is, to thwart future hacker interest?
<Doug> thinking....
<Doug> I think we are OK.
<NielsJ> I have a concern about the link section of the new WD site.
<Doug> yes?
<wdl> Gentlemen: "1-minute warning..."
<NielsJ> I just randomly selected a link and clicked on www.sybase.com - which gives you a 404 error.
<wdl> Oops... "1-minute" = "10-munute".....
<WalterOS2> wdl: It's ibkt 4:51 ET.
<Doug> Yes - the problem is that I entered the URLs without the beginning HTTP://
<WalterOS2> It's only 4:51 ET
<wdl> "munute" = "minute". Aargh.
<WalterOS2> wdl: I think your aphasia is catching. VBG
<Doug> All URLs without the beginning HTTP don't work. I have since put a check in the data entry screen to insure that doesn't happen -but I haven't looked at the existing URLs yet.
<WalterOS2> I'm doing it to.
<wdl> Walter: You don't want it...
<wdl> Walter: Aphasia, that is. ;(
<WalterOS2> NielsJ: with some browsers, e.g. IWB, you can set an option in Navigator | Smart Browsing to force a www in front of the URL if the URL fails.
<WalterOS2> Is there anything else someone wants to bring up?
<WalterOS2> someone=anyone
<Gord> Doug: Will you send me the icons to work on this week?
<Doug> Yes - I am very sorry - I kept forgetting.
<Doug> I have a box on my desk to send to Bill also.
<WalterOS2> Anyone want to move for adjournment?
<Doug> I move we get outta here
<WalterOS2> Please use official terminology. :-)
<Doug> I would please like to move that we adjourn this meeting
<Kris> If Ken is still around, I still would like to have a word with him.
<WalterOS2> We don't want to upset Bill's logging program.
<WalterOS2> The room is always open.
<KenKrchnr> I'm here, but my connection keeps going to sleep
<Kris> Ken: I have a channel open to you, don't you see it?
<WalterOS2> Is there an official motion for adjournment?
<wdl> Doug made one, properly
<KenKrchnr> No, I don't. It may have gotten lost in a sleep fit.
<Doug> I even used please!
<WalterOS2> Sorry--I missed it.
<WalterOS2> All in favour of adjourning, please type Aye.
<wdl> Aye
<Gord> aye
<Kris> Yes.
<KenKrchnr> Aye
<NielsJ> Aye
<Doug> aye
<WalterOS2> All opposed, please type Nay
14:01:58 <WalterOS2> Motion is carried; meeting is adjourned.