[Cialug] Interesting problem [OT?]

Tony Bibbs tony at tonybibbs.com
Fri Nov 14 18:06:44 CST 2008


A lot of the UI test tools automate doing things in the browser.  I was
suggesting that it or something like it (Bad Boy + Jmeter) might be able to
do what you want.

--Tony

On Fri, Nov 14, 2008 at 5:56 PM, Stuart Thiessen <thiessenstuart at aol.com>wrote:

> The img tag shows (for example) <img src="
> http://www.signbank.org/swis/glyph.php?code=45568">.  The glph script
> returns the appropriate image file for that code number. I have the dump of
> the html, but it is these <img> tags that are blocking me from having a
> complete offline copy. That was why I was also thinking of trying to
> automate some kind of PDF dump of each page.
>
> Tony, I did look at the Selenium website, but I am not sure how it will
> help with this task. Could you explain?
>
> Thanks,
>
> Stuart
>
> On Nov 14, 2008, at 17:36 , Matthew Nuzum wrote:
>
>  On Fri, Nov 14, 2008 at 5:20 PM, Stuart Thiessen <thiessenstuart at aol.com>
>> wrote:
>>
>>> Anyway, my technical challenge is that the organization developing this
>>> writing system has published a PHP database of the symbols at:
>>> http://www.signbank.org/swis/data.php?subset=&bs_code=*
>>>
>>> I need to get a offline dump of each of the basesymbol child pages listed
>>> on
>>> that page. I can't do a simple download of the page as HTML because the
>>> image file showing the symbol is actually a link to a script that finds
>>> the
>>> right symbol and plugs it in, so when I use programs like wget, a broken
>>> link for the symbol image appears when I try to look at it offline.
>>>
>>
>> Does wget download the symbols and give them a funny name like
>> glyph.php?code=368.html or glyph.php?code=368.png?
>>
>> I don't see anything sneaky on that page that would prevent you from
>> using an automated tool, my guess is that the names are getting
>> mangled.  If so, view the source of your downloaded page and see what
>> the filename is that it's expecting and how it differs from what was
>> actually generated. If they don't differ and the problem is really
>> that the name is illegal for the filesystem then you may be able to
>> just use a script to rename the images and the paths to the images in
>> the html files.
>>
>> I've run into this problem before and just tried a different
>> downloader program. It's been a while since I've used one so I can't
>> think of one to suggest at the moment.
>>
>> --
>> Matthew Nuzum
>> newz2000 on freenode
>> _______________________________________________
>> Cialug mailing list
>> Cialug at cialug.org
>> http://cialug.org/mailman/listinfo/cialug
>>
>
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cialug.org/pipermail/cialug/attachments/20081114/c4afdfe4/attachment.html


More information about the Cialug mailing list