[Cialug] Interesting problem [OT?]

Matthew Nuzum newz at bearfruit.org
Fri Nov 14 17:36:54 CST 2008


On Fri, Nov 14, 2008 at 5:20 PM, Stuart Thiessen <thiessenstuart at aol.com> wrote:
> Anyway, my technical challenge is that the organization developing this
> writing system has published a PHP database of the symbols at:
> http://www.signbank.org/swis/data.php?subset=&bs_code=*
>
> I need to get a offline dump of each of the basesymbol child pages listed on
> that page. I can't do a simple download of the page as HTML because the
> image file showing the symbol is actually a link to a script that finds the
> right symbol and plugs it in, so when I use programs like wget, a broken
> link for the symbol image appears when I try to look at it offline.

Does wget download the symbols and give them a funny name like
glyph.php?code=368.html or glyph.php?code=368.png?

I don't see anything sneaky on that page that would prevent you from
using an automated tool, my guess is that the names are getting
mangled.  If so, view the source of your downloaded page and see what
the filename is that it's expecting and how it differs from what was
actually generated. If they don't differ and the problem is really
that the name is illegal for the filesystem then you may be able to
just use a script to rename the images and the paths to the images in
the html files.

I've run into this problem before and just tried a different
downloader program. It's been a while since I've used one so I can't
think of one to suggest at the moment.

-- 
Matthew Nuzum
newz2000 on freenode


More information about the Cialug mailing list