[ciapug] Extract files used on a page

ciapug@cialug.org ciapug@cialug.org
Tue, 22 Feb 2005 18:39:03 -0600 (CST)


Thanks! Just changed the A HREF in the example to IMG SRC. Works great for
my use.


Jon

> Ok. This page has code showing how to extract links from pages. It
> checks their status, but you can change it to get file sizes instead.
>
> http://www.webreference.com/programming/php/cookbook/chap11/2/3.html
>
> This b it seems the most useful:
>
> |function pc_link_extractor($s) {|
> |    $a = array();|
> |    if (preg_match_all('/<A\s+.*?HREF=[\"\']?([^\"\'
> >]*)[\"\']?[^>]*>(.*?)<\/A>/i',|
> |                       $s,$matches,PREG_SET_ORDER)) {|
> |        foreach($matches as $match) {|
> |            array_push($a,array($match[1],$match[2]));|
> |        }|
> |    }|
> |    return $a;|
> |}
> |
>
> Darcy
>
>
> jcbailey@code0.net wrote:
>
>>I already have a function to get the file size. I need something that
>> will
>>parse HTML and get all the files its linking to (CSS,JS,images, etc).
>>
>>
>>Jon
>>
>>
>>
>>
>>
>>
>
>