[Cialug] Efficiently removing the beginning of a file

Thomas Kula kula at tproa.net
Mon May 21 12:48:19 CDT 2007


On Mon, May 21, 2007 at 12:42:34PM -0500, Nathan Stien wrote:
> 
> I don't see how a Perl script (or anything else) would do this
> significantly differently from dd.  At the end of the day, you still
> need to copy most of the blocks in the file to another file, and this
> will still cost you at best O(n) time.  And dd can be slightly faster
> in practice because it's a tight little C program and doesn't require
> loading a Perl interpreter.  But like I said, this should be an I/O
> bound process, so the slowness of Perl shouldn't matter much as your
> drive's throughput and seek time.

In order to skip 3mumble-thousand bytes at the beginning of the file,
you have to tell dd your block size is 1 byte, since all of its 
skip functions operate in multiples of block size. This has the side
effect of making dd do everything in 1 byte chunks: read a byte,
write a byte, read a byte, etc. Whereas with something else, you
could simply open the file, seek to where you wanted to start to
read, read in a meg or so of stuff, write out a meg or so of stuff,
etc. 

The overhead of doing i/o in one byte chunks compared to kilo or
megabyte chunks is what is making dd so slow. 


-- 
Thomas L. Kula | kula at tproa.net | http://kula.tproa.net/
Mathom House at Ypsi-Edge, The People's Republic of Ames


More information about the Cialug mailing list