[Cialug] Efficiently removing the beginning of a file

Nathan Stien nathanism at gmail.com
Mon May 21 11:44:09 CDT 2007


On 5/21/07, Daniel A. Ramaley <daniel.ramaley at drake.edu> wrote:
> I have a ~70 MB file. The first 3635 bytes need to be removed. What is
> the most efficient way to do that? I did this, knowing it would work
> but would be slow:
>     $ dd if=inputfile of=outputfile ibs=1 obs=1M skip=3635
> It did indeed work. But it took 274 seconds (and pegged the CPU the
> entire time), whereas simply copying the file with cp only takes 2
> seconds. Since what i want to do is not *that* different an action from
> just copying the file (at least in terms of the minimum disk operations
> that would be required), it seems to me that there should be a way to
> do it that only takes ~2 seconds. What are some other command line ways
> to do this that would be more efficient?

I think the block size issue is telling.  As an analogy, consider
moving a mountain by carrying a grain of sand at a time vs. a dump
truck.

You were pegging your CPU because dd could do a lot less for each
loop.  Normally when dealing with two disk files, dd is severely I/O
bound rather than CPU bound.  dd should normally spend most of its
time waiting for the kernel to hand it the next block and reschedule
it.  With a small block size it's nearly always schedulable and ends
up making a lot of unnecessary loops.

-- 
Nathan P. Stien
Consulting Engineer / Software Developer
Embedded Systems Electronics and Software
http://linkedin.com/in/nathanstien
Mobile: 309.241.2581


More information about the Cialug mailing list