Perl: MD5 directory full of filenames

  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

I'd like to know how I could get a list of files in a directory, and rename all of them to be the MD5 checksums of their contents.

So instead of
Code: [ Select ]
/dir/one.txt
/dir/two.txt
  1. /dir/one.txt
  2. /dir/two.txt


I would have
Code: [ Select ]
/dir/1234567890abcdef1234567890abcdef.txt
/dir/abcdef1234567890abcdef1234567890.txt
  1. /dir/1234567890abcdef1234567890abcdef.txt
  2. /dir/abcdef1234567890abcdef1234567890.txt
Moderator Remark: Split from http://www.ozzu.com/perl-tutorials/tutorial-suggestions-t90821.html
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • alex89
  • Bronze Member
  • Bronze Member
  • User avatar
  • Posts: 239
  • Loc: Western Australia

Post 3+ Months Ago

May I ask why? I think it wouldn't be too hard. I would do it in php, because that's what I'm more familiar with, but I'll include perl too.

You'd need a function to convert from a string into a the hash, a loop to go through all the files in a directory: read, run function, rename, next.

Function: Can get all the information from wikipedia (all in nice phsudocode) - Is this the bit you want someone to concentrate on?

Read: php:
Code: [ Select ]
file_get_contents()
or perl:
Code: [ Select ]
open FILE, "<", "filename.txt" or die $!


Rename: php:
Code: [ Select ]
rename("/dir/one.txt", "/dir/1234567890abcdef1234567890abcdef.txt");
or perl:
Code: [ Select ]
rename("/dir/one.txt","/dir/1234567890abcdef1234567890abcdef.txt")


Loop: (just perl, because I'm sure there's a reason you want to use perl):
Code: [ Select ]
@files = </home/user/dir/*>;
@all_files = </home/user/dir/* /home/user/dir/.*>;
@all_files = glob "/home/user/dir/* /home/user/dir/.*";
 
opendir DIRH, "/home/user/dir" or die "couldn't open: $!";
foreach (sort readdir <DIRH>) {
print "one file in the directory is $_n";
}
closedir DIRH;
  1. @files = </home/user/dir/*>;
  2. @all_files = </home/user/dir/* /home/user/dir/.*>;
  3. @all_files = glob "/home/user/dir/* /home/user/dir/.*";
  4.  
  5. opendir DIRH, "/home/user/dir" or die "couldn't open: $!";
  6. foreach (sort readdir <DIRH>) {
  7. print "one file in the directory is $_n";
  8. }
  9. closedir DIRH;

Source for directory loop
  • alex89
  • Bronze Member
  • Bronze Member
  • User avatar
  • Posts: 239
  • Loc: Western Australia

Post 3+ Months Ago

Just stumbled upon some php commands that would make this alot simpler (not sure if these exist in perl):

# md5_file — Calculates the md5 hash of a given file
# md5 — Calculate the md5 hash of a string

(This is slightly off topic, should these be split and put somewhere?)
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

Well, I've got a system setup on a website that keeps out duplicate files and simplifies the filename screening process on visitor-submitted files by using MD5 checksums of the file contents for the names of files.

Occasionally, I aquire batches of new files from sources other than visitor uploads and these batches are ready to be transfered to the site with the exception of their filenames.

I've already written something in PHP that I have patched into my Desktop via Nautilus Actions, I just want to know how to do it in Perl. :)

Code: [ Select ]
<?php
 
$argv0 = $argv[0];
unset($argv[0]);
 
foreach($argv as $path)
{
    if(is_file($path))
    {
        $md5 = md5_file($path);
        $parts = pathinfo($path);
        $newpath = $parts['dirname'] . '/' . $md5 . (!empty($parts['extension']) ? ".{$parts['extension']}" : '');
        rename($path, $newpath);
    }
}
 
?>
  1. <?php
  2.  
  3. $argv0 = $argv[0];
  4. unset($argv[0]);
  5.  
  6. foreach($argv as $path)
  7. {
  8.     if(is_file($path))
  9.     {
  10.         $md5 = md5_file($path);
  11.         $parts = pathinfo($path);
  12.         $newpath = $parts['dirname'] . '/' . $md5 . (!empty($parts['extension']) ? ".{$parts['extension']}" : '');
  13.         rename($path, $newpath);
  14.     }
  15. }
  16.  
  17. ?>
  • cbeckham
  • Born
  • Born
  • cbeckham
  • Posts: 1

Post 3+ Months Ago

how about just at a unix command prompt in 1 line...

Code: [ Select ]
md5 /path/to/files* | cut -d " " -f 2,4 | perl -p -e 's/\((.*)\/([^\/.]+)([.].*)\) ([a-f0-9]+)/"$1\/$2$3" "$1\/$4$3"/' | xargs -L 1 /bin/mv


works good for me in FreeBSD, other systems may not have xargs or cut, and you may need to alter the path to mv.. you'll also need perl installed
  • kc0tma
  • o|||||||o
  • Web Master
  • User avatar
  • Posts: 3318
  • Loc: Trout Creek, MT

Post 3+ Months Ago

That regex and command piping and stuff is pretty complicated, I like alex89's post.

Post Information

  • Total Posts in this topic: 6 posts
  • Users browsing this forum: No registered users and 82 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
cron
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.