PHP MP3 ID3

  • wphilipw
  • Student
  • Student
  • wphilipw
  • Posts: 87
  • Loc: PA, US

Post 3+ Months Ago

I have been working on an MP3 ID3 tag reader, and for now it only reads ID3v1 which is ok, but where I am having a problem is with the APIC tag (embedded image tag) for whatever reason I am unable to get an accurate size on it, and I can't figure out where my error is, it works great to read the other tags in the MP3, just not the APIC, it stops about half way through on some images and like 90% through on others... I don't really get it, and I know that multiple minds are better then one, so, would someone be kind enough to look through this and tell me where my mistake is? thanks!
Code: [ Select ]
function shift7_8($num) {
$num_1 = bindec(str_pad(substr($num, 0, 4), 8, "0", STR_PAD_LEFT));
$num_2 = bindec(substr($num, 4, 8));
$num_3 = bindec(substr($num, 12, 8));
$num_4 = bindec(substr($num, 20, 8));
$num = $num_1 . $num_2 . $num_3 . $num_4;
unset($num_1, $num_2, $num_3, $num_4);
return $num;
}

function id3_read($file) {
// BEGIN ID3 READ

// OPEN MP3 FILE
$mp3 = fopen($file, "r");

// ID3 HEADER READ

// READ THE "ALWAYS PRESENT "ID3"
$id3['base'] = fread($mp3, 3);
// GET THE HEX ID3 VERSION AND CONVERT IT
$id3['version'] = "";
for ($x = 0; $x < 2; $x++) {
$id3['version'] .= str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT);
}
$id3['version'] = str_split($id3['version']);
if ($id3['version'][0] == "0") { $id3['version'][0] = ""; }
if ($id3['version'][2] == "0") { $id3['version'][2] = ""; }
$id3['version'] = "2." . $id3['version'][0] . $id3['version'][1] . "." . $id3['version'][2] . $id3['version'][3];
// GET THE ID3 BINARY FLAGS - STANDARDS STORED IN FIRST 4 BITES
for ($x = 0; $x < 1; $x++) {
$id3['flags'] = str_pad(decbin(ord(fread($mp3, 1))), 8, "0", STR_PAD_LEFT);
}
$id3['flags'] = str_split($id3['flags']);
// GET THE 4 ID3 SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
$id3['size'] = "";
for ($x = 0; $x < 4; $x++) {
$id3['size'] .= str_pad(decbin(ord(fread($mp3, 1))), 8, "0", STR_PAD_LEFT);
}
$id3['size'] = shift7_8($id3['size']);

// IF THERE IS AN EXTENDED HEADER, SKIP IT
if ($id3_['flags'][1] == 1) {
// GET THE 4 ID3 EXT HEADER SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
$id3_exthead_size = "";
for ($x = 0; $x < 4; $x++) {
$id3_exthead_size .= str_pad(decbin(ord(fread($mp3, 1))), 7, "0", STR_PAD_LEFT);
}
$id3_exthead_size = shift7_8($id3_exthead_size);
// SKIP THE EXT HEADER
$tmp = fread($mp3, ($id3_exthead_size - 4));
unset($tmp, $id3_exthead_size);
}

// READ FRAME HEADERS AND LOAD FRAMES INTO ARRAY
$v = 0;
$r = 0;
while ($v < $id3['size']) {
// GET FRAME ID
$id3['frames'][$r]['id'] = "";
for ($x = 0; $x < 4; $x++) {
$id3['frames'][$r]['id'] .= chr(hexdec(str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT)));
}
if (!preg_match('/[A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]/', $id3['frames'][$r]['id']) || $id3['frames'][$r]['id'] == "") {
unset ($id3['frames'][$r]);
break;
break;
}
// GET THE 4 FRAME SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
$id3['frames'][$r]['size'] = "";
for ($x = 0; $x < 4; $x++) {
$id3['frames'][$r]['size'] .= str_pad(decbin(ord(fread($mp3, 1))), 7, "0", STR_PAD_LEFT);
}
$id3['frames'][$r]['size'] = shift7_8($id3['frames'][$r]['size']);
echo $id3['frames'][$r]['size'];
// GET THE 2 FRAME FLAG BYTES
$id3['frames'][$r]['flags'] = "";
for ($x = 0; $x < 2; $x++) {
$id3['frames'][$r]['flags'] .= str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT);
}
// SKIP A BYTE - (TEXT ENCODING)...
$temp = fread($mp3, 1);
unset($temp);
// IF NOT APIC, THEN GET THE FRAME CONTENT
if ($id3['frames'][$r]['id'] != "APIC") {
$id3['frames'][$r]['content'] = fread($mp3, ($id3['frames'][$r]['size'] - 1));
} else {
$tmp = fread($mp3, 1);
while (str_pad(dechex(ord($tmp)), 2, "0", STR_PAD_LEFT) != "00") {
$id3['frames'][$r]['mime'] .= $tmp;
$id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
$tmp = fread($mp3, 1);
}
echo $id3['frames'][$r]['size'];
$id3['frames'][$r]['picture_type'] = fread($mp3, 1);
$id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 2;
$tmp = fread($mp3, 1);
while (str_pad(dechex(ord($tmp)), 2, "0", STR_PAD_LEFT) != "00") {
$id3['frames'][$r]['description'] .= $tmp;
$id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
$tmp = fread($mp3, 1);
}
unset ($tmp);
$id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
$id3['frames'][$r]['content'] = fread($mp3, ($id3['frames'][$r]['size']));
}
// INCREMENT FRAME NUMBER AND START OVER
$v = $v + 10 + $id3['frames'][$r]['size'];
$r++;
}
// END ID3 READ

// CLOSE MP3 FILE & RETURN VALUES
fclose($mp3);
return $id3;
}
  1. function shift7_8($num) {
  2. $num_1 = bindec(str_pad(substr($num, 0, 4), 8, "0", STR_PAD_LEFT));
  3. $num_2 = bindec(substr($num, 4, 8));
  4. $num_3 = bindec(substr($num, 12, 8));
  5. $num_4 = bindec(substr($num, 20, 8));
  6. $num = $num_1 . $num_2 . $num_3 . $num_4;
  7. unset($num_1, $num_2, $num_3, $num_4);
  8. return $num;
  9. }
  10. function id3_read($file) {
  11. // BEGIN ID3 READ
  12. // OPEN MP3 FILE
  13. $mp3 = fopen($file, "r");
  14. // ID3 HEADER READ
  15. // READ THE "ALWAYS PRESENT "ID3"
  16. $id3['base'] = fread($mp3, 3);
  17. // GET THE HEX ID3 VERSION AND CONVERT IT
  18. $id3['version'] = "";
  19. for ($x = 0; $x < 2; $x++) {
  20. $id3['version'] .= str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT);
  21. }
  22. $id3['version'] = str_split($id3['version']);
  23. if ($id3['version'][0] == "0") { $id3['version'][0] = ""; }
  24. if ($id3['version'][2] == "0") { $id3['version'][2] = ""; }
  25. $id3['version'] = "2." . $id3['version'][0] . $id3['version'][1] . "." . $id3['version'][2] . $id3['version'][3];
  26. // GET THE ID3 BINARY FLAGS - STANDARDS STORED IN FIRST 4 BITES
  27. for ($x = 0; $x < 1; $x++) {
  28. $id3['flags'] = str_pad(decbin(ord(fread($mp3, 1))), 8, "0", STR_PAD_LEFT);
  29. }
  30. $id3['flags'] = str_split($id3['flags']);
  31. // GET THE 4 ID3 SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
  32. $id3['size'] = "";
  33. for ($x = 0; $x < 4; $x++) {
  34. $id3['size'] .= str_pad(decbin(ord(fread($mp3, 1))), 8, "0", STR_PAD_LEFT);
  35. }
  36. $id3['size'] = shift7_8($id3['size']);
  37. // IF THERE IS AN EXTENDED HEADER, SKIP IT
  38. if ($id3_['flags'][1] == 1) {
  39. // GET THE 4 ID3 EXT HEADER SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
  40. $id3_exthead_size = "";
  41. for ($x = 0; $x < 4; $x++) {
  42. $id3_exthead_size .= str_pad(decbin(ord(fread($mp3, 1))), 7, "0", STR_PAD_LEFT);
  43. }
  44. $id3_exthead_size = shift7_8($id3_exthead_size);
  45. // SKIP THE EXT HEADER
  46. $tmp = fread($mp3, ($id3_exthead_size - 4));
  47. unset($tmp, $id3_exthead_size);
  48. }
  49. // READ FRAME HEADERS AND LOAD FRAMES INTO ARRAY
  50. $v = 0;
  51. $r = 0;
  52. while ($v < $id3['size']) {
  53. // GET FRAME ID
  54. $id3['frames'][$r]['id'] = "";
  55. for ($x = 0; $x < 4; $x++) {
  56. $id3['frames'][$r]['id'] .= chr(hexdec(str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT)));
  57. }
  58. if (!preg_match('/[A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]/', $id3['frames'][$r]['id']) || $id3['frames'][$r]['id'] == "") {
  59. unset ($id3['frames'][$r]);
  60. break;
  61. break;
  62. }
  63. // GET THE 4 FRAME SIZE BYTES AND CONVERT THEM TO 8 BIT NUMBERS.
  64. $id3['frames'][$r]['size'] = "";
  65. for ($x = 0; $x < 4; $x++) {
  66. $id3['frames'][$r]['size'] .= str_pad(decbin(ord(fread($mp3, 1))), 7, "0", STR_PAD_LEFT);
  67. }
  68. $id3['frames'][$r]['size'] = shift7_8($id3['frames'][$r]['size']);
  69. echo $id3['frames'][$r]['size'];
  70. // GET THE 2 FRAME FLAG BYTES
  71. $id3['frames'][$r]['flags'] = "";
  72. for ($x = 0; $x < 2; $x++) {
  73. $id3['frames'][$r]['flags'] .= str_pad(dechex(ord(fread($mp3, 1))), 2, "0", STR_PAD_LEFT);
  74. }
  75. // SKIP A BYTE - (TEXT ENCODING)...
  76. $temp = fread($mp3, 1);
  77. unset($temp);
  78. // IF NOT APIC, THEN GET THE FRAME CONTENT
  79. if ($id3['frames'][$r]['id'] != "APIC") {
  80. $id3['frames'][$r]['content'] = fread($mp3, ($id3['frames'][$r]['size'] - 1));
  81. } else {
  82. $tmp = fread($mp3, 1);
  83. while (str_pad(dechex(ord($tmp)), 2, "0", STR_PAD_LEFT) != "00") {
  84. $id3['frames'][$r]['mime'] .= $tmp;
  85. $id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
  86. $tmp = fread($mp3, 1);
  87. }
  88. echo $id3['frames'][$r]['size'];
  89. $id3['frames'][$r]['picture_type'] = fread($mp3, 1);
  90. $id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 2;
  91. $tmp = fread($mp3, 1);
  92. while (str_pad(dechex(ord($tmp)), 2, "0", STR_PAD_LEFT) != "00") {
  93. $id3['frames'][$r]['description'] .= $tmp;
  94. $id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
  95. $tmp = fread($mp3, 1);
  96. }
  97. unset ($tmp);
  98. $id3['frames'][$r]['size'] = $id3['frames'][$r]['size'] - 1;
  99. $id3['frames'][$r]['content'] = fread($mp3, ($id3['frames'][$r]['size']));
  100. }
  101. // INCREMENT FRAME NUMBER AND START OVER
  102. $v = $v + 10 + $id3['frames'][$r]['size'];
  103. $r++;
  104. }
  105. // END ID3 READ
  106. // CLOSE MP3 FILE & RETURN VALUES
  107. fclose($mp3);
  108. return $id3;
  109. }

thanks again to any and all that take a look at this.
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • wphilipw
  • Student
  • Student
  • wphilipw
  • Posts: 87
  • Loc: PA, US

Post 3+ Months Ago

btw, solved!
  • phreak2day
  • Born
  • Born
  • phreak2day
  • Posts: 1

Post 3+ Months Ago

Hi there! I have the exact same problem and I've been slamming my head against the wall for over 2 months and I didn't find a solution. Could you tell me, if you remember, what you did to solve the problem? I keep getting stuck reading the APIC frame, want to avoid it completely but I can't seem to skip it during the reading process. This program that I am writing is for an exam that I have and that I need to finish in the next few days ASAP so I would be VERY thankful for any help you can give.



EDIT:

Solved! At least it seems to be. :P
OK, so for anyone else who might have this problem in the future (how to skip reading the APIC frame), this is what I was doing and how I solved it. ;)

What I was doing:
On the official site (id3 <dot> org), wich is AWFUL, says that "The ID3v2 tag size is encoded with four bytes where the most significant bit (bit 7) is set to zero in every byte, making a total of 28 bits. The zeroed bits are ignored...". I took that logic and appyed it when reading the tag size and all the frame sizes. To put it in code I was doing this:
<C# snippet>
int sizeOfTheTag=<the size wich is read>;
int readerPosition;
byte[] tagByteArray=new byte[sizeOfTheTag];
//...
//when the APIC frame ID is found (this was in a function but it all comes down to this):
frameSize = ((tagByteArray[readerPosition] << 21) | (tagByteArray[readerPosition + 1] << 14) | (tagByteArray[readerPosition + 2] << 7) | (tagByteArray[readerPosition + 3]));
readerPosition+=frameSize; //(try) skipping the APIC frame
//fail!

How I solved it:
I spent HOURS AND HOURS trying to find where the problem was so I started reading an mp3 file in Hex view with Notepad++ and comparing its frame positions with the sizes that I got reading the above mentioned way. Then I turned the view to binary and tryed to do a manual transformation of the APIC frame size. Doing it in the above mentioned way (like the official site states) looks like this:
//an example size is read in binary form like this:
00000000|00000011|00111001|01000011
//then we ignore every first bit and get this:
_0000000|_0000011|_0111001|_1000011
//and then all is "pressed together" and we get:
____0000|00000000|11011100|11000011
//when converted to a decimal number we get:
56515

That, offcourse, was too small and the reader position kept falling into the APIC frame. So I tried to get the frame size by NOT ignoring the first bit, like this:
//the same example size is read in binary form like this:
00000000|00000011|00111001|01000011
//when we DON'T ignore every first bit we get this:
00000000|00000011|00111001|01000011
//and when all is "pressed together" and we get:
00000000|00000011|00111001|01000011
//and then when converted to a decimal number we get:
211267

That's a BIG difference! And now I can easily read the size of the APIC frame and ignore it, for now. Maybe later I'll try and read/change it. :) So, the new code for reading the size is:
<C# snippet>
frameSize = ((tagByteArray[readerPosition] << 24) | (tagByteArray[readerPosition + 1] << 16) | (tagByteArray[readerPosition + 2] << 8 ) | (tagByteArray[readerPosition + 3]));
All else stayed the same.

I hope this will help everyone else who gets this problem so that you don't have to waste as much time as I have. ;) The only thing I don't know is why this hasn't effected the reading of all the other frame sizes. But who cares when it works, right? ;)



If I stumble upon any problems I'll add a comment. ;)

Post Information

  • Total Posts in this topic: 3 posts
  • Users browsing this forum: No registered users and 142 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.