oracular (3) Audio::Scan.3pm.gz

Provided by: libaudio-scan-perl_1.01-2build4_amd64 bug

NAME

       Audio::Scan - Fast C metadata and tag reader for all common audio file formats

SYNOPSIS

           use Audio::Scan;

           my $data = Audio::Scan->scan('/path/to/file.mp3');

           # Just file info
           my $info = Audio::Scan->scan_info('/path/to/file.mp3');

           # Just tags
           my $tags = Audio::Scan->scan_tags('/path/to/file.mp3');

           # Scan without reading (possibly large) artwork into memory.
           # Instead of binary artwork data, the size of the artwork will be returned instead.
           {
               local $ENV{AUDIO_SCAN_NO_ARTWORK} = 1;
               my $data = Audio::Scan->scan('/path/to/file.mp3');
           }

           # Scan a filehandle
           open my $fh, '<', 'my.mp3';
           my $data = Audio::Scan->scan_fh( mp3 => $fh );
           close $fh;

           # Scan and compute an audio MD5 checksum
           my $data = Audio::Scan->scan( '/path/to/file.mp3', { md5_size => 100 * 1024 } );
           my $md5 = $data->{info}->{audio_md5};

DESCRIPTION

       Audio::Scan is a C-based scanner for audio file metadata and tag information. It currently
       supports MP3, MP4, Ogg Vorbis, FLAC, ASF, WAV, AIFF, Musepack, Monkey's Audio, and
       WavPack.

       See below for specific details about each file format.

METHODS

   scan( $path, [ \%OPTIONS ] )
       Scans $path for both metadata and tag information.  The type of scan performed is
       determined by the file's extension.  Supported extensions are:

           MP3:  mp3, mp2
           MP4:  mp4, m4a, m4b, m4p, m4v, m4r, k3g, skm, 3gp, 3g2, mov
           AAC (ADTS): aac
           Ogg:  ogg, oga
           FLAC: flc, flac, fla
           ASF:  wma, wmv, asf
           Musepack:  mpc, mpp, mp+
           Monkey's Audio:  ape, apl
           WAV: wav
           AIFF: aiff, aif
           WavPack: wv

       This method returns a hashref containing two other hashrefs: info and tags.  The contents
       of the info and tag hashes vary depending on file format, see below for details.

       An optional hashref may be provided with the following values:

           md5_size => $audio_bytes_to_checksum

       An MD5 will be computed of the first N audio bytes. Any tags in the file are automatically
       skipped, so this is a useful way of determining if a file's audio content is the same even
       if tags may have been changed.  The hex MD5 value is returned in the $info->{audio_md5}
       key.  This option will reduce performance, so choose a small enough size that works for
       you, you should probably avoid using more than 64K for example.

       For FLAC files that already contain an MD5 checksum, this value will be used instead of
       calculating a new one.

           md5_offset => $offset

       Begin computing the audio_md5 value starting at $offset.  If this value is not specified,
       $offset defaults to a point in the middle of the file.

   scan_info( $path, [ \%OPTIONS ] )
       If you only need file metadata and don't care about tags, you can use this method.

   scan_tags( $path, [ \%OPTIONS ] )
       If you only need the tags and don't care about the metadata, use this method.

   scan_fh( $type => $fh, [ \%OPTIONS ] )
       Scans a filehandle. $type is the type of file to scan as, i.e. "mp3" or "ogg".  Note that
       FLAC does not support reading from a filehandle.

   find_frame( $path, $timestamp_in_ms )
       Returns the byte offset to the first audio frame starting from the given timestamp (in
       milliseconds).

       MP3, Ogg, FLAC, ASF, MP4
           The byte offset to the data packet containing this timestamp will be returned. For
           file formats that don't provide timestamp information such as MP3, the best estimate
           for the location of the timestamp will be returned.  This will be more accurate if the
           file has a Xing header or is CBR for example.

       WAV, AIFF, Musepack, Monkey's Audio, WavPack
           Not yet supported by find_frame.

   find_frame_return_info( $mp4_path, $timestamp_in_ms )
       The header of an MP4 file contains various metadata that refers to the structure of the
       audio data, making seeking more difficult to perform. This method will return the usual
       $info hash with 2 additional keys:

           seek_offset - The seek offset in bytes
           seek_header - A rewritten MP4 header that can be prepended to the audio data
                         found at seek_offset to construct a valid bitstream. Specifically,
                         the following boxes are rewritten: stts, stsc, stsz, stco

       For example, to seek 30 seconds into a file and write out a new MP4 file seeked to this
       point:

           my $info = Audio::Scan->find_frame_return_info( $file, 30000 );

           open my $f, '<', $file;
           sysseek $f, $info->{seek_offset}, 1;

           open my $fh, '>', 'seeked.m4a';
           print $fh $info->{seek_header};

           while ( sysread( $f, my $buf, 65536 ) ) {
               print $fh $buf;
           }

           close $f;
           close $fh;

   find_frame_fh( $type => $fh, $offset )
       Same as "find_frame", but with a filehandle.

   find_frame_fh_return_info( $type => $fh, $offset )
       Same as "find_frame_return_info", but with a filehandle.

   has_flac()
       Deprecated.  Always returns 1 now that FLAC is always enabled.

   is_supported( $path )
       Returns 1 if the given path can be scanned by Audio::Scan, or 0 if not.

   get_types()
       Returns an array of strings of the file types supported by Audio::Scan.

   extensions_for( $type )
       Returns an array of strings of the file extensions that are considered to be the file type
       $type.

   type_for( $extension )
       Returns file type for a given extension. Returns undef for unsupported extensions.

SKIPPING ARTWORK

       To save memory while reading tags, you can opt to skip potentially large embedded artwork.
       To do this, set the environment variable AUDIO_SCAN_NO_ARTWORK:

           local $ENV{AUDIO_SCAN_NO_ARTWORK} = 1;
           my $tags = Audio::Scan->scan_tags($file);

       This will return the length of the embedded artwork instead of the actual image data.  In
       some cases it will also return a byte offset to the image data, which can be used to
       extract the image using more efficient means.  Note that the offset is not always returned
       so if you want to use this data make sure to check for offset.  If offset is not present,
       the only way to get the image data is to perform a normal tag scan without the environment
       variable set.

       One limitation that currently exists is that memory for embedded images is still allocated
       for ASF and Ogg Vorbis files.

       This information is returned in different ways depending on the format:

       ID3 (MP3, AAC, WAV, AIFF):

           $tags->{APIC}->[3]: image length
           $tags->{APIC}->[4]: image offset (unless APIC would need unsynchronization)

       MP4:

           $tags->{COVR}: image length
           $tags->{COVR_offset}: image offset (always available)

       Ogg Vorbis:

           $tags->{ALLPICTURES}->[0]->{image_data}: image length
           Image offset is not supported with Vorbis because the data is always base64-encoded.

       FLAC:

           $tags->{ALLPICTURES}->[0]->{image_data}: image length
           $tags->{ALLPICTURES}->[0]->{offset}: image offset (always available)

       ASF:

           $tags->{'WM/Picture'}->{image}: image length
           $tags->{'WM/Picture'}->{offset}: image offset (always available)

       APE, Musepack, WavPack, MP3 with APEv2:

           $tags->{'COVER ART (FRONT)'}: image length
           $tags->{'COVER ART (FRONT)_offset'}: image offset (always available)

MP3

   INFO
       The following metadata about a file may be returned:

           id3_version (i.e. "ID3v2.4.0")
           id3_was_unsynced (if a v2.2/v2.3 file needed whole-tag unsynchronization)
           song_length_ms (duration in milliseconds)
           layer (i.e. 3)
           stereo
           samples_per_frame
           padding
           audio_size (size of all audio frames)
           audio_offset (byte offset to first audio frame)
           bitrate (in bps, determined using Xing/LAME/VBRI if possible, or average in the worst case)
           samplerate (in kHz)
           vbr (1 if file is VBR)
           dlna_profile (if file is compliant)

           If a Xing header is found:
           xing_frames
           xing_bytes
           xing_quality

           If a VBRI header is found:
           vbri_delay
           vbri_frames
           vbri_bytes
           vbri_quality

           If a LAME header is found:
           lame_encoder_version
           lame_tag_revision
           lame_vbr_method
           lame_lowpass
           lame_replay_gain_radio
           lame_replay_gain_audiophile
           lame_encoder_delay
           lame_encoder_padding
           lame_noise_shaping
           lame_stereo_mode
           lame_unwise_settings
           lame_source_freq
           lame_surround
           lame_preset

   TAGS
       Raw tags are returned as found.  This means older tags such as ID3v1 and ID3v2.2/v2.3 are
       converted to ID3v2.4 tag names.  Multiple instances of a tag in a file will be returned as
       arrays.  Complex tags such as APIC and COMM are returned as arrays.  All tag fields are
       converted to upper-case.  All text is converted to UTF-8.

       Sample tag data:

           tags => {
                 ALBUMARTISTSORT => "Solar Fields",
                 APIC => [ "image/jpeg", 3, "", <binary data snipped> ],
                 CATALOGNUMBER => "INRE 017",
                 COMM => ["eng", "", "Amazon.com Song ID: 202981429"],
                 "MUSICBRAINZ ALBUM ARTIST ID" => "a2af1f31-c9eb-4fff-990c-c4f547a11b75",
                 "MUSICBRAINZ ALBUM ID" => "282143c9-6191-474d-a31a-1117b8c88cc0",
                 "MUSICBRAINZ ALBUM RELEASE COUNTRY" => "FR",
                 "MUSICBRAINZ ALBUM STATUS" => "official",
                 "MUSICBRAINZ ALBUM TYPE" => "album",
                 "MUSICBRAINZ ARTIST ID" => "a2af1f31-c9eb-4fff-990c-c4f547a11b75",
                 "REPLAYGAIN_ALBUM_GAIN" => "-2.96 dB",
                 "REPLAYGAIN_ALBUM_PEAK" => "1.045736",
                 "REPLAYGAIN_TRACK_GAIN" => "+3.60 dB",
                 "REPLAYGAIN_TRACK_PEAK" => "0.892606",
                 TALB => "Leaving Home",
                 TCOM => "Magnus Birgersson",
                 TCON => "Ambient",
                 TCOP => "2005 ULTIMAE RECORDS",
                 TDRC => "2004-10",
                 TIT2 => "Home",
                 TPE1 => "Solar Fields",
                 TPE2 => "Solar Fields",
                 TPOS => "1/1",
                 TPUB => "Ultimae Records",
                 TRCK => "1/11",
                 TSOP => "Solar Fields",
                 UFID => [
                       "http://musicbrainz.org",
                       "1084278a-2254-4613-a03c-9fed7a8937ca",
                 ],
           },

MP4

   INFO
       The following metadata about a file may be returned:

           audio_offset (byte offset to start of mdat)
           audio_size
           compatible_brands
           file_size
           leading_mdat (if file has mdat before moov)
           major_brand
           minor_version
           song_length_ms
           timescale
           dlna_profile (if file is compliant)
           tracks (array of tracks in the file)
               Each track may contain:

               audio_type
               avg_bitrate
               bits_per_sample
               channels
               duration
               encoding
               handler_name
               handler_type
               id
               max_bitrate
               samplerate

   TAGS
       Tags are returned in a hash with all keys converted to upper-case.  Keys starting with
       0xA9 (copyright symbol) will have this character stripped out.  Sample tag data:

           tags => {
              AART              => "Album Artist",
              ALB               => "Album",
              ART               => "Artist",
              CMT               => "Comments",
              COVR              => <binary data snipped>,
              CPIL              => 1,
              DAY               => 2009,
              DESC              => "Video Description",
              DISK              => "1/2",
              "ENCODING PARAMS" => "vers\0\0\0\1acbf\0\0\0\2brat\0\1w\0cdcv\0\1\6\5",
              GNRE              => "Jazz",
              GRP               => "Grouping",
              ITUNNORM          => " 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000",
              ITUNSMPB          => " 00000000 00000840 000001E4 00000000000001DC 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000",
              LYR               => "Lyrics",
              NAM               => "Name",
              PGAP              => 1,
              SOAA              => "Sort Album Artist",
              SOAL              => "Sort Album",
              SOAR              => "Sort Artist",
              SOCO              => "Sort Composer",
              SONM              => "Sort Name",
              SOSN              => "Sort Show",
              TMPO              => 120,
              TOO               => "iTunes 8.1.1, QuickTime 7.6",
              TRKN              => "1/10",
              TVEN              => "Episode ID",
              TVES              => 12,
              TVSH              => "Show",
              TVSN              => 12,
              WRT               => "Composer",
           },

AAC (ADTS)

   INFO
       The following metadata about a file is returned:

           audio_offset
           audio_size
           bitrate (in bps)
           channels
           file_size
           profile (Main, LC, or SSR)
           samplerate (in kHz)
           song_length_ms (duration in milliseconds)
           dlna_profile (if file is compliant)

OGG VORBIS

   INFO
       The following metadata about a file is returned:

           version
           channels
           stereo
           samplerate (in kHz)
           bitrate_average (in bps)
           bitrate_upper
           bitrate_nominal
           bitrate_lower
           blocksize_0
           blocksize_1
           audio_offset (byte offset to audio)
           audio_size
           song_length_ms (duration in milliseconds)

   TAGS
       Raw Vorbis comments are returned.  All comment keys are capitalized.

FLAC

   INFO
       The following metadata about a file is returned:

           channels
           samplerate (in kHz)
           bitrate (in bps)
           file_size
           audio_offset (byte offset to first audio frame)
           audio_size
           song_length_ms (duration in milliseconds)
           bits_per_sample
           frames
           minimum_blocksize
           maximum_blocksize
           minimum_framesize
           maximum_framesize
           audio_md5
           total_samples

   TAGS
       Raw FLAC comments are returned.  All comment keys are capitalized.  Some data returned is
       special:

       APPLICATION

           Each application block is returned in the APPLICATION tag keyed by application ID.

       CUESHEET_BLOCK

           The CUESHEET_BLOCK tag is an array containing each line of the cue sheet.

       ALLPICTURES

           Embedded pictures are returned in an ALLPICTURES array.  Each picture has the following metadata:

               mime_type
               description
               width
               height
               depth
               color_index
               image_data
               picture_type

ASF (Windows Media Audio/Video)

   INFO
       The following metadata about a file may be returned.  Reading the ASF spec is encouraged
       if you want to find out more about any of these values.

           audio_offset (byte offset to first data packet)
           audio_size
           broadcast (boolean, whether the file is a live broadcast or not)
           codec_list (array of information about codecs used in the file)
           creation_date (UNIX timestamp when file was created)
           data_packets
           drm_key
           drm_license_url
           drm_protection_type
           drm_data
           file_id (unique file ID)
           file_size
           index_blocks
           index_entry_interval (in milliseconds)
           index_offsets (byte offsets for each second of audio, per stream. Useful for seeking)
           index_specifiers (indicates which stream a given index_offset points to)
           language_list (array of languages referenced by the file's metadata)
           lossless (boolean)
           max_bitrate
           max_packet_size
           min_packet_size
           mutex_list (mutually exclusive stream information)
           play_duration_ms
           preroll
           script_commands
           script_types
           seekable (boolean, whether the file is seekable or not)
           send_duration_ms
           song_length_ms (the actual length of the audio, in milliseconds)
           dlna_profile (if file is compliant)

       STREAMS

       The streams array contains metadata related to an individul stream within the file.  The
       following metadata may be returned:

           DeviceConformanceTemplate
           IsVBR
           alt_bitrate
           alt_buffer_fullness
           alt_buffer_size
           avg_bitrate (most accurate bitrate for this stream)
           avg_bytes_per_sec (audio only)
           bitrate
           bits_per_sample (audio only)
           block_alignment (audio only)
           bpp (video only)
           buffer_fullness
           buffer_size
           channels (audio only)
           codec_id (audio only)
           compression_id (video only)
           encode_options
           encrypted (boolean)
           error_correction_type
           flag_seekable (boolean)
           height (video only)
           index_type
           language_index (offset into language_list array)
           max_object_size
           samplerate (in kHz) (audio only)
           samples_per_block
           stream_number
           stream_type
           super_block_align
           time_offset
           width (video only)

   TAGS
       Raw tags are returned.  Tags that occur more than once are returned as arrays.  In
       contrast to the other formats, tag keys are NOT capitalized. There is one special key:

       WM/Picture

       Pictures are returned as a hash with the following keys:

           image_type (numeric type, same as ID3v2 APIC)
           mime_type
           description
           image

WAV

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bitrate (in bps)
           bits_per_sample
           block_align
           channels
           dlna_profile (if file is compliant)
           file_size
           format (WAV format code, 1 == PCM)
           id3_version (if an ID3v2 tag is found)
           samplerate (in kHz)
           song_length_ms

   TAGS
       WAV files can contain several different types of tags.  "Native" WAV tags found in a LIST
       block may include these and others:

           IARL - Archival Location
           IART - Artist
           ICMS - Commissioned
           ICMT - Comment
           ICOP - Copyright
           ICRD - Creation Date
           ICRP - Cropped
           IENG - Engineer
           IGNR - Genre
           IKEY - Keywords
           IMED - Medium
           INAM - Name (Title)
           IPRD - Product (Album)
           ISBJ - Subject
           ISFT - Software
           ISRC - Source
           ISRF - Source Form
           TORG - Label
           LOCA - Location
           TVER - Version
           TURL - URL
           TLEN - Length
           ITCH - Technician
           TRCK - Track
           ITRK - Track

       ID3v2 tags can also be embedded within WAV files.  These are returned exactly as for MP3
       files.

AIFF

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bitrate (in bps)
           bits_per_sample
           block_align
           channels
           compression_name (if AIFC)
           compression_type (if AIFC)
           dlna_profile (if file is compliant)
           file_size
           id3_version (if an ID3v2 tag is found)
           samplerate (in kHz)
           song_length_ms

   TAGS
       ID3v2 tags can be embedded within AIFF files.  These are returned exactly as for MP3
       files.

MONKEY'S AUDIO (APE)

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bitrate (in bps)
           channels
           compression
           file_size
           samplerate (in kHz)
           song_length_ms
           version

   TAGS
       APEv2 tags are returned as a hash of key/value pairs.

MUSEPACK

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bitrate (in bps)
           channels
           encoder
           file_size
           profile
           samplerate (in kHz)
           song_length_ms

   TAGS
       Musepack uses APEv2 tags.  They are returned as a hash of key/value pairs.

WAVPACK

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bitrate (in bps)
           bits_per_sample
           channels
           encoder_version
           file_size
           hybrid (1 if file is lossy) (v4 only)
           lossless (1 if file is lossless) (v4 only)
           samplerate
           song_length_ms
           total_samples

   TAGS
       WavPack uses APEv2 tags.  They are returned as a hash of key/value pairs.

DSF

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bits_per_sample
           channels
           song_length_ms
           samplerate
           block_size_per_channel

   TAGS
       ID3v2 tags can be embedded within DSF files.  These are returned exactly as for MP3 files.

DSDIFF (DFF)

   INFO
       The following metadata about a file may be returned.

           audio_offset
           audio_size
           bits_per_sample
           channels
           song_length_ms
           samplerate
           tag_diti_title
           tag_diar_artist

   TAGS
       No separate tags are supported by the DSDIFF format.

THANKS

       Logitech & Slim Devices, for letting us release so much of our code to the world.  Long
       live Squeezebox!

       Kimmo Taskinen, Adrian Smith, Clive Messer, and Jurgen Kramer for DSF/DSDIFF support and
       various other fixes.

       Some code from the Rockbox project was very helpful in implementing ASF and MP4 seeking.

       Some of the file format parsing code was derived from the mt-daapd project, and adapted by
       Netgear.  It has been heavily rewritten to fix bugs and add more features.

       The source to the original Netgear C scanner for SqueezeCenter is located at
       <http://svn.slimdevices.com/repos/slim/7.3/trunk/platforms/readynas/contrib/scanner>

       The audio MD5 feature uses an MD5 implementation by L. Peter Deutsch, <ghost@aladdin.com>.

SEE ALSO

       ASF Spec <http://www.microsoft.com/windows/windowsmedia/forpros/format/asfspec.aspx>

       MP4 Info:
       <http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip>
       <http://www.geocities.com/xhelmboyx/quicktime/formats/mp4-layout.txt>

AUTHORS

       Andy Grundman, <andy@hybridized.org>

       Dan Sully, <daniel@cpan.org>

       Copyright (C) 2010-2011 Logitech, Inc.

       This program is free software; you can redistribute it and/or modify it under the terms of
       the GNU General Public License as published by the Free Software Foundation; either
       version 2 of the License, or (at your option) any later version.