0% found this document useful (0 votes)
208 views

Metadata Extraction Tool - Introduction PDF

The Metadata Extraction Tool was developed by the National Library of New Zealand to automatically extract preservation metadata from a variety of file formats like PDFs, images, sound files, and Microsoft Office documents. It outputs the metadata in a standard XML format for use in preservation activities. The tool supports over a dozen file formats and can extract technical metadata as well as metadata embedded in files. It has both a graphical user interface and a command line interface to allow for batch processing or individual file processing. The open source tool is written in Java and XML and its code can be extended by developers.

Uploaded by

Freddie P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views

Metadata Extraction Tool - Introduction PDF

The Metadata Extraction Tool was developed by the National Library of New Zealand to automatically extract preservation metadata from a variety of file formats like PDFs, images, sound files, and Microsoft Office documents. It outputs the metadata in a standard XML format for use in preservation activities. The tool supports over a dozen file formats and can extract technical metadata as well as metadata embedded in files. It has both a graphical user interface and a command line interface to allow for batch processing or individual file processing. The open source tool is written in Java and XML and its code can be extended by developers.

Uploaded by

Freddie P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

10/24/2017 Metadata Extraction Tool - Introduction

 
Metadata Extraction Tool

Home Introduction
 
The Metadata Extraction Tool was developed by the National Library of New Zealand to programmatically extract preservation metadata
Information Sheet from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.

Project page The tool was initially developed in 2003 and released as open source softtware in 2007. The current version can be downloaded from the
SourceForge download page.
Documentation  
Purpose of the Metadata Extraction Tool
Screenshots The Tool builds on the Library's work on digital preservation, and its logical preservation metadata schema. It is designed to:

Download automatically extracts preservation­related metadata from digital files
output that metadata in a standard format (XML) for use in preservation activities.
Bugs
The Tool was designed for preservation processes and activities, but can be used to for other tasks, such as the extraction of metadata for
resource discovery.
Contact
Supported File Formats
 
The Metadata Extract Tool includes a number of 'adapters' that extract metadata from specific file types. Extractors are currently provided
for:

Images: BMP, GIF, JPEG and TIFF.
Office documents: MS Word (version 2, 6), Word Perfect, Open Office (version 1), MS Works, MS Excel, MS PowerPoint, and PDF.
Audio and Video: WAV, MP3 (normal and with ID3Tags), BFW, FLAC.
Markup languages: HTML and XML.
Internet files: ARC

If a file type is unknown the tool applies a generic adapter, which extracts data that the host system 'knows' about any given file (such as
size, filename, and date created).

Capabilities
 
The tool has both a Microsoft Windows interface and a UNIX command line interface. This enables work to be automated through batch
processing or processed on an individual basis as required.

The application opens all files as read­only, ensuring the integrity of original files. The tool only reads header information, so the extraction
process is quick.

Open Source Development
 
The Tool is written in Java and XML and is distributed under the Apache Public License (version 2).

Developers may be interested in extending some of the key components of the Metadata Extraction Tool such as extending existing
adapters or developing new ones to process other file types, or creating new XSLT files to generate different XML output formats.

Please refer to Developers Guide for more information on these components.

http://meta-extractor.sourceforge.net/ 1/1

You might also like