Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Summary: Learn about the binary file formats that are used in current and previous Microsoft Office products, including how to use them, their basic structures, and key concepts for interacting with them
programmatically.
Applies to: Excel 2010 | Office 2007 | Office 2010 | Office client | Open XML | PowerPoint 2010 | SharePoint Server 2010 | VBA | Word 2010
In this article
What Are Binary File Formats?
What Versions of Microsoft Office Use Binary File Format Files?
Viewing Content in Microsoft Office Binary File Format–Based Files
Conclusion
Additional Resources
Published: February 2011
Contents
Conclusion
Additional Resources
This article is the first in a series of articles that introduce the binary file formats used by Microsoft Office products. This first article provides an overview of how to work with Microsoft Office binary file formats in general,
and explains some of the shared structural traits and key concepts that the different formats have in common. The other articles in the series provide more detail about the individual file formats. These articles are
designed to be used in conjunction with the Microsoft Office File Format Documents available on MSDN.
This article series deals with only the four core Microsoft Office products: Microsoft Word, Microsoft PowerPoint, Microsoft Excel, and Microsoft Outlook.
All of the file data in Microsoft Office binary file formats exists in one or more streams. Each stream contains data structures to store metadata, such as user and system information and file properties, formatting
information, text content, and media content. These data structures are expressed as groups of hexadecimal numbers that the host program interprets and presents through its user interface.
Meanwhile, the organization of data structures varies within a stream. The most common unit of data is a record. A record typically contains some metadata about the file in the form of fields and flags. This includes one
or more offset values to indicate the locations of other relevant records or other data. Text is stored as numeric values that represent ANSI or Unicode characters. Images can be stored as pointers to external files or as
embedded images in their own binary file formats, such as .gif, .jpeg, or .png within the file. More active content, such as PowerPoint slide transitions, are marked with the information that is needed for interpretation,
such as the transition properties, and then rendered by the host program.
The file formats used by Microsoft Word, Microsoft PowerPoint, Microsoft Excel, and Microsoft Outlook are all documented, comprehensively, in the MSDN library in the following location: Microsoft Office File Format
Documents. From there, you can open the full specification for the file format, either directly on the MSDN site or as a .pdf file.
Note
The recommended way to perform most programming tasks in Microsoft Office is to use the Office Primary Interop Assemblies. These are a set of .NET classes that provide a complete object model for working with
Microsoft Office. This article series deals only with advanced scenarios, such as where Microsoft Office is not installed.
MS-DOC
Microsoft Word 97
Microsoft Word 2000
Microsoft Word 2002
Microsoft Office Word 2003
MS-PPT
Microsoft PowerPoint 97
Microsoft PowerPoint 2000
Microsoft PowerPoint 2002
Microsoft Office PowerPoint 2003
MS-PST
Microsoft Outlook 2000
Microsoft Outlook 2002
Microsoft Office Outlook 2003
Microsoft Office Outlook 2007
Microsoft Outlook 2010
MS-XLS
Microsoft Excel 97
Microsoft Excel 2000
Microsoft Excel 2002
Microsoft Office Excel 2003
1 of 2 22-11-2017 15:05
Understanding Office Binary File Formats https://msdn.microsoft.com/en-us/library/office/gg615407(d=printer,v=...
© 2017 Microsoft
2 of 2 22-11-2017 15:05