Retrieve docProps from a .docx file

In SharePoint Online you can assign metadata to documents. However, Office Open XML files can also have document properties. In this article I am showing you how you can retrieve those via Power Automate.

docproperties_example

Inspiration

This question from DanielStrid:

I’m trying to get the “File Version” property set for a file in Windows Explorer in a Flow. We have a version handling for documents and would like to have it as Metadata and process files differently in a Power Automate flow depending on version. I can not find the other data, for example “Authors” either.

Power Users Community thread: Get File Property ‘Version Number’ for file in Sharepoint (NOT Sharepoint version).

Zip Files

It is probably useful to know that Office Open XML files like .docx are .zip files with XML and other data files, along with a specification of the relationships between them. One of those folders is docProps which contains the core.xml with the properties. Our goal is to retrieve one or all of those properties from the core.xml.

Properties method approach

I managed to find two approaches to retrieve the document properties. One is a bit (over)complicated and the other one is a bit simpler.

The first approach is the Unzip approach. In this approach we are going to rename the .docx as a zip, extract and retrieve the core.xml from the docProps folder. With some substring functions you are able to find your properties.

docprops_odfb

The second approach is the Properties method approach. We can use the Properties method from the REST API in a Send an HTTP request action. It is more straightforward and you can even achieve this with one action. So, it would be my suggested approach.

simpler_way

Unzip approach

Like mentioned earlier in this blog, the Properties method approach is probably simpler and easier. However, that does not mean we can have some fun and trying the unzip approach with Power Automate :P.

In addition to that this setup might be useful for other scenarios as well. That is why I still wanted to share it.

unzip_approach

1. Add a Manually trigger a flow trigger action.

2. Add a Get file content using path action (from the SharePoint Connector).
This step is to retrieve the content of the file located in a SharePoint Online location.

getfilecontentusingpath

3. Add a Create file action. Select a location and a filename you want to use. Use the body of the previous action.
This step is to create a new temporary file in OneDrive for Business location.

createfile_odfb

4. Add a Move or rename a file using path action. Use the Path field of the previous action.
This step is to rename the temporary file to a .zip file.

renameafileusingpath

5. Add a Extract archive to folder action. Use the Path field of the previous action.
This step is to unzip the .zip file.

extractarchivetofolder

6. Add a Get file content using path action (from the OneDrive for Business Connector).
This step is retrieve the core.xml content.

getfilecontentusingpath_odfb

7. Add a Compose action. Add the two expressions of the code snippet below in the expression editor.
This step uses split and substring functions to retrieve the correct property values. It also might be possible with xpath, but I have not tried that.

compose_substringfunction

That should be it for the setup.

Happy testing!

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.