Scripting Office with PowerShell: Finding and Replacing Text in Word Documents

Scripting Office with PowerShell: Finding and Replacing Text in Word Documents

In my last post, I introduced you to to the Office Visual Basic for Applications (VBA) objects and how to use them in PowerShell to automate creating an Outlook rule. In this post, let's take a look at how we can perform another task that can be pretty tedious to do in the GUI applications, but is actually really simple with PowerShell: replacing text in all Word documents in a directory.

First some context. Last year, Cage Data made the move to a fully remote company and if you've ever had to change your business address, you'll know that it can be a royal pain to update the address everywhere. One of the spots, we had to update was all of our marketing documents and sales templates with our new business address. We weren't changing towns or anything for the address, so it was just the street address that needed to be updated. I can't imagine the time it would've taken to go through all of these documents by hand, but with PowerShell, I was able to update all references to our address in about 15 minutes.

So let's get started. Just like with Outlook, we need to load in the VBA COM object for Word. We also need to find all of the Word documents in whatever directory we're targeting.

$Word = New-Object -ComObject Word.Application

# Search for all Word document types (.doc, .docx, .doct, etc.)
$WordFiles = Get-ChildItem -Recurse | ? Name -like "*.do[c,t]*"

As a note, you'll notice a neat little trick I used with the -like comparison at line 4. When using -like pretty much everyone knows you can use asterisks as a wildcard to match anything, but you can also use square brackets (just like in Regular Expressions) to specify a set of characters to match. In this case match either a c or t. If you want full regex support you can use the -match operator.

Before we actually start making our changes to files, let's setup what we're going to do in each one:

$FindText = "10 Main Street" # <= Find this text
$ReplaceText = "22 Industrial Park Road" # <= Replace it with this text

$MatchCase = $false
$MatchWholeWorld = $true
$MatchWildcards = $false
$MatchSoundsLike = $false
$MatchAllWordForms = $false
$Forward = $false
$Wrap = 1
$Format = $false
$Replace = 2

These variables are going to be used on every find and replace operation, so we might as well specify them outside of the loop we're about to do. Each variable matches an argument to the Find.Execute method that we're about to use. You can find a description of each variable on the Microsoft docs. The Wrap and Replace variables correspond to enumerations and their values can be found on the WdFindWrap and WdReplace docs.

With our variables setup, we need to now take the following steps for each document:

  1. Open the document
  2. Find and replace the text
  3. Save the file
  4. Close the document

Let's see what that looks like in code:

foreach($WordFile in $WordFiles) {
	# Open the document
    $Document = $Word.Documents.Open($WordFile.FullName)
    
    # Find and replace the text using the variables we just setup
    $Document.Content.Find.Execute($FindText, $MatchCase, $MatchWholeWorld, $MatchWildcards, $MatchSoundsLike, $MatchAllWordForms, $Forward, $Wrap, $Format, $ReplaceText, $Replace)
    
    # Save and close the document
    $Document.Close(-1) # The -1 corresponds to https://docs.microsoft.com/en-us/office/vba/api/word.wdsaveoptions
}

Whenever working with Office VBA objects that open files, you need to always pass the fully qualified path of a file in to open the document.

After we make all of our updates to documents, there's only one more step: quit the Word process that we've opened for all of this work:

$Word.Quit()

Putting it all together

$Word = New-Object -ComObject Word.Application

# Search for all Word document types (.doc, .docx, .doct, etc.)
$WordFiles = Get-ChildItem -Recurse | ? Name -like "*.do[c,t]*"

$FindText = "10 Main Street" # <= Find this text
$ReplaceText = "22 Industrial Park Road" # <= Replace it with this text

$MatchCase = $false
$MatchWholeWorld = $true
$MatchWildcards = $false
$MatchSoundsLike = $false
$MatchAllWordForms = $false
$Forward = $false
$Wrap = 1
$Format = $false
$Replace = 2

foreach($WordFile in $WordFiles) {
	# Open the document
    $Document = $Word.Documents.Open($WordFile.FullName)
    
    # Find and replace the text using the variables we just setup
    $Document.Content.Find.Execute($FindText, $MatchCase, $MatchWholeWorld, $MatchWildcards, $MatchSoundsLike, $MatchAllWordForms, $Forward, $Wrap, $Format, $ReplaceText, $Replace)
    
    # Save and close the document
    $Document.Close(-1) # The -1 corresponds to https://docs.microsoft.com/en-us/office/vba/api/word.wdsaveoptions
}

$Word.Quit()