GScripts, Gmail: Regex

For my work I receive a few entirely text based emails for invoices, bank drafts, and some that are just informational that I commonly translate into either spreadsheets or into a Document.

I have found that using GScripts accompanied with Regular Expressions has greatly helped me automate this translation process for a number of different projects. I have used it for Reconciling Bank drafts, creating searchable files with a whole months worth of invoices in a text document, and evaluating manual entries for misplaced decimals.

Finding multiple spaces and reducing them down to one lays the foundation for separating columns.

Using ^ for beginning of sentence and $ for end of sentence as well as \n\r depending on how the body of the document needs to be parsed are great for breaking the email into rows.

Once I have those basics set I can start looking for lines or parts of lines that can be removed entirely and use an expression to match those especially if it’s multiple full lines with string.replace(/^(this|that)$/,"") in one expression to make the entire line blank.

Partial line removals each get their own Regex

I then replace all commas with nothing, all spaces with commas, and split each row to create columns.
string.replace( /,/g ,"").replace(" ",",")

At this point I may need to look at the length of each row and as I iterate through each row, throwing out blank rows, I will also add additional columns to clean up the layout before it’s placed into Sheets.

A specific example involved pulling a table from emails. I was able to string.replace(/(<td>|<tr>|,|<table border=1>)/igm,"") to get rid of code that I wasn’t going to use for splitting

I was then able to make my rows and columns in separate steps

string.replace(/(<\/tr>|\))/igm,",").split(",") which results in multiple rows
string[i].replace(/(\<\/td>|))/igm,",").split(",") which iterates through the resulting rows and makes columns

I did have an unusually sized row at the bottom so to make my columns uniform in this case I did a comparison of the length of the most recent row to the first row then cutoff the remaining part of the array using splice

if (string[k].length < string[0].length) {
string.splice(k,string.length-k)
break
}

Leave a comment

Design a site like this with WordPress.com
Get started