Skip to main content

When All Else Fails, Use the GUI

In this blog post, I show you how I automated entering credit card transactions in QuickBooks using the PyAutoGUI package for Python.

Automating routine tasks is a great way to save yourself a lot of time. Whenever I spend a significant amount of time doing a partiuclar task on a computer, I ask myself if there is a way to automate it. Generally, as long as the input to the task is located on the computer, the answer is yes. The kinds of tasks that evade automation are those that have some kind of physical medium as an imput. While there are technologies that could help automate those tasks, it is much more complicated and difficult to do so.

For tasks that can be automated, my favorite language for doing so is Python. Python has a wonderfully readable, expressive syntax, and there are an astoundingly large number of libraries and packages available that make the job of automating jobs much easier.

One of my favorite simple examples of how Python saved me a lot of time is when I was working on a project converting music clips from one audio format to another. I am sure there are many ways to do this quickly and efficiently with some software tools, but I don't find myself working with audio files a lot, so I don't have any good programs for it. I found a command line tool that could get the job done though. So, the first and second time I did this, I copied a command onto the command line, modified the file name and ran the command. It took me over an hour each time to convert all of the files. The third time I attempted it, I got smarter. With a few lines of Python, I wrote a script that converted all of the files at once. It took me a few minutes to get the script right and several seconds for it to run.

import os
import subprocess

os.mkdir('output_files')
for file in os.listdir(os.getcwd()):
    if file.endswith('.wav'):
        subprocess.call(['afconvert', '-f', 'caff', '-d', 'LEI16', file, 'output_files/' + file.replace('wav', 'caf')])

Sometimes tasks are not so straightforward though. Sometimes you need to interact with a certain software program, and that program doesn't provide a good programmatic interface for you to do so.

I do some bookkeeping part time, which involves a lot of tedious, repetitive tasks. One such task I often perform is entering credit card payments into the accounting program QuickBooks. QuickBooks uses a format called IIF that allows you to import transactions. This is the only way to import transactions into QuickBooks desktop software. IIF is very poorly documented and no longer actively supported, and after spending many hours messing with it, I finally figured out that it simply cannot handle some very common types of transactions, including the ones I wanted to input.

I could have given up at this point, but I was determined to make my life easier later by making it difficult now, i.e. by continuing to work to automate entering these transactions. Well, when the software you use doesn't provide any good interfaces for an automation script, you still have recourse. Your script can interact with the program the same way you do, meaning your script can use the GUI (Graphical User Interface).

A great tool for simulating mouse clicks and keyboard interaction is PyAutoGUI <Link To PyAutoGUI> here. I don't know about you, but writing scripts that automate tasks evokes so many things that I love about programming. Writing scripts to automate tasks via the GUI takes the magic to a while new level; it is really awe-inspiring to see the computer take control of your mouse and keyboard to do your work for you!

That said, I will caution you to use this as a last resort in general. GUI automation is very difficult to get correct and very easy to wreak havoc with if something is wrong. I once sent a print job to the company printer to print several hundered copies of a source code file. Quite a few pages had been printed before someone guessed a mistake had been made and put a stop to it. The script I was running was not intended to print anything, but that's what can happen when you enter Ctrl-P in the wrong context, so reader beware!

I will assume you know how to get Python running and install a Python package. If not, there are many great resources out there to help you do this, including Sweigart's Automate the Boring Stuff with Python (mentioned in more detail below). One of the most useful things you can do before getting started is write a small script to give you the position of your cursor. This code is taken, little modified if at all, from Sweigart's book:

from __future__ import print_function
import sys
import pyautogui

print('Press Ctrl-C to quit')
try:
    while True:
        # Get and print the mouse coordinates.
        (x, y) = pyautogui.position()
        positionStr = 'X: ' + str(x).rjust(4) + ' Y: ' + str(y).rjust(4)
        print(positionStr, end='')
        print('\b'*len(positionStr), end='')
        sys.stdout.flush()
except KeyboardInterrupt:
    print('\nDone.')

Running this script will help you tremendously when attempting to write scripts to automate via GUI interaction because mcuh of what you do depends on the position of the cursor.

Before you run any pyautogui commands, I recommend setting an appropriate pause time. This is the time that pyautogui will wait between executing commands. Pausing between commands is very important because the GUI will not keep up with commands that are executed instantaneously, and you will find your script interacting with the wrong window if you don't slow it down a little.

Unfortunately, there is no exact science to this. It is highly dependent on your system and even more so on the particular application you are interacting with. You will very likely need to experiment to get it right. The QuickBooks GUI proved to be fairly responsive in my case, so I used a value of 0.1 seconds for my script.

pyautogui.PAUSE=0.1

The first thing my script does is make sure QuickBooks is in focus. Running the rest of the commands without this step will do who knows what, as the commands will go to other applications. When running GUI scripts, context is critical.

self._QUICK_BOOKS_SIDE_MENU_LOC = (26, 30)
pyautogui.click(*self._QUICK_BOOKS_SIDE_MENU_LOC)

The * operator is just a way of instructing Python to pass the tuple elements as arguments to the method. In this case, it is equivalent to the line.

pyautogui.click(26, 30)

Next, we need to open the correct window within QuickBooks to enter the payment information.

self._CUSTOMERS_MENU_LOC = (402, 12)
self._RECEIVE_PAYMENTS_MENU_CMD_LOC = (433, 267)
pyautogui.click(*self._CUSTOMERS_MENU_LOC)
pyautogui.click(*self._RECEIVE_PAYMENTS_MENU_CMD_LOC)

Once the Receive Payments window is open, the customer name field is in focus. We then start by entering information in that field. We make use of tabbing from one field to another, just as we probably would if we were entering the information manually. Here is the code for entering the basic payment information.

pyautogui.typewrite(payment_info.customer_name)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.date)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.amount)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.payment_method)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.trans_id)
pyautogui.typewrite(['tab'])

The typewrite() method will by default type the contents of any string you pass to it. However, you can also use special keywords in a string to simulate pressing certain keys such as the tab key. These must be passed to the typewrite() method in a list.

After we enter the basic payment information, we need to apply the payment to an invoice. The easiest way I found to do this in QuickBooks is to click the Auto Apply button.

self._AUTO_APPLY_BUTTON_LOC = (856, 873)
pyautogui.click(*self._AUTO_APPLY_BUTTON_LOC)

Of course, when receiving credit card payments, you'll never get the full amount. The payment processor is going to take its cut, so we need to apply credit card transaction fees. My version of QuickBooks has a Discount Info button to accomplish this.

self._DISCOUNT_INFO_BUTTON_LOC = (856, 846)
pyautogui.click(*self._DISCOUNT_INFO_BUTTON_LOC)
pyautogui.typewrite(fee)
pyautogui.typewrite(['enter'])

Here we pass the keyword "enter" to the typewrite() command to simulate pressing the enter key. At this point, we have to come back and change the original payment amount to be the net payment amount, i.e. the payment minus the fee. The reason we didn't just enter this value to begin with is because entering the full payment amount allows the QuickBooks Auto Apply feature to find the correct invoice.

We click directly on the field and then use the Ctrl-k shortcut to clear it. The keyDown() and keyUp() methods in combination with the press() method are very useful for executing multi-key shortcut commands.

self._PAYMENT_AMOUNT_LOC = (390, 237)
pyautogui.click(*self._PAYMENT_AMOUNT_LOC)
pyautogui.keyDown('ctrl')
pyautogui.press('k')
pyautogui.keyUp('ctrl')
pyautogui.typewrite(net_amount)

We then record the transaction by clicking the Save button.

self._SAVE_BUTTON_LOC = (1060, 975)
pyautogui.click(*self._SAVE_BUTTON_LOC)

Finally, we close the window. This puts the application back in the state it was when we started. This allows the entire process to be repeated several times without the beginning of the script finding itself in a different state on subsequent runs, which is likely to break it.

pyautogui.keyDown('command')
pyautogui.press('w')
pyautogui.keyUp('command')

The PyAutoGUI library has many other really awesome features. This example is just a tiny portion of some of its core functionality. For example, it has the capacity to take different actions or verify the position of the cursor depending on the color of the screen at the cursor's location.

I want to mention Al Sweigart here because I owe much of my knowledge of automation to him. If you haven't read any of his books, please check them out. Particularly, if you are interested in learning more about automating everyday tasks, read Automate the Boring Stuff with Python. It's an excellent book and is very approachable for people with little to no programming experience.

Here is the complete script, which includes some code to retrieve the data from files. It's written very specifically for the system on which it is intended to run, so it almost certainly won't produce the intended result on your system, but it makes for a good illustration of what you can do and how to do it. Thanks for reading.

Comments

Popular posts from this blog

Yet Another Post About Text Editors

Here's a debate that will never end as long as people are writing code: which text editor is superior? While there is one that I regularly use, I really think the answer to this question is none of them. They all have their strengths and limitations. It often comes personal preference, convenience and the nature of the tasks you typically need to accomplish. What is less often discussed is why we should even bother with using a text editor at all. What is a Text Editor? I'm going to back up a bit though. Let's first ask: what is a text editor? A text editor is a program with the primary function of editing and manipulating characters on a screen and writing them to and reading them from files. While text editors are very commonly used in software development, they are not restricted to that purpose. Anyone who has any textual content to write or edit can potentially benefit from using a text editor. Text editors generally don't provide much in the way of visual form

Books That Have Influenced Me and Why

A mantra that I often repeat to myself is, "Don't abandon the behaviors and habits that made you successful." I believe this trap is actually much easier to fall into than most people realize. You can sometimes observe it in the context of a professional sporting event such as American football. One team might dominate the game, playing exceptionally well for the first three quarters. Then, as they sit with a comfortable lead, you see a shift in their strategy. They start to play more conservatively, running the ball more often than they had. Their defense shifts to a "prevent" formation, designed to emphasize stopping any big plays by the other team while putting less pressure on the short game. The leading team often looks awkward in this mode. They have switched their perspective from that of pursuing victory to that of avoiding defeat. They have stopped executing in the way that gained them the lead in the first place. I have seen more than one game ult