Skip to main content

When All Else Fails, Use the GUI

In this blog post, I show you how I automated entering credit card transactions in QuickBooks using the PyAutoGUI package for Python.

Automating routine tasks is a great way to save yourself a lot of time. Whenever I spend a significant amount of time doing a partiuclar task on a computer, I ask myself if there is a way to automate it. Generally, as long as the input to the task is located on the computer, the answer is yes. The kinds of tasks that evade automation are those that have some kind of physical medium as an imput. While there are technologies that could help automate those tasks, it is much more complicated and difficult to do so.

For tasks that can be automated, my favorite language for doing so is Python. Python has a wonderfully readable, expressive syntax, and there are an astoundingly large number of libraries and packages available that make the job of automating jobs much easier.

One of my favorite simple examples of how Python saved me a lot of time is when I was working on a project converting music clips from one audio format to another. I am sure there are many ways to do this quickly and efficiently with some software tools, but I don't find myself working with audio files a lot, so I don't have any good programs for it. I found a command line tool that could get the job done though. So, the first and second time I did this, I copied a command onto the command line, modified the file name and ran the command. It took me over an hour each time to convert all of the files. The third time I attempted it, I got smarter. With a few lines of Python, I wrote a script that converted all of the files at once. It took me a few minutes to get the script right and several seconds for it to run.

import os
import subprocess

os.mkdir('output_files')
for file in os.listdir(os.getcwd()):
    if file.endswith('.wav'):
        subprocess.call(['afconvert', '-f', 'caff', '-d', 'LEI16', file, 'output_files/' + file.replace('wav', 'caf')])

Sometimes tasks are not so straightforward though. Sometimes you need to interact with a certain software program, and that program doesn't provide a good programmatic interface for you to do so.

I do some bookkeeping part time, which involves a lot of tedious, repetitive tasks. One such task I often perform is entering credit card payments into the accounting program QuickBooks. QuickBooks uses a format called IIF that allows you to import transactions. This is the only way to import transactions into QuickBooks desktop software. IIF is very poorly documented and no longer actively supported, and after spending many hours messing with it, I finally figured out that it simply cannot handle some very common types of transactions, including the ones I wanted to input.

I could have given up at this point, but I was determined to make my life easier later by making it difficult now, i.e. by continuing to work to automate entering these transactions. Well, when the software you use doesn't provide any good interfaces for an automation script, you still have recourse. Your script can interact with the program the same way you do, meaning your script can use the GUI (Graphical User Interface).

A great tool for simulating mouse clicks and keyboard interaction is PyAutoGUI <Link To PyAutoGUI> here. I don't know about you, but writing scripts that automate tasks evokes so many things that I love about programming. Writing scripts to automate tasks via the GUI takes the magic to a while new level; it is really awe-inspiring to see the computer take control of your mouse and keyboard to do your work for you!

That said, I will caution you to use this as a last resort in general. GUI automation is very difficult to get correct and very easy to wreak havoc with if something is wrong. I once sent a print job to the company printer to print several hundered copies of a source code file. Quite a few pages had been printed before someone guessed a mistake had been made and put a stop to it. The script I was running was not intended to print anything, but that's what can happen when you enter Ctrl-P in the wrong context, so reader beware!

I will assume you know how to get Python running and install a Python package. If not, there are many great resources out there to help you do this, including Sweigart's Automate the Boring Stuff with Python (mentioned in more detail below). One of the most useful things you can do before getting started is write a small script to give you the position of your cursor. This code is taken, little modified if at all, from Sweigart's book:

from __future__ import print_function
import sys
import pyautogui

print('Press Ctrl-C to quit')
try:
    while True:
        # Get and print the mouse coordinates.
        (x, y) = pyautogui.position()
        positionStr = 'X: ' + str(x).rjust(4) + ' Y: ' + str(y).rjust(4)
        print(positionStr, end='')
        print('\b'*len(positionStr), end='')
        sys.stdout.flush()
except KeyboardInterrupt:
    print('\nDone.')

Running this script will help you tremendously when attempting to write scripts to automate via GUI interaction because mcuh of what you do depends on the position of the cursor.

Before you run any pyautogui commands, I recommend setting an appropriate pause time. This is the time that pyautogui will wait between executing commands. Pausing between commands is very important because the GUI will not keep up with commands that are executed instantaneously, and you will find your script interacting with the wrong window if you don't slow it down a little.

Unfortunately, there is no exact science to this. It is highly dependent on your system and even more so on the particular application you are interacting with. You will very likely need to experiment to get it right. The QuickBooks GUI proved to be fairly responsive in my case, so I used a value of 0.1 seconds for my script.

pyautogui.PAUSE=0.1

The first thing my script does is make sure QuickBooks is in focus. Running the rest of the commands without this step will do who knows what, as the commands will go to other applications. When running GUI scripts, context is critical.

self._QUICK_BOOKS_SIDE_MENU_LOC = (26, 30)
pyautogui.click(*self._QUICK_BOOKS_SIDE_MENU_LOC)

The * operator is just a way of instructing Python to pass the tuple elements as arguments to the method. In this case, it is equivalent to the line.

pyautogui.click(26, 30)

Next, we need to open the correct window within QuickBooks to enter the payment information.

self._CUSTOMERS_MENU_LOC = (402, 12)
self._RECEIVE_PAYMENTS_MENU_CMD_LOC = (433, 267)
pyautogui.click(*self._CUSTOMERS_MENU_LOC)
pyautogui.click(*self._RECEIVE_PAYMENTS_MENU_CMD_LOC)

Once the Receive Payments window is open, the customer name field is in focus. We then start by entering information in that field. We make use of tabbing from one field to another, just as we probably would if we were entering the information manually. Here is the code for entering the basic payment information.

pyautogui.typewrite(payment_info.customer_name)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.date)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.amount)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.payment_method)
pyautogui.typewrite(['tab'])
pyautogui.typewrite(payment_info.trans_id)
pyautogui.typewrite(['tab'])

The typewrite() method will by default type the contents of any string you pass to it. However, you can also use special keywords in a string to simulate pressing certain keys such as the tab key. These must be passed to the typewrite() method in a list.

After we enter the basic payment information, we need to apply the payment to an invoice. The easiest way I found to do this in QuickBooks is to click the Auto Apply button.

self._AUTO_APPLY_BUTTON_LOC = (856, 873)
pyautogui.click(*self._AUTO_APPLY_BUTTON_LOC)

Of course, when receiving credit card payments, you'll never get the full amount. The payment processor is going to take its cut, so we need to apply credit card transaction fees. My version of QuickBooks has a Discount Info button to accomplish this.

self._DISCOUNT_INFO_BUTTON_LOC = (856, 846)
pyautogui.click(*self._DISCOUNT_INFO_BUTTON_LOC)
pyautogui.typewrite(fee)
pyautogui.typewrite(['enter'])

Here we pass the keyword "enter" to the typewrite() command to simulate pressing the enter key. At this point, we have to come back and change the original payment amount to be the net payment amount, i.e. the payment minus the fee. The reason we didn't just enter this value to begin with is because entering the full payment amount allows the QuickBooks Auto Apply feature to find the correct invoice.

We click directly on the field and then use the Ctrl-k shortcut to clear it. The keyDown() and keyUp() methods in combination with the press() method are very useful for executing multi-key shortcut commands.

self._PAYMENT_AMOUNT_LOC = (390, 237)
pyautogui.click(*self._PAYMENT_AMOUNT_LOC)
pyautogui.keyDown('ctrl')
pyautogui.press('k')
pyautogui.keyUp('ctrl')
pyautogui.typewrite(net_amount)

We then record the transaction by clicking the Save button.

self._SAVE_BUTTON_LOC = (1060, 975)
pyautogui.click(*self._SAVE_BUTTON_LOC)

Finally, we close the window. This puts the application back in the state it was when we started. This allows the entire process to be repeated several times without the beginning of the script finding itself in a different state on subsequent runs, which is likely to break it.

pyautogui.keyDown('command')
pyautogui.press('w')
pyautogui.keyUp('command')

The PyAutoGUI library has many other really awesome features. This example is just a tiny portion of some of its core functionality. For example, it has the capacity to take different actions or verify the position of the cursor depending on the color of the screen at the cursor's location.

I want to mention Al Sweigart here because I owe much of my knowledge of automation to him. If you haven't read any of his books, please check them out. Particularly, if you are interested in learning more about automating everyday tasks, read Automate the Boring Stuff with Python. It's an excellent book and is very approachable for people with little to no programming experience.

Here is the complete script, which includes some code to retrieve the data from files. It's written very specifically for the system on which it is intended to run, so it almost certainly won't produce the intended result on your system, but it makes for a good illustration of what you can do and how to do it. Thanks for reading.

Comments

Popular posts from this blog

Don't Build it Yourself

This post discusses the whether to buy or build software components needed for your application. The title should give you a clue as to which side of the discussion I land on. Someone Else's Code One of the behaviors that I've noticed is significantly more often exhibited by junior developers versus more experienced ones is the tendency to build everything from scratch. While I can appreciate the enthusiasm for wanting to create something better than has been done before, this is often simply a waste of time. For many of the software tasks you will want to accomplish, someone will have made a library of framework that does 95% of what you want out of the box, lets you configure or extend it to do another 4.9%, and you come to realize that 0.1% remaining wasn't all that important anyway. In fact, with the proliferation of open source software, many of these software packages are 100% customizable because you have access to all of the source code and can change it at will...

Some Thoughts on Software Estimation

"All programmers are optimists." Fred Brooks's adage is as true today as it was when he first penned the words. The truth is that people are naturally very poor estimators. We all have a tendency to believe everything will go more smoothly than it probably will. Things will work out just as we planned, assuming we bothered to plan at all. I encounter the reality of this often in my work. The first deadline passes, and we assure ourselves that we can catch up to meet the next one. The next deadline passes, and we finally come to terms with the reality that we're not going to make it. Then, we push the schedule back a week, when if we really reflected on the situation, we would realize we were probably a couple of months out from actually completing the project. The product gets later and later one day at a time. Lots of yelling and finger-pointing ensues. Management demands better estimates, but rejects any that don't fit with their view of when we "sh...

Notions of Debugging

Bugs can be really evasive and produce some surprising, even "impossible" behaviors. Part of a software developer's job is to relentlessly hunt them down and destroy them. However, often at least half the battle is finding them to begin with. The best debuggers I have seen rely a lot on experience, both experience with the application and experience with debugging in general. The former is situation-specific, and there many not be any good ways to advance in that regard other than practice. However, we can extract some information pertaining to good debugging practices in general. Dig for Clues Using All Available Resources The first step in debugging an issue is ideally to reproduce the problem. This isn't always possible, but if you want any significant level of confidence that you have fixed a bug, you need to observe the broken behavior and understand how to trigger it from a user's perspective. Sometimes we're lucky, and a tester has taken the time to...