This is a short post as I have already wrote this up pretty nicely when I originally did it. This recipe is one I was stolen from inspired by one that James wrote but I made it both in Python and cross-platform.

You can read the original original post here.

This recipe is something I have written about a couple times as well and is in fact why I was at GLSEC last year to talk about. This script is used as part of the LOUD technique for testing i18n and l10n within an application. This one happens to be for a Java 2 resource bundle, though the it could be easily modified for any language or format (though some are easier than others).

This script, like many that manipulate files, follows a pattern which is useful to keep in mind.

• open file
• deal with only one line at a time
• do the interesting manipulation one chunk at a time
• save the modified file

Again, the script is Python.

import os

my_class = """ public class MyResources extends ListResourceBundle {
public Object[][] getContents() {
return contents;
}
static final Object[][] contents = {
// LOCALIZE THIS
{"OkKey", "OK"},
{"CancelKey", "Cancel"},
// END OF MATERIAL TO LOCALIZE
};
}
"""

# normally you would open a file and iterate over it's contents
# but this is an example...
my_class_lines = my_class.split("\n")

# process each line
for line_num in range(0, len(my_class_lines) -1):
# sure, a regex is likely the way to do this, but...
key_sep = '", "'
key_sep_at = my_class_lines[line_num].find(key_sep)
end_point = '"}'
end_point_at = my_class_lines[line_num].rfind(end_point)
if key_sep_at is not -1:
# break the line into chunks
beginning = my_class_lines[line_num][0:key_sep_at + len(key_sep)]
middle = my_class_lines[line_num][len(beginning):end_point_at]
end = my_class_lines[line_num][end_point_at:]
# LOUD the middle
middle = "^%s\$" % middle.upper()
# put it back into our copy of the class in memory
my_class_lines[line_num] = "%s%s%s" % (beginning, middle, end)

# write the file out to disk... okay, to the screen in this case
print "\n".join(my_class_lines)

The first script recipe I have written about before (here and here) and is a glass-box technique for determining, where you might want to test, how good your developers are at recording information , how complete features really are and for style adherence. What? You don’t have the code? I’ll wait while you go get it. No, I don’t care that you don’t know how to program. Now is as good a time to learn. Anyways, let’s look at the reasons for this script in a bit more detail.

• test inspiration – Any time you see comments or variables or method with words like ‘hack’ or ‘broken’ or ‘kludge’, there is a better than none chance that you can find a bug in the surrounding code. Maybe by reading it, or by thinking about the limitations of the hack. At the very least the hack should be fixed to not be a hack, so log it for redesign / reimplementation (if it has not already been logged).
• information capture – Modern IDEs let developers create little notes to themselves with a single click of a button. These notes typically take the form of comments starting with FIXME or TODO. In my world view, anytime there is a FIXME or a TODO left in the code there should be a corresponding item in the bug tracker. Why the duplicate capture of information? The bug tracker is the central means of keeping track of your task backlog and communicating it across the organization. Having information locked away where only the geeks can get it is asking for trouble and unpleasantness at the inevitable ‘project slip’ meeting. Also, developers (like testers) can get busy / bored / distracted and forget that they left a TODO over in some file over there last week for themselves.
• completeness – Technically, if I am doing the dishes or laundry and I have something left ‘TODO’ then I am not done. Similarly, if a feature is marked as done but a TODO was including in the code commits then the feature is really not done. Or at the very least needs to have some questions asked about the completeness.
• style – It is a good idea to write your code under the assumption that at some point someone is going to look at your code, and that someone is not anyone you can currently think of. For example, do you think the Netscape kids thought their code would be open sourced? Not a chance. If they had, they would not have spent months cleaning up the code for viewing. Had their test team been monitoring the code for socially unacceptable words then that process would have been a lot faster.

I have been using variations of this script since around 2005 and it seems that the market is starting to catch up with the idea. For example, Rails comes with a rake task (notes) which will find the TODOs, FIXMEs and OPTIMIZEs in the codebase. The problem with this is that it only finds those 3 values which means that we don’t get too too much value from it. And of course it only works on rails code and a lot of companies use a variety of languages depending on the project. (Or likely should.)

Here is the script (it is in python)

# this script is free to use, modify and distribute and comes with no warranties etc...

import os, os.path

def do_search(starting_path, res):
# walk through our source looking for interesting files
for root, dirs, files in os.walk(starting_path):
for f in files:
if is_interesting(f) == "yes":
# since it is, we now want to process it
process(os.path.join(root, f), res)
print_results(res)

def is_interesting(f):
# set which type of file we care about
interesting = [".java", ".cpp", ".html", ".rb"]

# check if the extension of our current file is interesting
if os.path.splitext(f)[1] in interesting:
return "yes"
else:
return "no"

def process(to_process, res):
# make a list of things we are looking for (all lowercase)
notes = ["todo", "to-do", "fixme", "fix-me"]
# open our file in "read" mode
r_f = open(to_process, "r")
# make a counter
line_num = 0
# read our file one line at a time
# circle through each of the things we are looking for
for note in notes:
# check if our line contains a developer note
# note we a lower()ing the line to avoid issues of upper vs lower case
# note also the find() function; if the thing it is looking for is not
#   found, it returns -1 else it returns the index
if line.lower().find(note) != -1:
# initialize our results to have a key of the file name we are on
if not res.has_key(to_process):
# each value will be a list
res[to_process] = []
res[to_process].append({"line_num": line_num, "line_value": line})
# increment our counter
line_num += 1
r_f.close()

def print_results(res):
# check if there was any developer notes found
if len(res) > 0:
# asking for a dictionary's keys gives you a list, so we can loop through it
# rememeber, we used the file name as the key
for f in res.keys():
# the %s syntax says "put a string here", %d is the same for a number
print "File %s has %s developer notes. They are:" % (f, len(res[f]))
# our value for the key here is a list, so again, we can loop through it
# (see, for loops are way too handy)
for note in res[f]:
# embed a tab for cleanliness
print "\tLine %d: %s" % (note["line_num"], note["line_value"])
else:
print "No developer notes found."

# set our base for our source code
source_base = "path\to\your\code"

# create a dictionary which will hold our results
results = {}

# go!
do_search(source_base, results)

The places you modify on this script to make it your own are:

• ‘interesting’ – put the extension for the files you are interested in finding things in. I have it checking Java, HTML, C++ and Ruby in this example
• ‘notes’ – these are the strings you want to look for; just keep adding to the list as you discover new breadcrumbs left by the developers. One trick mentioned in the comments is to make these all lowercase. This is because we can skip the case-sensitivity by making the criteria lowercase and lowering the line we are checking.
• ‘source_base’ – is the path to the top of the source tree being checked

And yes, this could be rewritten to use threads or something to improve its efficiency. But I have yet to be thwarted by the lack of performance of the script. So what if it takes 10 minutes to run, just as long as I get the information I am looking for.

I’m doing a 3.5h tutorial on Scripting Recipes for Testers next week at GLSEC and am starting to collect all my notes and thoughts together to not make a complete fool of myself. I know there are a number of people who read this and consider themselves scripters, so a question.

What do you think of this format?

1. Why testers should learn to script, scripting is not hard, etc.
2. Introduction to Recipe (why it is useful and the concepts it helps / assists)
3. Presentation of the script (walk through the logic, show customizations, etc.)
4. Repeat steps 2 and 3 for 6 or 7 different recipes (though I suspect I am going to have to have a number in the hat as well)
5. Some hands-on time to work with participants on their own scripting problems

Depending on the make-up of the class, the content could be too simplistic, or too complicated but I guess that is part of the “fun” of speaking. Last year at GLSEC I did the developer tutorial and not the QA one so I’m not sure of the expected demographic.

But any thoughts on the format itself? I seem to think that if I was in a similar one that it would be useful to me. Especially since all the scripts shown will be available to attendees.

I’ll be posting the recipes over the next couple days.

« Previous Page