Posted on October 30, 2008 in GLSEC2008 by adam1 Comment »

Another script I have previously posted was about parsing one of our Apache’s log files to determine the actual browser share on our site. Sure, you can find the aggregate numbers online from a pan-Internet perspective, but when choosing test data for our site I really only care about the numbers as they relate to our site. This is another important piece of the scripting puzzle: make it work for your own needs first. Only then should you make it generic. And only if you have time.

I’m sure there are more scalable and/or efficient ways of gathering and displaying the data (dynamically adding keys to the hash for example) but the script is both unrefactored and one of the first ones I wrote in Ruby so there is a lot of newbie-isms in there as well.

The script itself is in Ruby (as mentioned above) and is shown in the midst of a larger discussion on the definition of Quality.

This is a short post as I have already wrote this up pretty nicely when I originally did it. This recipe is one I was stolen from inspired by one that James wrote but I made it both in Python and cross-platform.

You can read the original original post here.

This recipe is something I have written about a couple times as well and is in fact why I was at GLSEC last year to talk about. This script is used as part of the LOUD technique for testing i18n and l10n within an application. This one happens to be for a Java 2 resource bundle, though the it could be easily modified for any language or format (though some are easier than others).

This script, like many that manipulate files, follows a pattern which is useful to keep in mind.

• open file
• deal with only one line at a time
• do the interesting manipulation one chunk at a time
• save the modified file

Again, the script is Python.

import os

my_class = """ public class MyResources extends ListResourceBundle {
public Object[][] getContents() {
return contents;
}
static final Object[][] contents = {
// LOCALIZE THIS
{"OkKey", "OK"},
{"CancelKey", "Cancel"},
// END OF MATERIAL TO LOCALIZE
};
}
"""

# normally you would open a file and iterate over it's contents
# but this is an example...
my_class_lines = my_class.split("\n")

# process each line
for line_num in range(0, len(my_class_lines) -1):
# sure, a regex is likely the way to do this, but...
key_sep = '", "'
key_sep_at = my_class_lines[line_num].find(key_sep)
end_point = '"}'
end_point_at = my_class_lines[line_num].rfind(end_point)
if key_sep_at is not -1:
# break the line into chunks
beginning = my_class_lines[line_num][0:key_sep_at + len(key_sep)]
middle = my_class_lines[line_num][len(beginning):end_point_at]
end = my_class_lines[line_num][end_point_at:]
# LOUD the middle
middle = "^%s$" % middle.upper() # put it back into our copy of the class in memory my_class_lines[line_num] = "%s%s%s" % (beginning, middle, end) # write the file out to disk... okay, to the screen in this case print "\n".join(my_class_lines) Posted on October 29, 2008 in GLSEC2008 by adamNo Comments » The first script recipe I have written about before (here and here) and is a glass-box technique for determining, where you might want to test, how good your developers are at recording information , how complete features really are and for style adherence. What? You don’t have the code? I’ll wait while you go get it. No, I don’t care that you don’t know how to program. Now is as good a time to learn. Anyways, let’s look at the reasons for this script in a bit more detail. • test inspiration – Any time you see comments or variables or method with words like ‘hack’ or ‘broken’ or ‘kludge’, there is a better than none chance that you can find a bug in the surrounding code. Maybe by reading it, or by thinking about the limitations of the hack. At the very least the hack should be fixed to not be a hack, so log it for redesign / reimplementation (if it has not already been logged). • information capture – Modern IDEs let developers create little notes to themselves with a single click of a button. These notes typically take the form of comments starting with FIXME or TODO. In my world view, anytime there is a FIXME or a TODO left in the code there should be a corresponding item in the bug tracker. Why the duplicate capture of information? The bug tracker is the central means of keeping track of your task backlog and communicating it across the organization. Having information locked away where only the geeks can get it is asking for trouble and unpleasantness at the inevitable ‘project slip’ meeting. Also, developers (like testers) can get busy / bored / distracted and forget that they left a TODO over in some file over there last week for themselves. • completeness – Technically, if I am doing the dishes or laundry and I have something left ‘TODO’ then I am not done. Similarly, if a feature is marked as done but a TODO was including in the code commits then the feature is really not done. Or at the very least needs to have some questions asked about the completeness. • style – It is a good idea to write your code under the assumption that at some point someone is going to look at your code, and that someone is not anyone you can currently think of. For example, do you think the Netscape kids thought their code would be open sourced? Not a chance. If they had, they would not have spent months cleaning up the code for viewing. Had their test team been monitoring the code for socially unacceptable words then that process would have been a lot faster. I have been using variations of this script since around 2005 and it seems that the market is starting to catch up with the idea. For example, Rails comes with a rake task (notes) which will find the TODOs, FIXMEs and OPTIMIZEs in the codebase. The problem with this is that it only finds those 3 values which means that we don’t get too too much value from it. And of course it only works on rails code and a lot of companies use a variety of languages depending on the project. (Or likely should.) Here is the script (it is in python) # this script is free to use, modify and distribute and comes with no warranties etc... # - adam_goucher@hotmail.com import os, os.path def do_search(starting_path, res): # walk through our source looking for interesting files for root, dirs, files in os.walk(starting_path): for f in files: if is_interesting(f) == "yes": # since it is, we now want to process it process(os.path.join(root, f), res) print_results(res) def is_interesting(f): # set which type of file we care about interesting = [".java", ".cpp", ".html", ".rb"] # check if the extension of our current file is interesting if os.path.splitext(f)[1] in interesting: return "yes" else: return "no" def process(to_process, res): # make a list of things we are looking for (all lowercase) notes = ["todo", "to-do", "fixme", "fix-me"] # open our file in "read" mode r_f = open(to_process, "r") # make a counter line_num = 0 # read our file one line at a time for line in r_f.readlines(): # circle through each of the things we are looking for for note in notes: # check if our line contains a developer note # note we a lower()ing the line to avoid issues of upper vs lower case # note also the find() function; if the thing it is looking for is not # found, it returns -1 else it returns the index if line.lower().find(note) != -1: # initialize our results to have a key of the file name we are on if not res.has_key(to_process): # each value will be a list res[to_process] = [] # add our information res[to_process].append({"line_num": line_num, "line_value": line}) # increment our counter line_num += 1 r_f.close() def print_results(res): # check if there was any developer notes found if len(res) > 0: # asking for a dictionary's keys gives you a list, so we can loop through it # rememeber, we used the file name as the key for f in res.keys(): # the %s syntax says "put a string here", %d is the same for a number print "File %s has %s developer notes. They are:" % (f, len(res[f])) # our value for the key here is a list, so again, we can loop through it # (see, for loops are way too handy) for note in res[f]: # embed a tab for cleanliness print "\tLine %d: %s" % (note["line_num"], note["line_value"]) else: print "No developer notes found." # set our base for our source code source_base = "path\to\your\code" # create a dictionary which will hold our results results = {} # go! do_search(source_base, results) The places you modify on this script to make it your own are: • ‘interesting’ – put the extension for the files you are interested in finding things in. I have it checking Java, HTML, C++ and Ruby in this example • ‘notes’ – these are the strings you want to look for; just keep adding to the list as you discover new breadcrumbs left by the developers. One trick mentioned in the comments is to make these all lowercase. This is because we can skip the case-sensitivity by making the criteria lowercase and lowering the line we are checking. • ‘source_base’ – is the path to the top of the source tree being checked And yes, this could be rewritten to use threads or something to improve its efficiency. But I have yet to be thwarted by the lack of performance of the script. So what if it takes 10 minutes to run, just as long as I get the information I am looking for. Posted on October 28, 2008 in GLSEC2008 by adamNo Comments » I’m doing a 3.5h tutorial on Scripting Recipes for Testers next week at GLSEC and am starting to collect all my notes and thoughts together to not make a complete fool of myself. I know there are a number of people who read this and consider themselves scripters, so a question. What do you think of this format? 1. Why testers should learn to script, scripting is not hard, etc. 2. Introduction to Recipe (why it is useful and the concepts it helps / assists) 3. Presentation of the script (walk through the logic, show customizations, etc.) 4. Repeat steps 2 and 3 for 6 or 7 different recipes (though I suspect I am going to have to have a number in the hat as well) 5. Some hands-on time to work with participants on their own scripting problems Depending on the make-up of the class, the content could be too simplistic, or too complicated but I guess that is part of the “fun” of speaking. Last year at GLSEC I did the developer tutorial and not the QA one so I’m not sure of the expected demographic. But any thoughts on the format itself? I seem to think that if I was in a similar one that it would be useful to me. Especially since all the scripts shown will be available to attendees. I’ll be posting the recipes over the next couple days. Posted on October 16, 2008 in Uncategorized by adamNo Comments » I used to have a set of ‘brand of Adam’ business cards separate from the ones my employer(s) would give me. I don’t anymore, but if I did get another set drawn up, this is totally going to be image I use. Posted on October 13, 2008 in Uncategorized by adam3 Comments » Scott‘s most recent column is called Software testers are not helpless. In it he starts to think about the seemingly cultural problem of testers who feel they are helpless within their organizations (which too often becomes a self-fulfilling prophecy). He doesn’t speculate why this happens, but I will. When I was at HP I would often feel overwhelmingly helpless towards the end of the project. Why? ‘Because they were releasing it when I told them they shouldn’t. They should listen to me! Don’t they know how dumb they are being?’, etc. This also led to a bit of burn-out and drop in performance as ‘they are going to ship regardless of what I say’. Huh? Clearly this was me operating in a manner completely inconsistent to the way I do now. At the time my mission, as I perceived it, was to control the release based upon the quality information I had gathered. James enlightened me to the mission I use now. I provide information. An input. Some small (though important) part of the greater decision. This reorientation removes a lot of the sources negative thought that leads to helplessness. Is the sole solution to internalize the more realistic mission of providing quality information to stakeholders? Well, not quite. The other half I think is to take responsibility for the things that are making you feel helpless. Some examples: • Need something changed on a server? Learn how to configure it. • Is a bug thwarting you in some way? Learn to fix it. • Does your process suck? Propose and champion a solution. • Want static checks included in your build system? Integrate it. Self-empowerment and clarity of a realistic mission are two powerful tools to help remedy to helplessness. I’m sure there are others, but I think those ones cross personality types (extrovert and introvert) and are ones that can be implemented in most organizations. (Just ease your change of your mission on management; they don’t like change this big all that fast.) Posted on October 12, 2008 in Quality by adam1 Comment » I spoke at the University of Toronto this week for an hour to about 40 of Greg Wilson’s third year students. It could have used more (any) rehearsal and as usual I had a major case of the nerves but overall I think it was a success and I got to the points I wanted. This is the mini-essay version of that talk (primarily so it can be sent to the students) Being Picky about Terminology Even though the introduction used the term QA a half dozen times, what I do and what I talked about is actually testing. Why? Well, let’s look at the term QA a bit. • Quality – The definition I currently use is ‘Value to some person that matters’. For more build-up on this concept see here and especially here. • Assurance – Most people who do QA cannot actually achieve or influence this part of the term. Can you actually assure the quality of the product? Do you make staffing decisions? How about schedule ones? Do you have final say on the release metrics and which bugs get fixed? Odds are you don’t and won’t. So you aren’t really assuring much. Okay then. What is testing then? Well, the AST has defined it as • An empirical technical investigation conducted to provide stakeholders with quality related information • A cognitively complex activity that requires critical thinking, effective communication and rapid self-directed learning (If memory servers, Cem Kaner heavily influenced that definition) The key words in the first bullet are provide quality related information. That is primary motivator for all testing activities. Notice how neither point says anything remotely like ‘improve the quality’. The Quality is actually your (the programmer) responsibility. As Michael Bolton has been saying recently, testers are there to defend the Quality of the code. Oracles Whether you are a tester or a programmer, you need to know what your oracle is when you are testing. This is really an unfortunate term given the size of Larry Ellison’s company, but this is not Oracle as in the database, but Oracle as in the Oracle of Delphi (who could talk to the Gods but was really stoned on sulphur fumes). An oracle in testing terms is the principle or mechanism by which you recognize a problem (BBST). This could be a person, another application, your professor, the TA, or even internal. Ask Questions A large part of testing can be summarized as ‘asking questions of the application’. Ask a lot of different questions. If you ask the same questions all the time, you won’t learn anything new. But don’t limit yourself to asking questions only of the product itself. Ask about the things circling the product. Such as the requirements. The requirements for the assignments you have received certainly have enough information to do something with (both create and test). But really, the requirements kinda suck. But that is okay. Requirements in the real world do generally suck. And they likely always will be. The world is a dynamic place which is undergoing constant change. As such, the requirements will change constantly too. Your job though is to question every line of them though. Ask the obvious question, you might not get the obvious answer. Ask the crazy questions, that is often enlightening. Ask why we are bothering with this feature, that will help clarify your mission. Mission Knowing your mission is important to testing. What information do the people you are testing for asking you to find? Doing security testing might be fun, but if your employer is wanting performance information then you are off mission. It doesn’t mean you can’t do security checks, but make sure you have given them the performance data they were asking for first. Complete Coverage Complete coverage is technically possible if you are doing TDD in its purest form. However, that number is often abused. Yes, you have touched 100% of the lines of code, but you have not touched 100% of the code with all possible values and in all possible contexts. This is part of the ‘Impossibility of Complete Testing’ which is also covered in the BBST course linked to above. Anyone who tells you otherwise is lying or just confused. This doesn’t mean there is not value in coverage tools, or even aiming for 100% coverage. They are a useful class of tools to tell you where you have under tested and also lets you determine when your coverage has change (for the better or worser) when then lets you question whether that can be lived with or not. (There is that questioning concept again…) Code Review Ask anyone who has written a book about the value of an editor. Just as writers fill in the blanks between what they meant to write and what they actually did write, programmers will do this with their code. While having every line of your code reviewed by a peer is a bit overkill, it is certainly valuable for anything of any sort of complexity. Learn how to review code. Random Bits • Don’t fall into the trap of thinking ‘No one will ever do that…’. No one will ever do that who is acting in a sane manner who does not think they are clever or are trying to find holes and/or problems with the system and are are only ever using it in the manner it was envisioned. Well, the real world is messy. Someone is almost guaranteed to do that at some point. Hopefully you won’t look too dumb when they do. • You are nothing like the the person who will be using the code you produce. Well, most of the majority of the time at any rate. Think of the person who calls you to ‘fix their computer’. If they could use you application, then you are doing pretty good. • Bugs tend to cluster. When you find one in a piece of code, spend some time to poke around and see what else is lurking. That was more-or-less the hour except some discussion are possible tests for a term assignment. I likely should have written this before the class shouldn’t I. ðŸ™‚ Posted on October 7, 2008 in Uncategorized by adamNo Comments » We’re breaking our monolithic application into a series of smaller services that a much lighter-weight Rails application will then utilize. These services will be (initially) sitting on the same machine behind an apache load balancer. The front application will be talking SSL to the load balancer. This post is essentially how to wire this all together (so it can be tested in a configuration similar to production) Before starting, here are the assumptions I am going to be working from: • The front application is currently able to talk directly to each of the services directly (no proxy or clustering) • Rails is being served by mongrel. Sorry Monorail or Passenger kids; you are too bleeding edge for me • This was written on a mac so some of the security conventions (like sudo for lots of stuff) might be OS quirks Part 1: Generate SSL Certificates In production, any customer port that is secured should likely be done through a commercial Certificate Authority (CA) as they have their Root Certificates installed in browsers by default (a perk they pay dearly for). But given that this is being done in the context of testing and the services will only only be used by clients of our creation we can use a pretend CA. The one that is in the copy of OpenSSL that is on this machine is called demoCA. Let’s set it up. Setup your CA$ mkdir ~/certificates
$cd ~/certificates$ mkdir demoCA
$mkdir demoCA/certs$ echo 00 > demoCA/serial
$touch demoCA/index.txt Now that you have a bit of infrastructure taken care of, you need to make the CA’s key $ openssl genrsa -des3 -out ca.key 1024

Then create a new request and sign it with the key in one shot

$openssl req -new -x509 -days 365 -key ca.key -out ca.crt Now that we have a CA you can make the request for your service. But first it needs a key too. $ openssl genrsa -out s1.key 1024

(alternatively you could have added a -des3 to password protect the key, but it is a pain. don’t do it)

Here is the actual request. It is very important that when it prompts you for the ‘Common Name’ (CN) that you put in the host name of the machine you are trying to access. For instance, adam.zerofootprint.net. Crypto is all about trust and one of the checks is that the hostname matches the cn.

$openssl req -new -key s1.key -out s1.csr And sign it using our CA $ openssl ca -in s1.csr -out s1.pem -keyfile ca.key -cert ca.crt

If you put a password on your server’s keyfile, you should remove it. It’s not necessarily as secure anymore, but…

$cp s1.key s1.key.orig$ openssl rsa -in s1.key.orig -out s1.key

Repeat the creation of certificates for the amount of services you have making sure to change the name of the files things are saved into.

Part 2: Setup Virtual Servers
I really don’t like how Apache is configured on the Mac by default, so this next section will likely be redundant to Ubuntu or other nicely configured servers.

Everything in Apache for the last couple years has been setup as a Virtual Server. Even the default port (80) is setup in that manner. We’re going to pretend that there are 2 services going to be hosted in this setup. Infrastructure again.

$cd /etc/apache2$ sudo mkdir sites-available
$sudo mkdir sites-enabed$ cd sites-available

In the sites-available directory you put in the actual virtual host definitions. Here is the virtual host definition I started with for one of the services. I also have one for port 7501.

Listen 7500
NameVirtualHost *:7500

<VirtualHost *:7500>
DocumentRoot "/Library/WebServer/Documents/s1"
# you need a different ServerName for each host
ServerName s1.your.site
ErrorLog "/private/var/log/apache2/s1-ssl-error.log"

</VirtualHost>

In order to have apache pick up these hosts you need to include them in your main config file. On the mac it is httpd.conf.

Include /etc/apache2/sites-enabled/*

This of course won’t work without a bit of magic. I’m only showing the first service, but you have to do this for each one.

$cd /etc/apache/sites-enabled$ ln -s ../sites-enabled/s1 s1

Restart apache and make sure that you can get to all your virtual hosts. They are still running in the clear over http.

Part 3: SSL-ize Virtual Servers
The end goal of course is have communications to these virtual servers to be secured so now we actually turn it on. Again, some infrastructure.

$cd /etc/apache2$ sudo mkdir certificates
$sudo cp ~/certificates/s1.key .$ sudo cp ~/certificates/s1.pem .

Listen 7500
NameVirtualHost *:7500

<VirtualHost *:7500>
DocumentRoot "/Library/WebServer/Documents/s1"
# you need a different ServerName for each host
ServerName s1.your.site
ErrorLog "/private/var/log/apache2/s1-ssl-error.log"

# these were all the defaults from the stock mac install; tune as necessary/desired
SSLEngine on
SSLCertificateFile "/private/etc/apache2/certificates/s1.pem"
SSLCertificateKeyFile "/private/etc/apache2/certificates/s1.key"

BrowserMatch ".*MSIE.*" \
nokeepalive ssl-unclean-shutdown \

CustomLog "/private/var/log/apache2/ssl_request_log" \
"%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"

</VirtualHost>

Again, restart apache and see that going to your ports over https works and you have the correct certificate being served (view the certificate and check the CN)

Part 4: Contacting the Nodes
I’m going to skip how you setup a mongrel cluster as there are a tonne of sites that deal with that already. In this example I’m going to have a 2 node cluster.

We’re going to use the same convention for the cluster nodes as we used for the virtual hosts.

$cd /etc/apache2$ sudo mkdir nodes-available
$sudo mkdir nodes-enabled$ cd nodes-available

In the nodes-available you want to define the members of each node. Each of our services is going to have only 1 node in this example. And each member is going to listen on localhost.

BalancerMember http://127.0.0.1:8000
BalancerMember http://127.0.0.1:8001

Now you might be a bit alarmed that it is using http which is somewhat ironic given the whole point of this is to be more secure, but if someone has the ability to sniff localhost then you are screwed already. Naturally, each member needs to have its own ports and that the ports are correct.

In order to redirect clients around to the various nodes we need to do a bit of modification to our virtual hosts file we created in step 2 and added ssl stuff to in 3. Specifically the Proxy section and the rewrite rules.

Listen 7500
NameVirtualHost *:7500

<Proxy balancer://s1_cluster>
Include "/etc/apache2/nodes-enabled/s1*"
</Proxy>

<VirtualHost *:7500>
# various SSL settings including certificates for this port go here

RewriteEngine On
RewriteRule ^/(.*)\$ balancer://s1_cluster%{REQUEST_URI} [P,QSA,L]
</VirtualHost>

That’s all there is to it. Now you will be able to run multiple services on the same host with proper ssl certificates. Once a service has outgrown its ability to share space with the other services you can just move the certificate to a separate machine and tweak the dns to point to the new location (since the CN is the machine name). No service reconfiguration necessary.

Hope this saves someone some thinking.