Wednesday, May 28, 2014

Contrivance without confluence

The previous contrivance (without conflation) showed how to count only user-viewed visits to a web page, depending on the user-agent (i.e. browser) to load an image each time the page is viewed. This is good because that contrivance no longer counts requests for the page from robots and web crawler software, as it did in its first incarnation.

However, much information is lost as the count is maintained, because a mere tally is a stark abstraction of the visit information. Left out are the date of the visit, the user's IP address, the kind of browser used, the referring page, etc. etc.

In this contrivance, we'll show one way of capturing the date information. This is so the flow of all counts do not run together into a single number. It is as if each day's tally is a separate river flowing into the month, flowing into the year*.

The idea is to create a folder for each year, month, and day on which a visit occurs, and to keep a separate tally for each date.

Here is a CGI script to accomplish the task.

#!/bin/bash
echo "Content-type: image/gif"
echo
DATE=`date +%Y/%m/%d`
FOLDER_Y=../../counters/`echo $DATE | cut -d / -f 1`
FOLDER_M=../../counters/`echo $DATE | cut -d / -f 1-2`
FOLDER_D=../../counters/`echo $DATE | cut -d / -f 1-3`
if [ ! -d "$FOLDER_Y" ]; then
  mkdir $FOLDER_Y
fi
if [ ! -d "$FOLDER_M" ]; then
  mkdir $FOLDER_M
fi
if [ ! -d "$FOLDER_D" ]; then
  mkdir $FOLDER_D
fi
echo -n 1 >>$FOLDER_D/tallies
cat 1x1.gif

As before, line 1 lets the web server know that a bash shell should interpret this script, line 2 informs the web server (which lets the browser know) that an image will be returned, and line 3 signals the end of HTTP headers.

Line 4 obtains the system date from the machine running the web server, in the format YYYY/MM/DD. This is convenient, as the separating slash also has meaning for the file system, indicating nested folders. Note that this machine might be in a different time zone than the web master, and very often than the visitors. This scheme abstracts out the time of the visit, along with all of the other information available to the script.

Lines 5 and 8-10 create the folder for the year, if it doesn't already exist.

Lines 6 and 11-13 create the folder for the month, if necessary.

Lines 7 and 14-16 create the folder for the day, if necessary.

Line 17 tallies the count.

Line 18 as before provides the actual image data to be returned to the web server, which will pass it along to the user's browser.

To use this script you would upload it to the cgi-bin folder of the public html folder of your provider's web server machine. Supposing you named it tallywithdate.cgi any request from an image tag in one of your web pages for
[your domain]/cgi-bin/tallywithdate.cgi
would result in a view of that page receiving one tally or count.

Now, the question you are going to ask is, "Why do it this way, rather than using a relational database?" Very good question. In my case, it was to avoid setting up all of the machinery required. All that I needed was the date information, and the publisher of the website was content to have the date be relative to the U.S. central time zone (where the web server machine is located).

Another reason for using this technique is that, for this web site, I did not need PHP, because all of the web pages are generated off-line, and uploaded periodically, as a collection of hundreds of HTML pages. So, I took it as a challenge to implement the few pieces that required some server-side logic using only the Bash shell.

A word about disk space requirements. On the machine that hosts the web site in question, the file system uses blocks of 4K (4096 bytes). So one of these will be required for each year, month, day, and tallies file. Since the tallies file uses base one, it will require an additional 4K block when the tally exceeds 4096, and so on. I don't know how this would compare to a relational database solution. There, each count event would require at least 10 or 12 bytes for the record to contain the count and the system date. The exact comparison will be left as an exercise for the reader.

Another useful piece of information that is available to the script is the referring page. That information consists of the entire URL of the page containing the tally (image) request. That could be quite large. As an alternative, the pages of the web site in question are partitioned into equivalence classes, and each class of page is assigned a simple identifier. Then the following script (replacing only the penultimate line of the first script) records the tallies in a separate file for each class of web page.

#!/bin/bash
echo "Content-type: image/gif"
echo
DATE=`date +%Y/%m/%d`
FOLDER_Y=../../counters/`echo $DATE | cut -d / -f 1`
FOLDER_M=../../counters/`echo $DATE | cut -d / -f 1-2`
FOLDER_D=../../counters/`echo $DATE | cut -d / -f 1-3`
if [ ! -d "$FOLDER_Y" ]; then
  mkdir $FOLDER_Y
fi
if [ ! -d "$FOLDER_M" ]; then
  mkdir $FOLDER_M
fi
if [ ! -d "$FOLDER_D" ]; then
  mkdir $FOLDER_D
fi
TALLIES=`echo $QUERY_STRING | grep -o '[a-zA-Z][0-9a-zA-Z]*'`
if [ -n "$TALLIES" ]; then
  echo -n 1 >>$FOLDER_D/$TALLIES
fi
cat 1x1.gif

There you have it. Instead of a tally going into a file named tallies, it will go into a file whose name is provided as the query string of the image URL. For example, a request from one of your (fabulous) web pages that looked like this
[your domain]/cgi-bin/tallywithdate.cgi?fabulous
would result in a tally to the file named fabulous in the folder for the current (server) date.

The query string (portion of the image URL following the question mark) is sanitized (line 17) by accepting only the first occurrence of an identifier of letters and digits, starting with a letter. If such an identifier is present in the query string, then it is used as a file name (lines 18-20) to collect the tallies.

Scripts to display the count of page views for any given date, or date range, and for any class of web page are all left as exercises for the reader.

* So, with this iteration of the contrivance, we leave behind the confluence of all the counts into a single number. Hence the contrived title of this post: this is a contrivance without confluence.

[Added May 31] If you would like to limit the class names to those in a particular list of identifiers, place those identifiers in a file named, say eclassids, and add this line of code just before the last if of the script.

TALLIES=`grep "^$TALLIES$" ../../counters/eclassids | head -1`

[Added Jan 10, 2021] Note that lines 8-16 could be replaced by this single line of code:

mkdir -p $FOLDER_D
The -p flag will cause the mkdir command to create all the directories which don't already exist without giving any error messages. This would include the parent directories for a new month and year.

Wednesday, May 21, 2014

Contrivance without conflation

In a previous post, "Contrivance without conclusion," I presented a device to count web page views. As noted in the post, that contrivance really counts web page requests. As such, it would also count requests from web crawlers used by search engines, and other* cases where a human being has not actually viewed the page.

The trick commonly used to count only views of a web page is to embed in the page an image containing the number of views. The technique is sometimes called a web bug. When the page is requested by a user-operated browser, the browser normally** will also request all embedded images so that they can be displayed in their proper places on the page.

There are dozens of web sites which offer free counters of this kind. However, if you choose to use one of these you will be bound by their terms and conditions, and subject to them discontinuing their service, which can occur without notice.

So, why not make your own? Here is a modification of the earlier contrivance that accomplishes this.

#!/bin/bash
echo "Content-type: image/gif"
echo
echo -n 1 >>../../tallies
SUM=`cat ../../tallies | wc -c`
LEN=${#SUM}
IMG="H/$LEN"
for d in `seq 1 $LEN`
  do
    IMG="$IMG S"
    IMG="$IMG P/`echo $LEN-$d | bc`"
    IMG="$IMG `echo -n $SUM | tail -c $d | head -c 1`"
  done
cd ../../counters
cat $IMG T

Line 1 as before, tells the web server that this file is a program written in the "bash" language.

Line 2 instructs the web server to send the user agent an HTTP header informing it of the type of content, in this case, a GIF image.

Line 3 as before is a blank line that signals the end of HTTP headers, so that all that follows will be sent to the user agent as the actual content produced by this CGI script (which happens only at line 15, but that is getting ahead of the story).

Line 4 as before increments the counter.

Line 5 assigns the visitor count as the value of the shell variable named SUM. As a running example, let's assume the value is 1957.

Line 6 assigns the length of the visitor count number (i.e. the number of digits it contains) to the shell variable named LEN. For our running example, this will be 4.

Line 7 assigns an initial value to the shell variable named IMG, the string "H/" followed by the actual length, "H/4" in our running example.

Line 8 begins a loop which, in our running example, will be used 4 times, since the command `seq 1 4` will return 1 2 3 4.

Line 9 open brackets the lines which will be repeated.

Line 10 appends a space character and the letter S to the IMG variable.

Line 11 appends a space character, then "P/" and, a number (the position, zero-based). For our running example, the numbers will be 3 2 1 0, as computed by having the command bc evaluate each of 4-1, 4-2, etc.

Line 12 appends a space character, then a digit from the number to be displayed. In our running example, these digits will be 1 9 5 7 7 5 9 1 (because we are looking at them in reverse order).

Line 13 close brackets the lines which were to be repeated.

Line 14 changes the current directory to a folder named "counters" (which must be created--see below) which is out of the way of the files which can be served by the web server. This directory or folder must contain image fragments, which are used by the final line.

Line 15 concatenates many bits and pieces, which are each binary files. When all strung together, and ending with the contents of the binary file named T, this will produce a valid GIF file which will be the number from the web page visitor count.

In our running example, the value stored in the IMG variable at the end of the loop will be

H/4 S P/3 7 S P/2 5 S P/1 9 S P/0 1

and the complete command of line 15 would thus be

cat H/4 S P/3 7 S P/2 5 S P/1 9 S P/0 1 T

which will output the fourteen fragments--named H/4, S, P/3, 7, etc. and finally, the fragment named T--to the web server (which passes this content along to the user agent (the browser)).

To make this work, you will need to upload the file to the cgi-bin folder in the public html file provided by your web hosting company, naming it, say, "tallyimage.cgi". Once you have done this, anyone who visits the page at
[your domain name]/cgi-bin/tallyimage.cgi
will see the number of visitors, shown as a decimal number. And that is all that will be shown in the browser, because the user will have requested an image. At this point it will be a broken image (because the script will fail to execute correctly), until you install the image fragments.

You will also need to download the image fragments from the file counters.zip (which is also mentioned/used in a web page entitled "Image generation in DataPerfect") and unzip this file into a new folder named "counters" (for that is the name mentioned in line 14) in the folder containing the public html file provided by your web hosting company.

As a mnemonic, the 32 fragment names are

folder H for "header" (containing fragments named 1, 2, 3, ... 9, X(not used here))
S for "separator"
folder P for "position" (zero-based, containing fragments named 0, 1, 2, ... 9)
9 for the digit nine, etc.
T for "terminator"

Suppose you don't want to display the count on a web page, but you want to just count it as having been viewed? In this case, here is a much simpler CGI script to accomplish the task.

#!/bin/bash
echo "Content-type: image/gif"
echo
echo -n 1 >>../../tallies
cat 1x1.gif

The first four lines are identical to the previous CGI script.

Line 5 copies a file named 1x1.gif to the the web server, which passes it along to the user agent as the requested image. This file, as its name suggests, is a small one by one pixel GIF image, which (though its name does not suggest this) is transparent, so it can be included somewhere on a page without disruption of the visual appearance of the page.

The two contrivances live:
http://sanbachs.net/tallypagewithimage.html
http://sanbachs.net/tallythispage.html

These are normal web pages, which call for and display the contrivance images. On the second page, I have enlarged the image and surrounded it will a border so that you can see where it is. This will allow you to download a 1x1.gif file more easily.

Note that these web pages (and those of the previous post) all share the same "tallies" file, and thus share a page view counter.

As a modification to the previous "contrivance without conclusion," this is a contrivance without conflation of the two kinds of visits: the ones from a genuine request to view a page, and the ones from web crawlers and other user agents which do not request the images on the page.

* Technically, any software which makes a request for a web page is called a User Agent.

** A user can disable the automatic display of images on a web page, unfortunately, so this trick will miss counting such page views.

Friday, May 16, 2014

Context without content

I approach this blog post with a bit of apprehension. It is key to understanding much of what I intend to write going forward. Yet it is a bit dry, and maybe too technical. I will do my best to keep it interesting.

First, let's consider content without context. That could, of course, be a post by itself, but I'm trying to contrast it with the topic at hand.

I had a colleague at one of my places of work, whom I will call "Brent." Brent used to enjoy writing the number "1040" on the whiteboard, and would then look at his audience expectantly. After letting some painfully silent moments pass, he would ask, "What is this?" People would propose various things. He would finally point out that we can't know what it means until we understand the context. It could be the number of an IRS tax form (this was especially fun in April), the number of a house, Brent's favorite number*, the number of visits to a particular web page, and on and on. I used to think this was a rather self-evident point, but it was important to him that people think about it, and we often dedicated a moment of silence to its honor.

This post is about context without regard to any particular content. In other words, I wish to discuss a framework for thinking about things. I invite you to pause for a moment and consider this question, "In what sense is Cinderella alive?" I recognize that I am encouraging the same** kind of exercise that Brent enjoyed, but please humor me for a moment. I invite you, then, having considered this, to read the last half of a short blog post. The half about Cinderella (starting two paragraphs above the screenshot). The post is entitled, "Lying to children," and I hope you will find it amusing. Please come back here and finish reading this post after you have enjoyed that one.

Thank you for returning! It is always a bit of a gamble, on the Internet, to refer your readers away to another author. Especially a younger one.

The context that I wish to expose you to is from Karl Popper, a twentieth century philosopher. This way of thinking about things is commonly called "Popper's three worlds". Things might exist in World One as tangible, physical objects, such as the pages in a book which comprise the Cinderella fairy tale. Other things might exist in World Two as thoughts and mental pictures, such as those imagined by the reader of those pages, or if read aloud, by the child listening. There exist also things which, while not concrete, physical objects, nevertheless exert an impact on us, such as the character "Cinderella," and he assigns these things to World Three.

I don't want to get into this in any great detail in this post, because here I am merely setting up the stage for a number of subsequent posts, which, to be understood, will require that the reader be familiar with this way of looking at things. Every time we hear that something exists, or read a question about the existence of something, we can make distinctions about that something in terms of these three worlds.

For example, recently facebook has been displaying an ad that asks, "Do you believe in Africa?" Probably because we are now living on the African continent. Well, not really on the continent per se, but the island of Mauritius is considered to be a part of Africa.

What does this question mean? I never clicked on the ad to find out what the advertiser meant, and the ad no longer appears. In World One, yes, I believe that Africa exists in the physical world. I have seen it on maps and globes hundreds of times, have flown over it several times, and I have changed planes in an airport there. Not having looked at the advertiser's material, I do not know what was in his or her head about Africa, what thoughts or mental pictures he or she wished to implant in my World Two. I can only assume that it is something about Africa as a World Three entity, perhaps its future or its potential for greatness or economic growth (or pride to be African?).

I invite you to consider using Popper's three worlds as a way to organize your thoughts about things. Is it a real, tangible object? World one. Is it an idea flitting through your head? World two. Is it something that, while not tangible, has a life of its own, that people talk about or to which people pledge allegiance? World three.

In fairness to modern thinking, I must point out that this way of viewing reality is not popular now. And, that will become my point in future posts, because it is my own cognitive dissonance, bouncing back and forth between different ways of viewing reality, which drives me to write this blog.

For the moment, this is just context without content.

* This was not one of the examples he used at the time, but, hey, it was his favorite number! Something that I had not thought of at that time, but realized at this writing.
** "Everyone is a mirror image of yourself—your own thinking coming back at you.", a Byron Katie quote.

Consistency without completeness

A good friend, whom I will call "Stewart," responded by email to the previous blog post, and pointed me to a column by Charles Krauthammer, as reprinted in his book, Things That Matter: Three Decades of Passions, Pastimes and Politics, pp. 64-66, Crown Publishing Group. The column is entitled, "The Central Axiom of Partisan Politics". (I would have liked to link to this column, but cannot.*)

As for the axiom itself, it is, "Conservatives think liberals are stupid. Liberals think conservatives are evil."

Where this gets interesting, to me, is his further claim that conservatives think liberals are stupid because they believe that everyone is basically good. Adding this reason to his axiom can be used to prove a contradiction, as follows (expressed from the point of view of liberals (according to conservatives according to Krauthammer)).
Every person is basically good. Conservatives are people. Therefore, conservatives are basically good. Oops. Contradiction, because, according to his axiom, liberals also believe that conservatives are evil.
A contradiction in a system of formal logic would not be a good thing. But (according to Nicholas Rescher**) people don't reason using formal logic, and so we can be perfectly happy with Krauthammer's axiom. That is, if we are inclined to agree with his views on politics. I will leave that up to you, dear reader, if you are inclined to look for, find, and read the entire column***.

Well, this post isn't meant to be about politics. Instead, the mathematician in me followed the path of least resistance in response to the word "axiom," and this post is about the journey along that path. As a mathematician, I love formal logic systems, which are based on a small number of axioms, and the things that can be proven from them. If a logical contradiction can be proven from the axioms, then they are considered to be inconsistent, which is not a good thing for an axiom system.

An ideal formal system is both consistent and complete. Unfortunately, early in the twentieth century, Kurt Gödel proved that any sufficiently powerful formal system is either incomplete or inconsistent. This undid the grand project of mathematicians at that time, who had hoped to find a set of axioms from which all true statements could be proven.

The cognitive dissonance of this post's title would exist in the minds of mathematicians, for whom consistency without completeness is not good for a formal axiomatic system. It is resolved, sadly, by Gödel's theorem demonstrating that the desired resolution is, in fact, impossible. And we're just going to have to live with that.

Mathematicians seemed to me, in the 1970's, when I was an undergraduate minoring in mathematics, to be still (forty years later) in denial about this result, for it was not taught in the mathematics department. I had to take a course from the philosophy department to learn more about it.

*The Washington Post, in which the column originally appeared on July 26, 2002, has a web site, but it requires payment to view articles older than 2005. Very interesting. This makes it impossible for me to do a proper attribution. Yet, at the same time, I understand that newspapers have been hard-hit by the Internet, and are desperately scrambling to find ways to monetize their work product. Thus, I am quoting from the book, rather than the original source, which is hereby at least acknowledged..

**Rescher, N. (1982). The Coherence Theory of Truth. University Press of America.

***Try the "Look inside" option of amazon.com once you have found his book there.

Sunday, May 11, 2014

Condescension without comprehension

This is something of a sad post for me, because it involves a disappointment from one of my all-time favorite authors, Douglas R. Hofstadter. I believe that I have read all of his books, and have studied one of them, Gödel, Escher, Bach, quite intensely. I admire both the depth of his thought and the ornateness of his writing. And, I just love all of the self-reference.

The disappointment is this quote from Le Ton beau de Marot, on page 485, in the section entitled "Concentric Rings of Diminishing Empathy".
A few rare people are exceedingly empathetic, a few are extremely sociopathic, and most of us fall somewhere closer to the middle.

The distance in "acquaintance space" at which an individual's empathy tails off is a central character trait determining many things, including one's politics.

Grosso modo, the conflict between left-wing and right-wing ideologies is simply the battle between small-radius and large-radius individuals writ large.

I loved everything about the book, except for this one sentence, which I felt to be dismissive of right-wing people. It is obvious from his writing that he "leans left," but this has nowhere else been condescending in tone. The way I understood it is, "on the scale from sociopathic to exceedingly empathetic, you would encounter right-wing people first, nearer the sociopathic end, while left-wing people are more empathetic." Or, in other words, "the less well a right-wing person knows someone, the less he or she cares about them."

This is a very interesting claim, and I believe it to be an over-simplification at best, just plain wrong in the middle, and condescending at worst.

Here, I may be in a bit of trouble, because I am not trained in political science, so I ask help from readers. I think of left-wing as tending towards the welfare state and right-wing as tending towards self-sufficiency. The rest of this post is based on that assumption, so please help me understand if I am wrong.

Side remark: I also think of left-wing as being "progressive" while right-wing is "conservative." My parents were supporters for much of my growing up years of a political party in Canada named "Progressive Conservative," so perhaps this is not a dichotomy.

I am leery of the progressive movement because of this truism: All progress requires change, but not all change is progress. In order to join in in calling reckless change "progress" I would need to understand the long-term goal of the progressive movement. Someone, please paint me a picture of the ideal society towards which we are making progress with these changes? But, this would be the topic of a different post, and one which I am not ready to write until my empathy for the progressive movement increases.

Back to the topic. Personally, I identify more with "right-wing" than I do with alternatives. Nevertheless, I have personally taken early retirement in order to live half-way around the world to help people who, until recently, were very remote in my "acquaintance space." Certainly my empathy has not tailed off with the distance.

I propose that the conflict between left-wing and right-wing has nothing to do with empathy at all, but everything to do with how best to help people in need. Grosso modo, I believe, the left-wing prefers to help the needy with government programs while the right-wing prefers to help the needy with private and locally-run programs.

This is well illustrated by conversations I have enjoyed with a good friend, whom I shall call "John." John is frustrated that the government isn't doing more to help the poor. He earns a very good income and yet wishes that the federal tax rate were much higher, perhaps even 60% to 80% so that social programs could be funded, and no one would need to suffer and die in poverty. This is what is done in some countries in the world, and he wished his country was more like these. Each person in society could then choose, if they are able, to work and pay for those who are unable, or who choose not, to work.

For my part, I told him about a fund which is managed by a private charitable organization. This fund accepts donations, and has tens of thousands of local units which actively seek out the poor and needy in order to help them with food, clothing, and shelter. And to find work if they are able and so desire. The overhead for managing the fund is provided by the charitable organization from other sources, so that the fund itself uses 100% of donations received to help the poor and needy. There are millions of voluntary contributors to this fund throughout the world, and it is actively engaged throughout the world. Would you be willing, I asked, to voluntarily contribute 20% to 40% of your income to this fund? This would go a long way towards helping those in poverty. No. He was not willing to voluntarily contribute to this fund.

I have no conclusion to offer for this post. I wish that I could have a dialog with my favorite author to find out if I am understanding his meaning correctly, or if it might be that I have simply taken offense where none was intended.

I do wish to invite my readers to consider this post from another blog, "The value of struggle." I was surprised in a speech by a left-wing person to hear that they understand that struggle has value, while at the same time seeking to remove the need to struggle. An interesting contradiction.

Constraint without constriction

Dear readers, this is a short post to beg an indulgence. Thus far, all of the blog posts have had titles which followed a strict lexical constraint. Three words. Middle word being "without" and the other words starting with "con".

However, "con-" as a prefix is written "com-" before five consonants, and is often reduced to "co-" in newer English words. Thus, I intend to relax the constraint on blog post titles just slightly, while, I hope, maintaining the spirit of the original constraint.

I sincerely hope not to offend nor to disappoint my readers.

Saturday, May 10, 2014

Contrivance without conclusion

This might be the smallest web page which counts and displays the number of times it has been seen.

#!/bin/bash
echo "Content-type: text/plain"
echo
echo -n 1 >>../../tallies
cat ../../tallies | wc -c

It is a contrivance, all right. Without conclusion? Well, not literally so. Eventually, it might run out of disk space. Probably before that the web hosting company owning the disk drive will go out of business. Probably before that the owner of the domain name will leave the planet or otherwise lose interest in maintaining it.

To make this work, you will need to reserve a domain name, engage a web hosting company for disk space and the use of a web server. Then, you will need to upload this file to a folder named "cgi-bin" within the public html folder provided by the web hosting company. You will of course give this file a name when you upload it. Supposing you give it the name "tally.cgi" it will be available to anyone with an internet connection as
[your domain name]/cgi-bin/tally.cgi
Next, you will need to convince some people to look at your web page. And that is in fact the hardest thing of all to accomplish. The social problems being harder than the technical problems.

Each time someone looks at it, he or she will see a number. Ever larger numbers. And that is all. The number will be equal to the number of times the page has been visited. No one, including you testing it, will ever see the number zero.

How does it work? Delighted that you asked.

The browser asks the web server at your domain (running on the machine belonging to the web hosting company) for the page. The web server runs the program which you have named "tally.cgi" by interpreting it as explained here. The web server uses the output of the program to prepare a response which it then forwards to the browser.

Line 1 tells the web server that this file is a program written in the "bash" language. This is one of the Linux shells, normally used at the command line prompt. So the web server starts up a bash shell and passes the file to it for execution.

Line 2 produces as output an HTTP header which will ultimately let the browser know that the content is plain text.

Line 3 outputs an empty line, which signals to the web server that the HTTP headers are finished, and that what follows will be the actual content to be sent to the browser.

Line 4 produces no output, but adds one to the file named "tallies". A quirk here is that it adds one by adding a digit "1" to the end of the file. The number of visits is maintained in base one, rather than the more familiar base ten.

Line 5 outputs a base ten number expressing the number of characters (all ones, remember) in the file named "tallies"

Then, the program stops, signalling to the web server that the page is complete, and the web server passes this information along to the browser, which will signal to the person that the page is complete.

The count will include visits of the page by web crawlers, such as the ones used by search engines. So, strictly speaking, it is not counting the number of times the page is viewed by a person (using a browser or mobile phone), but rather the number of times the page is requested of the web server.

Why use base one, instead of base ten? Glad you asked! For some insane reason your web page might become wildly popular, with people around the world accessing it over and over again just for the pleasure of seeing ever larger numbers. What happens if two of these requests arrive at the web server at exactly and precisely the same moment in time? If we had been using base ten, the operation of adding one to the number would involve reading in the (base ten) number, adding one to it, and writing out the next (base ten) number. If the web server is running two copies of your program and both copies read in the same number, both will increment it, and both will write out the next number, and one visit will be missed. This is known in the computing industry as a critical section. By contrast, the operation of appending a character to the end of a file is indivisible, and so the count will not be missed.

But won't this require more disk space than a base ten number? Yes, considerably more as the count gets bigger and bigger. This is an example of a trade-off. We are trading off space for the advantage of indivisibility or reliability. Of course, this is a bit contrived, because what does it really matter if the count is off by one once in awhile? Especially if there are billions of page views, this would consume vast amounts of disk space, with each byte being exactly the same digit "1". Well, yes, but that's not going to happen, because billions of people are not going to be viewing a page that merely contains an ever larger number. They have better things to do, such as looking at pictures of cats.

Could this be used to count visitors to a real web page? Sure. Adapt it like this.

#!/bin/bash
echo "Content-type: text/html"
echo
echo -n 1 >>../../tallies
COUNT=`cat ../../tallies | wc -c`
cat <<ENDMARKER
... your real web page goes here ...
<p>This page has been visited $COUNT times.</p>
... the conclusion of your real web page ...
ENDMARKER

Notice the change to line 2.

The adapted line 5 does not output the visitor count, but instead assigns it as the value of a shell variable named "COUNT".

Line 6 starts copying the following lines (your real web page!), up to but not including the end marker, to the output to be sent by the web server to the browser. The construct "$COUNT" will be replaced in the output with the value of the "COUNT" variable, which you recall is the visitor count. You can safely use the plural "times" because no one will see the page when it says "visited 1 times." Other than the first time you test it.

This paragraph marks the conclusion of this post. So, it was not the post itself which was without conclusion, but rather the contrivance described therein, which in principle would never run out of numbers.

[added Sun May 11 06:59:14 CDT 2014]
The two contrivances live:
http://sanbachs.net/cgi-bin/tally.cgi
http://sanbachs.net/cgi-bin/tallypage.cgi

Note that both web pages share the same "tallies" file, and thus share a page view counter.

[added Sun May 11 14:03:51 CDT 2014]
CGI stands for "Common Gateway Interface"

Wednesday, May 7, 2014

Controversy without contention

In a word, I suppose, diplomacy. Of course, I am only an armchair diplomat, having neither experience nor training.

The cognitive dissonance (required of posts in this blog) in this post is pretty subtle, as the two words are very closely related.

Disagreements happen all the time. It seems that we are built in such a way as to see situations from our particular perspective, which is necessarily different from the perspective of another, even if standing very close and looking in the same direction. As we attempt to communicate our view, it may be different from our listener's view, and controversy can easily ensue.

In this respect, may I draw your attention to this quotation from a presentation, ¨Seeing beyond the Leaf¨, given by Dieter F. Uchtdorf as the keynote address at the Church History Symposium of 2014. It is based on an analogy of individuals as leaves of a tree.
One of the weaknesses we have as mortals is to assume that our “leaf” is all there is—that our experience encompasses everyone else’s, that our truth is complete and universal.
Controversy does not have to degenerate into contention, although it easily can, especially if those involved feel strongly about their respective viewpoints. Perhaps the simplest way to achieve resolution is to agree to disagree. One could hope that the parties involved would take care to understand each other's point of view first, but of course that takes time and effort not to mention willingness to listen and effort to understand.

In another word, this post is about peacemaking.

In her patriarchal blessing, our mother was promised that her children would be peacemakers. I remember her often telling us this as we quarreled together. Personally, I have more often been a peace-seeker rather than a peace-maker, preferring to avoid even controversy, let alone contention. I can only recall one, very short-lived, fist fight in which I was involved, and that completely unintentional. This is recounted in the first of seven western stories in another blog. The second story there is an example of my avoiding controversy in a passive-aggressive way, which comes more naturally to me than fisticuffs.

There is much more to say about controversy without contention, so this is merely an introduction. In many ways, this blog is motivated by my desire to reify the promise in my mother's patriarchal blessing, at least for one of her children.


Monday, May 5, 2014

Conrad without con

This is a blog devoted to raising and resolving instances of cognitive dissonance.

So, how to resolve the dissonance of the blog's name? For, "con" means roughly "with", and "with without with" seems a bit dissonant, besides being self-contradictory. (After all, "with" without "w i t h" would just be the empty string, leaving just "out" after substitution throughout).

Hopefully, over time, the posts will give enough examples of resolution, thus resolving the dissonance of the blog's title.

The name of the blog is also meant as a pattern of sorts, as I am self-imposing a constraint on blog post titles. The promise that I am making, as I start the blog, is that post titles will be of the form "Con[x] without con[y]" where [x] and [y] will be the remainder (possibly empty, as [y] is here) of words beginning with "con-".

Now, moving on to this initial post, which ought to be subtitled, "The French connection."

In the 1970's my work took me to the computing center of INSEE, the French census bureau, located in Lille. At the time, I lived in Châtenay-Malabry, a suburb of Paris, and commuted, mostly by train (about two hours each way). The company for which I worked, CAP Sogeti Logiciel, allowed me to work four ten hour days a week, so that I only commuted weekly, and had room and board with a very nice couple in Lille for the nights and meals spent away from home.

The system administrator at the computing center was a great guy, whom I learned to like (and whose name I wish I could remember). My initial interaction with him is the subject of this post. At that time in the history of computing, there were no personal computers. Instead, computers were very large, very expensive, and ran very hot. So, this one was contained in a large room which was air conditioned, and had a false floor for all the cables combining the various components. The entire room was much less powerful than one of today's laptop computers, and it was shared by dozens of users, all software engineers. It also required an operator, actually operators, who worked in shifts around the clock. The whole operation involved many employees. Work was prepared on punched cards with each bit of work called a "job". A job was at least a dozen or so cards, and often hundreds of cards. It was submitted to the operator, who had it read into the computer. When its turn came, the computer executed (obeyed, or carried out) the instructions specified on the punched cards. These instructions were written in Job Control Language (JCL). The result of the execution was printed on continuous, fan-folded paper, and available from the operator when the job was finished, as he or she got around to separating the outputs and delivering them to a room where they were sorted into bins according the the job name.

The syntax of JCL requires that the first card of the job look like this:

//JOBNAME  JOB

Where "JOBNAME" is replaced by a name of the submitter's choosing, consisting of letters and digits but starting with a letter. To keep things under control, the system administrator of the Lille facility had refined the requirement so that the first card must look like this:

//XYZ9999  JOB

Where the pattern XYZ9999 is meant to show the form of the name of the job, with XYZ indicating the first three letters of the family name of the person submitting the job and 9999 indicating a unique number. Since the first three letters of my family name are "con" and since "con" is a word unfit for proper public conversation in French, I sought an exception to the rule. The operator snickered and then told me that an exception would probably not be made, but that I could always go and ask the system administrator anyway. When I knocked at his door, he lifted his head reluctantly from his work and the conversation went something like this. Of course, I didn't write it down or record it, so the dialog is fiction, and of course, it was also in French.

"Yes?"

"Could I get an exception to the job naming rule?"

"No," as his head went back down. Then as I turned to leave, "What is your last name?"

"Conrad."

"Your job names will begin CNR," he snapped, and went back to work.

Thus were the operators spared endless amusement, and myself oft-repeated embarrassment. My name was Conrad without "con" but rather Conrad as "cnr". The first card of the first job that I submitted was

//CNR0001  JOB

and over the next several months I submitted hundreds of jobs, each with a different number, but all with names beginning "CNR."

Finally, an unrelated anecdote--a bonus resolution. One of the officials of INSEE with whom we often worked when he visited from their head office in Paris always called me "Conrad" in our conversations. "Conrad" is a very common first given name in French, and rather uncommon as a family name. One day as we waited to board a train, he apologized for having always called me by my first name, but that that was the only name he knew for me. In France at the time, colleagues commonly called each other by the family name, generally dropping the leading "Monsieur" or "Madame" or "Mademoiselle" as familiarity increased, but only used given names when very, very familiar (which takes a lot of time in French society). He was relieved to learn that "Conrad" was my family name and that he had been correctly addressing me all along. Another bit of cognitive dissonance resolved.