When I first started to learn how to program I was confused by loops. I will write something short here about how loops work.
Possibly the simplest loop is the “while” loop. It loops while some condition is true.
Let’s suppose we want to get the character count of the first 20 visible words in a block of HTML (this is based on an actual request from a client). Let’s suppose that we are working on a WordPress site, so we will use the WordPress function get_the_content() to get the content:
$contentWithTextAndHtml = get_the_content();
$howManyCharactersHaveWeCountedSoFar = 0;
$howManyWhiteSpacesHaveWeFoundSoFar = 0;
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar) {
$thisCharacter = substr($contentWithTextAndHtml, $howManyCharactersHaveWeCountedSoFar, 1);
if ($thisCharacter == ” “) $howManyWhiteSpacesHaveWeFoundSoFar = $howManyWhiteSpacesHaveWeFoundSoFar + 1;
$howManyCharactersHaveWeCountedSoFar = $howManyCharactersHaveWeCountedSoFar = 1;
}
This while loop is going to continue for as long as $howManyWhiteSpacesHaveWeFoundSoFar is less than 20. Inside of the loop, every time we find a white space, we increase $howManyWhiteSpacesHaveWeFoundSoFar by 1. So after we have found 20 white spaces, this loop will stop running.
We can clean up this code somewhat, using shortcuts that PHP allows. For instance, this:
$howManyCharactersHaveWeCountedSoFar = $howManyCharactersHaveWeCountedSoFar = 1;
can also be written as this:
$howManyCharactersHaveWeCountedSoFar++;
The “++” means “add 1 to this variable”. So we can use this trick to have slightly less code:
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar) {
$thisCharacter = substr($contentWithTextAndHtml, $howManyCharactersHaveWeCountedSoFar, 1);
if ($thisCharacter == ” “) $howManyWhiteSpacesHaveWeFoundSoFar++;
$howManyCharactersHaveWeCountedSoFar++;
}
But our loop isn’t really doing what we want. We want to find the first 20 visible words, so we do not want to count the white spaces that are inside of HTML. If we had simple text, with no HTML, the above loop would work fine. For instance, it would find the first 20 words in this block of text:
“SuperAmazing.com, a subsidiary of Amazing, the leading provider of integrated messaging and collaboration services, today announced the availability of an enhanced version of its Enterprise Messaging Service (CMS) 2.0, a lower cost webmail alternative to other business email solutions such as Microsoft Exchange, GroupWise and LotusNotes offerings.”
but what if we are dealing with a block of text that has a lot of HTML in it, like this:
“<img src=”/images/corporate/logos/super_amazing.jpg” alt=”Company logo for SuperAmazing.com” /> SuperAmazing.com, a subsidiary of <a href=”http://www.amazing.com/”>Amazing</a>, the leading provider of integrated messaging and collaboration services, today announced the availability of an enhanced version of its Enterprise Messaging Service (CMS) 2.0, a lower cost webmail alternative to other business email solutions such as Microsoft Exchange, GroupWise and LotusNotes offerings.”
Let’s suppose we need to get the character count (including HTML) out to the 20th word. We will need some simple way to keep track of whether or not a white space is inside of HTML or not. So we could do something like this:
$contentWithTextAndHtml = get_the_content();
$howManyCharactersHaveWeCountedSoFar = 0;
$howManyWhiteSpacesHaveWeFoundSoFar = 0;
$areWeInsideOfHtml = false;
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar) {
$thisCharacter = substr($contentWithTextAndHtml, $howManyCharactersHaveWeCountedSoFar, 1);
if ($thisCharacter == “<") $areWeInsideOfHtml = true;
if ($thisCharacter == ">“) $areWeInsideOfHtml = false;
if (!$areWeInsideOfHtml) {
if ($thisCharacter == ” “) $howManyWhiteSpacesHaveWeFoundSoFar++;
}
$howManyCharactersHaveWeCountedSoFar++;
}
So now we only count white spaces when $areWeInsideOfHtml is false.
If loops confuse you, then you might have trouble figuring out what this line does:
$thisCharacter = substr($contentWithTextAndHtml, $howManyCharactersHaveWeCountedSoFar, 1);
We want to get one character at a time. substr() let’s us get a small section of a string. These are the 3 parameters:
$contentWithTextAndHtml – this is the string we should look inside.
$howManyCharactersHaveWeCountedSoFar – this is where in the string we start to look
1 – this is how many characters we should get. In our, case we just want to get 1 character at a time.
As the code loops, what is basically happening is this:
$thisCharacter = substr($contentWithTextAndHtml, 0, 1);
$thisCharacter = substr($contentWithTextAndHtml, 1, 1);
$thisCharacter = substr($contentWithTextAndHtml, 2, 1);
$thisCharacter = substr($contentWithTextAndHtml, 3, 1);
$thisCharacter = substr($contentWithTextAndHtml, 4, 1);
$thisCharacter = substr($contentWithTextAndHtml, 5, 1);
$thisCharacter = substr($contentWithTextAndHtml, 6, 1);
$thisCharacter = substr($contentWithTextAndHtml, 7, 1);
$thisCharacter = substr($contentWithTextAndHtml, 8, 1);
$thisCharacter = substr($contentWithTextAndHtml, 9, 1);
$thisCharacter = substr($contentWithTextAndHtml, 10, 1);
$thisCharacter = substr($contentWithTextAndHtml, 11, 1);
$thisCharacter = substr($contentWithTextAndHtml, 12, 1);
And so on. You see what is going on here? Each time we loop, we look one character further into $contentWithTextAndHtml, and we get just 1 character. This allows us to eventually get every character in the whole block of text.
But there is still a problem with our code. What happens when we are given a block of text that does not have 20 white spaces in it? What if we get text like this:
“OmniNewsGather is one of our clients.”
That text only has 5 white spaces in it. So this line of the while loop would never stop:
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar) {
We would end up with what is called an “infinite loop”. This is a loop that never ends. Because $howManyWhiteSpacesHaveWeFoundSoFar never equals 20, the loop just keeps going forever. You will be wondering why your code isn’t working.
I recall the first time I wrote an infinite loop. Debugging it was hellish because there were no errors. This is not like getting a syntax error, which ususally tells you what you need to fix. An infinite loop is a subtle, hard to find bug.
If you are lucky, substr() will throw an error which will give you a clue. The error might appear once you start requesting characters that are beyond the end of $contentWithTextAndHtml. For instance, there are 37 characters in this block of text:
“OmniNewsGather is one of our clients.”
So you might get an error once the code does this:
$thisCharacter = substr($contentWithTextAndHtml, 38, 1);
because 38 will be beyond the end of $contentWithTextAndHtml.
What we need to do is put in an extra check, to see if we have reached the end of $contentWithTextAndHtml. So our code would look like this:
$contentWithTextAndHtml = get_the_content();
$lengthOfText = strlen($contentWithTextAndHtml);
$howManyCharactersHaveWeCountedSoFar = 0;
$howManyWhiteSpacesHaveWeFoundSoFar = 0;
$areWeInsideOfHtml = false;
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar && $lengthOfExcerpt > $howManyCharactersHaveWeCountedSoFar) {
$thisCharacter = substr($contentWithTextAndHtml, $howManyCharactersHaveWeCountedSoFar, 1);
if ($thisCharacter == “<") $areWeInsideOfHtml = true;
if ($thisCharacter == ">“) $areWeInsideOfHtml = false;
if (!$areWeInsideOfHtml) {
if ($thisCharacter == ” “) $howManyWhiteSpacesHaveWeFoundSoFar++;
}
$howManyCharactersHaveWeCountedSoFar++;
}
So, what does this line mean?
while (20 > $howManyWhiteSpacesHaveWeFoundSoFar && $lengthOfText > $howManyCharactersHaveWeCountedSoFar) {
In English, this would read as “Loop until $howManyWhiteSpacesHaveWeFoundSoFar equals 20, but only while the length of the text is greater than the number of characters we have counted so far.”
The second part of this while statement protects us against infinite loops. Because now, if we have a sentence like this:
“OmniNewsGather is one of our clients.”
Then the loop will stop after counting 37 characters, even if the code has not yet found 20 white spaces outside of HTML.
When the loop is done running, $howManyCharactersHaveWeCountedSoFar might equal as little as 60 or 70 or it might equal 600 or more, depending on how much HTML there is in the first 20 visible words. For the client, our actual task was to run some tests on the full string that makes up the first 20 visible words:
$shortenedContent = substr($contentWithTextAndHtml, 0, $howManyCharactersHaveWeCountedSoFar);
if (stristr($shortenedContent, “href”)) {
// do stuff here if a link is detected
}
The while() loop is, I think, easy for beginners to get. What is tougher is the for() loop. It is important to realize that the for() loop is totally arbitrary. At some point some programmers decided it would be convenient to take the elements of a while() loop, and write them all on one line. A typical for() loop would look like this:
for ($i=0; $i < count($arrayOfWordPressPosts); $i++) {
$thisPost = $arrayOfWordPressPosts[$i];
// do stuff here with WordPress posts
}
This part of the for() loop simply sets up a variable that we can use to keep count of how many times we have looped:
$i=0
Traditionally, the “i” is suppose to stand for “incrementor”. We will increment it each time we loop. “Increment” basically means “we will add 1 to this variable”.
This part of the loop explains how long the loop should run for:
$i < count($arrayOfWordPressPosts)
Assume that we have an array with 5 WordPress posts in it. The above loop will run until $i equals 5. That is, it will run when i equals 0, 1, 2, 3 and 4, which means it will run 5 times.
The last part of this for() loop statement increments $i:
$i++
This last part is called every time the loop loops. You might ask “Why does this first part, which assigns a zero to i, run only once, but the last part runs every time that the loop loops?” That is a good question. I do not know the answer. This aspect of for() loops has always struck me as completely arbitrary.