Wednesday, September 16, 2020

Google Lemonade: GOOGLE HAS DESTROYED SOME OF MY NAMED ANCHORS!

When I was writing about the Cos Cob Power Plant, I wanted to reference my notes on ComEd's Fisk Power Plant using the link https://industrialscenery.blogspot.com/2014/05/midwest-generation-power-plants.html#fisk. But when I tested that link, it didn't work! So I looked at the HTML of the target notes. What I expected to find was the named anchor:
<a href="https://www.blogger.com/null" name="fisk"></a>
What I found was:
<a href="https://www.blogger.com/null"></a> 
Note that the "name" attribute is missing! The href attribute is something that Blogger adds after I type in <a name="fisk"></a> and it's extraneous. BTW, I learned years ago that <a name="fisk"/> doesn't work.

The Midwest notes had three named anchors, and all three of them were missing the "name" attribute. So I added it back in. The new version no longer adds a bogus href attribute, which is good.
<a name="fisk"></a>
I quickly spot checked another post that had named anchors, and it was OK. So I lost some sleep trying to figure out how to find the destroyed named anchors in my thousands of posts. The new version supports a "body" search which looks at the HTML as well as the visible text. So the query I needed was the boolean expression:
body:blogger.com/null and not body:name=
But Blogger doesn't support boolean expressions. (But believe me, the new search is far better than the search in the legacy version since the author's search in the legacy version broke April 3, 2018. I've been having to find posts using just a label. In fact, the search function is the only thing I have found so far that is better in the new version.) So I needed to do two body queries and eliminate from the first result those posts that appear in the second result. But the results of these searches is so big that I haven't taken the time to even force a loading of all of the results.

Then yesterday I was working on my Met "bridge to nowhere" notes and discovered that the forward reference https://industrialscenery.blogspot.com/2016/12/the-met-and-its-abandoned-bridge-over-c.html#met did now work. So I switched to HTML and searched for "/null" to add " name="met"". But I did not find it! So I added "zzz" to the notes where the name should be and looked for zzz in the HTML. To my horror, there is no <a> element for a named anchor! So I can't even search for "/null" to find where I have problems.
<a href="https://www.facebook.com/dennis.debruler.7">Dennis DeBruler</a> The former "L" bridge in the background. Now it is just a signal bridge. This "L" route was replaced by the Dearborn subway.</td></tr>
</tbody></table>
zzz<br />
The Metropolitan West Side Elevated Railroad was the third one built in Chicago and the first to use electric traction. [<a href="http://www.chicago-l.org/operations/lines/garfield.html" target="_blank">GarfieldPark</a>]
 So I added "<a name="met"/> to see if it still needs the </a> text. The resulting HTML is:
<a name="met">
That is not correct HTML, but it did work when I tested it.

So now I have the situation that Google has destroyed some of my named anchors and I have no way of finding what needs to be fixed. Named anchors is a feature that I use rather frequently so I was glad when I saw that the new version supported adding them. Google has since removed that support.

Unlike Google destroying URLs, I have no idea what causes it to destroy named anchors so I don't know what to avoid doing. In fact, I don't even know if it was the new or the legacy version that destroyed them! They may have been broken for years, and I just now happened to need a broken one. 

In the case of URLs, I have to avoid removing the formatting of text. I did a quick test. This bug is still there even though I found and reported that content destruction bug soon after they released the new version. (The correct domain in the screenshot is not blogger.com.) At least they improved their URL display a few weeks ago so that it is easier to check the URL content.

I did force the entire results of the "body:name=" results to be loaded and I saved the results below as a series of window captures so that a year or so from now I can see how many more named anchors Google has destroyed. But before I did that, I took a nap. After sleeping on it, I realized that would give me a list of posts that would have damage, but I wouldn't know the names that got lost. But then it occurred to me that I could find the lost names by searching the blog for links to the name. To test this theory, I searched for posts that referenced the Midwest posts:
body:"https://industrialscenery.blogspot.com/2014/05/midwest-generation-power-plants.html#"
And got:
If this doesn't remind me of the names, then I can go into a post, switch to HTML and search for ".html#" to find the names. Hopefully, when I learn the names, I can remember where they belong in a post.

Current posts that have named anchors























If sweeping out 21 screenshots and saving them wasn't a torturous enough waste of my time, I remembered that I have a second blog. Fortunately, I don't cross reference as much in that blog.















No comments:

Post a Comment