You have configured your Google Mini, got it integrated with you site. What you find now is that your results are getting skewed by irrelevant content on your site. This is what I’ve just found.
Exclude unwanted page sections
The result set was upset by the “customers who bought this also bought…” and the site page header and footer. This turned out very simple to resolve. There is a HTML tag that can be used to stop parts of the page from getting indexed. The definition of these are found in this document, excluding Unwanted Text from the Index.
Here are the examples pulled from that documentation for brevity;
<!--googleoff: anchor--><A href=sharks_rugby.html>shark </A> <!--googleon: anchor-->
<!--googleoff: snippet-->Come to the fair!<!--googleon: snippet-->
<!--googleoff: all-->Come to the fair!<!--googleon: all-->
You surround the control or section of the page you do not want to participate in the results with one of the three HTML comment tags shown above. This will not affect the rendering of you page but does mean something to the Google search appliance.
Index: The words between the tags are ignored by Google, they are treated as if they don’t occur on the page at all.
anchor: text in the html anchor tag to another page will not cause that destination page to appear as a result due to the link on this page.
Snippet: the search result will not use the text between the tags in the auto generated snippet that is included in the results.
all: Turns on all the attributes. Text between the tags is not indexed, followed to another linked-to page, or used for a snippet.
To solve my problem googleoff was applied to;
- “Customers who bought this bought” control reference
- Product category breadcrumb on the product pages
- master page header and footers
This has resulted in “contact us” not returning every page in the site any more, as it used to be linked from every page through the site master pages and made the snippets much more relevant from search results.
Resulting in much richer results. Caution should be applied to avoid excluding too much of your content from Google as you can’t predict what and why someone is searching on your site. Excluding too much content may hinder them finding what they require or prevent them ever getting what they need.
Check the documentation for other controls you have available to control the indexing of pages (the crawl).
Since ASP.NET 2.0 I’ve always liked the accessible way that caching has been presented to the ASP.NET developer. There are a number of good options for improving your server performance through server caches that are all really easy to get your head around and simple to implement in their most basic form.
Digging deeper into the cache
Like with most aspects of development when one looks more closely find it can get much more complex in order to optimise caching to your particular site heuristics.
The caching on the ecommerce site I’m working on at the moment was mostly removed during a site revamp last year, with the intention to put it back in again. Some of the new features I’m writing at the moment would be hit hard after a monthly email shot, thus caching is very attractive for these pages. Thus I ended up reminding myself of my options.
The site has a grid containing up to a thousand products and prices to show to the site visitor. The view of these items had to have the following features;
- Ability to sort the grid
- May filter the items viewed
- Can page the results with variable items per page
- Items in grid take context of current user language and currency selection
Page caching is not really helpful, instead I’m best caching the IEnumerable class that represents the items in the grid.
Watch out for bad performance behaviour
The thing you have to remember about web pages is that multiple users may be accessing the same page at the same time. This means you have to be a bit defensive about how you cache the object. I know the object take a long time to build as it comes from a database query that is quite intensive and some post query processing that takes a bit of time too, thus it is very realistic to see that if three users hit the page in quick succession when the cache is empty, they will all see it as empty and all simultaneously try to populate the cache, overwriting each other as they reach completion and write to the cache. Obviously this is going to load the application and database unnecessarily.
Reading around the subject I found the good old singleton pattern is a good one to start of using. If you’ve been through the joy of multithreaded applications you will know about Lock or SyncLock in vb. This allows you to lock a section of code so that only one thread may be running that section at any time. The other threads are blocked indefinitely waiting for the lock to clear. Well if you use a lock on the cache refresh, obviously the other users hitting the page will have to wait for the cache to be refreshed until they can move on and read from it.
The other one to watch out for out there that I noticed while reading up was many code examples for using the cache that forget that the item may drop out of the cache at any moment. So if you check the object exists in the cache, you need to grab it at the same time as next time you go to fetch it, it may have dropped out and nothing will be returned.
Got the cache working then used LINQ to Objects to get the paging and filtering working, what a joy to do, I’m really getting into LINQ!
Windiff Directory comparison tool, by Mircosoft
It is always great to discover utilities that already exist on your machine.
Yesterday I wanted to compare two directories trees containing thousands of directories to see what had changed between two computers. I was trying to resolve a source control error for a big ASP.NET website.
In this case I was not looking for anything complicated merely what changes there are between the machines. In the past I’ve tried downloads from the internet that are utilities that do all sorts of interesting stuff in comparing files and directories. However I didn’t have any of these installed on the machine I was using. I then uncovered Windiff.exe.
In Microsoft Windows 2000 and later, Windiff.exe is included on the original CD-ROM in the Support\Tools folder details of how to use it can be found here: http://support.microsoft.com/kb/159214 . It is a windows forms driven application that shows in two columns what has changed between the two directories and allows the user to drill down into the files listed to look at the differences within those files too.
The application also allows editing of those file too. Very, very useful. So after going into the vista start>Start Search I typed windiff and was away. Compared the directories, found my missing files and job done.
It happens so rarely that a job ends up simpler that you thought, so its woth blogging about when it happens!
System.Web.UI.UpdatePanelTriggerCollection must have items of type 'System.Web.UI.UpdatePanelTrigger'. 'asp:AsyncPostBackTrigger' is of type 'System.Web.UI.AsyncPostBackTrigger'.
I scratched my head for a little thinking this was something more complicated that it was.
After upgrading a ASP.NET site from ASP.NET 2.0 to the 3.5 framework where the site used a derived AJAX update panel in a separate referenced class library.
<asp:AsyncPostBackTrigger ControlID="albViewMoreCategories" EventName="Click" />
- was the line causing the compilation error.
The website project updated to 3.5 fine and changed the reference to the system.web.extensions class to a 3.5 version. However the associated class library still clung onto the previous 1.0 version of the class.
Removing the system.web.extensions 1.0 and relacing with it with the 3.5 version got rid of the conflict between the two projects.
I hope this helps some one.