Help > Trawling for Broken / Invalid Images

What does this trawl type find?

DeepTrawl allows you to find broken or invalid images. These might be contained within either IMG or OBJECT tags.

Invalid images might take the form of:

  • Malformed image URLs
  • Images hosted on a server which cannot be contacted
  • HTTP error codes returned when trying to access an image (e.g. 404)

Missing images annoy users. The page loses content and the page layout can also be adversely affected.

Running DeepTrawl regularly to find missing images is recommended. This will ensure the images from your site and those hosted on other web sites appear as they should.

How to use this trawl type

On icon

You should make sure this trawl type is switched on. There should be a green tick next to this trawl type in the Trawl for... tab. If there is a red cross instead, click the cross to toggle the trawl type on.

This trawl type also has several options contained in it's settings dialog. This can be accessed by pressing the Settings link to the right of the trawl type in the Trawl for... tab. The dialog is shown below, all the features are explained beneath...

The image validator settings dialog

Above: The image validator settings dialog


Screen shot item number 1

  

The Trawl for malformed URLs check box allows you to enable / prevent checking for badly structured URLs (for example full URLs which have a protocol which is not widely recognised).

Dividing line

Screen shot item number 2

  

The Trawl for problems contacting remote servers check box allows you to stop DeepTrawl from showing a problem when a server cannot be reached over the Internet. This should usually be left on.

Dividing line

Screen shot item number 4

  

The Trawl for HTTP codes... section of the dialog allows you to switch on / off searching for specific HTTP error codes. For instance you may wish to never see errors caused by 401 (unauthorized) HTTP codes.

How to solve the problems found

Solving 404 errors

Most of the time a 404 error is caused because the owner of the web site where the image is hosted has removed or moved the image. It is recommended you either remove this image from the design of your site or use an alternative image.

Solving Malformed URL errors

Most of the time, malformed URL errors are caused by a typo in your HTML. It is suggested that you review the image URL to make sure it is valid.

Solving Non-contactable server errors

A server may only be temporarily non-contactable. If you are worried the server is unavailable too often, it is recommended you remove this image from your site.